rocm-systems

Автор	SHA1	Сообщение	Дата
vsytch	f72a669487	Add missing texturePitchAlignment member to the hipDeviceProp_t struct. (#1802 ) * Add missing texturePitchAlignment member to the hipDeviceProp_t struct. * Add missing hipDeviceAttributeTexturePitchAlignment enumerator to the hipDeviceAttribute_t enum. * Initialize texturePitchAlignment to 256. This works for gfx9+, but is technically overaligned in most cases for pre-gfx9. * Add the texturePitchAlignment property to the NVCC path.	2020-01-27 16:37:00 -08:00
Siu Chi Chan	f4555c835a	Detect when an explicit printf buffer flush is required (#1766 ) * Detect when an explicit printf buffer flush is required in a device/stream synchronization function. * hip_module.cpp: add missing hc_am.hpp header	2020-01-07 09:06:38 -08:00
Evgeny Mankov	0dadb23327	Merge pull request #1759 from emankov/master [HIP] Unify hipError_t (Step 2)	2019-12-30 19:21:09 +03:00
Evgeny Mankov	4921678b6c	[HIP] Clean-up deprecated HIP error codes hipErrorMemoryAllocation -> hipErrorOutOfMemory hipErrorInitializationError -> hipErrorNotInitialized hipErrorMapBufferObjectFailed -> hipErrorMapFailed hipErrorInvalidResourceHandle -> hipErrorInvalidHandle	2019-12-23 17:01:35 +03:00
Alex Voicu	75a11330aa	Fix late-coming issues. (#1724 ) Implementation for hipMemcpyWithStream.	2019-12-23 19:11:24 +05:30
Sarbojit2019	153a959280	Revert [HIP] Fixed hipStreamAddCallback (#1674 ) This reverts commit `45613311d7`. Addresses SWDEV#212675.	2019-11-20 11:55:46 +05:30
Jeff Daily	3a7eb694f5	hipStreamSynchronize can skip marker if stream is empty (#1667 )	2019-11-19 09:42:43 -08:00
Sarbojit2019	45613311d7	[HIP] Fixed hipStreamAddCallback [SWDEV#165185] (#1425 ) Fixed hipStreamAddCallback() as requested in SWDEV#165185 Added unit test to test the behavior	2019-11-07 13:18:12 +05:30
Rahul Garg	579a4f36fa	Rename hip/hip_hcc.h to hip/hip_ext.h (#1341 ) * Rename hip/hip_hcc.h to hip/hip_ext.h * Deprecate hip_hcc.h	2019-11-07 13:17:10 +05:30
Jeff Daily	85080905c0	hipEventRecord only needs one lock; remove locked_eventIsReady	2019-11-06 15:56:32 +00:00
Rahul Garg	96530cba3b	Fix PCI Domain ID query (#1424 ) * Fix PCI Domain ID query * Update BDF comment	2019-10-07 14:11:52 +05:30
satyanveshd	3d661e4706	Reimplement hipMemGetInfo (#1447 ) Addresses SWDEV-136570. hipMemGetInfo changed to compute free memory based on information from kfd instead of relying on hc::am_tracker.	2019-10-01 12:40:36 +05:30
Sarbojit2019	0fa42af08c	[HIP] Add tccDriver info in hipDeviceProp Fixes #1433.	2019-09-26 13:53:33 +05:30
ansurya	ceb734b917	Added new device attributes (#1377 ) * Added new device attributes * updated comment * updated with new device attributes supported	2019-09-16 08:31:30 +00:00
Jeff Daily	8384f487ad	fix bug where HIP_DB=1 seg faults at startup (#1388 )	2019-09-05 10:04:19 +00:00
Sarbojit2019	0722704f35	Updated hipErrorString and CUDAErrorTohipError (#1365 )	2019-08-29 01:02:59 +00:00
Siu Chi Chan	83af327ef2	Compile HIP runtime with hidden visibility by default (#1303 ) * add default visibility to most APIs in program_state * remove unwanted C++ headers * Add symbol visibility pragmas and compiler flags * Add visibility attribute to APIs in channel_descriptor and hip_hcc * remove unused headers * simplify build flags with hcc * add pragma visibility hidden to functional_grid_launch * [CMake] add gfx908 back	2019-08-08 08:33:04 +00:00
Alex Voicu	fbbed603ff	Fix hip_throw. (#1285 ) * Fix hip_throw. * Fix typo * No, really fix typo	2019-08-05 09:52:22 +00:00
Jeff Daily	1eb3dbf065	consolidate thread local storage (#915 ) * all thread local access now through single struct * clean up old commented-out code, more use of GET_TLS() * fewer calls to GET_TLS by passing tls as a funtion argument * revert unnecessary change to printf * fix failing tests due to TLS change * fix merge conflicts in ihipOccupancyMaxActiveBlocksPerMultiprocessor	2019-08-05 09:51:02 +00:00
wkwchau	aaec4f73a6	Added CooperativeLaunch and CooperativeMultiDeviceLaunch flag and property for hipDeviceGetAttribute() and hipGetDeviceProperties() (#1247 )	2019-08-02 10:00:25 +00:00
wkwchau	e7447d5809	Added query of hipDeviceAttributeHdpMemFlushCntl and hipDeviceAttribu… (#1238 ) * Added query of hipDeviceAttributeHdpMemFlushCntl and hipDeviceAttributeHdpRegFlushCntl * Added NVCC blocker for the hip*FlushCntl test cases	2019-08-01 16:03:35 +00:00
Jeff Daily	f096a3239e	remove stream locks where it is safe to do so	2019-07-22 17:38:51 +00:00
ansurya	8e496c09d9	Add Max Texture 1D,2D,3D device properties (#1226 ) * Add Max Texture 1D,2D,3D device properties * Corrected testcase to use enums defined in hipDeviceAttribute_t * Added texture 1D,2D and 3D support for NVIDIA path	2019-07-18 03:18:50 +00:00
Rahul Garg	1dcf618d20	Fix HIP_VISIBLE_DEVICES order (#1184 ) * Fix HIP_VISIBLE_DEVICES order * Fix device IDs mismatch * Fix review comments- loop order and device range check * Handle incomplete VISIBLE device env variable * Revert "Handle incomplete VISIBLE device env variable"	2019-07-18 03:18:04 +00:00
Aryan Salmanpour	999f45fc11	[hip] Move _criticalData of ihipStream_t class to private section and use criticalData() to access it (#1177 )	2019-07-04 00:42:19 +00:00
Aryan Salmanpour	96dc74897d	[hip] implement the hipExtLaunchMultiKernelMultiDevice API (#1165 ) * [hip] implement the hipExtLaunchMultiKernelMultiDevice API * add a guard to check the HCC version for acquire_locked_hsa_queue() API which was introdued in HCC for ROCm 2.5 * modified code based on the requested changes * changes to lock all streams before launching kernels for each device and unlock them after the dispatches * check each stream to be valid before starting to lock all the streams	2019-06-20 05:59:05 +05:30
Siu Chi Chan	00824be34c	move executable_cache into program_state.cpp	2019-05-24 17:27:25 -04:00
Maneesh Gupta	693bd556d4	Merge pull request #1083 from gargrahul/fix_hip_impl_visible_agents Maintain HIP_VISIBLE_DEVICES for kernel launch	2019-05-13 14:20:18 +05:30
Siu Chi Chan	f5eb91d53d	migrate program_state logic from header into shared library (phase I) (#1077 ) * Revert "Revert "Use COMgr to read Kernel Args Metadata (#1006)"" This reverts commit `a3d118eaa8`. * Revert "Use COMgr to read Kernel Args Metadata (#1006)" This reverts commit `8a548bf40b`. * Revert "improve program state commentary" This reverts commit `7aada87cbd`. * Revert "load program state once per agent" This reverts commit `c9117de8eb`. * start moving function_names() into the hip shared lib * start moving code_object_blobs to a new "state" object * Consolidate various program state related static objects into a single program_state object * minor clean up * move more stuffs from functional_grid_launch into program_state * debug make_kernarg * moving lookup for kernargs size_align into program_state * clean up old code for kernarg size and alignment * update hip_module to use newer api in program_state * Create public member functions for program_state * move most program state functions into shared library * Pass the data buffer size to load_executable Otherwise, it can't figure what the data size is just from the char* (since the data is not really a string) * turning free functions in program state into members of program_state_impl * change the free function globals() into a member of program_state_impl * replace the static mutex used for populating globals * moving associate_code_object_symbols_with_host_allocation into program_state_impl * move load_code_object_and_freeze_executable into program_state_impl * moving executables and functions_names into program_state_impl * moving kernels() into program_state_impl * moving functions() into program_state_impl * move get_kernargs into program_state_impl * moving kernel_descriptor into program_state_impl * moving kernargs_size_align calculation into program_state_impl * Changing the handle to program_state_impl to a pointer * moving program_state_impl into a separate inline source file * fixing/cleaning up some header file includes * moving member function for kernargs_size_align into program_state.cpp * moving Kernel_descriptor into program_state.inl * add a new class to manage agent globals * moving all agent globals processing functions into agent_globals_impl * load program state once per agent re-merging PR991 against other program state changes * fix per-agent program state member initialization * cache executables based on elf name, isa, and agent. This avoids program state reloading executables after a shared library is dlopened. re-merging PR1057 against other program state changes * protect executables cache by a global mutex * return ref to executables cache * adapt PR#981 Make hipModuleGetGlobal be in HIP runtime	2019-05-12 19:24:03 +05:30
wkwchau	29b3b46b42	Return hipErrorInsufficientDriver status when CPU device not found (#1064 ) * Return hipErrorInsufficientDriver status when CPU device not found - no exception thrown * Return hipErrorInsufficientDriver status when CPU device not found	2019-05-07 15:58:25 +05:30
Rahul Garg	620a07102d	Maintain HIP_VISIBLE_DEVICES for kernel launch	2019-05-07 05:09:02 +05:30
Sameer Sahasrabuddhe	abb9375707	minor cleanup: eliminate repetition	2019-04-25 20:41:16 +05:30
Jeff Daily	2b3037a6ea	In hipFree, synchronize owner of memory (#1018 ) * In hipFree, if memory is associated with a device, synchronize that device's streams. This changes the behavior from synchronizing the currently set TLS device. * All devices sync in hipFree for _appId=-1 case. * Revert "All devices sync in hipFree for _appId=-1 case." This reverts commit 1efb34d6a8426661e45bc5f763422a1147aeac10. * add HIP_SYNC_FREE env var	2019-04-16 08:35:55 +05:30
Maneesh Gupta	eb03d50de9	Merge pull request #962 from gargrahul/add_2d_copy_fallback Add 2D fallback to use copy kernel	2019-03-25 07:46:43 +00:00
Rahul Garg	9bbfbceb64	2D Fallback needs hcc workweek 19101 or higher	2019-03-25 12:07:28 +05:30
Siu Chi Chan	24d08beef8	reimplement HIP_INIT as hip_impl::hip_init(), add hip_init() to some of the inlined API (#966 ) * reimplement HIP_INIT as a function, expose it as hip_impl::hip_init() so that it could be called from hipLaunchKernelGGL and other inlined HIP functions * Don't call hip_init from ihipPreLaunchKernel	2019-03-20 05:11:15 +00:00
Rahul Garg	918d7e3a40	Add 2D fallback to use copy kernel	2019-03-14 13:03:06 +05:30
Alex Voicu	ea0fcf3e61	dlopen() fixes (#929 ) * Initial attempt to switch over to internally linked state. * Add missing CMake update. * hipLaunchKernelGGLImpl must be inline as well. Ensure internal linkage. * Ensure global retrieval uses internally linked state. * Hide HC in the implementation. Minimise ADL woes. * Strange software exists, and must be catered to. * Use a less spammy mechanism for ensuring internal linkage / non-export. * Remove leftover internal detail.	2019-03-06 17:31:44 +05:30
Maneesh Gupta	0dd26b4f63	Merge pull request #608 from gargrahul/add_pinned_2d_sdma_copy Added support for pinned 2D SDMA copy	2018-12-12 07:44:16 +05:30
Maneesh Gupta	7ce082415b	Merge pull request #773 from fronteer/master Support of printing process ID for HIP tracing	2018-11-23 11:16:22 +05:30
Qianfeng Zhang	81cf7cabfa	Add support of printing process ID for HIP Tracing	2018-11-22 18:58:06 +08:00
Evgeny	e5ba097afd	renaming HIP_INIT_CB_API to HIP_INIT_API	2018-11-13 15:33:26 +00:00
Evgeny	b8b1637ef7	adding activity prof layer	2018-11-13 15:33:26 +00:00
Yaxun Sam Liu	f5d8842f6a	Add HIP_DUMP_CODE_OBJECT	2018-10-26 14:14:00 -04:00
Yaxun Sam Liu	1299b65e15	Add HIP_DB=fatbin for debugging fat binary issues	2018-08-17 11:53:45 -04:00
Rahul Garg	1e57764378	Added support for pinned 2D SDMA copy	2018-07-31 14:05:35 +05:30
Sarunya Pumma	8111fd3b8b	Remove device mapping from shareWithAll memory When shareWithAll memory (e.g., host memory) is allocated, set appId in hc::AmPointerInfo to -1 to indicate that this memory is not mapped to any device. Peer checking in ihipStream_t::canSeeMemory is not necessary if memory is shared with all devices. Thus, it is skipped. Note that earlier host memory is always mapped to device 0 and HIP always performs peer checking for all kinds of hipMemcpy. Since the peer checking process requires context locking, hipMemcpy from/to host memory always grabs device 0's context lock. Therefore, if there is another thread holding the context lock of device 0 (e.g., hipDeviceSynchronize on device 0), hipMemcpy will have to wait for the lock until it can actually perform memcpy. This can significantly deteriorate execution performance. Signed-off-by: Sarunya Pumma <sarunya.pumma@amd.com>	2018-07-28 23:15:16 -07:00
Maneesh Gupta	7311b60220	Merge pull request #491 from scchan/fix_wait callback handling: don't need to wait for the thread to become ready	2018-06-06 14:38:25 +05:30
Siu Chi Chan	a1f3b587fb	remove the _ready flag in ihipStreamCallback_t and the mutex that protects it.	2018-06-04 17:29:04 -04:00
Rahul Garg	1a02bc364f	Add integrated device property	2018-06-02 13:11:16 +05:30

1 2 3 4 5 ...

397 Коммитов