rocm-systems

Автор	SHA1	Сообщение	Дата
David Yat Sin	7ea25ebb85	rocr: Add thread priority for AsyncEventHandler Set priority to maximum for signal event handler and minimum for exceptions event handler. Change-Id: I1b982d3c2e4c880fafc073fe1a542d01692a6fdc	2025-01-24 10:08:12 -05:00
Saleel Kudchadker	26e105d9ab	Initial external logging API New API to accept a file stream for logging Co-authored-by: David Yat Sin <David.YatSin@amd.com> Change-Id: Ie09c35ae14ca86a97eb25f61251be287c55d7169 Signed-off-by: Chris Freehill <cfreehil@amd.com>	2024-08-07 02:59:00 +00:00
David Yat Sin	8d3fee5095	Use HybridMutex for signal mutexes Implement HybridMutex to improve latencies compared to KernelMutex when there is contention between several threads calling hsa_signal_create and hsa_amd_signal_async_handler. Change-Id: If53377033e749b0050727964c9303f09b02527cc	2024-01-16 21:29:39 +00:00
David Yat Sin	0ed1568afc	Add function for parse CPUID information Used to detect whether mwaitx instruction is supported Change-Id: I66fe906325aa523c8815133cf782df3a17a7edab	2023-02-22 16:55:42 +00:00
Shweta Khatri	8aac885318	Fixes hang due to change in order of initialization of libraries Fixes hang due to change in order of initialization of libraries that have cyclical dependencies and they call hsa_init() during their initialization phase. This implementation looks for a symbol called "HSA_AMD_TOOL_PRIORITY" across all loaded shared libraries using dynamic section entries of the loaded lib instead of using dlopen and dlsym for the same purpose. Change-Id: I4865f2fd18dd186ec311a432ec38fbb5583805d2	2023-01-26 01:17:22 -05:00
skhatri	e7fc301aa7	Adding support for rocrtracer tools loading without environment variable During hsa initializing stage, ROCr now searches all the loaded libraries for a symbol "HSA_AMD_TOOL_PRIORITY" and adds all those libraries to the tools library init list. Tools libraries listed in HSA_TOOLS_LIB env variable are also loaded in the given order and take priority over HSA_AMD_TOOL_PRIORITY. Change-Id: I739af42bbd777c44a9152c11e17dd69979b65e82	2022-06-23 20:08:30 -04:00
Sean Keely	0ee82742a7	Switch to CLOCK_BOOTTIME for HSA system clock. This is consistent with KFD and has significantly better latency. KFD is taking this as the definition of the SystemClockCounter. Change-Id: I4c1b3bc58c738206265c55ebefd41356c013bfe5	2022-05-05 15:27:29 -04:00
Sean Keely	df55cb0450	Rework memory locks to allow device parallelism in alloc/free. Prior solution used a single global lock to protect the memory tracking structures. This change protects the memory tracking structure with a shared mutex (rw lock) in shared (r) mode for memory allocations and frees so that long duration processes, calling to kfd, can be done in parallel. Operations which must modify the memory map take the mutex in exclusive mode (w) and must not call to the thunk while holding the mutex. The fragment allocator now requires separate protection and is protected with a mutex at the device level. Protecting at the device level, rather than pool, allows retention of the current recursive design and allows calling Trim from withing Allocate. This could be made finer (pool level locks) but would require backing out of Allocate entirely to call Trim. Trim and any retried Allocation must be done in isolation (per device) or we may report OOM when memory is actually available in some pool's fragment cache. So some device level serialization is required in at least some paths. Change-Id: I7c1e94d6965ffcc602b12fefdd3a6e97b84b5e00	2021-11-24 19:22:05 -06:00
Ramesh Errabolu	fa13208698	Add rocr namespace to core header and impl files Change-Id: I1e1b33f9bba1078d049bc19797889988c3e43360	2020-06-19 22:34:21 -04:00
Sean Keely	ce19721c88	Update copyright date. Change-Id: If4bf4c20cf051878bfe759080bb7345d884dd53d	2020-06-19 22:34:01 -04:00
Ramesh Errabolu	45958c727d	Extend ROCr to surface UUID of GPU devices that suppport Change-Id: I478db68d69a01578770403fa695f9e6391637573	2020-04-08 19:19:22 -05:00
Sean Keely	a913549190	Correct pthread join/detach handling. Joined threads can not be joined more than once nor can they be detached. Thread library wait and close allows multiple waits and separate close so this fixes the pthread implementation. Change-Id: I0019271a438f11ed4c6c11854011f5c4f6e16b65	2019-05-16 12:14:06 -05:00
Sean Keely	426d41e27c	Adjust signal sleep to reflect null kernel latency. Performance tested on Gromacs. Change-Id: I3851148ee8544b15d840f2c26ca73a83f8d0df2e	2017-03-09 15:20:53 -05:00
James Edwards (xN/A) TX	7d2bc9d113	Separate open source core runtime code from DK makefiles. [git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1250152]	2016-03-22 18:10:13 -05:00
James Edwards (xN/A) TX	7d1e6c3a57	Remove opensrc test files. [git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1249961]	2016-03-22 13:39:51 -05:00
James Edwards (xN/A) TX	c9ffe0004e	Check open source core runtime code into perforce. This includes license and README files. [git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1249136]	2016-03-20 15:39:40 -05:00

16 Коммитов