rocm-systems

Szerző	SHA1	Üzenet	Dátum
Ken O'Brien	24d10e5c76	rocr: Fixes memory allocation issue Fixes a bug in memory allocation in which dmabuf export only works on GPU 0 in a multi-GPU environment. [ROCm/ROCR-Runtime commit: `7b8a6f8ca2`]	2025-06-24 14:53:14 -04:00
Sunday Clement	315b1abaf9	rocr: Add hsa-agent Queries for Clock Counters Support has been added to query the following HSA_AMD_INFO_GET_CLOCK_COUNTERS agent info exposed through the hsa api in rocr, rather than the user having to make a direct IOCTL call through the kernel driver. Signed-off-by: Sunday Clement <Sunday.Clement@amd.com> [ROCm/ROCR-Runtime commit: `e97d06530e`]	2025-06-23 18:45:09 -04:00
Tony Gutierrez	a62368e2ba	rocr: Update Driver queue-related APIs Update the user-mode driver queue APIs to leverage KMT types. Move queue-related calls to the core::Driver API. [ROCm/ROCR-Runtime commit: `e03d44d742`]	2025-06-23 12:21:01 -07:00
David Yat Sin	39bddd8b9d	rocr: support reserving non-registered VA Extend hsa_amd_vmem_address_reserve/hsa_amd_vmem_address_reserve_align to support HSA_AMD_VMEM_ADDRESS_NO_REGISTER flag. This allocation can be used to reserve virtual address ranges that can later be used by hsa_amd_svm_attributes_set for SVM based memory allocations. [ROCm/ROCR-Runtime commit: `b3c48cc68c`]	2025-06-18 18:21:11 -04:00
Chris Freehill	14b5faf333	rocr: Add missing close of dmabuf after import [ROCm/ROCR-Runtime commit: `24f36de037`]	2025-06-17 20:22:34 -04:00
David Yat Sin	e3b013b208	rocr: Always send free scratch notifications Always send notification to profiler tools when scratch memory is freed. [ROCm/ROCR-Runtime commit: `488cfd467c`]	2025-06-16 17:39:33 -04:00
Alysa Liu	a36892da4d	rocr: Fix wrong sizeof argument Update size calculation from 2 * sizeof(void) to 2 sizeof(uint64_t) Signed-off-by: Alysa Liu <Alysa.Liu@amd.com> [ROCm/ROCR-Runtime commit: `3b450397d6`]	2025-06-16 13:11:07 -04:00
Sunday Clement	90e35e8486	rocr: Remove Recursive Include Removed unnecessary header inlude in file to prevent circular include. Signed-off-by: Sunday Clement <Sunday.Clement@amd.com> [ROCm/ROCR-Runtime commit: `31b6474801`]	2025-06-13 12:29:52 -04:00
Sunday Clement	76dbfc159c	rocr: Fix Recursive Include in header files scratch_cache.h includes amd_gpu_agent.h which then again includes scratch_cache.h, this has now been fixed removing the unecessary header include. Signed-off-by: Sunday Clement <Sunday.Clement@amd.com> [ROCm/ROCR-Runtime commit: `06efa50c09`]	2025-06-13 12:29:52 -04:00
David Yat Sin	b66b6991b0	rocr: Remove scratch_backing_memory_byte_size scratch_backing_memory_byte_size was originally removed, and then put back in `e130172218`. This was because it was used by rocgdb. rocgdb code has been updated to not use this field. Bumped _amdgpu_r_debug for the ABI change. [ROCm/ROCR-Runtime commit: `3c0af843e3`]	2025-06-12 15:33:47 -04:00
David Yat Sin	37afa1c0eb	rocr: Remove support for Kaveri GPUs Kaveri GPUs are EoL [ROCm/ROCR-Runtime commit: `24ce840732`]	2025-06-12 10:38:58 -04:00
David Yat Sin	8982f2c2c6	rocr: Fix compile warning when using clang [ROCm/ROCR-Runtime commit: `96d0f07b15`]	2025-06-12 10:38:58 -04:00
Alysa Liu	ab747b1ffd	rocr: Prevent int overflow in arithmetic operation Cast range->x and range->y to uint64_t before performing multiplication Signed-off-by: Alysa Liu <Alysa.Liu@amd.com> [ROCm/ROCR-Runtime commit: `77b86ca908`]	2025-06-11 19:36:36 -04:00
David Yat Sin	ec4830eb5c	rocr: document pseudo-code for scratch reclaim Document CP FW and ROCr pseudo-code for asynchronous reclaim. No code change. [ROCm/ROCR-Runtime commit: `df5d66eae5`]	2025-06-11 16:19:59 -04:00
Chris Freehill	287986ab65	rocr: Add hsa_amd_portable_export_dmabuf_v2 The original version of hsa_amd_portable_export_dmabuf() did not consider the conditions under which a dmabuf could be shared. In the new version (hsa_amd_portable_export_dmabuf_v2()), the caller can specify the flag HSA_AMD_DMABUF_MAPPING_TYPE_PCIE, which means they want to share the dmabuf over PCIe. In that case, the new code will check that if it is a PCIe GPU and it is not in a XGMI Hive then if large-BAR is not supported, we will return an error. [ROCm/ROCR-Runtime commit: `3a9d14bb66`]	2025-06-09 15:42:58 -05:00
Sunday Clement	5c7524ba3e	rocr: Fix Unintentional Integer Overflow Its safer to have the integer literal explicitly be an unsigned long in this expression as that's what the type of the errorCode variable resolves to, preventing any overflow errors. Signed-off-by: Sunday Clement <Sunday.Clement@amd.com> [ROCm/ROCR-Runtime commit: `dce52be686`]	2025-06-09 15:16:10 -04:00
Alysa Liu	03430838af	rocr: Remove structurally dead code Remove unreachable return statement. Signed-off-by: Alysa Liu <Alysa.Liu@amd.com> [ROCm/ROCR-Runtime commit: `9b3d15e68d`]	2025-06-09 14:01:39 -04:00
Sunday Clement	1da312af87	rocr: Fix Potential Deadlock Moved the Call to pthread_mutex_lock to an else statement for better code readibility. Signed-off-by: Sunday Clement <Sunday.Clement@amd.com> [ROCm/ROCR-Runtime commit: `1635746a9c`]	2025-06-04 10:18:09 -04:00
Sunday Clement	25886ecda8	rocr: Fix Potential Deadlock Because eventDescrp->mutex is a non-recursive lock attempting to acquire the lock with pthread_mutex_lock can cause the system to hang indefinitely if the lock was already previously aquired with the preceeding call to pthread_mutex_trylock. Signed-off-by: Sunday Clement <Sunday.Clement@amd.com> [ROCm/ROCR-Runtime commit: `a97b7df4b9`]	2025-06-04 10:18:09 -04:00
Alysa Liu	6de1c81b71	rocr: Fix inefficient copy operations Refactor variable assignments to use std::move() where appropriate. Updat function headers to accept parameters by const& where appropriate. Signed-off-by: Alysa Liu <Alysa.Liu@amd.com> [ROCm/ROCR-Runtime commit: `f6c8cbd293`]	2025-06-02 11:18:36 -04:00
Alysa Liu	65f5ce6f0a	rocr: Fixed inefficient copy operations Changed variable assignments to use std::move() where appropriate. Changed function headers to pass string arguments by reference where appropriate. Signed-off-by: Alysa Liu <Alysa.Liu@amd.com> [ROCm/ROCR-Runtime commit: `ae6851dbb4`]	2025-06-02 11:18:36 -04:00
Alysa Liu	b97f9ba6d5	rocr: Fixed inefficient copy operations Changed variable assignments to use std::move() where appropriate. Revert change in amd_kfd_driver.cpp. Signed-off-by: Alysa Liu <Alysa.Liu@amd.com> [ROCm/ROCR-Runtime commit: `a945b5d493`]	2025-06-02 11:18:36 -04:00
Alysa Liu	88dd451c64	rocr: Fixed inefficient copy operations Changed variable assignments to use std::move() where appropriate Signed-off-by: Alysa Liu <Alysa.Liu@amd.com> [ROCm/ROCR-Runtime commit: `369d89ade3`]	2025-06-02 11:18:36 -04:00
Sunday Clement	3d3cca8083	rocr: Fix Resource Leak allocated memory was previously not freed in the event of an error with rwlock initialization. Signed-off-by: Sunday Clement <Sunday.Clement@amd.com> [ROCm/ROCR-Runtime commit: `293092f32f`]	2025-05-30 09:16:26 -04:00
David Yat Sin	d2982b797a	rocr: Add all sysfs entries for L2 Cache For L2 Cache and above, we report the total amount of cache for the whole partition, so we add up the L2 Cache entry for each partition. [ROCm/ROCR-Runtime commit: `fc561ff37a`]	2025-05-29 19:02:38 -04:00
David Yat Sin	8f7c7458aa	rocr: Remove extra check for page-aligned ROCr initially had a bug where memory allocations that were not 4K aligned were internally 4K aligned but ROCr would not keep track of user-requested size. This would cause some pointer_info queries to fail, but HIP was already aligning the buffer sizes for IPC requests. For backward compatibility accross 2 minor versions, we allowed IPC look-ups to be both aligned and un-aligned. Removing this check as this 4 minor versions have been released since then. [ROCm/ROCR-Runtime commit: `d52f1d0453`]	2025-05-29 12:35:15 -04:00
David Yat Sin	1b1d4e017a	rocr:Fix compile warnings [ROCm/ROCR-Runtime commit: `11da1293de`]	2025-05-28 16:12:02 -04:00
David Yat Sin	39ecc88315	rocr: Remove deprecated doorbell type 1 support [ROCm/ROCR-Runtime commit: `0d70045817`]	2025-05-28 16:12:02 -04:00
David Yat Sin	b83b8b4535	rocr: Remove deprecated queue doubleMap code [ROCm/ROCR-Runtime commit: `4bae509296`]	2025-05-28 16:12:02 -04:00
David Yat Sin	e84a855c98	rocr: Remove queue_full_workaround code Remove deprecated queue_full_workaround code as gfx7 and gfx8 GPUs are EoL. [ROCm/ROCR-Runtime commit: `b8434529a5`]	2025-05-28 16:12:02 -04:00
David Yat Sin	4ecd0382b7	rocr: update required CP FW version Update required CP FW version required for async-scratch memory support on gfx950. [ROCm/ROCR-Runtime commit: `04dbf769f6`]	2025-05-28 13:03:58 -04:00
David Yat Sin	5e7bd6145d	rocr: Fix compile error when using clang [ROCm/ROCR-Runtime commit: `9d38ca0d22`]	2025-05-27 23:56:28 -04:00
cfreeamd	c20e30db93	rocr: Support unmap adjacent mem sections in 1 try [ROCm/ROCR-Runtime commit: `f0ce7a8e59`]	2025-05-27 15:13:20 -04:00
cfreeamd	7fe67829ef	rocr: Fix ISA generic's for gfx906 wrt sramecc gfx9-generic cannot support sramecc- and sramecc+. sramecc feature is only configurable on gfx906. The code object produced for gfx9-generic can be loaded on both gfx906 with any sramecc setting, compiler will produce the isa that will correctly work on both (EF_AMDGPU_FEATURE_SRAMECC_ANY_V4). [ROCm/ROCR-Runtime commit: `b7361c5ee4`]	2025-05-27 07:45:00 -05:00
cfreeamd	b7d56427ec	rocr: Fix ISA generic's for gfx906 wrt sramecc gfx9-generic cannot support sramecc- and sramecc+. sramecc feature is only configurable on gfx906. The code object produced for gfx9-generic can be loaded on both gfx906 with any sramecc setting, compiler will produce the isa that will correctly work on both (EF_AMDGPU_FEATURE_SRAMECC_ANY_V4). [ROCm/ROCR-Runtime commit: `3e99bb6150`]	2025-05-27 07:45:00 -05:00
David Yat Sin	342e478e7d	rocr: Perform memcpy for small code-object loads On large BAR systems, for small-sized code-objects, we get performance using direct memcpy due to latencies when doing the blit-copy. [ROCm/ROCR-Runtime commit: `da2607024b`]	2025-05-22 18:39:19 -04:00
David Yat Sin	9c5bb61708	rocr: Perform range based cache invalidates Invalidate only the address range that covers the newly copied code-object. This avoids invalidating I$ for old code objects and thus might increase I$ hit rate. [ROCm/ROCR-Runtime commit: `e969e01f54`]	2025-05-22 18:39:19 -04:00
Ben Vanik	62cd7e1f54	rocr: Fix SVM profiler QUEUE_RESTORE parsing [ROCm/ROCR-Runtime commit: `1a32392912`]	2025-05-21 13:17:25 -04:00
Flora Cui	89e5075ce0	rocr: try defaultSignal for intercept_queue if interrupt is not supported Signed-off-by: Flora Cui <flora.cui@amd.com> [ROCm/ROCR-Runtime commit: `8cf4b7fc05`]	2025-05-21 09:37:47 -04:00
Yiannis Papadopoulos	69505ab60c	Fix formatting [ROCm/ROCR-Runtime commit: `700078d335`]	2025-05-20 13:59:22 -05:00
Yiannis Papadopoulos	38c54b09ac	rocr/aie: Correct operand count [ROCm/ROCR-Runtime commit: `c80616d807`]	2025-05-20 13:59:22 -05:00
David Yat Sin	38ea4370c1	rocr: Fix doorbell ring When compiling with -O0, some compilers generate a xchg instruction for the __atomic_store(...) built-in. Using xchg on MMIO memory is undefined-behavior and may be ignored on certain CPUs. [ROCm/ROCR-Runtime commit: `f011a9506d`]	2025-05-20 09:19:10 -04:00
Jiadong Zhu	b99015f30a	rocr/dtif: use default signal for intercept queue for DTIF Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Reviewed-by: David Yat Sin <David.YatSin@amd.com> [ROCm/ROCR-Runtime commit: `0f9d2b836c`]	2025-05-13 16:44:31 -04:00
Aaron Liu	85a11c729c	rocr/dtif: disable interrupt signal for DTIF backend Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: David Yat Sin <David.YatSin@amd.com> [ROCm/ROCR-Runtime commit: `8c1b1201b7`]	2025-05-13 16:44:31 -04:00
Jiadong Zhu	a0dc167541	rocr/dtif: add hsaKmtQueueRingDoorbell in thunk loader hsaKmtQueueRingDoorbell is specfic to DTIF backend Signed-off-by: Flora Cui <flora.cui@amd.com> Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com> Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Shane Xiao <shane.xiao@amd.com> Reviewed-by: David Yat Sin <David.YatSin@amd.com> [ROCm/ROCR-Runtime commit: `e2d767879d`]	2025-05-13 16:44:31 -04:00
Aaron Liu	008bbd94d5	rocr/dtif: add CreateThunkInstance/DestroyThunkInstance interfaces Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: David Yat Sin <David.YatSin@amd.com> [ROCm/ROCR-Runtime commit: `e9088d6e47`]	2025-05-13 16:44:31 -04:00
Aaron Liu	c6ffc85a47	rocr/dtif: add DRM APIs wrapper in thunk loader Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: David Yat Sin <David.YatSin@amd.com> [ROCm/ROCR-Runtime commit: `0cd4ddd62b`]	2025-05-13 16:44:31 -04:00
Aaron Liu	6cf184a0d4	rocr/dtif: replace hsakmt interfaces with HSAKMT_CALL(...) Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: David Yat Sin <David.YatSin@amd.com> [ROCm/ROCR-Runtime commit: `1b79caa214`]	2025-05-13 16:44:31 -04:00
Aaron Liu	87dcbf1255	rocr/dtif: add thunk loader to wrap hsaKmt APIs For native and DTIF backends, unify to use HSAKMT_CALL(...) to call hsaKmt APIs. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: David Yat Sin <David.YatSin@amd.com> [ROCm/ROCR-Runtime commit: `7ba77fb193`]	2025-05-13 16:44:31 -04:00
Aaron Liu	137b168b46	rocr/dtif: add dtif environment variable Using HSA_ENABLE_DTIF to control dtif/native thunk code path Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: David Yat Sin <David.YatSin@amd.com> [ROCm/ROCR-Runtime commit: `166b0fa45a`]	2025-05-13 16:44:31 -04:00

1 2 3 4 5 ...

898 Commit-ok