نمودار کامیت

2936 کامیت‌ها

مولف SHA1 پیام تاریخ
Honglei Huang dee5bdc679 rocr: replace direct libhsakmt calls with driver interfaces
Replace direct hsakmt API calls with calls through the driver abstraction layer
in queue management related functions. This includes:
- CreateQueue/DestroyQueue operations
- Queue update and GWS allocation
- CU masking configuration

Also update the corresponding error status types from HSAKMT_STATUS to
hsa_status_t and adjust error handling accordingly.

Signed-off-by: Honglei Huang <Honglei1.Huang@amd.com>
2025-06-26 15:53:01 +08:00
Honglei Huang 046591419f rocr: use driver interface for memory and cache properties query
Replace direct libhsakmt calls with driver interface methods
in GpuAgent initialization:
- Replace hsaKmtGetNodeMemoryProperties with driver().GetMemoryProperties
- Replace hsaKmtGetNodeCacheProperties with driver().GetCacheProperties

Signed-off-by: Honglei Huang <Honglei1.Huang@amd.com>
2025-06-26 15:53:01 +08:00
Honglei Huang ffa07e28e7 rocr: remove unused agent properties reference in scratch initialization
The agent properties variable `agent_props` was declared but never used
in the `InitScratchSRD()` function. Which casued compile warning:

runtime/core/runtime/amd_aql_queue.cpp:1880:15: warning:
unused variable ‘agent_props’ [-Wunused-variable]
 1880 |   const auto& agent_props = agent_->properties();

No functional changes, purely a code cleanup commit.
2025-06-26 13:05:40 +08:00
Tony Gutierrez 1a339feb1f rocr: Move OpenSMI call to Driver 2025-06-25 15:53:02 -07:00
Yiannis Papadopoulos 2ca4d8f6d4 rocr/aie: Remove redundant and unused functions. 2025-06-25 11:32:42 -04:00
Yiannis Papadopoulos e5125c9d5e rocr/aie: Correct calculation of neural cores and avoid error on invalid queue ID. 2025-06-25 11:32:42 -04:00
Ken O'Brien 7b8a6f8ca2 rocr: Fixes memory allocation issue
Fixes a bug in memory allocation in which dmabuf export only works on
GPU 0 in a multi-GPU environment.
2025-06-24 14:53:14 -04:00
Sunday Clement d2b35dfee6 rocrtst: Add new test for querying Clock Counters
added new subtest to Agent Properties test, to check functionality of
query.

Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
2025-06-23 18:45:09 -04:00
Sunday Clement e97d06530e rocr: Add hsa-agent Queries for Clock Counters
Support has been added to query the following
HSA_AMD_INFO_GET_CLOCK_COUNTERS agent info exposed through the hsa api
in rocr, rather than the user having to make a direct IOCTL call
through the kernel driver.

Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
2025-06-23 18:45:09 -04:00
Tony Gutierrez e03d44d742 rocr: Update Driver queue-related APIs
Update the user-mode driver queue APIs to leverage KMT types.

Move queue-related calls to the core::Driver API.
2025-06-23 12:21:01 -07:00
David Yat Sin b3c48cc68c rocr: support reserving non-registered VA
Extend hsa_amd_vmem_address_reserve/hsa_amd_vmem_address_reserve_align
to support HSA_AMD_VMEM_ADDRESS_NO_REGISTER flag. This allocation can be
used to reserve virtual address ranges that can later be used by
hsa_amd_svm_attributes_set for SVM based memory allocations.
2025-06-18 18:21:11 -04:00
Chris Freehill 24f36de037 rocr: Add missing close of dmabuf after import 2025-06-17 20:22:34 -04:00
David Yat Sin 649ec63a4f rocrtst: Reduce host memory limit to 90%
Further reduce upper bound for rocrtstFunc.Memory_Max_Mem
as previous limit of 95% can still trigger OOM killer.
2025-06-16 21:02:20 -04:00
David Yat Sin 488cfd467c rocr: Always send free scratch notifications
Always send notification to profiler tools when scratch memory is freed.
2025-06-16 17:39:33 -04:00
Alysa Liu 3b450397d6 rocr: Fix wrong sizeof argument
Update size calculation from 2 * sizeof(void*) to 2 * sizeof(uint64_t)

Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>
2025-06-16 13:11:07 -04:00
Sunday Clement 31b6474801 rocr: Remove Recursive Include
Removed unnecessary header inlude in file to prevent circular include.

Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
2025-06-13 12:29:52 -04:00
Sunday Clement 06efa50c09 rocr: Fix Recursive Include in header files
scratch_cache.h includes amd_gpu_agent.h which then again includes
scratch_cache.h, this has now been fixed removing the unecessary
header include.

Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
2025-06-13 12:29:52 -04:00
David Yat Sin 3c0af843e3 rocr: Remove scratch_backing_memory_byte_size
scratch_backing_memory_byte_size was originally removed, and then put
back in 02b38d0614. This was because it
was used by rocgdb. rocgdb code has been updated to not use this field.
Bumped _amdgpu_r_debug for the ABI change.
2025-06-12 15:33:47 -04:00
David Yat Sin 17b8f9b24d cmake: Remove unused file 2025-06-12 10:38:58 -04:00
David Yat Sin 24ce840732 rocr: Remove support for Kaveri GPUs
Kaveri GPUs are EoL
2025-06-12 10:38:58 -04:00
David Yat Sin 96d0f07b15 rocr: Fix compile warning when using clang 2025-06-12 10:38:58 -04:00
Alysa Liu 77b86ca908 rocr: Prevent int overflow in arithmetic operation
Cast range->x and range->y to uint64_t before performing multiplication

Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>
2025-06-11 19:36:36 -04:00
David Yat Sin df5d66eae5 rocr: document pseudo-code for scratch reclaim
Document CP FW and ROCr pseudo-code for asynchronous reclaim.
No code change.
2025-06-11 16:19:59 -04:00
Chris Freehill a34604bddb rocr: Add hsa_amd_portable_export_dmabuf_v2
The original version of hsa_amd_portable_export_dmabuf() did not
consider the conditions under which a dmabuf could be shared.
In the new version (hsa_amd_portable_export_dmabuf_v2()), the caller
can specify the flag HSA_AMD_DMABUF_MAPPING_TYPE_PCIE, which means they
want to share the dmabuf over PCIe. In that case, the new code will check
that if it is a PCIe GPU and it is not in a XGMI Hive then if
large-BAR is not supported, we will return an error.
2025-06-09 15:42:58 -05:00
Chris Freehill 3a9d14bb66 rocr: Add hsa_amd_portable_export_dmabuf_v2
The original version of hsa_amd_portable_export_dmabuf() did not
consider the conditions under which a dmabuf could be shared.
In the new version (hsa_amd_portable_export_dmabuf_v2()), the caller
can specify the flag HSA_AMD_DMABUF_MAPPING_TYPE_PCIE, which means they
want to share the dmabuf over PCIe. In that case, the new code will check
that if it is a PCIe GPU and it is not in a XGMI Hive then if
large-BAR is not supported, we will return an error.
2025-06-09 15:42:58 -05:00
Sunday Clement dce52be686 rocr: Fix Unintentional Integer Overflow
Its safer to have the integer literal explicitly be an unsigned long
in this expression as that's what the type of the errorCode variable
resolves to, preventing any overflow errors.

Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
2025-06-09 15:16:10 -04:00
Sunday Clement d00ca2e9b7 rocr: Fix Unintended Sign Extension
ehdr->e_shentshize and ehdr->e_shnum are both 16-bit unsigned integers
and so their types get implicitly promoted to signed int automatically
during the multiplication, they must be explicitly cast into a larger
unsigned type, otherwise if the signed product is large enough the
value is sign extended resulting in incorrect values.

Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
2025-06-09 15:16:10 -04:00
Alysa Liu 9b3d15e68d rocr: Remove structurally dead code
Remove unreachable return statement.

Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>
2025-06-09 14:01:39 -04:00
Alysa Liu 167602edfb rocr: Add proper file descriptor cleanup
Ensure file descriptor 'in' is properly closed in error cases
when calling _lseek() during readFrom() operations.
Fix potential resource leak when errors occur during file operations.

Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>
2025-06-04 22:37:21 -04:00
Sunday Clement 1635746a9c rocr: Fix Potential Deadlock
Moved the Call to pthread_mutex_lock to an else statement for better
code readibility.

Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
2025-06-04 10:18:09 -04:00
Sunday Clement a97b7df4b9 rocr: Fix Potential Deadlock
Because eventDescrp->mutex is a non-recursive lock attempting to
acquire the lock with pthread_mutex_lock can cause the system to hang
indefinitely if the lock was already previously aquired with the
preceeding call to pthread_mutex_trylock.

Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
2025-06-04 10:18:09 -04:00
Alysa Liu f6c8cbd293 rocr: Fix inefficient copy operations
Refactor variable assignments to use std::move() where appropriate.
Updat function headers to accept parameters by const& where appropriate.

Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>
2025-06-02 11:18:36 -04:00
Alysa Liu ae6851dbb4 rocr: Fixed inefficient copy operations
Changed variable assignments to use std::move() where appropriate.
Changed function headers to pass string arguments by reference where appropriate.

Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>
2025-06-02 11:18:36 -04:00
Alysa Liu a945b5d493 rocr: Fixed inefficient copy operations
Changed variable assignments to use std::move() where appropriate.
Revert change in amd_kfd_driver.cpp.

Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>
2025-06-02 11:18:36 -04:00
Alysa Liu 369d89ade3 rocr: Fixed inefficient copy operations
Changed variable assignments to use std::move() where appropriate

Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>
2025-06-02 11:18:36 -04:00
Sunday Clement 293092f32f rocr: Fix Resource Leak
allocated memory was previously not freed in the event of an error
with rwlock initialization.

Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
2025-05-30 09:16:26 -04:00
David Yat Sin fc561ff37a rocr: Add all sysfs entries for L2 Cache
For L2 Cache and above, we report the total amount of cache for the
whole partition, so we add up the L2 Cache entry for each partition.
2025-05-29 19:02:38 -04:00
David Yat Sin d52f1d0453 rocr: Remove extra check for page-aligned
ROCr initially had a bug where memory allocations that were not 4K
aligned were internally 4K aligned but ROCr would not keep track
of user-requested size. This would cause some pointer_info queries
to fail, but HIP was already aligning the buffer sizes for IPC
requests. For backward compatibility accross 2 minor versions,
we allowed IPC look-ups to be both aligned and un-aligned.
Removing this check as this 4 minor versions have been released
since then.
2025-05-29 12:35:15 -04:00
David Yat Sin c3978d03a4 rocr: Update async-scratch reclaim API doc 2025-05-28 20:08:52 -04:00
David Yat Sin 11da1293de rocr:Fix compile warnings 2025-05-28 16:12:02 -04:00
David Yat Sin 0d70045817 rocr: Remove deprecated doorbell type 1 support 2025-05-28 16:12:02 -04:00
David Yat Sin 4bae509296 rocr: Remove deprecated queue doubleMap code 2025-05-28 16:12:02 -04:00
David Yat Sin b8434529a5 rocr: Remove queue_full_workaround code
Remove deprecated queue_full_workaround code as gfx7 and gfx8 GPUs are
EoL.
2025-05-28 16:12:02 -04:00
David Yat Sin 2b691c3d5f rocr: Remove addrlib files for EoL GPUs 2025-05-28 16:12:02 -04:00
David Yat Sin 04dbf769f6 rocr: update required CP FW version
Update required CP FW version required for async-scratch memory support
on gfx950.
2025-05-28 13:03:58 -04:00
David Yat Sin 9d38ca0d22 rocr: Fix compile error when using clang 2025-05-27 23:56:28 -04:00
Apurv Mishra d9a95605cc kfdtest: Disable KFD RAS test case
disable KFD RAS test case as the tests cause GPU reset
which affects the active kfdtest, the tests can only be
run successfully as separate processes

Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>
2025-05-27 19:04:04 -04:00
cfreeamd f0ce7a8e59 rocr: Support unmap adjacent mem sections in 1 try 2025-05-27 15:13:20 -04:00
Alysa Liu 625425326d rocr: Add check for 'value' pointer
Replaces assertion check assert(value) with explicit null pointer check
Returns HSA_STATUS_ERROR_INVALID_ARGUMENT on null valuesrocr: Add check for 'value' pointer

Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>
2025-05-27 12:18:04 -04:00
Alysa Liu f1f34da4f6 rocr: Unchecked return value as arg
v1: Add value pointer validation before
dereferencing in GetInfo method for MODULE_NAME case.

Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>
2025-05-27 12:18:04 -04:00