A copy paste mistake in a previous commit caused source and dest to
be reversed. Correct the source and dest params.
Fixes: 6de67d5d7c
Signed-off-by: Allen Hubbe <allen.hubbe@amd.com>
- Update the pinned SHA for TheRock in CI workflows.
- Update the version for actions in those same workflows.
- Comment out the rm .patch line and provide details on its use.
* Changes:
- Modified attempting to open files to check
permissions -> check read access only.
Do not try to open all paths, may cause driver issues.
Read access is sufficient to check permissions.
Reason: GPUs which support partitioning (memory/compute),
logical devices will not be valid until configured.
See `sudo amd-smi set -h` or applicable APIs
to configure on supported hardware.
Example error dmesg output:
[965358.883112] amdgpu 0000:15:00.0: amdgpu: renderD153 partition 1 not valid!
[965358.883283] amdgpu 0000:15:00.0: amdgpu: renderD154 partition 2 not valid!
[965358.883438] amdgpu 0000:15:00.0: amdgpu: renderD155 partition 3 not valid!
[965358.883594] amdgpu 0000:15:00.0: amdgpu: renderD156 partition 4 not valid!
[965358.883749] amdgpu 0000:15:00.0: amdgpu: renderD157 partition 5 not valid!
[965358.883904] amdgpu 0000:15:00.0: amdgpu: renderD158 partition 6 not valid!
[965358.884060] amdgpu 0000:15:00.0: amdgpu: renderD159 partition 7 not valid!
---------
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
* Changes:
- Modified attempting to open files to check
permissions -> check read access only.
Do not try to open all paths, may cause driver issues.
Read access is sufficient to check permissions.
Reason: GPUs which support partitioning (memory/compute),
logical devices will not be valid until configured.
See `sudo amd-smi set -h` or applicable APIs
to configure on supported hardware.
Example error dmesg output:
[965358.883112] amdgpu 0000:15:00.0: amdgpu: renderD153 partition 1 not valid!
[965358.883283] amdgpu 0000:15:00.0: amdgpu: renderD154 partition 2 not valid!
[965358.883438] amdgpu 0000:15:00.0: amdgpu: renderD155 partition 3 not valid!
[965358.883594] amdgpu 0000:15:00.0: amdgpu: renderD156 partition 4 not valid!
[965358.883749] amdgpu 0000:15:00.0: amdgpu: renderD157 partition 5 not valid!
[965358.883904] amdgpu 0000:15:00.0: amdgpu: renderD158 partition 6 not valid!
[965358.884060] amdgpu 0000:15:00.0: amdgpu: renderD159 partition 7 not valid!
---------
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
[ROCm/amdsmi commit: d73726698b]
* clr: SWDEV-547890 - Maintain an MQD for the emulated AQL queue
To simplify the shader debugger implementation, maintain the relevant
parts of the emulated AQL queue's MQD (amd_queue_t): read_dispatch_id,
write_dispatch_id, compute_tmpring_size.
With this MQD, the shader debugger can handle the emulated AQL queue
the same way it does the real AQL queue, no specialization is required.
* clr: SWDEV-547890 - Conservatively update the MQD's read_dispatch_id
The read_dispatch_id cannot be smaller than the current aql_packet_id
- hsa_queue.size for the debugger to work correctly.
The read_dispatch_id really should be updated when the CmdBuf is marked
as complete. Left a FIXME to address it in a future commit.
---------
Co-authored-by: Laurent Morichetti <laurent.morichetti@amd.com>
Use all available threads for polling the cq to increase the maximum
message rate. Even when posting a single wqe in the wave, use all
available theads for polling the cq to reserve space in the sq.
Changes were needed in the rocshmem abstraction to avoid disabling gpu
threads, like taking turns or using only the first thread in a wave or
wavefront. To avoid breaking other gda implementations, reimplement
turn-based or single thread strategy in post_wqe_rma_turn and
post_wqe_rma_single.
Signed-off-by: Allen Hubbe <allen.hubbe@amd.com>
[ROCm/rocshmem commit: 6de67d5d7c]
Use all available threads for polling the cq to increase the maximum
message rate. Even when posting a single wqe in the wave, use all
available theads for polling the cq to reserve space in the sq.
Changes were needed in the rocshmem abstraction to avoid disabling gpu
threads, like taking turns or using only the first thread in a wave or
wavefront. To avoid breaking other gda implementations, reimplement
turn-based or single thread strategy in post_wqe_rma_turn and
post_wqe_rma_single.
Signed-off-by: Allen Hubbe <allen.hubbe@amd.com>
When creating a python venv during the install_dependencies script, we try to use ensurepip if it is installed, as it deals better with cases where multiple venvs are active simultaneously. (as seen in CI buildbot)
[ROCm/rocshmem commit: b7a6d86c6b]
When creating a python venv during the install_dependencies script, we try to use ensurepip if it is installed, as it deals better with cases where multiple venvs are active simultaneously. (as seen in CI buildbot)
Enable image build in Windows.
Remove some useless codes that fail building in Windows.
Some minor improvement.
Temporarily exclude mipmap test files.
Prevent negative tests affect some tests.
Move some catch info log codes into failed cases.
The use of std::call_once caused the initialization flag to be set permanently,
preventing proper re-attempts to load libdxcore.so when needed. This change removes
the once_flag mechanism and relies solely on dxcore_handle_ checks to manage library
loading, allowing proper re-initialization attempts.
Signed-off-by: yangsu13 <Yang.Su2@amd.com>
Reviewed-by: Flora Cui <flora.cui@amd.com>
Replace direct D3DKMT API calls with DXCORE_CALL macro in WDDM
thunk layer. This enables dynamic loading of DXCore functions
while maintaining the same API interface.
Updated thunk functions:
- MapGpuVirtualAddress, CreateAllocation, DestroyAllocation
- ReserveGpuVirtualAddress, FreeGpuVirtualAddress
- MakeResident, Evict, ShareObjects
- QueryResourceInfoFromNtHandle, OpenResourceFromNtHandle
All existing functionality is preserved while adding flexibility
for runtime DXCore availability detection.
Signed-off-by: Chengjun Yao <Chengjun.Yao@amd.com>
Signed-off-by: Yang Su <Yang.Su2@amd.com>
Reviewed-by: Shi.Leslie <Yuliang.Shi@amd.com>
Replace direct D3DKMT API calls with DXCORE_CALL macro in WDDM
thunk layer. This enables dynamic loading of DXCore functions
while maintaining the same API interface.
Updated thunk functions:
- MapGpuVirtualAddress, CreateAllocation, DestroyAllocation
- ReserveGpuVirtualAddress, FreeGpuVirtualAddress
- MakeResident, Evict, ShareObjects
- QueryResourceInfoFromNtHandle, OpenResourceFromNtHandle
All existing functionality is preserved while adding flexibility
for runtime DXCore availability detection.
Signed-off-by: Chengjun Yao <Chengjun.Yao@amd.com>
Signed-off-by: Yang Su <Yang.Su2@amd.com>
Reviewed-by: Shi.Leslie <Yuliang.Shi@amd.com>
Remove static linking to libdxcore library from CMakeLists.txt.
This prepares for dynamic loading implementation and eliminates
hard dependency on DXCore being present at build time.
The DXCore functionality will be loaded dynamically at runtime
in subsequent patches, making the library more flexible for
different deployment scenarios.
Signed-off-by: Chengjun Yao <Chengjun.Yao@amd.com>
Signed-off-by: Yang Su <Yang.Su2@amd.com>
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Shi.Leslie <Yuliang.Shi@amd.com>