* SyncAll test case would run Sync
* Despecialized name for argument reader
* Rename sync-test to team-sync-test as it uses teams
* Another stab at probing NUM_GPUS
[ROCm/rocshmem commit: 054bc33dc4]
* SyncAll test case would run Sync
* Despecialized name for argument reader
* Rename sync-test to team-sync-test as it uses teams
* Another stab at probing NUM_GPUS
Changes:
- Fix `rocm-smi --setsclk [0 .. n]` for multiple devices to continue on fail when
in a partitioned configuration (ex. in DPX/QPX/CPX/etc).
- Partitioned configurations or devices which do not support changing
sclk/mclk/pcie clks will now continue on failure. Will report a "not
supported" or other (rocm-smi) error codes for these devices.
- Updates impact other clock settings such as `--setmclk` and
`--setpcie`.
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
* initial commit
* add csv support extraction for non kernel selection mode
* add --kernel-trace for rocprofiler-sdk mode
* make non kernel selective mode runnable
* make kernel selection work with -k
* remove upper case of arg hint
* update documentation
* display same kernel name at only one place and merge instruction id with same obj id as well as offset
* remove kernel name's display for single kernel selection
* change log added
---------
Co-authored-by: Fei Zheng <44449748+feizheng10@users.noreply.github.com>
* SWDEV-554608 - Add hipHostRegisterIoMemory for hipHostRegister
* SWDEV-554608 - Add hipHostRegisterIoMemory for hipHostRegister
* SWDEV-554174 Added hipHostRegisterIoMemory flag in test cases
* SWDEV-554174 : Did formatting corrections
* SWDEV-554608 - set HSA_AMD_MEMORY_POOL_UNCACHED_FLAG if IoMemory is set
* SWDEV-554608 - set HSA_AMD_MEMORY_POOL_UNCACHED_FLAG if IoMemory is set
* SWDEV-554608 - Add hipHostRegisterIoMemory for hipHostRegister
---------
Co-authored-by: Anavena Venkatesh <Anavena.Venkatesh@amd.com>
Co-authored-by: Rambabu Swargam <rambabu.swargam@amd.com>
* [SWDEV-542718] Correct socket_affinity
Updated Socket affinity to show bitmask and expanded cpu list.
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
* Update per-device local_cpulist for socket_affinity
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
* Added amdsmi_get_cpu_affinity_from_local_cpulist API.
Updated the wrapper.
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
* Revert "Added amdsmi_get_cpu_affinity_from_local_cpulist API."
This reverts commit 9a2ef934b1787f8aa09d3e4efe02f897b4295215.
* Moved the changes to C API.
In case of SOCKET_SCOPE, use local_cpulist first.
If it is unavailable or not readable, fallback to
numa.
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
* Addressed review comments
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
---------
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
* [SWDEV-542718] Correct socket_affinity
Updated Socket affinity to show bitmask and expanded cpu list.
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
* Update per-device local_cpulist for socket_affinity
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
* Added amdsmi_get_cpu_affinity_from_local_cpulist API.
Updated the wrapper.
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
* Revert "Added amdsmi_get_cpu_affinity_from_local_cpulist API."
This reverts commit 9a2ef934b1787f8aa09d3e4efe02f897b4295215.
* Moved the changes to C API.
In case of SOCKET_SCOPE, use local_cpulist first.
If it is unavailable or not readable, fallback to
numa.
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
* Addressed review comments
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
---------
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
[ROCm/amdsmi commit: 09a97f02ed]
- Remove unimplemented older API functions
- Remove mentions of reattach API
- Remove details on implementing a process attachment library
- This will return later as a theory of operation
* add gfx1100 support
Add support for Radeon 7900 GPUs (RX and PRO), and 7800 PRO.
I was contemplating to add gfx1101 and gfx1102 GPUs as well, but those are the lower end models that are more unlikely to be used for compute intensive jobs. In addition, I do not have access to them to test the support.
* update WF_SIZe for different options
Radeon systems use a WarpSize of 32, unlike current Instinct systems,
which use a warp size of 64. For the device side, a gfx specific ifdef
is sufficient. For the host side, we need to query the device
properties.
* adjust functional tests to wf_size of 32
* update unit tests to handle wf_size of 32
* address reviewer comments
[ROCm/rocshmem commit: d0c2845031]
* add gfx1100 support
Add support for Radeon 7900 GPUs (RX and PRO), and 7800 PRO.
I was contemplating to add gfx1101 and gfx1102 GPUs as well, but those are the lower end models that are more unlikely to be used for compute intensive jobs. In addition, I do not have access to them to test the support.
* update WF_SIZe for different options
Radeon systems use a WarpSize of 32, unlike current Instinct systems,
which use a warp size of 64. For the device side, a gfx specific ifdef
is sufficient. For the host side, we need to query the device
properties.
* adjust functional tests to wf_size of 32
* update unit tests to handle wf_size of 32
* address reviewer comments
* Add `ROCSHMEM_CTX_INVALID` for invalid context handling
- Define `ROCSHMEM_CTX_INVALID` as {nullptr, nullptr}
- Add == and != operators to rocshmem_ctx_t
- Use `ROCSHMEM_CTX_INVALID` on failed context creation
- Skip ctx destroy if context is invalid
* Update docs for context create and destroy APIs usage and behavior
[ROCm/rocshmem commit: 955c22aeed]
* Add `ROCSHMEM_CTX_INVALID` for invalid context handling
- Define `ROCSHMEM_CTX_INVALID` as {nullptr, nullptr}
- Add == and != operators to rocshmem_ctx_t
- Use `ROCSHMEM_CTX_INVALID` on failed context creation
- Skip ctx destroy if context is invalid
* Update docs for context create and destroy APIs usage and behavior
* Check if test exists before adding validation
* Adjust validation parameters for rocpd_string
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
---------
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
* Add rdhc script in to rocm-core package
* Create the rdhc symlink within the package itself.
* rdhc tool support is not enabled for windows.
* [RDHC] Check if the required pip pkgs are present and warn .
rdhc checks the required pip packages are present or not.
if not warns the user and exits gracefully.
Signed-off-by: Saravanan Solaiyappan <saravanan.solaiyappan@amd.com>
* create dir regardless of default or user-provided path if it doesn't exist
* Fix npkit_dump_dir on npkit_trace_generator.py
---------
Co-authored-by: BertanDogancay <bertan.dogancay@gmail.com>
[ROCm/rccl commit: aec4f0a659]
* create dir regardless of default or user-provided path if it doesn't exist
* Fix npkit_dump_dir on npkit_trace_generator.py
---------
Co-authored-by: BertanDogancay <bertan.dogancay@gmail.com>