Although the value is correct; there is no source of truth between
kernel and userspace. This leads to problems if the kernel has strict
restrictions (such as kernel 6.17 or earlier). The restrictions were
lifted in 6.17.9 and and 6.18, but there is no guarantee userspace is
using this.
So short term this value will be wrong. But on newer kernels the kernel
will communicate the right size and rocr-runtime will be adjusted to
use that.
Link: https://github.com/ROCm/TheRock/pull/2505
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
* When writing bulk packets always invalidate packet headers, Its
possible that the CP fetcher can have multiple packets in flight. In
such cases we may end up with a malformed packet because the writes are
not complete yet CP finds a valid header.
* [hip] Docs: Overhaul HW implementation page
* Update hardware implementation and glossary
* Update programming model
* Add performance optimization
* Split into how-to and understanding
---------
Signed-off-by: Jan Stephan <jan.stephan@amd.com>
Co-authored-by: Jan Stephan <jan.stephan@amd.com>
Co-authored-by: Julia Jiang <julia.jiang@amd.com>
* rocrtst: Updated CMakeFiles to find_package instead of hardcoded
This is to support TheROCK build environment
* rocrtst: Fix CMake to use find_package() instead of hardcoded ENV paths
Fixed CMake style issues from previos first commit's code review
* rocrtst: Fix rocrtst NUMA dependency detection to use find_package
Also added handling of missing headers
* rocrtst: Fix NUMA and hwloc detection for cross-platform builds
---------
Co-authored-by: Shweta Khatri <shweta.khatri@amd.com>
* [hip-tests] Update API coverage report generator
Updates the HIP API coverage tool. It now takes
extra arguments for the location of the catch test folder
and for the working directory. This avoids issues where the output
of the executable is dependent on the path where it is being
executed from.
Also updates CmakeLists.txt to integrate seamlessly with the
hip-tests project and avoid using commands which rely on
relative paths.
* Remove double new line
* Remove Cmake option to generate coverage
Removes Cmake option to generate coverage. Instead, explicitly removes
the gen_coverage target from all (this is already the default but
doing it explicitly prevents confusion).
* SWDEV-558848 - vmm api support for rocr on windows
* Fixes to VMM handle Map/Unmap Set/Get Access
* Fix GetShareableHandle to use pointer for shareable handle
* Update os specific map/unmap memory calls
* clang format update
* Minor syntax fixes from code review
Co-authored-by: Yiannis Papadopoulos <102817138+ypapadop-amd@users.noreply.github.com>
---------
Co-authored-by: Rahul Manocha <rmanocha@amd.com>
Co-authored-by: Yiannis Papadopoulos <102817138+ypapadop-amd@users.noreply.github.com>
* Remove TCP_TCP_LATENCY_sum counter for MI300
* Remove TCP_TCP_LATENCY_sum counter which is unsupported for MI300 per register specification
* Remove VL1 Lat metric from memory chart section (block 3) for MI 300
since it uses TCP_TCP_LATENCY_sum counter which is unsupported
* Remove references to TCP_TCP_LATENCY_sum
* Update CHANGELOG
* reword changelog
* Read the ids_flags when fetching GPU info
The ids_flags contains the flags that can help identify if a GPU
is a dGPU or an APU.
* Show correct memory pool for APUs
The kernel policy for APUs will be to choose the bigger pool of
memory (GTT or VRAM) for KFD work. Adjust the policy for the monitor
and default commands to show the right memory pool when using an APU.
* Add ls statement for debugging /opt directory file naming
* Update ROCM_VERSION from 7.0.0 to 7.1.1 in SDK CI
* Update amdgpu debian package for Ubuntu in Dockerfile.ci
* disable HIP/CLR build in codeql (#2242)
---------
Co-authored-by: Venkateshwar Reddy Kandula <Venkateshwarreddy.Kandula@amd.com>
* Change rocprofiler-sdk CMake compatibility to AnyNewerVersion
Update CMake package version compatibility from SameMinorVersion to
AnyNewerVersion to allow downstream packages (like RDC) to use newer
versions of rocprofiler-sdk without requiring exact minor version match.
This fixes compatibility issues where RDC requests 1.0.0 but finds 1.1.0.
* Update projects/rocprofiler-sdk/cmake/rocprofiler_config_install.cmake
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Change rocpd and roctx CMake compatibility to SameMajorVersion
Update COMPATIBILITY setting from SameMinorVersion to SameMajorVersion
for both rocpd and roctx packages to allow compatibility across major
version boundaries.
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* SWDEV-554626 - return hipErrorInvalidDeviceFunction when we can not load module
Return correct error code when modules are empty
* Match the error codes
* Revert the error code
* Don't require powercap support
APUs don't necessarily support setting a power cap from sysfs.
Ignore failures of the file missing.
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
* Show edge temperature in default output if hotspot is missing
APUs don't have a hotspot temperature, they have an edge though.
Use that.
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
* Format all "power" keys as watts
There will be more power keys when APU support is added, so format
them properly.
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
* Don't show power limit in output if it's invalid
APUs can't set power limit using power_cap1 interface. The limit
will be 0 and thus the UX looks weird in default output.
Only add the `/power_limit` if it's valid.
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
* Unify sizes of `amdsmi_power_info_t`
Sizes are used inconsistently. This causes tools to not show
N/A when they should. Make them unified.
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
---------
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
* While reusing signals, its possible we can come across a timestamp
that can contain several signals, like when profiling a graph. Reading
timestamps from all signals can make the call severely CPU bound.
Instead cache only that signal so as to avoid the overhead for critical
path.
* Run pre-commit's whitespace related hooks on projects/rocr-runtime
In order for pre-commit to be useful, everything needs to meet a common
baseline.
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
* Add missing semicolon which would block compilation on big endian CPUs
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
---------
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
* Fixing the copy back to the original buffer malformed packets
* Addressing Copilot Comments
* Addressing Review comments
* Adjust staging buffer size allocation
Change staging buffer size to match the number of packets.
* Initial work in progress for compute CI workflow
* Update run-ci.py script location, enable test creation
* Add new lines to files
* Add coverage file argument to run-ci.py
* Remove run-ci.py script usage from rocprofiler-compute-continuous-integration.yml workflow
* Add --break-system-packages parameter
* Add --ignore-installed to pip install
* Checkout specific branch until amdclang issue fixed in develop
* Add missing slash to path for cxx compiler
* Remove specific branch from checkout action
* Use run-ci.py in rocprofiler-compute-continuous-integration.yml
* Update install python requirements step
* Fix typo in build-name
* Update run-ci.py to have toggle for code coverage
* Apply ruff formatting
* Ruff again
* Exclude live attach detach and roofline tests in CI
* Add ctest args
* Revert run-ci.py changes
* Try new run-ci-2.py
* Update type of pytest-numprocs argument
* Try casting arg to str
* Fix typo in arg reference
* upgrade pip before running python installs
* Use jammy instead of noble for CI
* Remove python nproc arg from run-ci-2.py
* Switch to MI325 runners for CI
* Fix spacing issue
* Rename run-ci.py to run-code-coverage.py, add new run-ci.py
* Update to ROCm version 7.1.0 to debug sdk issues
* Testing out tarball install again
* Update regex on tarball version
* Update tarball regex on compute
* ruff formatting
* Revert change to systems CI file
* Switch back to rocm-dev install
* ruff formatting again
* Add ld_lib_path for rocm_sysdeps
* Remove excluded tests temporarily
* Add back excluded tests, add timeout for test step
* Address PR feedback
* Add git safe directory lines
* Revert dependencies change to debug new failures
* Exclude roofline again, rework dependencies
* Add in hip-runtime-amd dependency
* Install hip dev package
* Add TEST_FROM_INSTALL cmake arg to compute CI workflow
* Remove test_from_install for now
* Enable roofline tests again