* expose dimensional info in rocprofiler_counter_info_v1_t.
* add counter_id in dim info.
* address review comments
* format.
* address comments.
* use array of pointers for dimensions_instaces.
* format and comments.
* address comments.
* new line.
* Update counter_defs.yaml
* Update counter_defs.yaml
* Update counter_defs.yaml
* counter_defs.
* format counter defs.
* format counter defs.
* format counter defs.
* show only counters being profiled in metadata.
* Format.
* use config for counters and fix warnings.
* add version for rocprofiler_counter_dimension_info_v1_t struct.
* rename rocprofiler_counter_record_dimension_instance_v1_info_t.
* account device id from pmc for counters metadata.
* move dim structs to counters.h.
* address comments to compare value.
* fix tests.
* Address comments. use pointer of arrays for ABI.
* rebase.
* fix build error.
* use separate metadata::init() for rocprofv3.
* also print not found counters.
* precompute all the perf counters needed to be in metadata.
* Misc.
* format
* Format.
* rocprofiler::sdk::container::c_array
* Address comments.
* source/lib/output/metadata.cpp
* lint.
* add unit test for c_array.
* add unit test and serialization support for c_array container.
* Misc.
* Clean files.
* Format.
* clang-tidy.
* add more checks to c_array.
* misc. typo
* Addr comments.
---------
Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
Co-authored-by: Jonathan R. Madsen <Jonathan.Madsen@amd.com>
* Fix null handle
- use .handle=0, not .handle=numeric_limits<>::max()
* Update lib.common.hasher
* Fix ROCPROFILER_CONTEXT_NONE
* Use context operator==
* Update CHANGELOG
* Updated null handle for scratch memory and changed allocation test so that free ops account for null agent
---------
Co-authored-by: Ian Trowbridge <Ian.Trowbridge@amd.com>
- Refactored scratch memory handling by introducing fmm_is_scratch_aperture to
replace repeated for-loops.
- Simplified code paths in hsakmt_fmm_release, hsakmt_fmm_map_to_gpu, and
hsakmt_fmm_unmap_from_gpu by using the new helper.
Signed-off-by: Honglei Huang <Honglei1.Huang@amd.com>
* replace azure runners with internal
* change to mi300a for debug
* revert back to mi300
* move some of the load to mi300a
* use mi300a for clang-tidy
---------
Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>
* Arbitrary host-trap sampling skid (doc)
The host-trap PC sampling might introduce a skid of [0, 2]
instructions. We documented this information and provides
some advice to application developers how to find
hot-spots in the profiles generated by host-trap sampling.
Adds the following presets:
- `ci` - to match the common CI settings - including tests and asserts
- `debug` - True debug build - include building tests
- `debug-optimized` - include building tests
- `release` - To match the "build-release` script - no tests.
The default build folder will be `${sourceDir}/build/<preset>`.
---------
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
* Fix: Add missing <string.h> include for C string functions in RCCL tests
* Update examples/rccl/rccl-tests/src/common.h
Yes, confirmed—<cstring> alone works in my environment. Updated the PR
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
* clang-format
---------
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
This commit introduces MakeMemoryResident and MakeMemoryUnresident
functions to KfdDriver and XdnaDriver classes.
- Added implementations in amd_kfd_driver.cpp
- Added stubs in amd_xdna_driver.cpp returning HSA_STATUS_ERROR
- Updated header files amd_kfd_driver.h and amd_xdna_driver.h
- Removed MakeKfdMemoryResident/Unresident from amd_memory_region.cpp
Signed-off-by: Honglei Huang <Honglei1.Huang@amd.com>
This commit completes the memory register/deregister interface change.
Removed static RegisterMemory and DeregisterMemory from MemoryRegion class
- Added pure virtual methods to base Driver interface in driver class
- Added implementation in KFD driver
- Modified MemoryRegion Lock and Unlock to use driver interface
Signed-off-by: Honglei Huang <Honglei1.Huang@amd.com>
This commit introduces a new AvailableMemory API to the KfdDriver and
XdnaDriver classes.
- Implemented AvailableMemory in KfdDriver to return the available memory size
using hsaKmtAvailableMemory.
- Added a stub implementation of AvailableMemory in XdnaDriver that returns an error.
- Updated the GpuAgent class to use the new AvailableMemory API instead of
directly calling hsaKmtAvailableMemory.
Signed-off-by: Honglei Huang <Honglei1.Huang@amd.com>