ldconfig is run during rocm-opencl package installation.
Installing libamdocl.so in /opt/rocm-xxx/lib exposes all ROCm libraries when /opt/rocm/lib is added to ldconfig.
To prevent this, libamdocl.so is now installed in /opt/rocm-xxx/lib/opencl.
ldconfig will use the updated path, limiting exposure to only libamdocl.so library.
Co-authored-by: raramakr <raramakr@amd.com>
* Split roofline tests
* Use N/A for missing values
* Test eval_expression for no valid data
* Fixed tests
* Updated Changelog for N/A
* Fixed platform specific test failure
* Use native tool for counter collection
* Add native counter collection tool which uses rocprofiler-sdk C++
library public API to get counter collection data
* This is enabled by default, unless --no-native-tool option is
provided or ROCPROF=rocprofv3 env. var. is provided
* This tool is only supported for ROCm version >=7.x.x
* This tool is not supported for attach/detach scenario
* Build native tool shared object during build time
* If using rocprof-compute without building then runtime compilation of
t push native tool shared object is performed
* rocprofiler-sdk tools is still used for services other than counter
collection and data collected by native tool is merged into the
rocpd/csv output of rocprofiler-sdk tool
* Make `rocpd` choice the default choice for `--format-rocprof-output`
option
* If `rocpd` public API from rocprofiler-sdk library is not present,
then fallback to `csv` choice
* In this case only `pmc_perf.csv` is written in workload folder
instead of multiple `csv` files for each profiling run
* Remove `json` choice from `--format-rocprof-output` option since it
functions identical to `csv` option
* Rename option `--rocprofiler-sdk-library-path` to
`--rocprofiler-sdk-tool-path` since we LD_PRELOAD the
rocprofiler-sdk tool shared object and not the rocprofiler-sdk library
shared object
* Fix the meaning of `--dispatch` option in `profile` mode to mention
dispatch iteration filtering instead of dispatch id filtering
* --dispatch option in analyze mode does dispatch id filtering
* Move standalone binary creation logic from cmake file to docker file
* fix native counter collection tool during attach/detach
* improve logging
* fix attach detach with native tool
* fix attach detach with native tool
* do not support attach/detach in native tool
* Update changelog
* add standalone binary creation functionality in cmake
* address review comments
* address review comments
* fix formatting
* address review comments
* Adding paths for cmake to search. Also updated min. cmake requirement to 3.21 as this was when hip was supported.
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Update hip compiler ID check, sometimes comes up as Clang, sometimes ROCMClang- depends on setup.
Updated formatting.
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* RHEL8.10 unable to compile due to defaulting to old c++ version, need to force c++17
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Updating changelog per docs team recommendations
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Apply suggestions from code review to changelog
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
* Do not required HIP complier to build native counter collection tool
* fix cmake
* gersemi formatting on latest cmake change
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* ex ci updated dependencies to include rocprofiler-sdk, but cmake was still not capturing the path- there was a commit that added to the cmake_prefix_path entry that specified rocprof-sdk's cmake location ut was too specific for the search paths in find_package's config mode.
removing the cmake_prefix_path var and adding hints to find_package call instead, and specifying config mode so it knows how to construct the search paths
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* gersemi run for formatting
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Still need prefix path, should not have been removed in last commit but does need to be shortened to just the rocm path to allow for find_package config mode to do the job
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* include cstdint for uint32_t
* Run formatting on helper.cpp
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Remove rocm 7.2 release stuff from version and changelog and handle it in separate pr
* fix version
* fix changelog
* fix changelog
* run ruff formatter
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* fix rocprofiler-sdk attach so path
---------
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
Force tencentos to use rhel-based bin since tencent is branched off of centos, which is branch of fedora.
Verified rocprof-compute run correctly selects bin to use, and the roofline benchmark values look similar between runs on rhel vs tencentos4 docker images on same system.
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Add user start/stop bool
* Update documentation for user-api
* Update projects/rocprofiler-systems/source/lib/rocprof-sys-dl/dl.cpp
Co-authored-by: Aleksandar Djordjevic <aleksandar.djordjevic@amd.com>
* Format fix
---------
Co-authored-by: Aleksandar Djordjevic <aleksandar.djordjevic@amd.com>
* attach: rename librocprofv3-attach
- Renames library to librocprofiler-sdk-rocattach
- ROCAttach library will be formalized and documented in future commit
* Address review comments
- Rename rocprofv3-attach.py to rocprof-attach.py
- Use common filesystem.hpp in rocattach
* Fix component name typo
* Doc fixup
---------
Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>
* roll back json file processing logic for pc sampling
* format cmake files
* Revert "format cmake files"
This reverts commit e64df65a8f30abcb6738e3a0d7ffd4270bd1d302.
* Add XGMI and PCIe metrics to the profiling data
Add support for AMD XGMI (GPU-to-GPU interconnect) and PCIe
metrics:
* XGMI link width in bits
* XGMI link speed in GT/s
* Per-link read bandwidth (KB)
* Per-link write bandwidth (KB)
- Add new categories for PCIe metrics:
* PCIe link width
* PCIe link speed in GT/s
* Accumulated bandwidth (MB)
* Instantaneous bandwidth (MB/s)
* Fix VCN/JPEG insert logic
* Modify the gpu_metrics struct to accomodate XCP structure
* Add ctest automation for gpu interconnect metrics
* Refactor to move gpu_metrics struct and serialization to another file
* Possible fix for timeout in CI
Fix redundant skip check in ctest
Add xgmi and pcie option in rocprof-sys-avail.
* Change2: Address review comments
Change ctest sampling to avoid timeout
Change variable name and code structuring
* Add option in ctest to run rocprof-sys-run without rewrite
Run transferbench with rocprof-sys-run without sampling
* Change3: Fix sample insert bug and address review comments
xgmi and pci support check
renaming variables
additional hip_api validation in rocpd
* Reduce the load from the trnasferBench sample
The CI builds were timing out when flushing a big temporary file to the
DB: (2720824.23 KB / 2720.82 MB / 2.72 GB)...
* rocr: Fix exception on AsyncEventControl init
Fix exception on init when compiling with in release mode.
* rocr: Fix crash when interrupts are disabled
Fix segfault due to assert for signal->EopEvent() being false when
HSA_ENABLE_INTERRUPT=0. Use Signal::WaitMultiple(..) when interrupt is
disabled.
---------
Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>
* SWDEV-533237 Added test cases for hipOccupancyAvailableDynamicSMemPerBlock API
* SWDEV-533237 : Added test cases for hipOccupancyAvailableDynamicSMemPerBlock
* SWDEV-533237 : Addressed review comments for hipOccupancyAvailableDynamicSMemPerBlock aip test cases
---------
Co-authored-by: jainprad <92369414+jainprad@users.noreply.github.com>
* Forward ctest labels from the execution test to the validation test.
* Adjust test validation parameters for amid_smi samples
The actual number of samples will vary depending on the GPU. This test
is just to validate the presence of the samples
The test verifies that all shared memory objects for
IPC events used internally by HIP are properly cleaned
up after use and do not leave persistent files in /dev/shm.
Co-authored-by: Ioannis Assiouras <Ioannis.Assiouras@amd.com>