`tests/rocprofiler/rocprofiler.cpp` uses `std::string` without including `<string>` directly.
This works with libstdc++ due to transitive includes, but fails with libc++.
Closes#1240
* Fix merging logic for multi process
* Fix dispatch id reset logic in case of rocpd format
* Fix kernel id reset logic in case of csv format
* Revert correlation logic change in csv format
* Do inner join instead of left join
* Added tool for dumping counter and metric values
* Skip Linting
* Added support for iteration multiplexing
* Remove subparser and supress compute options
* Specify output dir
* Add kernel info
* csv name change
* Added comments
* Support dispatch id-less dataframes
* Formatting fix
* Add default for path
* Print help with no args
* Support only single workload
Fixed incorrect error code expectation in FrequenciesRead
test when calling amdsmi_get_gpu_pci_bandwidth() with nullptr
parameter.
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
* Faster counter accuracy testing
* Better handle SPI_CSN_* metrics for lesser than MI350 series
* Use metric filtering to collect only relevant counters for comparison
* Ensure all workload folders are deleted after testing is completed
* Dont use clean_existing=False
* Add manual test for all counter accuracy
* Test env. vars. in rocprofiler-sdk backend
* Improve rocprofiler-sdk backend test case to check for env. vars. and
ensure we do not overwrite irrelevant env. vars.
* Remove unnecessary usage of ROCPROF_INDIVIDUAL_XCC_MODE env. var.
* Formatting fixes
* Test fixes
* Remove redundant code in tests
* Remove usage of utils_mod and use utils instead, this prevents
duplicate imports
* Fix for multi process workload profiling
Native counter collection tool updates:
* Do not dump empty counter data for a process
* Use PID instead of UUID for dumped csv files to facilitate correlation
* Handle merging multiple pairs of rocpd (from sdk tool) and csv (from
native tool) files
* Handle merging multiple pairs of csv (from sdk tool) and csv (from
native tool) files
Rocpd output format updates:
* Merge multiple rocpd databases into a single csv
* Reset dispatch id and kernel id for unique dispatches and unique
kernels respectively
* Retain multiple rocpd databases per run for multi process workloads
* Add test case for multiprocess profiling using rocflop workload
* Add rocflop
* Fix native counter csv to rocprofv3 csv conversion
* Use kernel_id instead of dispatch_id to correlate native counter csv
and kernel trace csv
* python formatting using ruff 0.14 instead of 0.13
* Added support for AMD ROCm net-ib alongside vanilla net-ib, with auto-generation to detect conflicts early during NCCL sync and enable future customizations.
* Integrated AMD AINIC support in RCCL for out-of-the-box usage, leveraging performance improvements by default, channel pinning for optimal pipeline performance, and extended support for 32B in-line CTS messages.
* Implemented internal derivation of AINIC-specific flags when RCCL AINIC environment parameter is set, and checks before initializing AINIC net-ib methods.
* Included snapshot of auto-generated ROCm net-ib file (src/transport/net_ib_rocm.cc) for reference.
* Fixed typos in RCCL param API (RCCL_AINIC_ROCE) and dlclose.
* Updated plugin loading logic:
* Load internal ROCmIB plugin only when NCCL_NET_PLUGIN is not set.
* Load default internal net-ib only when not AINIC and no external plugin env is set.
[ROCm/rccl commit: 9f4651f20f]
* Added support for AMD ROCm net-ib alongside vanilla net-ib, with auto-generation to detect conflicts early during NCCL sync and enable future customizations.
* Integrated AMD AINIC support in RCCL for out-of-the-box usage, leveraging performance improvements by default, channel pinning for optimal pipeline performance, and extended support for 32B in-line CTS messages.
* Implemented internal derivation of AINIC-specific flags when RCCL AINIC environment parameter is set, and checks before initializing AINIC net-ib methods.
* Included snapshot of auto-generated ROCm net-ib file (src/transport/net_ib_rocm.cc) for reference.
* Fixed typos in RCCL param API (RCCL_AINIC_ROCE) and dlclose.
* Updated plugin loading logic:
* Load internal ROCmIB plugin only when NCCL_NET_PLUGIN is not set.
* Load default internal net-ib only when not AINIC and no external plugin env is set.