* adding ROCpd database merge
* adding ROCpd database merge concatenating all tables
* update merge script
- copy all tables from files
* fix merge format
* Add package submodule, initial POC. Need to refine
* Minor fixes and clean up duplicated code in package.py
* Revamp metadata layout, add wildcard and .rpdb parsing
* Add auto merge & package when > 5 DBs, add examples, don't use auto_merge when using sub-commands merge & package
* - Extend package/yaml inputs to all rocpd modules
- Improve handling more corner cases for bad input files when parsing input parameters (bad yaml files, bad .rpdb folder, folders as input)
- Changed to use UUID in merged filename instead of the time, in auto-merge algorithm
* Minor text fixes for consistancy between modules
* Add more wildcard support and add package, merge tests
* Make changes based on review suggestions
* Move parsing packages into importer.py, simplified adding required params to a function
* fix package test by flattening input list before processing
* Integrate merge.py changes from Jonathan to add name-collision checks, recreating indexes, foreign key check (disabled for now, due to processing time)
* Rework rocpd.<submodule>.{add_args,process_args}
- add_args function returns a functor which accepts input and args
- time_window functor returned from add_args automatically applies time windowing of input
* change merge&package limit to 1, merge should create data views
* Move files by default instead of making copies
- copying can be enabled by passing "copy=True" or --copy cmdline argument
* refactor package to make the logic cleaner, set merge limit back to 5
* Allow automerge-limit param to override limit, change default back to 1. Tests updated to use query, much quicker
* Update --help instructions for package
---------
Co-authored-by: acanadas <acanadas@amd.com>
Co-authored-by: a-canadasruiz <Araceli.CanadasRuiz@amd.com>
Co-authored-by: Young Hui <young.hui@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Fix race condition when SetAsyncSignalHandler for the first time because
async_events_thread_ could be null and launched twice.
Refactored async-events to use lazy_pointer.
* Round the sum of percentages before validating to account for floating point errors
---------
Co-authored-by: Kian Cossettini <Kian.Cossettini@amd.com>
* Update workflow files to use general public rocm dev build images from dockerhub.
Old method was to borrow rocprofiler-systems images but they do not contain rocm install anymore, so we cannot rely on them.
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Add workflow files to paths on push and PR
* Revert change of image for red hat variant because the image offered in official rocm image release is too large for runners.
Going back to using systems team images and installing rocm on them (as they do) as a workaround until we can get a smaller package size docker image with ROCm included.
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Adjusted python3-devel install line with an if else determined by distro version.
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
---------
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: jbonnell-amd <jason.bonnell@amd.com>
* Cleaning up some BUILD_<dep> config variables
The `ROCPROFSYS_BUILD_<dep>` settings were being translated to `BUILD_<dep>` for the old Dyninst dependencies.
Remove this extra layer
Add `rocprofiler_systems_add_option` for the `ROCPROFSYS_BUILD_<dep>` options, so there is a better description in the in the CMakeCache.
* Changes to support USE_ROCM in TheRock builds
* Removed `amd-smi::roctx` from Findamd-smi.cmake
* Fix linking error on rocm-6.4 when including amd_smi
* Format cmake
* Fix typo in logs
* Removing Findamd-smi.cmake
* Refactor the cmake parameters for `amd-smi`.
The `drm` libraries were only required ba amdsmi for rocm-6.4.0. There was no point adding them for other versions.
* Fix cmake formatting
* Updated rev. in `.pre-commit-config.yaml`
* Pin the gersemi used in CI to v0.23.1, matching the pre-commit
---------
Co-authored-by: Aleksandar Djordjevic <adjordje@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
* Fail the job if compiler flag HIP_HOST_UNCACHED_MEMORY is not turned on on mi350x
Place the check after initTransportsRank as the GPU arch info in comm->topo->nodes info is populated after that.
* Update src/init.cc to use ERROR instead of WARN
Co-authored-by: Nilesh M Negi <Nilesh.Negi@amd.com>
[ROCm/rccl commit: 05f914c997]
* Fail the job if compiler flag HIP_HOST_UNCACHED_MEMORY is not turned on on mi350x
Place the check after initTransportsRank as the GPU arch info in comm->topo->nodes info is populated after that.
* Update src/init.cc to use ERROR instead of WARN
Co-authored-by: Nilesh M Negi <Nilesh.Negi@amd.com>
* Enhance logging in NCCL initialization
It's convenient to log comms obj and default channels together for debugging
* Add opCount to collDevWork and update increment logic
Added opCount to collDevWork and incremented it when proxyOpQueue is empty (e.g., for intra-node comms)
* Clarify opCount increment logic in enqueue.cc
Updated comment to clarify incrementing opCount for intranode communications.
* Refactor NCCL_INIT logging format
Updated logging format for NCCL_INIT to improve clarity.
* Remove duplicate INFO logging in init.cc
[ROCm/rccl commit: b00ee4c83c]