* look for symbols in dynsym table
* checking both symtab and dynsym
* Avoid symbol duplication in non stripped binaries
* clang-format
* Minor elf_utils.cpp updates
- use 'else if' instead of 'if'
- logging tweaks
* Update registration
- tweak logging
* Update testing
- strip the rocprofiler-sdk-c-tool library
- add test-c-tool-rocp-tool-lib-execute test which does NOT LD_PRELOAD the library (uses only ROCP_TOOL_LIBRARIES instead)
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
* updating CI for RHEL and SLES builds
* Turn off Build CI for rhel/sles
* update CI workflow
* Attempt to fix job names
* core-deb and core-rpm
* picking the right runner-set
* compiler version check for rhel/sles
* PC sampling check
* BUILD_CI off for
* trying with sudo
* supressing unused variable error
* removing LD_LIBRARY_PATH
* Revert "removing LD_LIBRARY_PATH"
This reverts commit eb2d79ab65c00a97056f6bb4b679de3aad59f593.
* Removing duplicate code
* Convert counter def format to YAML
Converts counter definition format to YAML with the
following structure:
```yaml
COUNTER_NAME:
architectures:
gfxXX: // Can be more than one, / deliminated if they share idential data
block: <Optional>
event: <Optional>
expression: <optional>
description: <Optional> // In case per arch notes are needed
gfxYY:
...
description: General counter desctiption
```
All counters (derived and hardware) are now defined
in the same file for ease of future additions/subtractions.
Removes existing XML parser. Keeps the existing XML
definitions for now (since other tools still rely on
its presence).
* Update lib/rocprofiler-sdk/agent.cpp
- use static_object wrapper for vector of agent_pair (rocp agent <-> hsa agent)
* Fix get_aql_handles() shutdown error
- use `static_object` wrapper for vector of `aqlprofile_agent_handle_t`
- missing-new-line CI job: ensures all source files end with new line
- logging updates
- add new line to the end of many files
- fix header include ordering is misc places
- transition to use hsa::get_core_table() and hsa::get_amd_ext_table() in various places instead of making copies
* General fixes to ATT, packets and event ID retrieval
* Update source/lib/rocprofiler-sdk/hsa/aql_packet.hpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* PC sampling: integration test with instruction decoding
* PC sampling: verifying internal and external CIDs
The PC sampling integration test has been extended
to verify internal and external correlation IDs.
* tmp solution of using Instructions as keys
* wrapper for HIP call
* PCS integration test: ld_addr as instruction id
For the sake of the integration test, use as the
instruction identifier. To support code object unloading
and relocations, use as the identifier
(the change in the decoder is required).
* PCS integration test: removing shared_ptr
Completely removing usage of shared pointers.
* PCS integration test: removing decoder
When a code object has been unloaded, ensure all PC samples
corresponding to that object are decoded, prior to removing
the decoder.
* PCS integration test: fixing build flags and imports
* PCS integration test: fixing labels
* PCS integration test: cmake flags fix
* PC sampling cmake labels renamed
* PCS integration test refactoring
* PCS integration test: minimize usage of raw pointers
* PCS integration test: at least one sample should be delivered.
* PC sampling lables: pc-sampling
- source/lib/rocprofiler-sdk/hsa/queue.cpp
- Optimize WriteInterceptor to eliminate extra barrier packets causing gaps between kernels in kernel tracing
- increase timeout_hint in hsa_signal_wait in set_profiler_active_on_queue
- misc logging improvements
- source/lib/rocprofiler-sdk/counters/agent_profiling.cpp
- increase timeout_hint in hsa_signal_wait in set_profiler_active_on_queue
- tests/rocprofv3/hsa-queue-dependency/CMakeLists.txt
- add TIMEOUT for rocprofv3-test-hsa-multiqueue-execute
* readthedocs updates
* Adding License
* correcting table of contents path
* Move doc requirements to sphinx dir
* Compile requirements.txt
* Update path to reqs
* Adding missing python module
* changing sphinx version
* changing docutils version
* enabling sphinx extensions
* trying sphinx-rtd-theme
* Remove unused doc configs
* Remove unused html theme options
* Add files to toc
* temp commit to test
* updating environment.yml for CI build
* Update doc requirements
To include rocprofiler-sdk in projects.yaml
* Set external_projects_current_project as rocprofiler-sdk
* Exclude external projects
* Fix warning for missing static path
* updating conf.py
* Removing reST syntax
* Use rocm-docs-core doxygen integration
* Remove RST syntax from Markdown files
* Generate doxyfile post checkout on RTD
* Use custom RTD env
* Specify mambaforge
* Put conda before post checkout cmd
* Add doxyfile for RTD
* Run cmake from conf.py
* Update environment.yml
* Use mambaforge
* Fix path to environment.yml
* Call build doxyfile
* Add Developer API title to Doxyfile
* Config version header
* Fix typo in conf.py
* Format fix for conf.py
* Increasing timeout for build-docs-from-source
* Remove README as mainpage for doxyfile
* Fix formatting in conf.py
---------
Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
The following changes are introduced:
- Use functions instead of macros.
- Verify the error code when querying KFD IOCTL version.
- Skip tests and samples if KFD IOCTL < 1.16 or PC Sampling IOCTL < 0.1.
* Incremental Counter Profile Creation
Adds support for incremental counter creation. How this functions is the
behavior of rocprofiler_create_profile_config has been changed.
rocprofiler_create_profile_config(rocprofiler_agent_id_t agent_id,
rocprofiler_counter_id_t* counters_list,
size_t counters_count,
rocprofiler_profile_config_id_t* config_id)
The behavior of this function now allows an existing config_id to be
supplied via config_id. The counters contained in this config will be
copied over and used as a base for a new config along with any counters
supplied in counters_list. The new config id is returned via config_id
and can be used in future dispatch/agent counting sessions.
A new config is created over modifying an existing config since there
is no gaurentee that the existing config isn't already in use. While we
could add locks (or other mutual exclusion properties) to check if its
in use and reject an update, the benefit from doing so is minor in
comparison to just creating a new config. This also side steps a common
pattern a tool may use to add additional counters at some point later on
during execution. Now they can do that without destroying the existing
config.
---------
Co-authored-by: Benjamin Welton <ben@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* SWDEV-465322: Adding support for r Perfcounter SIMD Mask in ATT
* Apply suggestions from code review
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
* Adding unit tests
* Adding counters check for gfx9 and SQ block only
* Addressing review comments
* changing the struct size
* fixing header includes
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>