* Remove std::regex usage from rocprofiler-sdk and common library
- See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118408
- std::regex usage produces exceptions or segfaults when used when on applications compiled with dual ABI
- Add code restrictions workflow
- simple workflow ensuring code restrictions (such as std::regex) are not used
* Update CHANGELOG
* Explicitly set permissions for restrictions workflow
* Fix handling of /proc/cpuinfo entries with no info
- e.g. "power_management:" (colon is last character)
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
* Update header file includes
* Fix includes for lib/rocprofiler-sdk/hip/hip.hpp
* Minor touch ups
* Minor include improvements
* Doxygen tweak
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
* Model Name fix for rocprofiler_lib.agent
* fixing format
* formatting source
* Adding comments and example
---------
Co-authored-by: Sushma Vaddireddy <svaddire@amd.com>
* Fix segfault on fail to query GPU name
* Format
* Review comments
* Format
* Review comment
---------
Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
* rocDecode API Tracing support
* Test bin file added to rocdecode. Need to add validate python methods
* Added option to not make rocDecode tests
* Added rocdecode and rocprofv3 tests
* Added csv test
* Address PR comments. Changed tests to use built-in rocstreambit decoder to remove ffmpeg dependancy. Changed cmake option to disbale tests rather than not build them. Tests work locally, but will fail until rocDecode is built with tracing enabled on CI
* Add option to avoid building rocdecode tests
* Added option to avoid building rocdecode bin file
* Support for rocJPEG API Trace
* Added newline to rocjpeg_version.h
* json-tool code added, initial test/bin commit
* Formatting
* Resolved rocjpeg bin test compilation errors
* Tests implemented. Perfetto module currently resulting in errors, so need to retest whenever it is fixed
* Formatting and compilation errors
* Minor fixes
* Copyright year update and minor fixes
* Doc update fix
* Added rocjpeg csv file in data
* Addresses review comments: Updated fixed Findroc.. and uses root directory as a hint, fixed documentation error, changed tables to use _CORE, minor style fixes
* Added rocdecode and rocjpeg to CI
* Removed rocdecode and rocjpeg from CI and added back build tests option
* Updated Cmake Files
* Added rocDecode and rocJPEG to CI
* Remove cmake line added in error
* Temporarily modified tests to pass if rocdecode or rocjpeg tracing are not supported for CI, cmake changes
* Added find_package for test
* Added back use of system rocDecode and rocJPEG, modifies system files to include prefix path
* Updated no-link to include INCLUDE_DIR/roc(decode|jpeg), added comments for tests
* Resolve merge conflicts and formatting
* Added regex find and replace instead of include for CI
* VAAPI package causing errors on Vega20
* Removed system rocjpeg and rocdecode use temporarily until cmake issues resolved
* Removed workflows regex
* Formatting and minor test modification
* Modified test for vega20
* Update rocDecode and rocJPEG cmake and tests
* Changelog
* Fix merge conflict
* Added back if-statements around add-tests since cmake-generator-expressions are resulting in errors when the packages are missing
* Removed if found statements, replaced with TARGET:EXISTS
* Skip json file for rocjpeg and rocdecode tests if not supported
* Add os import
---------
Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
* [DO NOT MERGE] Misc UUID updates
- this is WIP
* Agent visibility
- Support for ROCR_VISIBLE_DEVICES, HIP_VISIBLE_DEVICES, CUDA_VISIBLE_DEVICES, GPU_DEVICE_ORDINAL
* Update CHANGELOG
* tweak to rocprofiler_agent_runtime_visiblity_t
* Code object kernel address
- new fields in code_object_kernel_symbol_register_data_t
- kernel_code_entry_byte_offset
- kernel_address
* Support ROCR_VISIBLE_DEVICES reordering devices for HIP
* Addressed code review changes
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
* Misc AFAR VII updates + clang-tidy-19 + bump version to 0.6.0
- move tests/rocprofv3/trace-period to tests/rocprofv3/collection-period
- bump clang-tidy to v19
- fix misc clang-tidy errors
* Update the collection period test
- don't attach files on fail bc when test is disabled, it causes problems
---------
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
* Update lib/rocprofiler-sdk/agent.cpp
- use static_object wrapper for vector of agent_pair (rocp agent <-> hsa agent)
* Fix get_aql_handles() shutdown error
- use `static_object` wrapper for vector of `aqlprofile_agent_handle_t`
* cmake formatting (cmake-format) (#188)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* source formatting (clang-format v11) (#189)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: design of the pc sampling data struct; guarding parts of code that uses ROCr marker packets
* source formatting (clang-format v11) (#191)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* cmake formatting (cmake-format) (#192)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: shadow variable fix
* pcs: fix for compiler errors reported by CI/CD
* source formatting (clang-format v11) (#193)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: docs fix; samples uses rocprofiler::rocprofiler library
* cmake formatting (cmake-format) (#195)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: client in samples folder fixed
* pcs: client requires rocprofiler package as dependency
* pcs: client uses single context
* source formatting (clang-format v11) (#196)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: client using single buffer; no buffer destroy in client
* pcs: client::setup explicitly called from the example
* pcs: rocprofiler_pc_sample_record_t updated
* pcs: fixed init of external correlation id
* source formatting (clang-format v11) (#198)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: remove outdated files; update CMakeLists
* cmake formatting (cmake-format) (#212)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: using rocprofiler_agent_id_t
* pcs: Removing trailing whitespaces
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
* source formatting (clang-format v11) (#214)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: mapping agent_id to the agent
* source formatting (clang-format v11) (#215)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: const while iterating over agents
* source formatting (clang-format v11) (#216)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: calling get_buffer instead of get_buffers
* pcs: workgroup typo
* pcs: documentation for the public PC sampling API
* pcs: queue_cb_t signature adaptation
* pcs: mocks removed
* pcs: updating HsaApiTable with HSA/ROCr PC sampling API
* pcs: querying available PC sampling configs through IOCTL
* pcs: create the PCS session in IOCTL
* pcs: first actual PC samples delivered to the rocprofiler's client :)
* pcs: works with marker packet too
* pcs: using HSA table to call pc sampling related functions
* pcs: using ioctl instead of kfd in naming
* pcs: configuration service test fixed
* pcs: sample processing test fixed
* pcs: marker packet macro wrapper removed
* pcs: marker packet is part of the rocprofiler_packet union
* pcs: one fixme added
* pcs: client that uses pc-sampling and code obj tracing
* pcs: client that supprts PC sampling and code obj tracing refactored
* pcs: show more info for each PC sample
* pcs: hex output for the samples that do not belong to the matmul kernel
* pcs: querying avail configuration happens immediately before configuring
* pcs: hsa_ven_amd_pcs_create_from_id renamed
* pcs: using hsa_stop; accessing a buffer by id from parser
* pcs: includes reworked, tests returned to life
* pcs: rocrofiler dir removed as outdated
* cmake formatting (cmake-format) (#271)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* source formatting (clang-format v11) (#272)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: some warnings fixed
* source formatting (clang-format v11) (#273)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* cmake formatting (cmake-format) (#274)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: show MI200 relevant information in the sample
* pcs: queue cb fixed; rocr.h include fixed
* source formatting (clang-format v11) (#296)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: getting hsa_agent and the doorbell_id from hsa_queue
* source formatting (clang-format v11) (#297)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: correlation ID logic fixed
* source formatting (clang-format v11) (#303)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: pure pc sampling example fixed
* source formatting (clang-format v11) (#307)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* cmake formatting (cmake-format) (#308)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: interval value if the PC sampling is already configured
* pcs: ROCPROFILER_STATUS_ERROR_PC_SAMPLING_ALREADY_CONFIGURED
New status code if another process configured PC sampling service with different configuration.
Samples are extended to consider this case and retry if it happens.
* pcs: hsa_amd_queue_get_info mocked in tests
* source formatting (clang-format v11) (#328)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs (tests): query configs after configuring service
* source formatting (clang-format v11) (#329)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: sample checks workgroup_id_* and wave_id
* source formatting (clang-format v11) (#330)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs samples: running samples on the device 0
* pcs: kfd_ioctl updated
* pcs: ioctl config struct changed fields names
* pcs: status when PC sampling is configured by another process is renamed
* pcs: HSA PC sampling API table fixed
* pcs: tmp hack to be able to use HSA pc sampling table
* source formatting (clang-format v11) (#443)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs service use CIDs generated by HIP API tracing service
* source formatting (clang-format v11) (#455)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* cmake formatting (cmake-format) (#456)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: CID manager
* pcs: explicit flush with no delivered data executes retirement logic
* source formatting (clang-format v11) (#464)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: rocprofiler_query_pc_sampling_agent_configurations docs update
* source formatting (clang-format v11) (#465)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: rocprofiler_configure_pc_sampling_service docs update
* pcs: explicit sync introduced in PCSCIDManager
* pcs: new logic for retiring CIDs in PC sampling service documented
* pcs: queue interception cb signature updated
* source formatting (clang-format v11) (#471)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: if no agents supports PC sampling, fail gracefully
* elaborating when KFD returns EBUSY and EEXIST
* pcs: the second PC sampling examples fails gracefully
* code samples use only single kernel for now
* pcs: CID manager refactored
* source formatting (clang-format v11) (#481)
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
* pcs: ioctl update
* source formatting (clang-format v11) (#531)
Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>
* pcs:code sample to test PC sampling applied on concurrent kernels
* source formatting (clang-format v11) (#533)
Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>
* pcs: pc sampling strest test included
* cmake formatting (cmake-format) (#539)
Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>
* source formatting (clang-format v11) (#540)
Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>
* pcs: standalone benchmark
* cmake formatting (cmake-format) (#555)
Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>
* pcs: glance in external correlation IDs
* source formatting (clang-format v11) (#557)
Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>
* another change in ioctl interface
* pcs: update queue interceptor callbacks and samples accroding to the agent 0 version
* source formatting (clang-format v11) (#611)
Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>
* pcs: avoid running problematic PC sampling test
* pcs: guarding tests not to fail on architectures not supporting PC sampling
* source formatting (clang-format v11) (#617)
Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>
* pcs: check IOCTL version prior to each KFD call
* pcs: ioctl refactoring
* pcs: PC sampling service increases the ref_count of the correlation ID of the kernel dispatch
* cmake formatting (cmake-format) (#631)
Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>
* source formatting (clang-format v11) (#632)
Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>
* pcs: PC sampling service provides external correlation IDs
* source formatting (clang-format v11) (#644)
Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>
* pcs: use rocprofiler_dim3_t for workgrou_ip
* source formatting (clang-format v11) (#645)
Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>
* pcs: minor fixes
* pcs: updating the documentation for the pc sampling API functions
* pcs: api table and queue controller fix
* pcs: don't generate marker packets for the agent if PC sampling is not configured on it
* pcs: multi-GPU and single-GPU clients
* source formatting (clang-format v11) (#700)
Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>
* pcs: warning and errors fixed
* source formatting (clang-format v11) (#702)
Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>
* pcs: clang compiler errors and warnings fixed
* source formatting (clang-format v11) (#716)
Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>
* pcs: const reference in cid manager
* source formatting (clang-format v11) (#717)
Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>
* pcs: const & func in manager explicit
* pcs: test to cover creating PC sampling service of agent that does not exist
* pcs: generate marker packets if service is active
* source formatting (clang-format v11) (#719)
Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>
* pcs: refactoring hsa_adapter; use the correlation_id->thread_idx
* Update source/lib/rocprofiler-sdk/pc_sampling/cid_manager.cpp
* Update source/lib/rocprofiler-sdk/pc_sampling/cid_manager.cpp
* Update source/lib/rocprofiler-sdk/pc_sampling/hsa_adapter.cpp
* Update source/lib/rocprofiler-sdk/pc_sampling/hsa_adapter.cpp
* Update source/lib/rocprofiler-sdk/pc_sampling/hsa_adapter.cpp
* Update source/lib/rocprofiler-sdk/pc_sampling/hsa_adapter.cpp
* Update source/lib/rocprofiler-sdk/pc_sampling/utils.cpp
* Update utils.cpp
* moving pc-sampling tests and samples to pc-sampling label
* Format fix
* pcs: use configured instead of active service
* Update source/lib/rocprofiler-sdk/pc_sampling/service.cpp
* pcs: ensure configuring PC sampling on the HSA level is called only once
* pcs: minor fix
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* Update CMakeLists.txt
* pcs: refactoring IOCTL integration
* Update source/lib/rocprofiler-sdk/pc_sampling/tests/CMakeLists.txt
Co-authored-by: Ammar ELWazir <ammar.elwazir@amd.com>
* Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter.cpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter_types.hpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter.cpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter_types.hpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter.hpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* pcs: reverting back what bot doubled
* Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter_types.hpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* pcs: retesting the bot
* Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter_types.hpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* pcs: why bot fails on this IOCTL status
* pcs: why failing on <vector>
* Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter.cpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* pcs: returning commits removed by bot
* pcs: formatting locally
* pcs: clients are flushing buffers inside the tool_fini
* pcs: sync function in public API
* pcs: sync prior to unloading the code object
* pcs: sync function requires context
* pcs: client uses CID retirement service
* pcs: test for flusing internal ROCr buffers
* pcs: source formatting
* Update source/lib/rocprofiler-sdk/pc_sampling/tests/CMakeLists.txt
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* pcs: code samples refactoring
* pcs: public API header refactored
* pcs: rocprofiler_buffer_flush drains internal PC sampling buffers too
* pcs: remove unnecessary functions
* pcs: do not call hsa's copytables
* pcs: include reordering
* pcs: using ROCP_ERROR inside PC sampling implementation
* pcs: pc_sampling sample uses ostream instean of printfs
* pcs: pc_sampling_codeobj tracing using ostream instead of prints
* pcs: registering once for interceptor callbacks
* pcs: do not generate internal CIDs if not in debug mode
* pcs: rebasing fixed; missing external correlation IDs
* pcs: code formatting
* enable kernel tracing service to receive external correlation IDs
* pcs: using ROCPROFILER_STATUS_ERROR_INCOMPATIBLE_KERNEL
* pcs: polishing parser
* formatting
* updating parser to use workgroup_id
* kfd_ioctl.h extracted in details folder
* refactoring
* pcs: preparing to generate code object information
* flush internal buffers prior to unloading code object
* pcs: generating marker records
* pcs: wrap code_object's shutdown function
* ROCR_VISIBLE_DEVICES and HIP_VISISBLE_DEVICES unsupported at the moment
* documenting the ignorance of ROCR/HIP_VISIBLE_DEVICES
* pcs: separate structs for code object loading/unloading markers
* pcs: inst_pkt_t changed the namespace
* pcs: removing wrapper around the shutdown function
* pcs: size in record field
* pcs: documentation refactoring + typdefs
* renaming PCSAgentConfig to PCSAgentSession
* pcs: service does not keep a pointer to the context
* pcs: static assertions related to the versioning
* pcs: rocprofiler_pc_sampling_configuration_t size field
* pcs: report API unimplemented unleass explicitly enabled
* pcs: skip tests if KFD does not support PC sampling
* pcs: if ROCr hides some devices, no PC samples will be delivered for it
* pcs: hip error check after kernel launch
* formatting
* removing PCS info from agent.h
* fix based on review
* Update continuous integration workflow
- use mi200 runner for code coverage (supports PC sampling)
- split sanitizer jobs across navi3, vega20, and mi300
* Updating pc sampling test labels
* ROCP_PC_SAMPLING_ENABLED env in CI
* ROCP_PC_SAMPLING_ENABLED for all CI mi200 jobs
* Rearrange sanitizer assignments
* fixes according to review
* removed unused functions
* pcs: rocprofiler_agent_id_t instead of handle as a key in map
* Update source/lib/rocprofiler-sdk/context/context.hpp
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
* removing drm_fd from the agent.h
* pcs: removing one sample due to complexity
* pcs: refactoring sample
* simplifying sample
* new lines
* Improve queue_control enable intercepter logic
* Update lib/rocprofiler-sdk/hsa/types.hpp
- handle amd_ext size for HSA 1.12.0
* ROCP_PC_SAMPLING_ENABLED -> ROCPROFILER_PC_SAMPLING_BETA_ENABLED
* Update hsa_adapter.cpp
- anonymous namespace + remove debug
* parser update
* Apply suggestions from code review
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
Co-authored-by: vlaindic <vladimir.indic@amd.com>
Co-authored-by: vlaindic <vlaindic@amd.com>
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Co-authored-by: gobhardw <gopesh.bhardwaj@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
* Minor fix
Removal of HSA from counter collection
Tests for AQL
Updated counter collection client to build profiles in tool init
* Rebased
* Debug printing
* Formatting
* More format
* fix shadowing
---------
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
* Fix agent node id + randomize offset id
- fixes the node_id value
- randomizes a constant offset for the id.handle values
- switch to using node ids in rocprofiler-sdk-tool library
- update tests related to agents
* Logical node id
- sequential node id values from 0 to (N-1) where N is the number of agents
* Moved tests/apps to tests/bin
* Renamed cmake project in tests/bin
* Update samples
- Use ROCPROFILER_DEFAULT_FAIL_REGEX
- tweaks to stdout messages
* Update tests
- Use ROCPROFILER_DEFAULT_FAIL_REGEX
* Add tests/lib
- libraries with HIP code
* Update PTL submodule
- remove atexit delete of thread_id_map
* Update cmake/rocprofiler_options.cmake
- Set ROCPROFILER_DEFAULT_FAIL_REGEX
* Update common lib: env + logging
- improved customization of logging settings
- default to disabling logging to files
- install failure handler for rocprofv3
- set_env support in environment.*
* Add lib/rocprofiler-sdk/shared_library.cpp
- shared library constructor
* Update lib/rocprofiler-sdk-tool/tool.cpp
- destructor thread safety
- convert callback_name_info and buffered_name_info to pointers
- install failure handler for logging
* Add tests/bin/hip-in-libraries
- hip-in-libraries is an exe which uses two shared libraries where each shared library contains HIP kernels
- used for testing deadlocking within __hipRegisterFatBinary
* Update bin/rocprofv3
- reorganized the env variables
- use exec to launch command
- set ROCPROFILER_LIBRARY_CTOR=1
* Add tests/rocprofv3/tracing-hip-in-libraries
- uses hip-in-libraries exe for exe which uses shared libraries to launch HIP kernels
* Update bin/rocprofv3
- fix counter collection (no exec)
* Update lib/rocprofiler-sdk-tool/tool.cpp
- replace "Kernel-Name" with "Kernel_Name"
* Update lib/rocprofiler-sdk/registration.cpp
Use RTLD_LOCAL instead of RTLD_GLOBAL for env libraries
* Update tests/rocprofv3
- replace "Kernel-Name" with "Kernel_Name"
* Update tests
- vector-ops (bin) stream syncs + runs with 4 queues per device
- improve counter-collection/input1 validation
- rocprofv3/tracing-hip-in-libraries does not do sys-trace
- improved validation script for tracing-hip-in-libraries
- updated dispatch_callback in json-tool.cpp following reworking of prototypes for counter collection
* Update samples/counter_collection
- updated dispatch_callback(s) and record_callback(s) following reworking of prototypes
* Update bin/rocprofv3
- reorganized help menu
- added options for sub-HSA tables
- added --hip-runtime-trace
- changed --hip-trace to include --hip-compiler-trace
* Update lib/rocprofiler-sdk-tool
- improved kernel filtering
- removed arch_vgpr, accum_vgpr, sgpr code (in rocprofiler-sdk)
- fixed issue with counter-collection w/o tracing
- added support for fine grained HSA API tracing
- removed directly linking to HSA-runtime
* Update lib/rocprofiler-sdk/agent.cpp
- rocp_agents != hsa_agents is non-fatal when ROCPROFILER_BUILD_CI=OFF (CMake option)
* GPR (vector and scalar) info in kernel symbol data
- rocprofiler_callback_tracing_code_object_kernel_symbol_register_data_t contains general purpose register info
* Header include order fix
- Include repo headers first
- Third party library headers next
- standard library headers last
* Update dispatch profiling public API
- introduce rocprofiler_profile_counting_dispatch_data_t
- change signature of rocprofiler_profile_counting_dispatch_callback_t and rocprofiler_profile_counting_record_callback_t
- provide rocprofiler_user_data_t pointer in dispatch callback
- provide rocprofiler_user_data_t value (from dispatch cb) in record callback
* Update tests/bin/CMakeLists.txt
- fix add_subdirectory(hip-in-libraries) order
* Update VERSION
- bump to 0.2.0 in prep for AFAR
* Update include/rocprofiler-sdk/{counters,profile_config}.h
- use rocprofiler_agent_id_t instead of rocprofiler_agent_t
* Update samples
- use rocprofiler-sdk::rocprofiler-sdk instead of rocprofiler::rocprofiler in cmake
- api_callback_tracing sample roctxProfiler{Pause,Resume}
- api_callback_tracing sample uses ROCTx
- updates to use rocprofiler_agent_id_t
* Update run-ci.py
- exclude rocprofiler-sdk-tool from samples (no sample uses that code)
* Update lib/rocprofiler-sdk-tool/tool.cpp
- Update rocprofiler_iterate_agent_supported_counters to use agent ID
* Update lib/rocprofiler-sdk/counters/core.*
- profile_config has pointer to agent instead of copy
* Update lib/rocprofiler-sdk/agent.*
- provide get_agent(...) func via rocp agent id
* Update lib/rocprofiler-sdk/{buffer,callback}_tracing.cpp
- return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED for enums missing implementation
* Update lib/rocprofiler-sdk/counters.cpp
- update to use rocprofiler_agent_id_t instead of rocprofiler_agent_t
* Update lib/rocprofiler-sdk/profile_config.cpp
- update to use rocprofiler_agent_id_t instead of rocprofiler_agent_t
* Update source/docs
- requirements.txt + install reqs in cmake
* Bump version to 0.1.0
* Update samples/api_callback_tracing/CMakeLists.txt
- LD_LIBRARY_PATH for test
* Update test/rocprofv3/tracing/CMakeLists.txt
- reorder validation files so memory copy comes first
* Update lib/rocprofiler-sdk-tool/tool.cpp
- logging for flushing buffers
- variables for buffer_size and buffer_watermark
- increase the watermark to a full buffer
- use dedicated threads for each buffer
* Update lib/rocprofiler-sdk-tool/CMakeLists.txt
- test sets ROCPROF_LOG_LEVEL and ROCPROFILER_LOG_LEVEL to info
* Remove lib/rocprofiler-sdk-tool/trace_buffer.hpp
* Update lib/rocprofiler-sdk-tool/CMakeLists.txt
- drop log level to warning when leak sanitizer is enabled (produces small memory leak)
* Update lib/rocprofiler-sdk/context/context.*
- get_registered_contexts functions (local copy)
* Update lib/rocprofiler-sdk/hsa/{queue,queue_controller}.cpp
- remove ROCPROFILER_BUFFER_TRACING_MEMORY_COPY code
* Update tests/kernel-tracing/kernel-tracing.cpp
- move stop() and flush() in tool_fini to before reporting of sizes of data collected
* Update lib/rocprofiler-sdk/hsa/hsa.*
- remove stale set_callback / activity_functor_t code
* Update lib/rocprofiler-sdk/buffer.cpp
- full wait instead of returning busy when buffer is busy
- use task_group::join instead of task_group::wait to fully wait for tasks to finish (bug fix)
* Update lib/rocprofiler-sdk/agent.cpp
- support agent mapping for CPU agents
* Remove direct access to vector of registered contexts
* Update GitHub links
* Update samples/api_buffered_tracing/client.cpp
- check if initialized before forcing initialization
* Add lib/common/static_object.*
- template class for creating a static allocation in the binary which has all the properties of a heap allocated singleton but does not trigger leak sanitizers
* Update include/rocprofiler-sdk/internal_threading.h
- document return values
* Update lib/rocprofiler-sdk/internal_threading.cpp
- return codes from rocprofiler_create_callback_thread and rocprofiler_assign_callback_thread
- use common::static_object for thread-pool object
* Update lib/rocprofiler-sdk/agent.cpp
- use common::static_object to store array of strings and their hashes
* Update lib/rocprofiler-sdk/hsa/code_object.cpp
- use common::static_object to store array of strings and their hashes to ensure strings exist until termination
* Update lib/rocprofiler-sdk/registration.cpp
- use common::static_object to store status and client libraries
- update return values for rocprofiler_set_api_table
* Update lib/rocprofiler-sdk/hsa/hsa.cpp
- check registration::get_fini_status() in hsa_api_impl::functor<Idx>(args...)
* Update lib/rocprofiler-sdk/context/context.cpp
- using common::static_object for correlation id map
* Fix find_package(rocprofiler) in build tree
* Move include/rocprofiler to include/rocprofiler-sdk
* Update include/CMakeLists.txt
- add_subdirectory(rocprofiler-sdk)
* Move lib/rocprofiler to lib/rocprofiler-sdk
* Move lib/rocprofiler-tool to lib/rocprofiler-sdk-tool
* Update lib/CMakeLists.txt
- add_subdirectory(rocprofiler-sdk)
- add_subdirectory(rocprofiler-sdk-tool)
* Update lib/rocprofiler-sdk/CMakeLists.txt
* Rename rocprofiler-tool to rocprofiler-sdk-tool
* Replace include rocprofiler/ with include rocprofiler-sdk/
* Replace include lib/rocprofiler/ with include lib/rocprofiler-sdk/
* Set VERSION to 0.0.0 and finish install to rocprofiler-sdk
* More fixes for rocprofiler -> rocprofiler-sdk
- fix issue with rocprofiler-sdk-config.cmake.in
- fix counters xml install path
* Fix documentation generation
* Create rocprofiler_LIB_ROCPROFILER_SDK_DIR for build tree
* cmake formatting (cmake-format) (#264)
Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com>
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>