a3819f09ad554fc0a945aaef197dfd7989edfc50
31 次代码提交
| 作者 | SHA1 | 备注 | 提交日期 | |
|---|---|---|---|---|
|
|
2a146259c7 |
Add support for RCCL tracing (#1047)
* [Draft]: Add support for RCCL tracing Address comments * [Draft]: Add support for RCCL tracing Address PR comments, changes from RCCL upstream * Add RCCL library table registration Working on adding support to rocprofiler-register * Support compilation w/o <rccl/amd_detail/api_trace.h> - dummy api_trace.h header - return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED when RCCL does not have api_trace.h header * RCCL API tracing tool support - add to rocprofv3 - add to json-tool --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
5d54682468 |
Misc cleanup and stale code removal (#1026)
* Remove custom allocators - remove unused lib/rocprofiler-sdk/allocator.* - remove unused lib/rocprofiler-sdk/context/allocator.hpp * Fix rocprofiler_strip_target (rocprofiler_utilities.cmake) * Remove old HSA_TOOLS_LIB support - remove OnLoad/OnUnload functions used by HSA_TOOLS_LIB env variable * Fix linter warnings + specific NOLINT exceptions - replace bare NOLINT with NOLINT(<warning-name>) |
||
|
|
fa1b9e67ab |
ATT Agent fixes and improvements (#1011)
* Tidying ATT dispatch API. ATT Agent to be initialized with rest of profiler. Removing read_index-based wait. * Formatting * Adding some input validation * Add perf test for agent * Removing async |
||
|
|
dc671497da |
look for symbols in dynsym table (#990)
* look for symbols in dynsym table * checking both symtab and dynsym * Avoid symbol duplication in non stripped binaries * clang-format * Minor elf_utils.cpp updates - use 'else if' instead of 'if' - logging tweaks * Update registration - tweak logging * Update testing - strip the rocprofiler-sdk-c-tool library - add test-c-tool-rocp-tool-lib-execute test which does NOT LD_PRELOAD the library (uses only ROCP_TOOL_LIBRARIES instead) --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
2be3543c7b |
Parse ELF format for rocprofiler_configure symbol (#970)
* Parse ELF format to search for rocprofiler_configure * Use ELF parsing in registration |
||
|
|
8da0c35079 |
Adding wrappers on HSA for executable load/unload and allowing multiple agents per context on ATT (#951)
* Codeobj wrappers around HSA calls for ATT * Formatting * Bookeeping * Tidy * Tidy * Update source/lib/rocprofiler-sdk/thread_trace/code_object.hpp Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com> * Update source/lib/rocprofiler-sdk/thread_trace/att_core.hpp Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com> * Variable naming --------- Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com> |
||
|
|
62ec95eae6 | Sync queue and async copy on client finalizer (#950) | ||
|
|
987ae3cc47 |
PC Sampling Support (#715)
* cmake formatting (cmake-format) (#188) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * source formatting (clang-format v11) (#189) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: design of the pc sampling data struct; guarding parts of code that uses ROCr marker packets * source formatting (clang-format v11) (#191) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * cmake formatting (cmake-format) (#192) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: shadow variable fix * pcs: fix for compiler errors reported by CI/CD * source formatting (clang-format v11) (#193) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: docs fix; samples uses rocprofiler::rocprofiler library * cmake formatting (cmake-format) (#195) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: client in samples folder fixed * pcs: client requires rocprofiler package as dependency * pcs: client uses single context * source formatting (clang-format v11) (#196) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: client using single buffer; no buffer destroy in client * pcs: client::setup explicitly called from the example * pcs: rocprofiler_pc_sample_record_t updated * pcs: fixed init of external correlation id * source formatting (clang-format v11) (#198) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: remove outdated files; update CMakeLists * cmake formatting (cmake-format) (#212) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: using rocprofiler_agent_id_t * pcs: Removing trailing whitespaces Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> * source formatting (clang-format v11) (#214) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: mapping agent_id to the agent * source formatting (clang-format v11) (#215) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: const while iterating over agents * source formatting (clang-format v11) (#216) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: calling get_buffer instead of get_buffers * pcs: workgroup typo * pcs: documentation for the public PC sampling API * pcs: queue_cb_t signature adaptation * pcs: mocks removed * pcs: updating HsaApiTable with HSA/ROCr PC sampling API * pcs: querying available PC sampling configs through IOCTL * pcs: create the PCS session in IOCTL * pcs: first actual PC samples delivered to the rocprofiler's client :) * pcs: works with marker packet too * pcs: using HSA table to call pc sampling related functions * pcs: using ioctl instead of kfd in naming * pcs: configuration service test fixed * pcs: sample processing test fixed * pcs: marker packet macro wrapper removed * pcs: marker packet is part of the rocprofiler_packet union * pcs: one fixme added * pcs: client that uses pc-sampling and code obj tracing * pcs: client that supprts PC sampling and code obj tracing refactored * pcs: show more info for each PC sample * pcs: hex output for the samples that do not belong to the matmul kernel * pcs: querying avail configuration happens immediately before configuring * pcs: hsa_ven_amd_pcs_create_from_id renamed * pcs: using hsa_stop; accessing a buffer by id from parser * pcs: includes reworked, tests returned to life * pcs: rocrofiler dir removed as outdated * cmake formatting (cmake-format) (#271) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * source formatting (clang-format v11) (#272) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: some warnings fixed * source formatting (clang-format v11) (#273) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * cmake formatting (cmake-format) (#274) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: show MI200 relevant information in the sample * pcs: queue cb fixed; rocr.h include fixed * source formatting (clang-format v11) (#296) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: getting hsa_agent and the doorbell_id from hsa_queue * source formatting (clang-format v11) (#297) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: correlation ID logic fixed * source formatting (clang-format v11) (#303) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: pure pc sampling example fixed * source formatting (clang-format v11) (#307) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * cmake formatting (cmake-format) (#308) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: interval value if the PC sampling is already configured * pcs: ROCPROFILER_STATUS_ERROR_PC_SAMPLING_ALREADY_CONFIGURED New status code if another process configured PC sampling service with different configuration. Samples are extended to consider this case and retry if it happens. * pcs: hsa_amd_queue_get_info mocked in tests * source formatting (clang-format v11) (#328) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs (tests): query configs after configuring service * source formatting (clang-format v11) (#329) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: sample checks workgroup_id_* and wave_id * source formatting (clang-format v11) (#330) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs samples: running samples on the device 0 * pcs: kfd_ioctl updated * pcs: ioctl config struct changed fields names * pcs: status when PC sampling is configured by another process is renamed * pcs: HSA PC sampling API table fixed * pcs: tmp hack to be able to use HSA pc sampling table * source formatting (clang-format v11) (#443) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs service use CIDs generated by HIP API tracing service * source formatting (clang-format v11) (#455) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * cmake formatting (cmake-format) (#456) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: CID manager * pcs: explicit flush with no delivered data executes retirement logic * source formatting (clang-format v11) (#464) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: rocprofiler_query_pc_sampling_agent_configurations docs update * source formatting (clang-format v11) (#465) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: rocprofiler_configure_pc_sampling_service docs update * pcs: explicit sync introduced in PCSCIDManager * pcs: new logic for retiring CIDs in PC sampling service documented * pcs: queue interception cb signature updated * source formatting (clang-format v11) (#471) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: if no agents supports PC sampling, fail gracefully * elaborating when KFD returns EBUSY and EEXIST * pcs: the second PC sampling examples fails gracefully * code samples use only single kernel for now * pcs: CID manager refactored * source formatting (clang-format v11) (#481) Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> * pcs: ioctl update * source formatting (clang-format v11) (#531) Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com> * pcs:code sample to test PC sampling applied on concurrent kernels * source formatting (clang-format v11) (#533) Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com> * pcs: pc sampling strest test included * cmake formatting (cmake-format) (#539) Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com> * source formatting (clang-format v11) (#540) Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com> * pcs: standalone benchmark * cmake formatting (cmake-format) (#555) Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com> * pcs: glance in external correlation IDs * source formatting (clang-format v11) (#557) Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com> * another change in ioctl interface * pcs: update queue interceptor callbacks and samples accroding to the agent 0 version * source formatting (clang-format v11) (#611) Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com> * pcs: avoid running problematic PC sampling test * pcs: guarding tests not to fail on architectures not supporting PC sampling * source formatting (clang-format v11) (#617) Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com> * pcs: check IOCTL version prior to each KFD call * pcs: ioctl refactoring * pcs: PC sampling service increases the ref_count of the correlation ID of the kernel dispatch * cmake formatting (cmake-format) (#631) Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com> * source formatting (clang-format v11) (#632) Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com> * pcs: PC sampling service provides external correlation IDs * source formatting (clang-format v11) (#644) Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com> * pcs: use rocprofiler_dim3_t for workgrou_ip * source formatting (clang-format v11) (#645) Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com> * pcs: minor fixes * pcs: updating the documentation for the pc sampling API functions * pcs: api table and queue controller fix * pcs: don't generate marker packets for the agent if PC sampling is not configured on it * pcs: multi-GPU and single-GPU clients * source formatting (clang-format v11) (#700) Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com> * pcs: warning and errors fixed * source formatting (clang-format v11) (#702) Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com> * pcs: clang compiler errors and warnings fixed * source formatting (clang-format v11) (#716) Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com> * pcs: const reference in cid manager * source formatting (clang-format v11) (#717) Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com> * pcs: const & func in manager explicit * pcs: test to cover creating PC sampling service of agent that does not exist * pcs: generate marker packets if service is active * source formatting (clang-format v11) (#719) Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com> * pcs: refactoring hsa_adapter; use the correlation_id->thread_idx * Update source/lib/rocprofiler-sdk/pc_sampling/cid_manager.cpp * Update source/lib/rocprofiler-sdk/pc_sampling/cid_manager.cpp * Update source/lib/rocprofiler-sdk/pc_sampling/hsa_adapter.cpp * Update source/lib/rocprofiler-sdk/pc_sampling/hsa_adapter.cpp * Update source/lib/rocprofiler-sdk/pc_sampling/hsa_adapter.cpp * Update source/lib/rocprofiler-sdk/pc_sampling/hsa_adapter.cpp * Update source/lib/rocprofiler-sdk/pc_sampling/utils.cpp * Update utils.cpp * moving pc-sampling tests and samples to pc-sampling label * Format fix * pcs: use configured instead of active service * Update source/lib/rocprofiler-sdk/pc_sampling/service.cpp * pcs: ensure configuring PC sampling on the HSA level is called only once * pcs: minor fix * Update CMakeLists.txt * Update CMakeLists.txt * Update CMakeLists.txt * Update CMakeLists.txt * pcs: refactoring IOCTL integration * Update source/lib/rocprofiler-sdk/pc_sampling/tests/CMakeLists.txt Co-authored-by: Ammar ELWazir <ammar.elwazir@amd.com> * Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter.cpp Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter_types.hpp Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter.cpp Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter_types.hpp Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter.hpp Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * pcs: reverting back what bot doubled * Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter_types.hpp Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * pcs: retesting the bot * Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter_types.hpp Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * pcs: why bot fails on this IOCTL status * pcs: why failing on <vector> * Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter.cpp Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * pcs: returning commits removed by bot * pcs: formatting locally * pcs: clients are flushing buffers inside the tool_fini * pcs: sync function in public API * pcs: sync prior to unloading the code object * pcs: sync function requires context * pcs: client uses CID retirement service * pcs: test for flusing internal ROCr buffers * pcs: source formatting * Update source/lib/rocprofiler-sdk/pc_sampling/tests/CMakeLists.txt Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * pcs: code samples refactoring * pcs: public API header refactored * pcs: rocprofiler_buffer_flush drains internal PC sampling buffers too * pcs: remove unnecessary functions * pcs: do not call hsa's copytables * pcs: include reordering * pcs: using ROCP_ERROR inside PC sampling implementation * pcs: pc_sampling sample uses ostream instean of printfs * pcs: pc_sampling_codeobj tracing using ostream instead of prints * pcs: registering once for interceptor callbacks * pcs: do not generate internal CIDs if not in debug mode * pcs: rebasing fixed; missing external correlation IDs * pcs: code formatting * enable kernel tracing service to receive external correlation IDs * pcs: using ROCPROFILER_STATUS_ERROR_INCOMPATIBLE_KERNEL * pcs: polishing parser * formatting * updating parser to use workgroup_id * kfd_ioctl.h extracted in details folder * refactoring * pcs: preparing to generate code object information * flush internal buffers prior to unloading code object * pcs: generating marker records * pcs: wrap code_object's shutdown function * ROCR_VISIBLE_DEVICES and HIP_VISISBLE_DEVICES unsupported at the moment * documenting the ignorance of ROCR/HIP_VISIBLE_DEVICES * pcs: separate structs for code object loading/unloading markers * pcs: inst_pkt_t changed the namespace * pcs: removing wrapper around the shutdown function * pcs: size in record field * pcs: documentation refactoring + typdefs * renaming PCSAgentConfig to PCSAgentSession * pcs: service does not keep a pointer to the context * pcs: static assertions related to the versioning * pcs: rocprofiler_pc_sampling_configuration_t size field * pcs: report API unimplemented unleass explicitly enabled * pcs: skip tests if KFD does not support PC sampling * pcs: if ROCr hides some devices, no PC samples will be delivered for it * pcs: hip error check after kernel launch * formatting * removing PCS info from agent.h * fix based on review * Update continuous integration workflow - use mi200 runner for code coverage (supports PC sampling) - split sanitizer jobs across navi3, vega20, and mi300 * Updating pc sampling test labels * ROCP_PC_SAMPLING_ENABLED env in CI * ROCP_PC_SAMPLING_ENABLED for all CI mi200 jobs * Rearrange sanitizer assignments * fixes according to review * removed unused functions * pcs: rocprofiler_agent_id_t instead of handle as a key in map * Update source/lib/rocprofiler-sdk/context/context.hpp Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * removing drm_fd from the agent.h * pcs: removing one sample due to complexity * pcs: refactoring sample * simplifying sample * new lines * Improve queue_control enable intercepter logic * Update lib/rocprofiler-sdk/hsa/types.hpp - handle amd_ext size for HSA 1.12.0 * ROCP_PC_SAMPLING_ENABLED -> ROCPROFILER_PC_SAMPLING_BETA_ENABLED * Update hsa_adapter.cpp - anonymous namespace + remove debug * parser update * Apply suggestions from code review --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: vlaindic <vlaindic@users.noreply.github.com> Co-authored-by: vlaindic <vladimir.indic@amd.com> Co-authored-by: vlaindic <vlaindic@amd.com> Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> Co-authored-by: gobhardw <gopesh.bhardwaj@amd.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
28e6430d04 |
[2/N] Agent Counter implementation with unit tests to check functionality (#846)
Agent Counter Collection API with tests and samples. --------- Co-authored-by: Benjamin Welton <ben@amd.com> |
||
|
|
4d5b71b0e7 |
Update logging (#838)
* Update logging * Remove unused function * Fix lib/rocprofiler-sdk/hsa/pc_sampling.cpp logging compilation * Fix logging FLAGS_vmodule string leak and numerical log level * Update logging * Update glog submodule * Leak fixes * format |
||
|
|
733aa8e438 |
Restructure code object source code (#826)
* public codeobj info * Restructure code object source code file layout * Update get_unloaded_code_objects + add iterate_loaded_code_objects * Remove get_unloaded_code_objects from visible internal API - iterate_loaded_code_objects + functor which filters on the hsa_executable_t effectively reproduces this behavior * Whitespace removal --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
deabd869b5 |
Introducing PcSamplingExtTable (#735)
* pcs: updating the PCS table * Fixing Clang Tidy errors * pcs: reverting old table version * testint wrong table size * new size * testing step * reverting old steps * hsa_amd_queue_get_info introduced * pcs: testing table version * formatting * removing redundand declarations * removing unnecessary files * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update source/lib/rocprofiler-sdk/hsa/pc_sampling.cpp Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Enable function pointer offset check in hsa::pc_sampling::copy_table - add offset() to HSA_API_META_DEFINITION - check if offset() >= size of struct * Support build without PC sampling API table * ids for ROCr's PC sampling public functions --------- Co-authored-by: Ammar ELWazir <aelwazir@hpe6u-21.amd.com> Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
fd3d97287c |
Page migration reporting (#651)
* Page migration reporting support * Page migration: Update parser and reporting Container does not lave latest KFD header, so CI might fail * Add kfd_ioctl.h * Formatting * Update get_key - get key was not used (and shouldn't be), so delete it * clang-tidy fixes * Tests for page migration * Apply suggestions from code review Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update tests/bin/page-migration/CMakeLists.txt Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * Update page-migration test app - add hipHostRegister to register mmap'ed allocation with HIP - misc cleanup and reorg - remove HSA_XNACK=1 from test env * Update lib/rocprofiler-sdk/tests/page_migration.cpp - fix compilation error * Minor updates (reorg, rename) * Page migration reporting support * Page migration: Update parser and reporting Container does not lave latest KFD header, so CI might fail * Update page migration tests, fix trigger types * Page Migration Tracing Support Refactoring (#753) * Reorganization * Update page migration init/fini * Formatting * Update page_migration.cpp - change logging severity * Skip test if KFD does not support page migration reporting * Rework skipping test if KFD does not support page migration * Fix event trigger enum values * Fix clang-diagnostic-unused-const-variable --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> |
||
|
|
e2d8ccad4b |
adding pandas and pytest to rquirements.txt (#748)
* adding pandas and pytest to rquirements.txt * setting up requrements.txt * Update requirements - formatting packages - remove packages not directly used by rocprofiler-sdk * Update cmake formatting, linting, and options - if BUILD_CI -> force BUILD_DEVELOPER and BUILD_WERROR - support python installed clang-format and python installed clang-tidy * Update build.sh - split into install-deps.sh and install-apt-deps.sh * Improve code coverage --------- Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
4fa165ec1a |
Add support for scratch reporting (#523)
* Add ToolsApiTable Add ToolsApiTable wrapping for scratch memory tracking * Add initial support for scratch memory tracking Buffering is implemented * cmake formatting (cmake-format) (#525) Co-authored-by: MythreyaK <MythreyaK@users.noreply.github.com> * source formatting (clang-format v11) (#524) Co-authored-by: MythreyaK <MythreyaK@users.noreply.github.com> * Add callback tracing for scratch Fixed the error where scratch tracking init was called irrespective of whether any client requested for it * Apply suggestions from code review Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> * Fix tools api copy/update Table were saved/updated incorrectly in previous commit. Also adds passing user data through the callback * Fix OpKind sequence for scratch tracking Previously scratch was using OpKind from rocprofiler-sdk, but templates were instantiated using API ID. These differ by 1 * Integration tests for scratch reporting Added buffer and callback integration tests for scratch reporting * source formatting (clang-format v11) (#550) Co-authored-by: MythreyaK <26112391+MythreyaK@users.noreply.github.com> * cmake formatting (cmake-format) (#551) Co-authored-by: MythreyaK <26112391+MythreyaK@users.noreply.github.com> * python formatting (black) (#549) Co-authored-by: MythreyaK <26112391+MythreyaK@users.noreply.github.com> * CI fixes * source formatting (clang-format v11) (#554) Co-authored-by: MythreyaK <26112391+MythreyaK@users.noreply.github.com> * Update api Rebase on main and updates based on PR feedback * Update scratch reporting and address PR comments - Added agent id to buffer records - Updated `test_internal_correlation_ids` - Is almost identical to one in async-copy - Updated scratch test to check for agent id - Updated queue id serialization in callback records (prints handle as nested key) - Remove `marker_api_traces` from scratch `test_internal_correlation_ids` validation test - Rename `amd_tools_api` to `scratch_memory` - Added doxygen comments - Remove scratch callback from `tool.cpp` - Replace assert with `LOF_IF` in `scratch_memory.cpp` * Update tools table Changed to match up with changes to hsa tables in main branch * Rework scratch memory structure * Update tests - Added suggestions from PR review, and updated tests accordingly * Misc cleanup * Update scratch test As of Apr 4th, `hsa_amd_agent_set_async_scratch_limit` is disabled. Note, > This API: `hsa_amd_agent_set_async_scratch_limit` is currently > disabled. We need some changes in CP firmware to be able to do this > and these changes are not ready yet. > With the current code, you will also not get notifications for > alternate-scratch allocations because this feature has been disabled > while CP firmware is making additional changes > We are hoping to have that feature enabled by ROCm-6.3 * Minor update to lib/rocprofiler-sdk/internal_threading.* - delay destruction of shared_ptrs of the tasks to prevent rare (but possible) data race on the destruction of the shared_ptr --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: MythreyaK <MythreyaK@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
41c0ddd72d |
Convert LOG() -> ROCP_X logging macros. (#695)
* Convert LOG() -> ROCP_X logging macros. This patch converts the LOG() macro to the ROCP_X logging macros. There are the following levels of logs. Logs whos expressions are not evaluated unless the log level is enabled: ROCP_TRACE - VLOG(2) (enabeled by env variable GLOG_v=2) ROCP_INFO - VLOG(1) (enabeled by env variable GLOG_v=1) Logs whos expressions are always evaluated: ROCP_WARNING - LOG(WARNING) ROCP_ERROR - LOG(ERROR) ROCP_FATAL - LOG(FATAL) ROCP_DFATAL - DLOG(FATAL) (only fatal in debug mode) * source formatting (clang-format v11) (#696) Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com> * Minor fix * Fixes for VLOG before main * fix vmodule * source formatting (clang-format v11) (#718) Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com> * memory leak fix * Vlog change --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com> |
||
|
|
939e23e9d1 |
Stop all client contexts prior to finalization (#721)
* Stop all client contexts prior to finalization
* Update lib/common/container/static_vector.hpp
- improve emplace_back for non-{move,copy}-assignable object
* Update samples/intercept_table/client.cpp
- improve robustness against static object destruction
* Update lib/rocprofiler-sdk/context/context.cpp
- change storage of registered context array
- stable_vector of optional contexts
- common::static_object wrapper around stable_vector
* Update samples/intercept_table/client.cpp
- use variable template for underlying function pointer
|
||
|
|
bc9f86ec62 |
Update HSA copy table (#687)
- two copies of HSA table: internal and tracing - internal is used to invoke HSA function without any possibility of triggering tracing, etc. |
||
|
|
7b6d3c70bd |
Shared Library Constructor (rocprofv3 deadlock fix) (#599)
* Moved tests/apps to tests/bin * Renamed cmake project in tests/bin * Update samples - Use ROCPROFILER_DEFAULT_FAIL_REGEX - tweaks to stdout messages * Update tests - Use ROCPROFILER_DEFAULT_FAIL_REGEX * Add tests/lib - libraries with HIP code * Update PTL submodule - remove atexit delete of thread_id_map * Update cmake/rocprofiler_options.cmake - Set ROCPROFILER_DEFAULT_FAIL_REGEX * Update common lib: env + logging - improved customization of logging settings - default to disabling logging to files - install failure handler for rocprofv3 - set_env support in environment.* * Add lib/rocprofiler-sdk/shared_library.cpp - shared library constructor * Update lib/rocprofiler-sdk-tool/tool.cpp - destructor thread safety - convert callback_name_info and buffered_name_info to pointers - install failure handler for logging * Add tests/bin/hip-in-libraries - hip-in-libraries is an exe which uses two shared libraries where each shared library contains HIP kernels - used for testing deadlocking within __hipRegisterFatBinary * Update bin/rocprofv3 - reorganized the env variables - use exec to launch command - set ROCPROFILER_LIBRARY_CTOR=1 * Add tests/rocprofv3/tracing-hip-in-libraries - uses hip-in-libraries exe for exe which uses shared libraries to launch HIP kernels * Update bin/rocprofv3 - fix counter collection (no exec) * Update lib/rocprofiler-sdk-tool/tool.cpp - replace "Kernel-Name" with "Kernel_Name" * Update lib/rocprofiler-sdk/registration.cpp Use RTLD_LOCAL instead of RTLD_GLOBAL for env libraries * Update tests/rocprofv3 - replace "Kernel-Name" with "Kernel_Name" * Update tests - vector-ops (bin) stream syncs + runs with 4 queues per device - improve counter-collection/input1 validation - rocprofv3/tracing-hip-in-libraries does not do sys-trace - improved validation script for tracing-hip-in-libraries - updated dispatch_callback in json-tool.cpp following reworking of prototypes for counter collection * Update samples/counter_collection - updated dispatch_callback(s) and record_callback(s) following reworking of prototypes * Update bin/rocprofv3 - reorganized help menu - added options for sub-HSA tables - added --hip-runtime-trace - changed --hip-trace to include --hip-compiler-trace * Update lib/rocprofiler-sdk-tool - improved kernel filtering - removed arch_vgpr, accum_vgpr, sgpr code (in rocprofiler-sdk) - fixed issue with counter-collection w/o tracing - added support for fine grained HSA API tracing - removed directly linking to HSA-runtime * Update lib/rocprofiler-sdk/agent.cpp - rocp_agents != hsa_agents is non-fatal when ROCPROFILER_BUILD_CI=OFF (CMake option) * GPR (vector and scalar) info in kernel symbol data - rocprofiler_callback_tracing_code_object_kernel_symbol_register_data_t contains general purpose register info * Header include order fix - Include repo headers first - Third party library headers next - standard library headers last * Update dispatch profiling public API - introduce rocprofiler_profile_counting_dispatch_data_t - change signature of rocprofiler_profile_counting_dispatch_callback_t and rocprofiler_profile_counting_record_callback_t - provide rocprofiler_user_data_t pointer in dispatch callback - provide rocprofiler_user_data_t value (from dispatch cb) in record callback * Update tests/bin/CMakeLists.txt - fix add_subdirectory(hip-in-libraries) order * Update VERSION - bump to 0.2.0 in prep for AFAR |
||
|
|
b0a88d9124 |
Update registration client search (#569)
* Update registration client search - Search ROCP_TOOL_LIBRARIES before dlopen search - Fatal error if ROCP_TOOL_LIBRARIES entry does not contain rocprofiler_configure symbol - Use RTLD_DEFAULT and RTLD_NEXT to (potentially) find first two instances of rocprofiler_configure - if no rocprofiler_configure found via RTLD_NEXT, do not do extensive search via link map * _GNU_SOURCE instead of GNU_SOURCE * Clang-tidy fix |
||
|
|
1bb94add11 |
Fix rocprofiler_iterate_callback_tracing_kind_operation_args for HIP compiler callbacks (#532)
* Fix HIP compiler iterate args
- `include/rocprofiler-sdk/hip/api_args.h`
- replace struct fields named "f" with "func"
- replace hip stream fields named "hStream" with "stream"
- `lib/rocprofiler-sdk/callback_tracing.cpp`
- iterate_args for HIP compiler table
- `lib/rocprofiler-sdk/registration.cpp`
- fix warning about roctx num_tables
- `lib/rocprofiler-sdk/hip/hip.def.cpp`
- replace struct fields named "f" with "func"
- replace hip stream fields named "hStream" with "stream"
- `lib/rocprofiler-sdk/{hip,hsa,marker}/utils.hpp`
- improve `stringize_impl`
- `lib/rocprofiler-sdk/hsa/code_object.cpp`
- remove stale commented out code
- `lib/rocprofiler-sdk/hsa/queue_controller.*`
- destory_queue -> destroy_queue
- `tests/tools/json-tool.cpp`
- improve parallelism in tool_tracing_callback
- serialize the marker api args
- only invoke rocprofiler_iterate_callback_tracing_kind_operation_args in exit phase
- `samples/counter_collection/CMakeLists.txt`
- reduce timeout on tests to 120 seconds
* Update lib/rocprofiler-sdk/hsa/utils.hpp
- disable dereference of double pointer in stringize_impl
* Update lib/common
- indirection_level in mpl.hpp
- stringize_arg.hpp
* Rework rocprofiler_iterate_callback_tracing_kind_operation_args
- provide more information in rocprofiler_callback_tracing_operation_args_cb_t
- support specifying the dereference level to account for output paramters
|
||
|
|
875f53b608 |
Correlation ID Retirement + misc (#527)
* Correlation ID Retirement
- include/rocprofiler-sdk/buffer_tracing.h
- add rocprofiler_buffer_tracing_correlation_id_retirement_record_t
- include/rocprofiler-sdk/fwd.h
- ROCPROFILER_BUFFER_TRACING_CORRELATION_ID_RETIREMENT
- lib/rocprofiler-sdk/buffer_tracing.cpp
- kind string for correlation id retirement
- lib/rocprofiler-sdk/buffer.hpp
- emplace returns bool
- lib/rocprofiler-sdk/registration.cpp
- pass lib_instance to copy_table functions
- lib/rocprofiler-sdk/context/context.*
- update correlation_id struct
- make ref_count private
- {get,add,sub}_ref_count() functions
- sub_ref_count() performs correlation id retirement
- use stack for "latest" thread-local correlation id
- lib/rocprofiler-sdk/hip/hip.*
- migrate to new {get,add,sub}_ref_count() for correlation ids
- return in iterate_args
- handle table instance in copy_table
- lib/rocprofiler-sdk/hsa/hsa.*
- migrate to new {get,add,sub}_ref_count() for correlation ids
- return in iterate_args
- handle table instance in copy_table
- lib/rocprofiler-sdk/marker/marker.*
- migrate to new {get,add,sub}_ref_count() for correlation ids
- return in iterate_args
- handle table instance in copy_table
- lib/rocprofiler-sdk/hsa/async_copy.cpp
- migrate to new {get,add,sub}_ref_count() for correlation ids
- handle table instance in async_copy_init / async_copy_save
- lib/rocprofiler-sdk/hsa/queue.cpp
- migrate to new {get,add,sub}_ref_count() for correlation ids
- tweak to external correlation id mapping in WriteInterceptor
- tests/async-copy-tracing/validate.py
- check retired_correlation_ids
- tests/common/serialization.hpp
- support rocprofiler_buffer_tracing_correlation_id_retirement_record_t
- tests/kernel-tracing/validate.py
- check retired_correlation_ids
- tests/common/CMakeLists.txt
- perfetto external project
- tests/common/perfetto.hpp
- perfetto categories + aliases
- add_perfetto_annotation
- metaprogramming helpers
- tests/tools/CMakeLists.txt
- link to tests-perfetto
- tests/tools/json-tool.cpp
- demangling functions
- serialization of marker API callback args
- reduce parallel bottleneck in tool_tracing_callback
- support correlation id retirement
- Multiple threads for buffers
- Support ROCPROFILER_TOOL_CONTEXTS_EXCLUDE env variable
- write_perfetto() function
* Update tests/rocprofv3/tracing/validate.py
- tweak test_hsa_api_trace
* Update PTL submodule
- fixes for data race during destruction of task
* Update lib/rocprofiler-sdk/buffer.*
- unique_buffer_vec_t uses std::unique_ptr instead of allocator::unique_static_ptr_t
* Reduce timeouts in counter collection samples [skip ci]
* Update tests/tools/json-tool.cpp
- tweak demangle(string_view, int*) -> demangle(string_view, int&)
* Update lib/rocprofiler-sdk/hsa/async_copy.cpp
- move sub_ref_count() to later in async_copy_handler to delay retirement slightly more
|
||
|
|
3f39339926 |
API Tracing Overhaul (#437)
* Update include/rocprofiler-sdk/hsa/*
- split HSA API IDs into separate enumerations
- add support for finalize ext table
* Update include/rocprofiler-sdk/hip/*
- remove compiler_api_args.h
- rocprofiler_hip_api_args_t contains all for HIP runtime and HIP compiler
- ROCPROFILER_HIP_API_ID_ -> ROCPROFILER_HIP_RUNTIME_API_ID_
* Update include/rocprofiler-sdk/marker/table_api_id.h
- ROCPROFILER_MARKER_API_TABLE_ID_ -> ROCPROFILER_MARKER_TABLE_ID_
* Update include/rocprofiler-sdk/*/table_api_id.h
- table_api_id.h -> table_id.h
* Update include/rocprofiler-sdk/*/table_api_id.h
- table_api_id.h -> table_id.h
* Update include/rocprofiler-sdk/fwd.h
- ROCPROFILER_CALLBACK_TRACING_HSA_API split into 4 enum values:
- ROCPROFILER_CALLBACK_TRACING_HSA_CORE_API
- ROCPROFILER_CALLBACK_TRACING_HSA_AMD_EXT_API
- ROCPROFILER_CALLBACK_TRACING_HSA_IMAGE_EXT_API
- ROCPROFILER_CALLBACK_TRACING_HSA_FINALIZE_EXT_API
- ROCPROFILER_BUFFER_TRACING_HSA_API split into 4 enum values:
- ROCPROFILER_BUFFER_TRACING_HSA_CORE_API
- ROCPROFILER_BUFFER_TRACING_HSA_AMD_EXT_API
- ROCPROFILER_BUFFER_TRACING_HSA_IMAGE_EXT_API
- ROCPROFILER_BUFFER_TRACING_HSA_FINALIZE_EXT_API
- rocprofiler_callback_tracing_code_object_operation_t renamed to rocprofiler_code_object_operation_t (more consistent)
- doxygen updates
* Update include/rocprofiler-sdk/buffer_tracing.h
- improved doxygen comments
- removed unused rocprofiler_buffer_tracing_queue_scheduling_record_t
- removed unused rocprofiler_buffer_tracing_correlation_record_t
* Update include/rocprofiler-sdk/callback_tracing.h
- removed rocprofiler_callback_tracing_hip_compiler_api_data_t
- rocprofiler_hip_api_args_t and rocprofiler_hip_compiler_api_args_t were combined
- rocprofiler_hsa_api_retval_t and rocprofiler_hsa_compiler_api_retval_t were combined
* Update lib/rocprofiler-sdk/hsa/*
- utils.hpp
- formatters for hsa_ext_program_t and hsa_ext_control_directives_t
- defines.hpp
- removed variadic macros from lib/common/defines.hpp
- HSA_API_META_DEFINITION, HSA_API_INFO_DEFINITION_0, HSA_API_INFO_DEFINITION_V specialize on table id
- async_copy.cpp
- ROCPROFILER_HSA_API_ID_* -> ROCPROFILER_HSA_AMD_EXT_API_ID_*
- add table id to templates
- improve async_copy_fini
- hsa.hpp
- add hsa_table_id_lookup
- add hsa_domain_info
- add table id to templates
- add copy_table function
- hsa.cpp
- add table id to templates
- require hsa tables to be trivial and standard layout
- remove set_data_args specialization for hsa_amd_memory_async_copy_rect
- implement copy_table function
- hsa.def.cpp
- update enums
* Update lib/rocprofiler-sdk/hip/*
- defines.hpp
- use lib/common/defines.hpp
- add hip_table_id_lookup to HIP_API_TABLE_LOOKUP_DEFINITION
- hip.hpp
- hip_table_id_lookup
- template iterate_args on table id
- templated copy_table and update_table
- hip.cpp
- replaced api_id_bounds with hip_domain_info
- templated iterate_args on table id
- templated copy_table and update_table
* Update lib/rocprofiler-sdk/marker/*
- defines.hpp
- use lib/common/defines.hpp
- marker.cpp
- updated enums
- marker.def.cpp
- updated enums
* Update lib/rocprofiler-sdk/tests
- common.hpp
- ROCPROFILER_CALL_EXPECT
- callback_data_ext
- update get_callback_tracing_names with new enums
- update get_buffer_tracing_names with new enums
- external_correlation.cpp
- support new HSA API enums
- intercept_table.cpp
- use test/common.hpp
- update to new HSA API enums
- registration.cpp
- support new HSA API enums
- naming.cpp
- validation for all get_ids(), get_names(), name_by_id(), id_by_name(), etc.
* Update lib/common
- defines.hpp
- Move IMPL_DETAIL_FOR_EACH_NARG, GET_ADDR_MEMBER_FIELDS, and GET_NAMED_MEMBER_FIELDS here
- used by HSA, HIP, and Marker
- static_object.hpp
- is_trivial_standard_layout static constexpr member function
- suppress register_static_dtor when is_trivial_standard_layout
* Update lib/rocprofiler-sdk/hsa/code_object.*
- name_by_id
- id_by_name
- get_names
- get_ids
* Update lib/rocprofiler-sdk/registration.cpp
- Update rocprofiler_set_api_table for HSA
* Update lib/rocprofiler-sdk/callback_tracing.cpp
- Update for new HSA enums
- Rework to use switch statement
- rocprofiler_query_callback_tracing_kind_operation_name
- rocprofiler_iterate_callback_tracing_kind_operations
- rocprofiler_iterate_callback_tracing_kind_operation_args
* Update lib/rocprofiler-sdk/buffer_tracing.cpp
- Update for new HSA enums
- Rework to use switch statement
- rocprofiler_query_buffer_tracing_kind_operation_name
- rocprofiler_iterate_buffer_tracing_kind_operations
* Update lib/rocprofiler-sdk-tool
- helper.cpp
- update get_buffer_id_names with new enums
- update get_callback_id_names with new enums
- tools.cpp
- update to use new HSA enums
* Update samples/common
- added call_stack.hpp
- source_location struct
- call_stack_t alias
- print_call_stack function
- added name_info.hpp
- utils for getting buffer/callback domain and operation names
* Update samples/api_buffered_tracing/client.cpp
- use samples/common/call_stack.hpp
- use samples/common/name_info.hpp
- update for new HSA enums
* Update samples/api_callback_tracing/client.cpp
- use samples/common/call_stack.hpp
- use samples/common/name_info.hpp
- update for new HSA enums
* Update tests/tools/json-tool.cpp
- update for new HSA enums
* Update tests/rocprofv3/tracing/validate.py
- update for new HSA domain names
* Update samples/counter_collection/main.cpp
- reduce number of kernels to 50,000 since 200,000 causes issues with thread sanitizer
|
||
|
|
9efafc4d23 |
Split ROCTx API tables and update intercept table API (#421)
* Update include/rocprofiler-sdk
- buffer_tracing.h
- fix doxygen for rocprofiler_buffer_tracing_hip_api_record_t
- update doxygen for rocprofiler_buffer_tracing_marker_api_record_t
- remove unused marker_id field
- fwd.h
- Split ROCPROFILER_CALLBACK_TRACING_MARKER_API into ROCPROFILER_CALLBACK_TRACING_MARKER_{CORE,CONTROL,NAME}_API
- Split ROCPROFILER_BUFFER_TRACING_MARKER_API into ROCPROFILER_BUFFER_TRACING_MARKER_{CORE,CONTROL,NAME}_API
- split rocprofiler_runtime_library_t into rocprofiler_runtime_library_t and rocprofiler_intercept_table_t
- after split of ROCTx into 3 tables, specifying rocprofiler_at_internal_thread_create became confusing
* Update include/rocprofiler-sdk-roctx/api_trace.h
- Split into three tables: core, control, and name
- core: what it sounds like
- control: functions for controling the profiler
- name: functions for giving resources names
* Update lib/rocprofiler-sdk-roctx/roctx.cpp
- modifications following split into multiple tables
* Update lib/rocprofiler-sdk/marker/*
- modifications following split of ROCTx API into multiple intercept tables
* Update lib/rocprofiler-sdk/tests
- common.hpp
- add enums to get_callback_tracing_names() and get_buffer_tracing_names()
- intercept_table.cpp
- update test to use rocprofiler_intercept_table_t (and enums) instead of rocproifler_runtime_library_t
- update OR combos tested
- roctx.cpp
- updates following split of ROCTx API table into multiple tables
- use simplified specification of control API
* Update lib/rocprofiler-sdk
- buffer_tracing.cpp
- Updates for ROCPROFILER_BUFFER_TRACING_MARKER_{CORE,CONTROL,NAME}_API enum values
- callback_tracing.cpp
- Updates for ROCPROFILER_CALLBACK_TRACING_MARKER_{CORE,CONTROL,NAME}_API enum values
- intercept_table.hpp
- notify_runtime_api_registration -> notify_intercept_table_registration
- intercept_table.cpp
- updates for new rocprofiler_intercept_table_t enum and new ROCTx tables
- registration.cpp
- updates for new rocprofiler_intercept_table_t enum and new ROCTx tables
- updates for notify_runtime_api_registration -> notify_intercept_table_registration
* Update lib/rocprofiler-sdk-tool
- helper.cpp
- Updates for new enums in get_callback_id_names() and get_buffer_id_names()
- tool.cpp
- migrate to new enums for split ROCTx tables
- use simplified split for control table vs. core+name tables
* Update samples/{api_callback_tracing,intercept_table}
- intercept_table/client.cpp
- rocprofiler_runtime_library_t -> rocprofiler_intercept_table_t
- api_callback_tracing/client.cpp
- Updates for new enums in get_callback_id_names()
- use simplified split for control table vs. core+name tables
- migrate to new enums for split ROCTx tables
* Update tests
- rocprofv3/tracing/validate.py
- handle new marker domain names
- tools/json-tool.cpp
- Updates for new enums in get_callback_id_names() and get_buffer_id_names()
- use simplified split for control table vs. core+name tables
- migrate to new enums for split ROCTx tables
* Update tests/rocprofv3/tracing/CMakeLists.txt
- fix FAIL_REGULAR_EXPRESSION for rocprofv3-test-trace-execute
* Update lib/rocprofiler-sdk-tool/{output_file,tool}.*
- logging in output_file dtor
- support stdout/stderr
* Update lib/common/container/record_header_buffer.hpp
- reduce probability of is_empty() returning true while emplace is happening
* Update lib/rocprofiler-sdk-tool/tool.cpp
- logging for buffered_tracing_callback
- counter collection uses CSV encoder
* Update bin/rocprofv3
- remove -i flag from help menu
|
||
|
|
c641749fe6 |
HIP API Tracing (#357)
* Update include/rocprofiler-sdk/hip*
- updates for intercept table
* Update lib/common/units.hpp
- clang-tidy fixes
* Add lib/rocprofiler-sdk/hip
- tracing implementation for the HIP intercept table
* Update source/lib/rocprofiler-sdk/CMakeLists.txt
- add_subdirectory(hip)
* Update source/lib/rocprofiler-sdk/hsa
- offset function in hsa_api_info<Idx>
- remove report_activity, set_callback
- Tweak HSA_API_TABLE_LOOKUP_DEFINITION
* Update lib/rocprofiler-sdk/hip
- rocprofiler::hip::copy_table
- stringize_impl print dereferenced pointers when possible
* Update lib/rocprofiler-sdk/hsa/utils.hpp
- stringize_impl print dereferenced pointers when possible
* Update lib/rocprofiler-sdk/tests/intercept_table.cpp
- remove failures for intercepting HIP API tables
* Update include/rocprofiler-sdk/fwd.h
- add ROCPROFILER_HIP_RUNTIME_LIBRARY (== ROCPROFILER_HIP_LIBRARY)
- add ROCPROFILER_HIP_COMPILER_LIBRARY
* Update lib/rocprofiler-sdk/buffer_tracing.cpp
- Support ROCPROFILER_BUFFER_TRACING_HIP_API in rocprofiler_query_buffer_tracing_kind_operation_name
- Support ROCPROFILER_BUFFER_TRACING_HIP_API in rocprofiler_iterate_buffer_tracing_kind_operations
* Update lib/rocprofiler-sdk/callback_tracing.cpp
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_query_callback_tracing_kind_operation_name
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_iterate_callback_tracing_kind_operations
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_iterate_callback_tracing_kind_operation_args
* Update lib/rocprofiler-sdk/intercept_table.cpp
- support HipDispatchTable and HipCompilerDispatchTable
* Update lib/rocprofiler-sdk/internal_threading.cpp
- Support ROCPROFILER_HIP_COMPILER_LIBRARY
* Update lib/rocprofiler-sdk/registration.cpp
- Support "hip" and "hip_compiler" in rocprofiler_set_api_table
- Added some extra logging
* Update samples/api_{buffered,callback}_tracing
- Modifications to demonstrate HIP API tracing
* Update tests/kernel-tracing
- Modifications to handle/test HIP API tracing
* Separate HIP tracing from HIP compiler tracing
* Fix installation of include/rocprofiler-sdk/hip/*
- add compiler and table headers to install
* Fixes to HIP interception
- hip_api_trace.hpp was updated a bit
- removed hipGetDeviceProperties (generic)
- added hipGetDevicePropertiesR0600
- added hipGetDevicePropertiesR0000
- removed hipRegisterTracerCallback
- reordered hipCreateChannelDesc, hipExtModuleLaunchKernel, hipHccModuleLaunchKernel
- added hipDrvGraphAddMemsetNode
- static asserts in hsa_api_info ensuring ordering of pointers
* Update lib/rocprofiler-sdk/hip/hip.*
- use size_t instead of rocprofiler_hip_table_api_id_t as non-type template parameter (smaller binary)
- separated out population of callback_context_data and buffered_context_data into non-template function (significantly smaller binary)
* Update lib/rocprofiler-sdk/hsa/hsa.*
- separated out population of callback_context_data and buffered_context_data into non-template function (significantly smaller binary)
* Update test/kernel-tracing/validate.py
- does not expect any hip_api_traces until libamdhip.so actually starts using rocprofiler-register
* Update tests/tools/json-tool.cpp
- fix context associated with "HIP_API_CALLBACK"
* Update external/CMakeLists.txt
- move misc variables to top of CMakeLists.txt so they apply to all external subprojects
- BUILD_TESTING (OFF)
- BUILD_SHARED_LIBS (OFF)
- BUILD_OBJECT_LIBS (OFF)
- BUILD_STATIC_LIBS (ON)
- CMAKE_POSITION_INDEPENDENT_CODE (ON)
- CMAKE_VISIBILITY_INLINES_HIDDEN (ON)
- CMAKE_CXX_VISIBILITY_PRESET (hidden)
- disable using libunwind in glog
* Update lib/rocprofiler-{sdk,sdk-tool}/CMakeLists.txt
- remove explicit setting of SKIP_BUILD_RPATH
* Update CMakeLists.txt
- set high-level CMAKE_BUILD_RPATH and CMAKE_INSTALL_RPATH_USE_LINK_PATH
* Update tests/CMakeLists.txt
- include(GNUInstallDirs)
* Update samples/CMakeLists.txt
- include(GNUInstallDirs)
* Update include/rocprofiler-sdk/hip/{compiler_api,api}_args.h
- remove extern "C" due to incompatibility b/t empty struct in C (size 0) vs. empty struct in C++ (size 1)
* Update lib/rocprofiler-sdk/hip/details/ostream.hpp
- clang-tidy fixes
* Update cmake/rocprofiler_linting.cmake
- add a feature for clang tidy exe
* Update lib/rocprofiler-sdk/hip/hip.cpp
- use recursion instead of fold expression due to clang-tidy errors (maximum nesting level exceeded)
* Update lib/rocprofiler-sdk/buffer_tracing.cpp
- fix merge
* Update lib/rocprofiler-sdk/callback_tracing.cpp
- fix merge
* Update bin/rocprofv3
- args for marker, HIP runtime, and HIP compiler tracing
* Update tests/apps/simple-transpose
- use roctx
* Update tests/rocprofv3/tracing
- validate marker API data
* Update lib/rocprofiler-sdk-tool
- support for HIP runtime, HIP compiler, marker API
* Update queue/queue_controller/registration/utility
- call hsa::queue_controller_fini() during finalization
- add a yield function to common/utility.hpp
- implements a thread yield + sleep
- add a sync function to Queue class
- add a iterate_queues member function to QueueController
- this is used to sync each queue during queue_controller_fini()
* Fix data races: queue/context/stable_vector
- stable_vector::emplace_back returns reference
- correlation id map uses stable_vector
- queue_info_session has explicit fields for queue id, hsa agent, rocp agent
- use hsa::get_table() in AsyncSignalHandler
- WriteInterceptor does not use TLS for context array
* Update lib/rocprofiler-sdk/hsa/hsa.*
- static object for API subtables
- accessors for API subtables
- google tests for HSA API subtables
* Update lib/rocprofiler-sdk/hsa/{queue,async_copy}.cpp
- use HSA subtable accessors
* Update rocprofiler_memcheck and CI workflow
- use GCC 13 instead of GCC 11 due to suspected false positives in thread sanitizer
- GCC 13 uses libtsan.so.2
* Update CI workflow
* Update lib/rocprofiler-sdk/counters/{metrics,counters}
- fix possibly dangling reference to a temporary from gcc-13
* Update thread-sanitizer-suppr.txt
- Ignore data races originating in hsa-runtime library
* Update cmake/rocprofiler_memcheck.cmake
- Deduce the sanitizer library to preload by compiling an application and extracting the linked sanitizer library
* Update tests/rocprofv3/tracing/CMakeLists.txt
- add csv files to REQUIRED_FILES and ATTACH_ON_FAIL in validate test
* Update lib/common/container/record_header_buffer.hpp
- fix data race identified by gcc v13 and libtsan.so.2
* Update hip API id, args, and def
- remove hipDrvGraphAddMemsetNode (not part of ROCm 6.0
* Update lib/common/container/record_header_buffer.hpp
- fix deadlock in save/read/reset
* Update source/docs/CMakeLists.txt
- remove COMMAND_ERROR_IS_FATAL ANY to allow for printing of stdout/stderr
* Update lib/rocprofiler-sdk/hip/details/ostream.hpp
- remove overloads for HIP_MEMSET_NODE_PARAMS
* Update docs/CMakeLists.txt
- use find_program for shell instead of hardcoded /bin/bash
|
||
|
|
1f4cf1aa39 |
Tools update (#397)
* Srnagara/tool counters collect (#331) * Adding counter collection capability to tools * Adding counter collection feature to tools * Adding counter collection capability to tools * Fixing merge down issues * Small tool fixes for build + prevent profile realloc * Reproducing the counter name query issue in buffered callback * Minor fix for init order + sample that directly uses sdk-tool for debug purposes * Adding a temporary fix to print the counter names * Fixing the output file name and reverting the changes of caching the profile config * Fixing SGPR_Count value * cleaning up debug prints * Adding header to counter collection file * Adding kernel filtering support * Remove threading * Cleaning up the code * Removing redundant prints * Revert "Remove threading" This reverts commit 05c58fb9de826e92cf8d2e3d1c31d5578525dcb4. * Revert "Cleaning up the code" This reverts commit 1d964882bf2396dee8ad020cbb6c83b36e0674e9. * Changing the tools code to align with init-order fix * cmake formatting (cmake-format) (#335) Co-authored-by: SrirakshaNag <SrirakshaNag@users.noreply.github.com> * source formatting (clang-format v11) (#336) Co-authored-by: SrirakshaNag <SrirakshaNag@users.noreply.github.com> * Adding support for async memory copy * source formatting (clang-format v11) (#391) Co-authored-by: SrirakshaNag <SrirakshaNag@users.noreply.github.com> * Fixing header typo * Fixing tool_fini * Replaceing the direction and kind fields values with description * Update lib/rocprofiler-sdk-tool/helper.cpp - Remove use of VLA * Update lib/rocprofiler-sdk-tool/tool.cpp - Formatting * Migrate common/config.* to rocprofiler-sdk-tool * Update lib/rocprofiler-sdk-tool/tool.cpp - fix clang-tidy issues * source formatting (clang-format v11) (#392) Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com> * Update lib/common/mpl.hpp - is_string_type / is_string_type_impl for deducing if type is a string type * Update include/rocprofiler-sdk/fwd.h - ROCPROFILER_BUFFER_TRACING_MEMORY_COPY_NONE starts at zero * Update lib/rocprofiler-sdk/hsa/async_copy.* - functions for operation ids and names * Update lib/rocprofiler-sdk/buffer_tracing.cpp - support iterating and getting names for ROCPROFILER_BUFFER_TRACING_MEMORY_COPY * Update lib/rocprofiler-sdk-tool/config.* - env ROCPROFILER_ prefix -> ROCPROF_ prefix - add support for memory copy tracing, counter collection, etc. * Update lib/rocprofiler-sdk-tool/helper.* - removed TracerFlushRecord - removed cxa_demangle (use one in common library) - removed GetCounterNames (handled in config) - removed GetKernelNames (handled in config) * Add lib/rocprofiler-sdk-tool/output_file.* - separate out get_output_stream function and output_file struct from tool.cpp * Add lib/rocprofiler-sdk-tool/csv.hpp - write_csv_entry automatically quotes strings - csv_encoder struct enforces correct number of columns * Update lib/rocprofiler-sdk-tool/CMakeLists.txt - add new files * Update lib/rocprofiler-sdk-tool/tool.cpp - update construction of output_file class - add kernel_symbol_data for serializing kernel trace data - use config instead of env lookups - optimize counter collection profile config lookup/creation * Update bin/rocprofv3 - rocprofv3 --help exits with 0 (as it should) - command-line arg for memory copy tracing - command-line arg for mangled kernels - command-line arg for truncated kernels - env ROCPROFILER_ prefix -> env ROCPROF_ prefix * Update tests/async-copy-tracing/validate.py - update test_async_copy_direction to new enum values * Update tests/kernel-tracing/validate.py - update test_async_copy_direction to new enum values * Update tests/tools/json-tool.cpp - add ROCPROFILER_BUFFER_TRACING_MEMORY_COPY to supported buffer_name_info * Update samples/counter_collection/{CMakeLists.txt,main.cpp} - remove counter-collection-sdk-tool * Update .github/workflows/docs.yml - fix paths triggering running the workflow --------- Co-authored-by: Benjamin Welton <bewelton@amd.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: SrirakshaNag <SrirakshaNag@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com> * adding counter collection support * Adding counter collection test * changing directory structure of counter collection tests * Fixing test path for rocprofv3 * Adding hsa-tracing basic test * cmake formatting (cmake-format) (#362) Co-authored-by: bgopesh <bgopesh@users.noreply.github.com> * counter collection tests drop2 * fixing hsa-trace test for rocprofv3 path * python formatting (black) (#371) Co-authored-by: bgopesh <bgopesh@users.noreply.github.com> * both counter colleciton and tracing should work together * Fixing rocprofv3 path * Attempt to fix Segfault with AddressSanitizer * fixing sanitizer segfault * Update rocprofv3 * Update lib/rocprofiler-sdk-tool/README.md - update env variables * Update lib/rocprofiler-sdk/buffer_tracing.cpp - return ROCPROFILER_STATUS_BUFFER_NOT_FOUND if buffer tracing service is configured with invalid buffer * Update lib/rocprofiler-sdk-tool/tool.cpp - designated hsa API trace buffer * Update tests/hsa-tracing/CMakeLists.txt - Fix environment * Update rocprofv3 - do not override HSA_TOOLS_LIB - support ROCPROF_PRELOAD - LD_PRELOAD librocprofiler-sdk.so * Restructure tests directory - move all rocprofv3 integration tests into subfolder * Update cmake/Templates/rocprofiler-sdk/config.cmake.in - create rocprofiler-sdk::rocprofv3 cmake target * Update tests/rocprofv3/hsa-tracing - improve validate.py - convert input to dict via csv.DictReader * Update tests/apps/CMakeLists.txt - fix build rpath for simple-transpose * Update cmake/rocprofiler_memcheck.cmake - prefer libtsan.so.0 * Update tests/rocprofv3/hsa-tracing - move to tests/rocprofv3/tracing - include kernel tracing and memory copy tracing * Update lib/rocprofiler-sdk-tool/tool.cpp - normalize "_ID" vs. "_Id" in CSV column names (use "_Id") * Update lib/rocprofiler-sdk/buffer.{hpp,cpp} - change signature of buffer::get_buffers() - buffer::get_buffers() uses static_object * Update lib/rocprofiler-sdk/context/context.cpp - update usage of buffer::get_buffers() - now returns pointer * Update lib/rocprofiler-sdk/tests/buffer.cpp - update to change for signature of buffer::get_buffers() * Update tests/rocprofv3/tracing/CMakeLists.txt - use %argt% with -d argument * Update lib/rocprofiler-sdk-tool/tool.cpp - use atexit for finalization * Update tests/rocprofv3/tracing/CMakeLists.txt - tweaked name of tests * Update lib/rocprofiler-sdk/hsa/async_copy.* - async_copy_fini + reference counting signals * Update lib/rocprofiler-sdk/registration.cpp - invoke hsa::async_copy_fini() to prevent data race on signals --------- Co-authored-by: SrirakshaNag <104580803+SrirakshaNag@users.noreply.github.com> Co-authored-by: Benjamin Welton <bewelton@amd.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: SrirakshaNag <SrirakshaNag@users.noreply.github.com> Co-authored-by: gobhardw <gopesh.bhardwaj@amd.com> Co-authored-by: bgopesh <bgopesh@users.noreply.github.com> |
||
|
|
21dd088c8e |
ROCTx Library Tracing (#390)
* Update include/rocprofiler-sdk/marker/*
- Update rocprofiler_marker_api_args_t for all API functions
- Add ROCPROFILER_MARKER_API_ID_roctxGetThreadId to rocprofiler_marker_api_id_t
* Update include/rocprofiler-sdk/marker/api_args.h
- fix include
* Update lib/common/mpl.hpp
- is_pair
- is_type_complete_v
* Update include/rocprofiler-sdk/marker/*
- fix rocprofiler_marker_api_retval_t
- add roctxGetThreadId to rocprofiler_marker_api_args_t
- fix type in enum: HsaDevice -> HsaAgent
- add table_api_id.h
* Update include/rocprofiler-sdk/marker.h
- include marker/table_api_id.h
* Update include/rocprofiler-sdk/buffer_tracing.h
- Buffer marker tracer records have begin and end timestamp
* Add lib/rocprofiler-sdk/marker
- tracing implementation for marker (roctx) library
* Update include/rocprofiler-sdk/{buffer_tracing,marker/table_api_id}.h
- rocprofiler_buffer_tracing_marker_record_t -> rocprofiler_buffer_tracing_marker_api_record_t
* Update lib/rocprofiler-sdk/buffer_tracing.cpp
- support for ROCPROFILER_BUFFER_TRACING_MARKER_API
* Update lib/rocprofiler-sdk/callback_tracing.cpp
- support for ROCPROFILER_CALLBACK_TRACING_MARKER_API
* Update lib/rocprofiler-sdk/intercept_table.cpp
- template instantiation for notify_runtime_api_registration
* Update lib/rocprofiler-sdk/registration.cpp
- enable roctx in rocprofiler_set_api_table
* Update lib/rocprofiler-sdk/marker/marker.cpp
- rocprofiler_buffer_tracing_marker_record_t -> rocprofiler_buffer_tracing_marker_api_record_t
* Update lib/rocprofiler/tests for roctx testing
- add roctx.cpp
- unit tests for roctx callback and buffer tracing
- support marker API in get_{buffer,callback}_tracing_names()
* Update lib/common/logging.cpp
- logging initialized message mentions env variable
* Update lib/common/mpl.hpp
- NOLINT for misc-definitions-in-headers
* Update lib/rocprofiler-sdk/tests/CMakeLists.txt
- include LD_LIBRARY_PATH in rocprofiler-lib-tests-shared tests
* Update lib/rocprofiler-sdk/registration.cpp
- client_library_vec_t is now vector of option<client_library>
- enables resetting the client_library after finalization
- removed acquiring registration lock when invoke_client_finalizers called via atexit
- this was causing some lock-order-inversion warnings (potential deadlock)
* Update lib/rocprofiler-sdk/agent.cpp
- model name for agent supports spaces
* Update tests/common/serialization.hpp
- add serialization support for marker tracing data structures
* Update tests/apps
- Add ROCTx markers into reproducible-runtime and transpose
* Update tests/tools/json-tools.cpp
- add marker tracing support
- remove strdup (no longer necessary)
* Update tests/kernel-tracing/validate.py
- validate marker API tracing data
* Update tests/async-copy-tracing/validate.py
- validate marker API tracing data
* Update cmake for load path resolution during testing
* Update tests/async-copy-tracing/CMakeLists.txt
- fix test LD_LIBRARY_PATH
* Update cmake/Templates/rocprofiler-sdk-roctx/config.cmake.in
- fix constructing rocprofiler-sdk-roctx::rocprofiler-sdk-roctx
|
||
|
|
dc8b8aa448 |
Cleanup + logging env variable (#387)
* [CP] Update tests/common/serialization.hpp
- remove duplication in rocprofiler_callback_tracing_code_object_load_data_t
* [CP] Update lib/rocprofiler-sdk/tests
- create common.hpp
- update registration.cpp to use common.hpp
* [CP] Add lib/common/logging.{hpp,cpp}
- generic init_logging function
* [CP] Update lib/rocprofiler-sdk/hsa/async_copy.cpp
- remove excess logging
* [CP] Update lib/rocprofiler-sdk/registration.cpp
- use common::init_logging(...)
- enforce ROCPROFILER_REGISTER_FORCE_LOAD in rocprofiler_force_configure
- logging updates in rocprofiler_set_api_table
* Update include/rocprofiler-sdk/buffer_tracing.h
- rocprofiler_buffer_tracing_marker_record_t -> rocprofiler_buffer_tracing_marker_api_record_t
* Update lib/common/utility.hpp
- remove active_capacity_gate
* Update lib/rocprofiler-sdk/tests/common.hpp
- fix get_{callback,buffer}_tracing_names()
* Update lib/rocprofiler-sdk/counters/xml/{basic,derived}_counters.xml
- add entries for gfx1102
|
||
|
|
936816f762 |
Async memory copy tracing (#317)
* Update samples/api_buffered_tracing/client.cpp
- support ROCPROFILER_BUFFER_TRACING_MEMORY_COPY
* Update include/rocprofiler-sdk/{buffer_tracing,fwd}.h
- update rocprofiler_buffer_tracing_memory_copy_record_t
- add ROCPROFILER_BUFFER_TRACING_MEMORY_COPY_HOST_TO_HOST to rocprofiler_memory_copy_operation_t
* Update lib/rocprofiler-sdk/context/context.*
- get_registered_contexts functions (local copy)
* Update tests/apps/reproducible-runtime/reproducible-runtime.cpp
- include some memory allocations and memory copies for better testing
* Update tests/common/serialization.hpp
- update serialization save function for rocprofiler_buffer_tracing_memory_copy_record_t
* Update lib/rocprofiler-sdk/hsa/hsa.*
- remove stale set_callback / activity_functor_t code
- forward decl hsa_api_meta
- template struct hsa_api_func for getting function return type and args
* Update tests/kernel-tracing/validate.py
- enforce memory_copies data size
- test timestamps in memory copies data
- improve internal and external correlation id validation
* Update lib/rocprofiler-sdk/hsa/defines.hpp
- HSA_API_META_DEFINITION macro
* Update lib/rocprofiler/hsa/rocprofiler-sdk/hsa/hsa.def.cpp
- HSA_API_META_DEFINITION specializations for async copy functions
* Add lib/rocprofiler-sdk/hsa/async_copy.{hpp,cpp}
- implements buffer memory tracing
* Update lib/rocprofiler-sdk/registration.cpp
- invoke rocprofiler::hsa::async_copy_init
* Update lib/rocprofiler-sdk/hsa/async_copy.cpp
- logging improvements
- improve hsa <-> rocp agent mapping
* Update lib/rocprofiler-sdk/hsa/async_copy.cpp
- load original signal in async signal handler before store_screlease
* Update lib/rocprofiler-sdk/hsa/async_copy.cpp
- use store_relaxed instead of store_screlease
* Update lib/rocprofiler-sdk/hsa/async_copy.cpp
- logging
* Update lib/rocprofiler-sdk/hsa/async_copy.cpp
- logging
* Update lib/rocprofiler-sdk/hsa/async_copy.cpp
- misc changes
* Update lib/rocprofiler-sdk/hsa/async_copy.cpp
- misc changes
* Update lib/rocprofiler-sdk/hsa/async_copy.cpp
- misc changes
* Update lib/rocprofiler-sdk/hsa/async_copy.cpp
- return function pointer instead of lambda
* Update reproducible-runtime.cpp
- device sync
* Update tests/apps/reproducible-runtime/reproducible-runtime.cpp
- use *Async variants of hipMalloc and hipMemcpy
* Update lib/rocprofiler-sdk/hsa/async_copy.cpp
- populate async data properly
* Update tests/kernel-tracing/validate.py
- verification of async copy direction
* Update tests/apps/reproducible-runtime/reproducible-runtime.cpp
- temporarily disable async memcpy functions
* Create tests/tools
- directory containing tool libraries used for collecting data in integration tests
* Update tests/kernel-tracing
- remove kernel-tracing-test-tool library (now rocprofiler-sdk-json-tool)
- update cmake, validate.py, conftest.py accordingly
* Add tests/async-copy-tracing
- integration test validating async copy tracing in transpose example
* Update tests/CMakeLists.txt
- updates for restructuring
* Revert tests/apps/reproducible-runtime
- restore code to semi-original state (no memory copying)
* Update tests/async-copy-tracing/validate.py
- fix comment in test_async_copy_direction
* Fix building tests against installation
|
||
|
|
6b374b8e68 |
Improve static singleton memory safety (#316)
* Update GitHub links * Update samples/api_buffered_tracing/client.cpp - check if initialized before forcing initialization * Add lib/common/static_object.* - template class for creating a static allocation in the binary which has all the properties of a heap allocated singleton but does not trigger leak sanitizers * Update include/rocprofiler-sdk/internal_threading.h - document return values * Update lib/rocprofiler-sdk/internal_threading.cpp - return codes from rocprofiler_create_callback_thread and rocprofiler_assign_callback_thread - use common::static_object for thread-pool object * Update lib/rocprofiler-sdk/agent.cpp - use common::static_object to store array of strings and their hashes * Update lib/rocprofiler-sdk/hsa/code_object.cpp - use common::static_object to store array of strings and their hashes to ensure strings exist until termination * Update lib/rocprofiler-sdk/registration.cpp - use common::static_object to store status and client libraries - update return values for rocprofiler_set_api_table * Update lib/rocprofiler-sdk/hsa/hsa.cpp - check registration::get_fini_status() in hsa_api_impl::functor<Idx>(args...) * Update lib/rocprofiler-sdk/context/context.cpp - using common::static_object for correlation id map |
||
|
|
9a0c84efa6 |
Use -sdk suffix and reset VERSION to 0.0.0 (#263)
* Fix find_package(rocprofiler) in build tree * Move include/rocprofiler to include/rocprofiler-sdk * Update include/CMakeLists.txt - add_subdirectory(rocprofiler-sdk) * Move lib/rocprofiler to lib/rocprofiler-sdk * Move lib/rocprofiler-tool to lib/rocprofiler-sdk-tool * Update lib/CMakeLists.txt - add_subdirectory(rocprofiler-sdk) - add_subdirectory(rocprofiler-sdk-tool) * Update lib/rocprofiler-sdk/CMakeLists.txt * Rename rocprofiler-tool to rocprofiler-sdk-tool * Replace include rocprofiler/ with include rocprofiler-sdk/ * Replace include lib/rocprofiler/ with include lib/rocprofiler-sdk/ * Set VERSION to 0.0.0 and finish install to rocprofiler-sdk * More fixes for rocprofiler -> rocprofiler-sdk - fix issue with rocprofiler-sdk-config.cmake.in - fix counters xml install path * Fix documentation generation * Create rocprofiler_LIB_ROCPROFILER_SDK_DIR for build tree * cmake formatting (cmake-format) (#264) Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> |