2
0
Gráfico de cometimentos

24 Cometimentos

Autor(a) SHA1 Mensagem Data
Jonathan R. Madsen af2f85ca93 Add logical_node_type_id field to rocprofiler_agent_t (#948)
* Add logical_node_type_id field to rocprofiler_agent_t

* Patch queue_controller
2024-06-24 23:18:58 -05:00
Jonathan R. Madsen 62ec95eae6 Sync queue and async copy on client finalizer (#950) 2024-06-24 20:38:34 -05:00
Giovanni Lenzi Baraldi 9676295d3d ATT API changes - add user_data field and separation of dispatch vs agent profiling (#893)
* DRM Issue Fix for SLES 15 (#897)

* DRM Issue Fix

* Formatting Fix

* PC sampling: CID manager unit test (#898)

* Adding per-dispatch userdata field to ATT

* Clang tidy

* Formatting

* Update source/lib/rocprofiler-sdk/hsa/aql_packet.hpp

Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>

* Adding dispatch_id, fixing user_data and update aql_profile_v2

* Formatting

* Tidy fixes

* Second fix for userdata

* removing assert for union

* Adding serialization. Created agent profiling-like thread trace

* Implemented agent thread trace

* Update source/lib/rocprofiler-sdk/hsa/aql_packet.hpp

Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>

* Restructured thread trace packets

* Added agent API tests

* Fixing multigpu for agent test

* Formatting

* Formatting

* Improving header locations

* Fixing merge conflicts

* Tidy

* Tidy

* Tidy

---------

Co-authored-by: Ammar ELWazir <ammar.elwazir@amd.com>
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
2024-06-13 15:29:29 -03:00
Benjamin Welton c32d8b6696 Fix agent profiling for SQ counters (#919)
* Fix agent profiling for SQ counters


---------

Co-authored-by: Benjamin Welton <ben@amd.com>
2024-06-12 20:39:18 -07:00
Giovanni Lenzi Baraldi 1b95089c28 Enable ATT continuous mode and code object tracing registration (#850)
* Adding ATT continuous mode and ATT code object tracking

* Fixing aql_packet.cpp

* Updating to aqlprofile codeobj changes

* Removing kernel packet from ATT dispatch callback

* Changing getSymbolMap() to return relative vaddr

* Tidy fixes

* Formatting

* Fix shadowing

* Fixing packet test

* Updating tests

* Simplifying multi-agent traces

* Adding dynamic codeobj tracking

* leftover book-keeping for codeobj markers

* Formatting

* Formatting

* Temporary removing codeobj marker

* Formatting

* Re-enabling codeobj tracking

* Making copy of coreapi table

* Fixing issues with toolData lifetile

* Formatting

* Fixing issues with ASAN

* Improving memory profile

* Removing misplaced annotation

* Fixing queue type and allowing shared_locks in globalThreadTracer

* Update logging

* Changing ATT formats to be more in line with the SDk (#883)

* Fixing some merge conflicts

* Fixing cmakelists

* Fixing merge conflicts

* Formatting
2024-05-29 11:09:28 -05:00
Ammar ELWazir 987ae3cc47 PC Sampling Support (#715)
* cmake formatting (cmake-format) (#188)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* source formatting (clang-format v11) (#189)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: design of the pc sampling data struct; guarding parts of code that uses ROCr marker packets

* source formatting (clang-format v11) (#191)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* cmake formatting (cmake-format) (#192)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: shadow variable fix

* pcs: fix for compiler errors reported by CI/CD

* source formatting (clang-format v11) (#193)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: docs fix; samples uses rocprofiler::rocprofiler library

* cmake formatting (cmake-format) (#195)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: client in samples folder fixed

* pcs: client requires rocprofiler package as dependency

* pcs: client uses single context

* source formatting (clang-format v11) (#196)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: client using single buffer; no buffer destroy in client

* pcs: client::setup explicitly called from the example

* pcs: rocprofiler_pc_sample_record_t updated

* pcs: fixed init of external correlation id

* source formatting (clang-format v11) (#198)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: remove outdated files; update CMakeLists

* cmake formatting (cmake-format) (#212)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: using rocprofiler_agent_id_t

* pcs: Removing trailing whitespaces

Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>

* source formatting (clang-format v11) (#214)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: mapping agent_id to the agent

* source formatting (clang-format v11) (#215)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: const while iterating over agents

* source formatting (clang-format v11) (#216)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: calling get_buffer instead of get_buffers

* pcs: workgroup typo

* pcs: documentation for the public PC sampling API

* pcs: queue_cb_t signature adaptation

* pcs: mocks removed

* pcs: updating HsaApiTable with HSA/ROCr PC sampling API

* pcs: querying available PC sampling configs through IOCTL

* pcs: create the PCS session in IOCTL

* pcs: first actual PC samples delivered to the rocprofiler's client :)

* pcs: works with marker packet too

* pcs: using HSA table to call pc sampling related functions

* pcs: using ioctl instead of kfd in naming

* pcs: configuration service test fixed

* pcs: sample processing test fixed

* pcs: marker packet macro wrapper removed

* pcs: marker packet is part of the rocprofiler_packet union

* pcs: one fixme added

* pcs: client that uses pc-sampling and code obj tracing

* pcs: client that supprts PC sampling and code obj tracing refactored

* pcs: show more info for each PC sample

* pcs: hex output for the samples that do not belong to the matmul kernel

* pcs: querying avail configuration happens immediately before configuring

* pcs: hsa_ven_amd_pcs_create_from_id renamed

* pcs: using hsa_stop; accessing a buffer by id from parser

* pcs: includes reworked, tests returned to life

* pcs: rocrofiler dir removed as outdated

* cmake formatting (cmake-format) (#271)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* source formatting (clang-format v11) (#272)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: some warnings fixed

* source formatting (clang-format v11) (#273)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* cmake formatting (cmake-format) (#274)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: show MI200 relevant information in the sample

* pcs: queue cb fixed; rocr.h include fixed

* source formatting (clang-format v11) (#296)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: getting hsa_agent and the doorbell_id from hsa_queue

* source formatting (clang-format v11) (#297)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: correlation ID logic fixed

* source formatting (clang-format v11) (#303)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: pure pc sampling example fixed

* source formatting (clang-format v11) (#307)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* cmake formatting (cmake-format) (#308)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: interval value if the PC sampling is already configured

* pcs: ROCPROFILER_STATUS_ERROR_PC_SAMPLING_ALREADY_CONFIGURED

New status code if another process configured PC sampling service with different configuration.
Samples are extended to consider this case and retry if it happens.

* pcs: hsa_amd_queue_get_info mocked in tests

* source formatting (clang-format v11) (#328)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs (tests): query configs after configuring service

* source formatting (clang-format v11) (#329)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: sample checks workgroup_id_* and wave_id

* source formatting (clang-format v11) (#330)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs samples: running samples on the device 0

* pcs: kfd_ioctl updated

* pcs: ioctl config struct changed fields names

* pcs: status when PC sampling is configured by another process is renamed

* pcs: HSA PC sampling API table fixed

* pcs: tmp hack to be able to use HSA pc sampling table

* source formatting (clang-format v11) (#443)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs service use CIDs generated by HIP API tracing service

* source formatting (clang-format v11) (#455)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* cmake formatting (cmake-format) (#456)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: CID manager

* pcs: explicit flush with no delivered data executes retirement logic

* source formatting (clang-format v11) (#464)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: rocprofiler_query_pc_sampling_agent_configurations docs update

* source formatting (clang-format v11) (#465)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: rocprofiler_configure_pc_sampling_service docs update

* pcs: explicit sync introduced in PCSCIDManager

* pcs: new logic for retiring CIDs in PC sampling service documented

* pcs: queue interception cb signature updated

* source formatting (clang-format v11) (#471)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: if no agents supports PC sampling, fail gracefully

* elaborating when KFD returns EBUSY and EEXIST

* pcs: the second PC sampling examples fails gracefully

* code samples use only single kernel for now

* pcs: CID manager refactored

* source formatting (clang-format v11) (#481)

Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>

* pcs: ioctl update

* source formatting (clang-format v11) (#531)

Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>

* pcs:code sample to test PC sampling applied on concurrent kernels

* source formatting (clang-format v11) (#533)

Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>

* pcs: pc sampling strest test included

* cmake formatting (cmake-format) (#539)

Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>

* source formatting (clang-format v11) (#540)

Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>

* pcs: standalone benchmark

* cmake formatting (cmake-format) (#555)

Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>

* pcs: glance in external correlation IDs

* source formatting (clang-format v11) (#557)

Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>

* another change in ioctl interface

* pcs: update queue interceptor callbacks and samples accroding to the agent 0 version

* source formatting (clang-format v11) (#611)

Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>

* pcs: avoid running problematic PC sampling test

* pcs: guarding tests not to fail on architectures not supporting PC sampling

* source formatting (clang-format v11) (#617)

Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>

* pcs: check IOCTL version prior to each KFD call

* pcs: ioctl refactoring

* pcs: PC sampling service increases the ref_count of the correlation ID of the kernel dispatch

* cmake formatting (cmake-format) (#631)

Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>

* source formatting (clang-format v11) (#632)

Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>

* pcs: PC sampling service provides external correlation IDs

* source formatting (clang-format v11) (#644)

Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>

* pcs: use rocprofiler_dim3_t for workgrou_ip

* source formatting (clang-format v11) (#645)

Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>

* pcs: minor fixes

* pcs: updating the documentation for the pc sampling API functions

* pcs: api table and queue controller fix

* pcs: don't generate marker packets for the agent if PC sampling is not configured on it

* pcs: multi-GPU and single-GPU clients

* source formatting (clang-format v11) (#700)

Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>

* pcs: warning and errors fixed

* source formatting (clang-format v11) (#702)

Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>

* pcs: clang compiler errors and warnings fixed

* source formatting (clang-format v11) (#716)

Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>

* pcs: const reference in cid manager

* source formatting (clang-format v11) (#717)

Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>

* pcs: const & func in manager explicit

* pcs: test to cover creating PC sampling service of agent that does not exist

* pcs: generate marker packets if service is active

* source formatting (clang-format v11) (#719)

Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>

* pcs: refactoring hsa_adapter; use the correlation_id->thread_idx

* Update source/lib/rocprofiler-sdk/pc_sampling/cid_manager.cpp

* Update source/lib/rocprofiler-sdk/pc_sampling/cid_manager.cpp

* Update source/lib/rocprofiler-sdk/pc_sampling/hsa_adapter.cpp

* Update source/lib/rocprofiler-sdk/pc_sampling/hsa_adapter.cpp

* Update source/lib/rocprofiler-sdk/pc_sampling/hsa_adapter.cpp

* Update source/lib/rocprofiler-sdk/pc_sampling/hsa_adapter.cpp

* Update source/lib/rocprofiler-sdk/pc_sampling/utils.cpp

* Update utils.cpp

* moving pc-sampling tests and samples to pc-sampling label

* Format fix

* pcs: use configured instead of active service

* Update source/lib/rocprofiler-sdk/pc_sampling/service.cpp

* pcs: ensure configuring PC sampling on the HSA level is called only once

* pcs: minor fix

* Update CMakeLists.txt

* Update CMakeLists.txt

* Update CMakeLists.txt

* Update CMakeLists.txt

* pcs: refactoring IOCTL integration

* Update source/lib/rocprofiler-sdk/pc_sampling/tests/CMakeLists.txt

Co-authored-by: Ammar ELWazir <ammar.elwazir@amd.com>

* Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter_types.hpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter_types.hpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter.hpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* pcs: reverting back what bot doubled

* Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter_types.hpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* pcs: retesting the bot

* Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter_types.hpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* pcs: why bot fails on this IOCTL status

* pcs: why failing on <vector>

* Update source/lib/rocprofiler-sdk/pc_sampling/ioctl/ioctl_adapter.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* pcs: returning commits removed by bot

* pcs: formatting locally

* pcs: clients are flushing buffers inside the tool_fini

* pcs: sync function in public API

* pcs: sync prior to unloading the code object

* pcs: sync function requires context

* pcs: client uses CID retirement service

* pcs: test for flusing internal ROCr buffers

* pcs: source formatting

* Update source/lib/rocprofiler-sdk/pc_sampling/tests/CMakeLists.txt

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* pcs: code samples refactoring

* pcs: public API header refactored

* pcs: rocprofiler_buffer_flush drains internal PC sampling buffers too

* pcs: remove unnecessary functions

* pcs: do not call hsa's copytables

* pcs: include reordering

* pcs: using ROCP_ERROR inside PC sampling implementation

* pcs: pc_sampling sample uses ostream instean of printfs

* pcs: pc_sampling_codeobj tracing using ostream instead of prints

* pcs: registering once for interceptor callbacks

* pcs: do not generate internal CIDs if not in debug mode

* pcs: rebasing fixed; missing external correlation IDs

* pcs: code formatting

* enable kernel tracing service to receive external correlation IDs

* pcs: using ROCPROFILER_STATUS_ERROR_INCOMPATIBLE_KERNEL

* pcs: polishing parser

* formatting

* updating parser to use workgroup_id

* kfd_ioctl.h extracted in details folder

* refactoring

* pcs: preparing to generate code object information

* flush internal buffers prior to unloading code object

* pcs: generating marker records

* pcs: wrap code_object's shutdown function

* ROCR_VISIBLE_DEVICES and HIP_VISISBLE_DEVICES unsupported at the moment

* documenting the ignorance of ROCR/HIP_VISIBLE_DEVICES

* pcs: separate structs for code object loading/unloading markers

* pcs: inst_pkt_t changed the namespace

* pcs: removing wrapper around the shutdown function

* pcs: size in record field

* pcs: documentation refactoring + typdefs

* renaming PCSAgentConfig to PCSAgentSession

* pcs: service does not keep a pointer to the context

* pcs: static assertions related to the versioning

* pcs: rocprofiler_pc_sampling_configuration_t size field

* pcs: report API unimplemented unleass explicitly enabled

* pcs: skip tests if KFD does not support PC sampling

* pcs: if ROCr hides some devices, no PC samples will be delivered for it

* pcs: hip error check after kernel launch

* formatting

* removing PCS info from agent.h

* fix based on review

* Update continuous integration workflow

- use mi200 runner for code coverage (supports PC sampling)
- split sanitizer jobs across navi3, vega20, and mi300

* Updating pc sampling test labels

* ROCP_PC_SAMPLING_ENABLED env in CI

* ROCP_PC_SAMPLING_ENABLED for all CI mi200 jobs

* Rearrange sanitizer assignments

* fixes according to review

* removed unused functions

* pcs: rocprofiler_agent_id_t instead of handle as a key in map

* Update source/lib/rocprofiler-sdk/context/context.hpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* removing drm_fd from the agent.h

* pcs: removing one sample due to complexity

* pcs: refactoring sample

* simplifying sample

* new lines

* Improve queue_control enable intercepter logic

* Update lib/rocprofiler-sdk/hsa/types.hpp

- handle amd_ext size for HSA 1.12.0

* ROCP_PC_SAMPLING_ENABLED -> ROCPROFILER_PC_SAMPLING_BETA_ENABLED

* Update hsa_adapter.cpp

- anonymous namespace + remove debug

* parser update

* Apply suggestions from code review

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: vlaindic <vlaindic@users.noreply.github.com>
Co-authored-by: vlaindic <vladimir.indic@amd.com>
Co-authored-by: vlaindic <vlaindic@amd.com>
Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Co-authored-by: gobhardw <gopesh.bhardwaj@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2024-05-24 09:49:44 -05:00
Jonathan R. Madsen 4d5b71b0e7 Update logging (#838)
* Update logging

* Remove unused function

* Fix lib/rocprofiler-sdk/hsa/pc_sampling.cpp logging compilation

* Fix logging FLAGS_vmodule string leak and numerical log level

* Update logging

* Update glog submodule

* Leak fixes

* format
2024-05-20 15:38:18 -05:00
Giovanni Lenzi Baraldi 099ac7c72d Gbaraldi/att tool (#766)
* Enabling codeobj and thread trace samples

* Updating aqlprofile_v2 header

* Codeobj and thread trace samples with output log files

* Fixing clang format

* Cmake formatting

* Adding coverage to codeobj

* Comment trace sample

* Adding ATT Parser API

* Fixing forwarding to aqlprofile

* Clang formatting

* Clang tidy

* Adding option to print memory kernels

* Clang format

* Remove default from switch case

* Separating  client/main on codeobj sample for ASAn

* Formatting

* Gbaraldi/att tool rebase (#801)

* Enabling codeobj and thread trace samples

* Updating aqlprofile_v2 header

* Codeobj and thread trace samples with output log files

* Fixing clang format

* Cmake formatting

* Adding coverage to codeobj

* Comment trace sample

* Removing python from workflow

* Adding ATT Parser API

* Fixing forwarding to aqlprofile

* Clang formatting

* Clang tidy

* Adding option to print memory kernels

* Clang format

* Remove default from switch case

* Separating  client/main on codeobj sample for ASAn

* Formatting

* Enabling codeobj and thread trace samples

* Updating aqlprofile_v2 header

* Codeobj and thread trace samples with output log files

* Fixing clang format

* Cmake formatting

* Adding coverage to codeobj

* Comment trace sample

* Adding ATT Parser API

* Fixing forwarding to aqlprofile

* Clang formatting

* Clang tidy

* Adding option to print memory kernels

* Clang format

* Remove default from switch case

* Separating  client/main on codeobj sample for ASAn

* Formatting

* Fix codeobj library

* Allow thread trace in parallel with other service

* Zeroing the HSA signals

* Adding exception wrappers in ATT sample

* Removed force configure

* Remove force configure from ISA decode

* Removing codecov flag

* Gbaraldi/att tool tests (#828)

* Adding tests for codeobj ISA decode

* Adding ATT tests

* Adding ATT integration tests

* Formatting

* Changing codeobj binary extension

* Renaming codeobj library spaces

* Fixing samples

* Formatting

* Formatting

* Fixing int test

* Fixing linker error

* Fixing memory fault

* Moving kernel ot inside namespace

* ASAN linking fix

* Removing unecessary headers

* Formatting

* Fixing target_cu

* Remove codeobj binary

* Revert "Remove codeobj binary"

This reverts commit 7d286f89d8096bc36925cd79cd742a5e6d10d179.

* Enable memory snapshot

* adding comgr

---------

Co-authored-by: Ammar ELWazir <ammar.elwazir@amd.com>
2024-05-03 18:45:47 -03:00
Benjamin Welton cb3fc070c7 [1/N] Agent Counter Collection Implementation (#832)
Added public API call to setup agent counter collection on a context.

Refactored the return types internally for dispatch counter collection
to use rocprofiler_status_t (allow for more verbose failures to be
surfaced via the API)

Subsequent commits will fill out the sampling functionality for agent
counter collection.

Co-authored-by: Benjamin Welton <ben@amd.com>
2024-05-01 13:34:54 -07:00
Jonathan R. Madsen 48273d6a65 Remove -Wno-missing-field-initializers from build flags (#810)
* Remove -Wno-missing-field-initializers

- Compiler errors if missing field initializers

* Update lib/rocprofiler-sdk/counters/evaluate_ast.cpp

- copy over dispatch ID in perform_reduction/evaluate
2024-04-22 22:26:01 -05:00
Jonathan R. Madsen 8cc28ae51d Enable HSA packet write interception for callback kernel tracing (#780) 2024-04-17 14:55:04 -05:00
Benjamin Welton c2f659ab5c Removal of HSA from counter collection (#697)
* Minor fix

Removal of HSA from counter collection

Tests for AQL

Updated counter collection client to build profiles in tool init

* Rebased

* Debug printing

* Formatting

* More format

* fix shadowing

---------

Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
2024-04-12 18:46:10 -07:00
Jonathan R. Madsen 56030018dc Callback tracing for kernel dispatches + External correlation ID request service (#682)
* Support ROCPROFILER_CALLBACK_TRACING_KERNEL_DISPATCH

* Fix doxygen

* Update callback tracing

- temporary hacks for kind operation name and iterate kind operations

* Update source/include/rocprofiler-sdk

- introduce sequence id for kernel dispatches

* Update lib/rocprofiler-sdk (seq id)

- support sequence id passing

* Update tests (seq id)

- testing for sequence ids

* Cleanup include/rocprofiler-sdk/fwd.h

* Misc cleanup

* External Correlation ID Request Service (#699)

* External correlation ID request service

- callback requesting an external correlation ID instead of fetching from top of pushed external correlation ID stack

* Update external correlation id request support

- pass internal correlation ID in callback
- async copy generates a correlation ID if none already exists
- added external correlation ID request support for scratch memory tracing
- updated scratch memory tracing to use tracing:: functions

* Update hsa/queue.hpp

- new line at EOF

* Misc tweaks

- remove unnecessary logging in agent.cpp
- correlation_id::add_ref_count check for retirement
- finalization check in HSA queue AsyncSignalHandler

* Improve assertion failure logging in misc tests

* Update include/rocprofiler-sdk/fwd.h

- remove rocprofiler_record_counter_header_t

* Move lib/rocprofiler-sdk/tracing.hpp into lib/rocprofiler-sdk/tracing/ folder

* Update lib/rocprofiler-sdk/hsa/*

- hsa::get_hsa_status_string
- queue_info_session.hpp header
- rocprofiler_packet.hpp

* Update lib/rocprofiler-sdk/{counters,hip,marker}

- execute_phase_exit_callbacks tweaks
- queue_info_session tweaks

* Move rocprofiler_kernel_dispatch_operation_t to include/rocprofiler-sdk/fwd.h

* Update rocprofiler_buffer_tracing_kernel_dispatch_record_t

- add operation field and thread_id field

* Add lib/rocprofiler-sdk/kernel_dispatch

- enum <-> string mapping for kernel dispatch
- tracing implementations

* Update lib/rocprofiler-sdk/CMakeLists.txt

- tracing and kernel dispatch sub-directories

* Update lib/rocprofiler-sdk/{buffer,callback}_tracing.cpp

- invoke rocprofiler::kernel_tracing functions

* Update tests/common/serialization.hpp

- support operation and thread_id fields for rocprofiler_buffer_tracing_kernel_dispatch_record_t

* Update tests/tools/json-tool.cpp

- use external correlation id request service

* Rename sequence_id to dispatch_id
2024-04-11 19:49:49 -05:00
Giovanni Lenzi Baraldi 69b8a43dc6 Gbaraldi/threadtrace2 (#724)
* Added first ATT API

* Finalizing thread trace API

* Fixing more rebase conflicts

* Added codeobj disassembly sample

* Fixing merge issues with rebase [2]

* Adding ATT packets

* Implemented thread trace intercept

* Moved codeobj parser to same repo as rocprofiler

* Moved thread trace to new API

* Fixing merge conflicts

* Fixing more merge conflicts

* Adding thread trace packet reuse

* Merged aql_profile_v2 headers

* Linked ATT sample to aqlprofile

* Updated decoder to include non-loaded codeobjs

* Implemented ISA decoder into ATT sample

* Added marker_id to vaddr

* Updating aql_profile_v2 API to memcpy

* Updating thread trace API to include 64bit markers. Using the result of ISA matching.

* Added instruction type and cycles summary

* Updated sample with selection of kernel by kernel_object

* Added option to copy from memory kernels

* Moved tool_data in thread_trace to dynamic alloc

* Restoring hsa.cpp

* Fixed ATT sample crash. General improvements.

* Moved codeobj library to outside src/

* Updated license header

* Moved codeobj_capture to camelcase

* Solving some more merge conflicts

* Update samples/advanced_thread_trace/CMakeLists.txt

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Update samples/advanced_thread_trace/CMakeLists.txt

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Update samples/code_object_isa_decode/CMakeLists.txt

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Update source/lib/rocprofiler-sdk/thread_trace/CMakeLists.txt

* Removing unused parameter check

* Adding const to isEmpty

* Removing unused warning

* Adding libdw-dev to requirements

* Running clang-format

* Commenting out new aql calls

* Clang format

* Unused variable fix

* Adding codeobj-decoder coverage

* Commenting out threadtrace

* Update samples/CMakeLists.txt

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* P

* WOverloaded

* Addressing clang-tidy

* Virtual destructor on ttracer class

* Corr id

* Fixing code source format

* Update CMakeLists.txt

* Build fixes

* Update source/lib/rocprofiler-sdk-codeobj/code_object_track.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Fix shadowing

* Update CMakeLists.txt

* Update samples/CMakeLists.txt

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Ammar ELWazir <ammar.elwazir@amd.com>
Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
2024-04-08 12:43:02 -07:00
Benjamin Welton 41c0ddd72d Convert LOG() -> ROCP_X logging macros. (#695)
* Convert LOG() -> ROCP_X logging macros.

This patch converts the LOG() macro to the ROCP_X logging macros.
There are the following levels of logs.

Logs whos expressions are not evaluated unless the log level is enabled:

ROCP_TRACE - VLOG(2) (enabeled by env variable GLOG_v=2)
ROCP_INFO - VLOG(1) (enabeled by env variable GLOG_v=1)

Logs whos expressions are always evaluated:

ROCP_WARNING - LOG(WARNING)
ROCP_ERROR - LOG(ERROR)
ROCP_FATAL - LOG(FATAL)
ROCP_DFATAL - DLOG(FATAL) (only fatal in debug mode)

* source formatting (clang-format v11) (#696)

Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>

* Minor fix

* Fixes for VLOG before main

* fix vmodule

* source formatting (clang-format v11) (#718)

Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>

* memory leak fix

* Vlog change

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>
2024-04-02 17:15:30 -07:00
Jonathan R. Madsen 2f9b1767e9 Handle hsa_queue_destroy after finalization (#679)
* Handle hsa_queue_destroy after finalization

- fixes issue where hsa_queue_destroy(...) is invoked after rocprofiler-sdk has finalized
- hsa::get_queue_controller() returns pointer
- if queue controller is a null pointer, skip invoking QueueController::destroy_queue

* Update HIP/HSA/marker update_table logging

* Update rocprofv3 tests

- remove HSA_TOOLS_LIB env variable
- remove setting ROCPROFILER_LOG_LEVEL env variable
- add timeouts to tests which are missing them

* Disable thread sanitizer deadlock detection

* Update CI workflow

- rename vega20-ubuntu job to core-ci
- enable navi32 in core-ci and sanitizers

* Update run-ci.py

- set gcovr html medium and high threshold

* Update lib/rocprofiler-sdk/hsa/queue_controller.cpp

- remove this capture from enable/disable serialization

* Update lib/rocprofiler-sdk/hsa/{hsa_barrier,profile_serializer}.*

- hsa_barrier::set_barrier accepts const-ref to queue map
- profile_serializer::enable and profile_serializer::disable accept const-ref to queue map

* Logging for HIP/HSA/marker/profile_serializer

* Logging for HIP/HSA/marker/queue_controller

* Improve test_retired_correlation_ids asserts

* Fix tests/counter-collection/validate.py

- scale expected SQ_WAVES counter value based on warp size of GPU

* Tweak github comment for code coverage

* Remove gcovr html high/medium threshold args

* Fix tests/counter-collection/validate.py

- round before casting to int in test_counter_values

* operator bool for profile_serializer

- only wait on CV if profile_serializer is used

* Logging updates (profile_serializer + code_object)

* Update counter-collection validate.py

* QueueController does not wait on CV if finalizing/finalized

* Update CI workflow

- remove navi32 from core job

* Improve HIP/HSA/marker tracing get_functor/functor

- remove lambda wrapper around functor

* Update lib/rocprofiler-sdk/hsa/queue_controller.cpp

- do not acquire cvmutex lock during finalization

* Update lib/rocprofiler-sdk/hsa/hsa_barrier.*

- move ctor and dtor to implementation
- skip signal store screlease and destroy if already finalized

* Update CI workflow

- remove navi32 runners

* bwelton fixes for hangs

* CMake improvements + simplified demangle

- remove amd-comgr from common target (and thus removed from roctx DT_NEEDED)

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
2024-03-21 17:52:15 -05:00
Benjamin Welton 1de44447f4 Deadlock Fix for HSA and Serialization Disable/Enabling support (#582)
* Initial barrier

* Working on profiler serializer extraction

* Current progress

* Serializtion Support

* source formatting (clang-format v11) (#583)

Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>

* cmake formatting (cmake-format) (#584)

Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>

* Minor fix

* Current Progress

* Current progress

* More fixes

* Serialization Fixes

* Bug fix

* source formatting (clang-format v11) (#600)

Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>

* More fixes

* More minor fixes

* source formatting (clang-format v11) (#603)

Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>

* source formatting (clang-format v11) (#604)

Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>

* Lock order inversion false positive

* order fix

* More changes

* source formatting (clang-format v11) (#607)

Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>

* minor test fix

* Minor test changes

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>
2024-03-08 09:02:43 -06:00
Jonathan R. Madsen 1bb94add11 Fix rocprofiler_iterate_callback_tracing_kind_operation_args for HIP compiler callbacks (#532)
* Fix HIP compiler iterate args

- `include/rocprofiler-sdk/hip/api_args.h`
  - replace struct fields named "f" with "func"
  - replace hip stream fields named "hStream" with "stream"
- `lib/rocprofiler-sdk/callback_tracing.cpp`
  - iterate_args for HIP compiler table
- `lib/rocprofiler-sdk/registration.cpp`
  - fix warning about roctx num_tables
- `lib/rocprofiler-sdk/hip/hip.def.cpp`
  - replace struct fields named "f" with "func"
  - replace hip stream fields named "hStream" with "stream"
- `lib/rocprofiler-sdk/{hip,hsa,marker}/utils.hpp`
  - improve `stringize_impl`
- `lib/rocprofiler-sdk/hsa/code_object.cpp`
  - remove stale commented out code
- `lib/rocprofiler-sdk/hsa/queue_controller.*`
  - destory_queue -> destroy_queue
- `tests/tools/json-tool.cpp`
  - improve parallelism in tool_tracing_callback
  - serialize the marker api args
  - only invoke rocprofiler_iterate_callback_tracing_kind_operation_args in exit phase
- `samples/counter_collection/CMakeLists.txt`
  - reduce timeout on tests to 120 seconds

* Update lib/rocprofiler-sdk/hsa/utils.hpp

- disable dereference of double pointer in stringize_impl

* Update lib/common

- indirection_level in mpl.hpp
- stringize_arg.hpp

* Rework rocprofiler_iterate_callback_tracing_kind_operation_args

- provide more information in rocprofiler_callback_tracing_operation_args_cb_t
- support specifying the dereference level to account for output paramters
2024-03-01 01:46:07 -06:00
Benjamin Welton 3eb6a27bc6 Add support for AQL dimensions (#262)
* Add support for AQL dimension changes

Adds support for returning dimensions from AQLProfile through rocprofiler
to tools. Includes a much larger expanded test suite that covers nearly
all files in counter collection.

Specific changes below:

samples/counter_collection/print_functional_counters: Modified to check
the validity of dimensions returned in comparison to the actual underlying
data obtained from a kernel execution.

rocprofiler-sdk/aql/helpers: adds function calls to support fetching
dimension information from AQLProfile.

rocprofiler-sdk/aql/packet_construct: modified to allow for events
to be exported to aid evaluate_ast in decoding the output buffer.

lib/rocprofiler-sdk/counters: Instance count now derived from dimension
sizes. rocprofiler_query_counter_dimensions now moved to a callback format
to improve usability.

rocprofiler-sdk/counters/core: Code migrations and exports of functions
for testing.

rocprofiler-sdk/counters/dimensions: Generates a dimension cache to be
used when querying dimension information for a counter id.

rocprofiler-sdk/counters/evaluate_ast: Modified to pass back correct
dimension information and to check/determine output dimensions for derived
counters.

rocprofiler-sdk/counters/id_decode: Modified to have a map between
dimension name -> dimension along with a conversion from the aql profile
id for a dimension (string) -> integer based id (happens only once during
init).

rocprofiler-sdk/hsa/queue: Modified to allow for making testing easier.
Specifically to allow Queue to now be mocked in unit tests for counter
collection.

* Merge with changes for serialization

* Added suggestions

* source formatting (clang-format v11) (#457)

Co-authored-by: bwelton <bwelton@users.noreply.github.com>

* Minor fix

* Test change

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: bwelton <bwelton@users.noreply.github.com>
2024-02-07 22:03:21 -06:00
SrirakshaNag f6198f226a Kernel Serialization Support (#379)
* Serialization-rebased with main branch

* Removing client_id from queue completion callbacks

* removing debugging code

* source formatting (clang-format v11) (#449)

Co-authored-by: SrirakshaNag <SrirakshaNag@users.noreply.github.com>

* moving ready signal handler to anonymous namespace

* source formatting (clang-format v11) (#450)

Co-authored-by: SrirakshaNag <SrirakshaNag@users.noreply.github.com>

* Handling deque search better in queue destructor

* source formatting (clang-format v11) (#451)

Co-authored-by: SrirakshaNag <SrirakshaNag@users.noreply.github.com>

* disabling test_total_runtime test  in code coverage

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: SrirakshaNag <SrirakshaNag@users.noreply.github.com>
2024-02-05 12:42:59 -06:00
Jonathan R. Madsen aa813f5c9b Update lib/rocprofiler-sdk/hsa/queue_controller.cpp (#420)
- designated initializers for default_agent
2024-01-26 07:13:15 -06:00
Jonathan R. Madsen c641749fe6 HIP API Tracing (#357)
* Update include/rocprofiler-sdk/hip*

- updates for intercept table

* Update lib/common/units.hpp

- clang-tidy fixes

* Add lib/rocprofiler-sdk/hip

- tracing implementation for the HIP intercept table

* Update source/lib/rocprofiler-sdk/CMakeLists.txt

- add_subdirectory(hip)

* Update source/lib/rocprofiler-sdk/hsa

- offset function in hsa_api_info<Idx>
- remove report_activity, set_callback
- Tweak HSA_API_TABLE_LOOKUP_DEFINITION

* Update lib/rocprofiler-sdk/hip

- rocprofiler::hip::copy_table
- stringize_impl print dereferenced pointers when possible

* Update lib/rocprofiler-sdk/hsa/utils.hpp

- stringize_impl print dereferenced pointers when possible

* Update lib/rocprofiler-sdk/tests/intercept_table.cpp

- remove failures for intercepting HIP API tables

* Update include/rocprofiler-sdk/fwd.h

- add ROCPROFILER_HIP_RUNTIME_LIBRARY (== ROCPROFILER_HIP_LIBRARY)
- add ROCPROFILER_HIP_COMPILER_LIBRARY

* Update lib/rocprofiler-sdk/buffer_tracing.cpp

- Support ROCPROFILER_BUFFER_TRACING_HIP_API in rocprofiler_query_buffer_tracing_kind_operation_name
- Support ROCPROFILER_BUFFER_TRACING_HIP_API in rocprofiler_iterate_buffer_tracing_kind_operations

* Update lib/rocprofiler-sdk/callback_tracing.cpp

- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_query_callback_tracing_kind_operation_name
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_iterate_callback_tracing_kind_operations
- Support ROCPROFILER_CALLBACK_TRACING_HIP_API in rocprofiler_iterate_callback_tracing_kind_operation_args

* Update lib/rocprofiler-sdk/intercept_table.cpp

- support HipDispatchTable and HipCompilerDispatchTable

* Update lib/rocprofiler-sdk/internal_threading.cpp

- Support ROCPROFILER_HIP_COMPILER_LIBRARY

* Update lib/rocprofiler-sdk/registration.cpp

- Support "hip" and "hip_compiler" in rocprofiler_set_api_table
- Added some extra logging

* Update samples/api_{buffered,callback}_tracing

- Modifications to demonstrate HIP API tracing

* Update tests/kernel-tracing

- Modifications to handle/test HIP API tracing

* Separate HIP tracing from HIP compiler tracing

* Fix installation of include/rocprofiler-sdk/hip/*

- add compiler and table headers to install

* Fixes to HIP interception

- hip_api_trace.hpp was updated a bit
  - removed hipGetDeviceProperties (generic)
  - added hipGetDevicePropertiesR0600
  - added hipGetDevicePropertiesR0000
  - removed hipRegisterTracerCallback
  - reordered hipCreateChannelDesc, hipExtModuleLaunchKernel, hipHccModuleLaunchKernel
  - added hipDrvGraphAddMemsetNode
- static asserts in hsa_api_info ensuring ordering of pointers

* Update lib/rocprofiler-sdk/hip/hip.*

- use size_t instead of rocprofiler_hip_table_api_id_t as non-type template parameter (smaller binary)
- separated out population of callback_context_data and buffered_context_data into non-template function (significantly smaller binary)

* Update lib/rocprofiler-sdk/hsa/hsa.*

- separated out population of callback_context_data and buffered_context_data into non-template function (significantly smaller binary)

* Update test/kernel-tracing/validate.py

- does not expect any hip_api_traces until libamdhip.so actually starts using rocprofiler-register

* Update tests/tools/json-tool.cpp

- fix context associated with "HIP_API_CALLBACK"

* Update external/CMakeLists.txt

- move misc variables to top of CMakeLists.txt so they apply to all external subprojects
  - BUILD_TESTING (OFF)
  - BUILD_SHARED_LIBS (OFF)
  - BUILD_OBJECT_LIBS (OFF)
  - BUILD_STATIC_LIBS (ON)
  - CMAKE_POSITION_INDEPENDENT_CODE (ON)
  - CMAKE_VISIBILITY_INLINES_HIDDEN (ON)
  - CMAKE_CXX_VISIBILITY_PRESET (hidden)
- disable using libunwind in glog

* Update lib/rocprofiler-{sdk,sdk-tool}/CMakeLists.txt

- remove explicit setting of SKIP_BUILD_RPATH

* Update CMakeLists.txt

- set high-level CMAKE_BUILD_RPATH and CMAKE_INSTALL_RPATH_USE_LINK_PATH

* Update tests/CMakeLists.txt

- include(GNUInstallDirs)

* Update samples/CMakeLists.txt

- include(GNUInstallDirs)

* Update include/rocprofiler-sdk/hip/{compiler_api,api}_args.h

- remove extern "C" due to incompatibility b/t empty struct in C (size 0) vs. empty struct in C++ (size 1)

* Update lib/rocprofiler-sdk/hip/details/ostream.hpp

- clang-tidy fixes

* Update cmake/rocprofiler_linting.cmake

- add a feature for clang tidy exe

* Update lib/rocprofiler-sdk/hip/hip.cpp

- use recursion instead of fold expression due to clang-tidy errors (maximum nesting level exceeded)

* Update lib/rocprofiler-sdk/buffer_tracing.cpp

- fix merge

* Update lib/rocprofiler-sdk/callback_tracing.cpp

- fix merge

* Update bin/rocprofv3

- args for marker, HIP runtime, and HIP compiler tracing

* Update tests/apps/simple-transpose

- use roctx

* Update tests/rocprofv3/tracing

- validate marker API data

* Update lib/rocprofiler-sdk-tool

- support for HIP runtime, HIP compiler, marker API

* Update queue/queue_controller/registration/utility

- call hsa::queue_controller_fini() during finalization
- add a yield function to common/utility.hpp
  - implements a thread yield + sleep
- add a sync function to Queue class
- add a iterate_queues member function to QueueController
  - this is used to sync each queue during queue_controller_fini()

* Fix data races: queue/context/stable_vector

- stable_vector::emplace_back returns reference
- correlation id map uses stable_vector
- queue_info_session has explicit fields for queue id, hsa agent, rocp agent
- use hsa::get_table() in AsyncSignalHandler
- WriteInterceptor does not use TLS for context array

* Update lib/rocprofiler-sdk/hsa/hsa.*

- static object for API subtables
- accessors for API subtables
- google tests for HSA API subtables

* Update lib/rocprofiler-sdk/hsa/{queue,async_copy}.cpp

- use HSA subtable accessors

* Update rocprofiler_memcheck and CI workflow

- use GCC 13 instead of GCC 11 due to suspected false positives in thread sanitizer
  - GCC 13 uses libtsan.so.2

* Update CI workflow

* Update lib/rocprofiler-sdk/counters/{metrics,counters}

- fix possibly dangling reference to a temporary from gcc-13

* Update thread-sanitizer-suppr.txt

- Ignore data races originating in hsa-runtime library

* Update cmake/rocprofiler_memcheck.cmake

- Deduce the sanitizer library to preload by compiling an application and extracting the linked sanitizer library

* Update tests/rocprofv3/tracing/CMakeLists.txt

- add csv files to REQUIRED_FILES and ATTACH_ON_FAIL in validate test

* Update lib/common/container/record_header_buffer.hpp

- fix data race identified by gcc v13 and libtsan.so.2

* Update hip API id, args, and def

- remove hipDrvGraphAddMemsetNode (not part of ROCm 6.0

* Update lib/common/container/record_header_buffer.hpp

- fix deadlock in save/read/reset

* Update source/docs/CMakeLists.txt

- remove COMMAND_ERROR_IS_FATAL ANY to allow for printing of stdout/stderr

* Update lib/rocprofiler-sdk/hip/details/ostream.hpp

- remove overloads for HIP_MEMSET_NODE_PARAMS

* Update docs/CMakeLists.txt

- use find_program for shell instead of hardcoded /bin/bash
2024-01-24 16:32:54 -06:00
Jonathan R. Madsen 199f0b5421 Contexts update + buffer flushing + cleanup (#338)
* Update lib/rocprofiler-sdk/context/context.*

- get_registered_contexts functions (local copy)

* Update lib/rocprofiler-sdk/hsa/{queue,queue_controller}.cpp

- remove ROCPROFILER_BUFFER_TRACING_MEMORY_COPY code

* Update tests/kernel-tracing/kernel-tracing.cpp

- move stop() and flush() in tool_fini to before reporting of sizes of data collected

* Update lib/rocprofiler-sdk/hsa/hsa.*

- remove stale set_callback / activity_functor_t code

* Update lib/rocprofiler-sdk/buffer.cpp

- full wait instead of returning busy when buffer is busy
- use task_group::join instead of task_group::wait to fully wait for tasks to finish (bug fix)

* Update lib/rocprofiler-sdk/agent.cpp

- support agent mapping for CPU agents

* Remove direct access to vector of registered contexts
2024-01-03 04:26:46 -06:00
Jonathan R. Madsen 9a0c84efa6 Use -sdk suffix and reset VERSION to 0.0.0 (#263)
* Fix find_package(rocprofiler) in build tree

* Move include/rocprofiler to include/rocprofiler-sdk

* Update include/CMakeLists.txt

- add_subdirectory(rocprofiler-sdk)

* Move lib/rocprofiler to lib/rocprofiler-sdk

* Move lib/rocprofiler-tool to lib/rocprofiler-sdk-tool

* Update lib/CMakeLists.txt

- add_subdirectory(rocprofiler-sdk)
- add_subdirectory(rocprofiler-sdk-tool)

* Update lib/rocprofiler-sdk/CMakeLists.txt

* Rename rocprofiler-tool to rocprofiler-sdk-tool

* Replace include rocprofiler/ with include rocprofiler-sdk/

* Replace include lib/rocprofiler/ with include lib/rocprofiler-sdk/

* Set VERSION to 0.0.0 and finish install to rocprofiler-sdk

* More fixes for rocprofiler -> rocprofiler-sdk

- fix issue with rocprofiler-sdk-config.cmake.in
- fix counters xml install path

* Fix documentation generation

* Create rocprofiler_LIB_ROCPROFILER_SDK_DIR for build tree

* cmake formatting (cmake-format) (#264)

Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com>

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2023-11-29 20:43:18 -06:00