Squashed commit of the following: commit f029195705a15700380c6f832ba5d15d46fd6de7 Author: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> Date: Thu Jul 13 14:38:56 2023 -0500 Formatting workflows for source (clang-format) and cmake (cmake-format) (#4) * Add .cmake-format.yaml file * Add formatting workflow * provide base input for creating PR * Update scheme for extracting branch name - disable running formatting on push to amd-staging branch * patch .cmake-format.yaml for find_package signature - apparently cmake-format doesn't format the full signature of find_package * run formatting (clang-format v11) (#7) Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com> * run cmake formatting (cmake-format) (#6) Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com> --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> commit bc4d135fdd8a1a9e51235f18a5d575fd2b3735e6 Author: Ammar ELWazir <aelwazir@amd.com> Date: Thu Jul 13 12:55:17 2023 -0500 Removing Build cache for potential issues with auto-generated header files (#5) Change-Id: I9e2319f4335e2f88585ffa6fac2bd88a1c952e6e commit ce86dea6a311d44d880fa684eb78f3329295e2a4 Author: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> Date: Thu Jul 13 11:08:58 2023 -0500 Fix decltype(<hsa-function>) function pointer usage (#3) - the following is done in several places: decltype(hsa_memory_allocate)* hsa_memory_allocate - above can cause compiler errors - replace decltype(<hsa-function>) with decltype(::<hsa-function>) - this ensures that the type within the decltype is recognized as the global scope HSA function, not the variable - in many places, the variable has a "_fn" suffix to prevent this issue but added '::' anyway for consistency commit ac49fdd92a72e9c99394253a02da413a6c2e3b3a Merge: a07946a 03a0855 Author: Ammar ELWazir <aelwazir@amd.com> Date: Wed Jul 12 11:36:24 2023 -0500 Merge pull request #2 from ROCm-Developer-Tools/gerrit-amd-staging Pull from gerrit commit 03a085588cffe863e8f466de67be1cfb205b675a Merge:c26b32ba07946a Author: Ammar ELWazir <aelwazir@amd.com> Date: Wed Jul 12 10:57:30 2023 -0500 Merge branch 'amd-staging' into gerrit-amd-staging commit a07946a5cd4c670c83c27ad1a076a9d4567ce6d7 Author: Ammar ELWazir <Ammar.ELWazir@amd.com> Date: Wed Jul 12 15:46:04 2023 +0000 Enabling Cached Builds commit 525e494a7f13941077a8fd4ad6840904db4d27d4 Author: Ammar ELWazir <Ammar.ELWazir@amd.com> Date: Wed Jul 12 04:53:54 2023 +0000 Updating missed GPU Targets commit 42c75862f628c9bee7cfb7dc04dff2619430efbc Author: Ammar ELWazir <Ammar.ELWazir@amd.com> Date: Wed Jul 12 04:43:02 2023 +0000 Adding V1 Testing commit 9d72fd4aee85e4b0c12e717060d2730fa5b73be1 Author: Ammar ELWazir <Ammar.ELWazir@amd.com> Date: Wed Jul 12 03:34:31 2023 +0000 Fixing Artifacts directory path commit f4000cc558b3b2e4676f7994f7ce8c8e6f94518e Author: Ammar ELWazir <Ammar.ELWazir@amd.com> Date: Wed Jul 12 03:27:26 2023 +0000 Fixing CMake for test build job commit 2ce8115d4c33948c3c8f957f545a95a04e1d6cd2 Author: Ammar ELWazir <Ammar.ELWazir@amd.com> Date: Wed Jul 12 03:16:18 2023 +0000 Fixing Ubuntu CMake for ubuntu test build commit 6d0ed439191be900748d0c025157f9d689a73ec7 Author: Ammar ELWazir <Ammar.ELWazir@amd.com> Date: Wed Jul 12 01:28:41 2023 +0000 Removing Navi21 commit e349a7642e5ae5eb03ab9fcd0a0f74f09f78cab5 Author: Ammar ELWazir <Ammar.ELWazir@amd.com> Date: Wed Jul 12 01:14:14 2023 +0000 Removing Navi21 commit fefd02fe68d2a4bca7ec2e381960ad004ee9fc5b Author: Ammar ELWazir <Ammar.ELWazir@amd.com> Date: Wed Jul 12 00:42:48 2023 +0000 Fixing CMake Job commit 2ea46abf7bf92643efa8c549fa70346ffbd79d65 Author: Ammar ELWazir <Ammar.ELWazir@amd.com> Date: Wed Jul 12 00:35:13 2023 +0000 Fixing CMake Job commit d99d681ed1999c5fcf291dc678b11a77205fb0f3 Author: Ammar ELWazir <Ammar.ELWazir@amd.com> Date: Wed Jul 12 00:32:13 2023 +0000 Fixing Pull Latest Dockers and CMake Jobs commit dfc4498072d13b4a1df3a63047d34c682c3d9a29 Author: Ammar ELWazir <Ammar.ELWazir@amd.com> Date: Tue Jul 11 23:54:21 2023 +0000 Fixing CMake job commit 919efe04de707f7c702031be15c3e2c5f8442cbb Author: Ammar ELWazir <Ammar.ELWazir@amd.com> Date: Tue Jul 11 23:52:13 2023 +0000 Adding Pull Last dockers job commit be1b1256e8b0e05308e8f7e7e69bee3acca55281 Author: Ammar ELWazir <aelwazir@amd.com> Date: Tue Jul 11 18:25:40 2023 -0500 Update cmake.yml commit 212299fa4355ae6ec18f9aaacbb79c51ea6c6f97 Author: Ammar ELWazir <aelwazir@amd.com> Date: Tue Jul 11 18:23:35 2023 -0500 Update cmake.yml commit 7c2c1327086a61466cc6cac39f70865c051a8bc7 Author: Ammar ELWazir <aelwazir@amd.com> Date: Tue Jul 11 18:18:53 2023 -0500 Update cmake.yml commit 191b5ce007e612e814c1d7a3afb4ad398f3852e1 Author: Ammar ELWazir <aelwazir@amd.com> Date: Tue Jul 11 16:03:22 2023 -0500 Update cmake.yml commit 8824113d95f3e13c7ce4d0af8e0d9d8f522a6c4a Author: Ammar ELWazir <Ammar.ELWazir@amd.com> Date: Tue Jul 11 16:28:09 2023 +0000 Fixing Pull from Gerrit job name Change-Id: I9e7ed9a27a13ca49d62c93bdadb30f0057e4d385 commit cc3d5e4b02ffb439e8cc2b3efa53527c376f9982 Author: Ammar ELWazir <Ammar.ELWazir@amd.com> Date: Tue Jul 11 16:21:43 2023 +0000 Adding Staging sync job Change-Id: I0551f43878b0678ce4b3e74e27d62357cf95ad95 commit b9be2eee71380a2e6dd34d520e92d0c4209277a0 Author: Ammar ELWazir <Ammar.ELWazir@amd.com> Date: Tue Jul 11 15:57:11 2023 +0000 Fixing build.sh Change-Id: Ia987b0244f0875370d5fe69907b3f5e9cea914de commit 9eee33a95a1abd656a7ac5ca10a9f245e9825431 Author: Ammar ELWazir <aelwazir@amd.com> Date: Mon Jul 10 21:39:46 2023 -0500 Update cmake.yml commit 7093b85a78497140e8b52632ca2a002bdaeacd62 Author: Ammar ELWazir <aelwazir@amd.com> Date: Mon Jul 10 21:33:29 2023 -0500 Update cmake.yml commit f54697172c72a67740f9fdfa0c217b6ea6931576 Author: Ammar ELWazir <aelwazir@amd.com> Date: Mon Jul 10 21:01:26 2023 -0500 Update cmake.yml commit 1b6620e16f8940386b0f4f04e69e2410d21c0e26 Author: Ammar ELWazir <aelwazir@amd.com> Date: Mon Jul 10 20:21:02 2023 -0500 Update cmake.yml commit a94bec740c6b42c4b79c87bca20fa87b99bf060d Author: Ammar ELWazir <aelwazir@amd.com> Date: Mon Jul 10 19:46:35 2023 -0500 Update cmake.yml commit 85d6b29d4375a69d575c18ece8542c50f2ddfcc3 Author: Ammar ELWazir <aelwazir@amd.com> Date: Mon Jul 10 19:34:39 2023 -0500 Update cmake.yml commit 8c004887cf1435f1a6214c3d2455299a8a27bd4c Author: Ammar ELWazir <aelwazir@amd.com> Date: Mon Jul 10 19:31:17 2023 -0500 Update cmake.yml commit a14a9168e17d9348a53c6e9c9a47ba1edb4c4509 Author: Ammar ELWazir <aelwazir@amd.com> Date: Mon Jul 10 19:25:46 2023 -0500 Update cmake.yml commit 000f2f40b84e6a2f7d4becdbf5aed01436ca4c83 Author: Ammar ELWazir <aelwazir@amd.com> Date: Mon Jul 10 19:08:18 2023 -0500 Update cmake.yml commit a28a53d56731cad848fa9133d1c4dbaa8fc7afa7 Author: Ammar ELWazir <aelwazir@amd.com> Date: Mon Jul 10 19:03:39 2023 -0500 Update cmake.yml commit a6a2db01027f0b01fdfbb5997ddb772c7f51b649 Author: Ammar ELWazir <aelwazir@amd.com> Date: Mon Jul 10 18:21:53 2023 -0500 Update cmake.yml commit 118ef2a88b2d44e3207c31c343da3e5e5ec6f176 Author: Ammar ELWazir <aelwazir@amd.com> Date: Mon Jul 10 17:55:57 2023 -0500 Update cmake.yml commit 03c4c232396440cd0be6d2dd7baf4ceea1c2589d Author: Ammar ELWazir <aelwazir@amd.com> Date: Mon Jul 10 17:48:49 2023 -0500 Create cmake.yml Change-Id: I77992f15694e77cbae49c56f9ff02f4f9079235d [ROCm/rocprofiler commit:d4a33cf33a]
DISCLAIMER
The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions, and typographical errors. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. Any computer system has risks of security vulnerabilities that cannot be completely prevented or mitigated. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes.THIS INFORMATION IS PROVIDED ‘AS IS.” AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS, OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY RELIANCE, DIRECT, INDIRECT, SPECIAL, OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. AMD, the AMD Arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies.
© 2023 Advanced Micro Devices, Inc. All Rights Reserved.
ROCProfiler API Concepts
- Session
- Filter
- Buffer
API Philosophy
The APIs provide a common interface to the users for different features such as profiling, tracing.
In order to make use of any functionality of rocprofv2, one needs to create a "Session" object. This session could be a profiling session/tracing/pc-sampling session etc.
In order to set user inputs, one needs to provide a "Filter" to a session object. This filter could be for counters/traces/pc-samples etc.
Now that the input is taken care of, one also needs to provide a "Buffer" which will store the output results generated during a session. This buffer will contain different records corresponding to the filter type chosen. A flush function can also be specified for the buffer, which will be used to flush the buffer records. A filter and buffer are associated together.
Once a Session, Buffer, Filter have all been created, the session can be started. One can control when the session can be started, stopped and destroyed.
Descriptions of Code Samples
kernel_profiling_sample.cpp
This code sample demonstrates how to use the APIs to collect performance counters and metrics for every kernel dispatch.
tracer_sample.cpp
This code sample demonstrates how to use the APIs to collect different API and activity traces:
- HIP API
- HIP OPS
- HSA API
- HSA OPS
- ROCTX
device_profiling_sample.cpp
This code sample demonstrates how to use the APIs to collect counters and metrics from the GPU via user defined sampling, instead of per-kernel dipatch measurements.
How to compile
In order to get the samples to compile, make sure to copy rocprofiler binaries into /opt/rocm/lib Running 'make install' inside the rocprofiler/build folder will copy the binaries to /opt/rocm/lib
Alternately, change the 'ROCPROFILER_LIBS_PATH' variable in the Makefile to point to the rocprofiler/build folder. After modifications to Makefile are done, run:
# compile all samples
make
# compile kernel_profiling_no_replay_sample.cpp
make kernel_profiling_no_replay_sample
How to run
Before running, ROCPROFILER_METRICS_PATH needs to be set to point to 'derived_counters.xml' If the rocprofiler binaries are present in the rocm installation path /opt/rocm then below command will work:
export ROCPROFILER_METRICS_PATH=/opt/rocm/libexec/rocprofiler/counters/derived_counters.xml
Otherwise, make it point to rocprofiler/build/libexec/rocprofiler/counters/derived_counters.xml like below:
export ROCPROFILER_METRICS_PATH=<path_to_rocprofiler>/rocprofiler/build/libexec/rocprofiler/counters/derived_counters.xml
Finally, run a sample:
./kernel_profiling_no_replay_sample
PC-Sampler
The ROCProfiler library includes an API to enable periodic sampling of the GPU program counter during kernel execution. An example program is included that demonstrates the PC sampling API, with additional code to illustrate a typical non-trivial use case: correlation of sampled PC addresses with their disassembled machine code, as well as source code and symbolic debugging information if available.