Ammar ELWazir a7e5509e8e Merging From Github
Squashed commit of the following:

commit 7ab6644fd04db189801f6cee70a09bb621070b60
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Sat Jul 15 00:47:30 2023 -0500

    Gerrit amd staging (#49)

    * removing README from tests diretory

    Change-Id: Id1162dbbe911e24f02d1bf42dafc93a9a934f71f

    * Fixing navi v1 test hang

    Change-Id: I7416170c126a2d3ec564ed27111f1befc3778b4a

    - Added all gpu targets in build script.
    - Changed memory order to relaxed seem to work for navi hang
    - need to set power state for consistent results

    Change-Id: I7416170c126a2d3ec564ed27111f1befc3778b4a

    * Pull from Github

    Squashed commit of the following:

    commit ac49fdd92a72e9c99394253a02da413a6c2e3b3a
    Merge: a07946a 03a0855
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Wed Jul 12 11:36:24 2023 -0500

        Merge pull request #2 from ROCm-Developer-Tools/gerrit-amd-staging

        Pull from gerrit

    commit 03a085588cffe863e8f466de67be1cfb205b675a
    Merge: e88cad2 a07946a
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Wed Jul 12 10:57:30 2023 -0500

        Merge branch 'amd-staging' into gerrit-amd-staging

    commit a07946a5cd4c670c83c27ad1a076a9d4567ce6d7
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 15:46:04 2023 +0000

        Enabling Cached Builds

    commit 525e494a7f13941077a8fd4ad6840904db4d27d4
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 04:53:54 2023 +0000

        Updating missed GPU Targets

    commit 42c75862f628c9bee7cfb7dc04dff2619430efbc
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 04:43:02 2023 +0000

        Adding V1 Testing

    commit 9d72fd4aee85e4b0c12e717060d2730fa5b73be1
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 03:34:31 2023 +0000

        Fixing Artifacts directory path

    commit f4000cc558b3b2e4676f7994f7ce8c8e6f94518e
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 03:27:26 2023 +0000

        Fixing CMake for test build job

    commit 2ce8115d4c33948c3c8f957f545a95a04e1d6cd2
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 03:16:18 2023 +0000

        Fixing Ubuntu CMake for ubuntu test build

    commit 6d0ed439191be900748d0c025157f9d689a73ec7
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 01:28:41 2023 +0000

        Removing Navi21

    commit e349a7642e5ae5eb03ab9fcd0a0f74f09f78cab5
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 01:14:14 2023 +0000

        Removing Navi21

    commit fefd02fe68d2a4bca7ec2e381960ad004ee9fc5b
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 00:42:48 2023 +0000

        Fixing CMake Job

    commit 2ea46abf7bf92643efa8c549fa70346ffbd79d65
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 00:35:13 2023 +0000

        Fixing CMake Job

    commit d99d681ed1999c5fcf291dc678b11a77205fb0f3
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 00:32:13 2023 +0000

        Fixing Pull Latest Dockers and CMake Jobs

    commit dfc4498072d13b4a1df3a63047d34c682c3d9a29
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Tue Jul 11 23:54:21 2023 +0000

        Fixing CMake job

    commit 919efe04de707f7c702031be15c3e2c5f8442cbb
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Tue Jul 11 23:52:13 2023 +0000

        Adding Pull Last dockers job

    commit be1b1256e8b0e05308e8f7e7e69bee3acca55281
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Tue Jul 11 18:25:40 2023 -0500

        Update cmake.yml

    commit 212299fa4355ae6ec18f9aaacbb79c51ea6c6f97
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Tue Jul 11 18:23:35 2023 -0500

        Update cmake.yml

    commit 7c2c1327086a61466cc6cac39f70865c051a8bc7
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Tue Jul 11 18:18:53 2023 -0500

        Update cmake.yml

    commit 191b5ce007e612e814c1d7a3afb4ad398f3852e1
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Tue Jul 11 16:03:22 2023 -0500

        Update cmake.yml

    commit 8824113d95f3e13c7ce4d0af8e0d9d8f522a6c4a
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Tue Jul 11 16:28:09 2023 +0000

        Fixing Pull from Gerrit job name

        Change-Id: I9e7ed9a27a13ca49d62c93bdadb30f0057e4d385

    commit cc3d5e4b02ffb439e8cc2b3efa53527c376f9982
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Tue Jul 11 16:21:43 2023 +0000

        Adding Staging sync job

        Change-Id: I0551f43878b0678ce4b3e74e27d62357cf95ad95

    commit b9be2eee71380a2e6dd34d520e92d0c4209277a0
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Tue Jul 11 15:57:11 2023 +0000

        Fixing build.sh

        Change-Id: Ia987b0244f0875370d5fe69907b3f5e9cea914de

    commit 9eee33a95a1abd656a7ac5ca10a9f245e9825431
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 21:39:46 2023 -0500

        Update cmake.yml

    commit 7093b85a78497140e8b52632ca2a002bdaeacd62
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 21:33:29 2023 -0500

        Update cmake.yml

    commit f54697172c72a67740f9fdfa0c217b6ea6931576
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 21:01:26 2023 -0500

        Update cmake.yml

    commit 1b6620e16f8940386b0f4f04e69e2410d21c0e26
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 20:21:02 2023 -0500

        Update cmake.yml

    commit a94bec740c6b42c4b79c87bca20fa87b99bf060d
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 19:46:35 2023 -0500

        Update cmake.yml

    commit 85d6b29d4375a69d575c18ece8542c50f2ddfcc3
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 19:34:39 2023 -0500

        Update cmake.yml

    commit 8c004887cf1435f1a6214c3d2455299a8a27bd4c
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 19:31:17 2023 -0500

        Update cmake.yml

    commit a14a9168e17d9348a53c6e9c9a47ba1edb4c4509
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 19:25:46 2023 -0500

        Update cmake.yml

    commit 000f2f40b84e6a2f7d4becdbf5aed01436ca4c83
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 19:08:18 2023 -0500

        Update cmake.yml

    commit a28a53d56731cad848fa9133d1c4dbaa8fc7afa7
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 19:03:39 2023 -0500

        Update cmake.yml

    commit a6a2db01027f0b01fdfbb5997ddb772c7f51b649
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 18:21:53 2023 -0500

        Update cmake.yml

    commit 118ef2a88b2d44e3207c31c343da3e5e5ec6f176
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 17:55:57 2023 -0500

        Update cmake.yml

    commit 03c4c232396440cd0be6d2dd7baf4ceea1c2589d
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 17:48:49 2023 -0500

        Create cmake.yml

    Change-Id: I2223efd600dcd8a4f695e61491b94b7f12ae2c5b

    * SWDEV-409195: Added instructions for ATT help.

    Change-Id: Ie76518dd54c3de82abfbd64b5e8c44a43edc8a09

    * Pull from Github

    Squashed commit of the following:

    commit f029195705a15700380c6f832ba5d15d46fd6de7
    Author: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
    Date:   Thu Jul 13 14:38:56 2023 -0500

        Formatting workflows for source (clang-format) and cmake (cmake-format) (#4)

        * Add .cmake-format.yaml file

        * Add formatting workflow

        * provide base input for creating PR

        * Update scheme for extracting branch name

        - disable running formatting on push to amd-staging branch

        * patch .cmake-format.yaml for find_package signature

        - apparently cmake-format doesn't format the full signature of find_package

        * run formatting (clang-format v11) (#7)

        Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com>

        * run cmake formatting (cmake-format) (#6)

        Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com>

        ---------

        Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

    commit bc4d135fdd8a1a9e51235f18a5d575fd2b3735e6
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Thu Jul 13 12:55:17 2023 -0500

        Removing Build cache for potential issues with auto-generated header files (#5)

        Change-Id: I9e2319f4335e2f88585ffa6fac2bd88a1c952e6e

    commit ce86dea6a311d44d880fa684eb78f3329295e2a4
    Author: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
    Date:   Thu Jul 13 11:08:58 2023 -0500

        Fix decltype(<hsa-function>) function pointer usage (#3)

        - the following is done in several places:
            decltype(hsa_memory_allocate)* hsa_memory_allocate
        - above can cause compiler errors
        - replace decltype(<hsa-function>) with decltype(::<hsa-function>)
          - this ensures that the type within the decltype is recognized as the global scope HSA function, not the variable
        - in many places, the variable has a "_fn" suffix to prevent this issue but added '::' anyway for consistency

    commit ac49fdd92a72e9c99394253a02da413a6c2e3b3a
    Merge: a07946a 03a0855
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Wed Jul 12 11:36:24 2023 -0500

        Merge pull request #2 from ROCm-Developer-Tools/gerrit-amd-staging

        Pull from gerrit

    commit 03a085588cffe863e8f466de67be1cfb205b675a
    Merge: e88cad2 a07946a
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Wed Jul 12 10:57:30 2023 -0500

        Merge branch 'amd-staging' into gerrit-amd-staging

    commit a07946a5cd4c670c83c27ad1a076a9d4567ce6d7
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 15:46:04 2023 +0000

        Enabling Cached Builds

    commit 525e494a7f13941077a8fd4ad6840904db4d27d4
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 04:53:54 2023 +0000

        Updating missed GPU Targets

    commit 42c75862f628c9bee7cfb7dc04dff2619430efbc
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 04:43:02 2023 +0000

        Adding V1 Testing

    commit 9d72fd4aee85e4b0c12e717060d2730fa5b73be1
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 03:34:31 2023 +0000

        Fixing Artifacts directory path

    commit f4000cc558b3b2e4676f7994f7ce8c8e6f94518e
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 03:27:26 2023 +0000

        Fixing CMake for test build job

    commit 2ce8115d4c33948c3c8f957f545a95a04e1d6cd2
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 03:16:18 2023 +0000

        Fixing Ubuntu CMake for ubuntu test build

    commit 6d0ed439191be900748d0c025157f9d689a73ec7
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 01:28:41 2023 +0000

        Removing Navi21

    commit e349a7642e5ae5eb03ab9fcd0a0f74f09f78cab5
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 01:14:14 2023 +0000

        Removing Navi21

    commit fefd02fe68d2a4bca7ec2e381960ad004ee9fc5b
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 00:42:48 2023 +0000

        Fixing CMake Job

    commit 2ea46abf7bf92643efa8c549fa70346ffbd79d65
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 00:35:13 2023 +0000

        Fixing CMake Job

    commit d99d681ed1999c5fcf291dc678b11a77205fb0f3
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 00:32:13 2023 +0000

        Fixing Pull Latest Dockers and CMake Jobs

    commit dfc4498072d13b4a1df3a63047d34c682c3d9a29
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Tue Jul 11 23:54:21 2023 +0000

        Fixing CMake job

    commit 919efe04de707f7c702031be15c3e2c5f8442cbb
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Tue Jul 11 23:52:13 2023 +0000

        Adding Pull Last dockers job

    commit be1b1256e8b0e05308e8f7e7e69bee3acca55281
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Tue Jul 11 18:25:40 2023 -0500

        Update cmake.yml

    commit 212299fa4355ae6ec18f9aaacbb79c51ea6c6f97
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Tue Jul 11 18:23:35 2023 -0500

        Update cmake.yml

    commit 7c2c1327086a61466cc6cac39f70865c051a8bc7
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Tue Jul 11 18:18:53 2023 -0500

        Update cmake.yml

    commit 191b5ce007e612e814c1d7a3afb4ad398f3852e1
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Tue Jul 11 16:03:22 2023 -0500

        Update cmake.yml

    commit 8824113d95f3e13c7ce4d0af8e0d9d8f522a6c4a
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Tue Jul 11 16:28:09 2023 +0000

        Fixing Pull from Gerrit job name

        Change-Id: I9e7ed9a27a13ca49d62c93bdadb30f0057e4d385

    commit cc3d5e4b02ffb439e8cc2b3efa53527c376f9982
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Tue Jul 11 16:21:43 2023 +0000

        Adding Staging sync job

        Change-Id: I0551f43878b0678ce4b3e74e27d62357cf95ad95

    commit b9be2eee71380a2e6dd34d520e92d0c4209277a0
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Tue Jul 11 15:57:11 2023 +0000

        Fixing build.sh

        Change-Id: Ia987b0244f0875370d5fe69907b3f5e9cea914de

    commit 9eee33a95a1abd656a7ac5ca10a9f245e9825431
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 21:39:46 2023 -0500

        Update cmake.yml

    commit 7093b85a78497140e8b52632ca2a002bdaeacd62
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 21:33:29 2023 -0500

        Update cmake.yml

    commit f54697172c72a67740f9fdfa0c217b6ea6931576
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 21:01:26 2023 -0500

        Update cmake.yml

    commit 1b6620e16f8940386b0f4f04e69e2410d21c0e26
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 20:21:02 2023 -0500

        Update cmake.yml

    commit a94bec740c6b42c4b79c87bca20fa87b99bf060d
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 19:46:35 2023 -0500

        Update cmake.yml

    commit 85d6b29d4375a69d575c18ece8542c50f2ddfcc3
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 19:34:39 2023 -0500

        Update cmake.yml

    commit 8c004887cf1435f1a6214c3d2455299a8a27bd4c
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 19:31:17 2023 -0500

        Update cmake.yml

    commit a14a9168e17d9348a53c6e9c9a47ba1edb4c4509
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 19:25:46 2023 -0500

        Update cmake.yml

    commit 000f2f40b84e6a2f7d4becdbf5aed01436ca4c83
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 19:08:18 2023 -0500

        Update cmake.yml

    commit a28a53d56731cad848fa9133d1c4dbaa8fc7afa7
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 19:03:39 2023 -0500

        Update cmake.yml

    commit a6a2db01027f0b01fdfbb5997ddb772c7f51b649
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 18:21:53 2023 -0500

        Update cmake.yml

    commit 118ef2a88b2d44e3207c31c343da3e5e5ec6f176
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 17:55:57 2023 -0500

        Update cmake.yml

    commit 03c4c232396440cd0be6d2dd7baf4ceea1c2589d
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 17:48:49 2023 -0500

        Create cmake.yml

    Change-Id: I77992f15694e77cbae49c56f9ff02f4f9079235d

    * Added error handling to att iterate_data. Fix for genasm.

    Change-Id: Ia86e629e74c6e00b98155355beabf69681a88875

    * SWDEV-409575 - Append additional RPATH to libraries and binaries installed in /opt/rocm-ver/lib/rocprofiler

    Append the rpath $ORIGIN/.. to component specific libraries
    Binaries installed in /opt/rocm-ver/lib/rocprofiler had been appended with $ORIGIN/..
    Binaries installed in /opt/rocm-ver/libexec/rocprofiler had been appended with $ORIGIN/../../lib
    Used TARGET form for installation of rocprof-ctrl and librocprof-tool in runtime component

    Change-Id: I53b7a283c6a8ddea97d4889db6010832389894bb

    * run cmake formatting (cmake-format) (#50)

    Co-authored-by: ammarwa <ammarwa@users.noreply.github.com>

    * run formatting (clang-format v11) (#51)

    Co-authored-by: ammarwa <ammarwa@users.noreply.github.com>

    * Update CMakeLists.txt

    Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>

    ---------

    Co-authored-by: gobhardw <gopesh.bhardwaj@amd.com>
    Co-authored-by: Giovanni  LB <gbaraldi@amd.com>
    Co-authored-by: Ranjith Ramakrishnan <Ranjith.Ramakrishnan@amd.com>
    Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    Co-authored-by: ammarwa <ammarwa@users.noreply.github.com>
    Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>

commit 3476ef7afe4e7af0a282b42da4b06ec8b0b9301a
Author: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Date:   Fri Jul 14 19:11:41 2023 -0500

    Workflow simplification and cancellation (#48)

    * Update formatting workflow

    - ignore changes to pull_*.yml workflows

    * Update pull_from_gerrit.yml workflow

    - allow manual trigger (workflow dispatch)
    - concurrent cancellation

    * Update pull_latest_dockers.yml workflow

    - simply workflow significantly by using matrix
    - allow manual trigger (workflow dispatch)
    - concurrent cancellation
    - run when pushed

    * Update CMake workflow

    - ignore changes to pull_*.yml workflows
    - concurrent cancellation

commit f053319a4873b3d0d5d6a5074238c0371e0c9f60
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Thu Jul 13 20:27:40 2023 -0500

    Update and rename pull.yml to pull_from_gerrit.yml (#44)

commit 90b423ebfaf35cc14d6c3b07c617e2346140853f
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Thu Jul 13 19:40:45 2023 -0500

    Update pull_latest_dockers.yml

commit 93acde8ed69766fb6d3482a1be8238f322b1db75
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Thu Jul 13 19:38:29 2023 -0500

    Update pull_latest_dockers.yml

commit 0092267a800ef1571bdb423272a8f2a2b8a641e6
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Thu Jul 13 19:29:47 2023 -0500

    Gerrit amd staging (#9)

    * removing README from tests diretory

    Change-Id: Id1162dbbe911e24f02d1bf42dafc93a9a934f71f

    * Fixing navi v1 test hang

    Change-Id: I7416170c126a2d3ec564ed27111f1befc3778b4a

    - Added all gpu targets in build script.
    - Changed memory order to relaxed seem to work for navi hang
    - need to set power state for consistent results

    Change-Id: I7416170c126a2d3ec564ed27111f1befc3778b4a

    * Pull from Github

    Squashed commit of the following:

    commit ac49fdd92a72e9c99394253a02da413a6c2e3b3a
    Merge: a07946a 03a0855
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Wed Jul 12 11:36:24 2023 -0500

        Merge pull request #2 from ROCm-Developer-Tools/gerrit-amd-staging

        Pull from gerrit

    commit 03a085588cffe863e8f466de67be1cfb205b675a
    Merge: e88cad2 a07946a
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Wed Jul 12 10:57:30 2023 -0500

        Merge branch 'amd-staging' into gerrit-amd-staging

    commit a07946a5cd4c670c83c27ad1a076a9d4567ce6d7
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 15:46:04 2023 +0000

        Enabling Cached Builds

    commit 525e494a7f13941077a8fd4ad6840904db4d27d4
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 04:53:54 2023 +0000

        Updating missed GPU Targets

    commit 42c75862f628c9bee7cfb7dc04dff2619430efbc
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 04:43:02 2023 +0000

        Adding V1 Testing

    commit 9d72fd4aee85e4b0c12e717060d2730fa5b73be1
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 03:34:31 2023 +0000

        Fixing Artifacts directory path

    commit f4000cc558b3b2e4676f7994f7ce8c8e6f94518e
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 03:27:26 2023 +0000

        Fixing CMake for test build job

    commit 2ce8115d4c33948c3c8f957f545a95a04e1d6cd2
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 03:16:18 2023 +0000

        Fixing Ubuntu CMake for ubuntu test build

    commit 6d0ed439191be900748d0c025157f9d689a73ec7
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 01:28:41 2023 +0000

        Removing Navi21

    commit e349a7642e5ae5eb03ab9fcd0a0f74f09f78cab5
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 01:14:14 2023 +0000

        Removing Navi21

    commit fefd02fe68d2a4bca7ec2e381960ad004ee9fc5b
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 00:42:48 2023 +0000

        Fixing CMake Job

    commit 2ea46abf7bf92643efa8c549fa70346ffbd79d65
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 00:35:13 2023 +0000

        Fixing CMake Job

    commit d99d681ed1999c5fcf291dc678b11a77205fb0f3
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Wed Jul 12 00:32:13 2023 +0000

        Fixing Pull Latest Dockers and CMake Jobs

    commit dfc4498072d13b4a1df3a63047d34c682c3d9a29
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Tue Jul 11 23:54:21 2023 +0000

        Fixing CMake job

    commit 919efe04de707f7c702031be15c3e2c5f8442cbb
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Tue Jul 11 23:52:13 2023 +0000

        Adding Pull Last dockers job

    commit be1b1256e8b0e05308e8f7e7e69bee3acca55281
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Tue Jul 11 18:25:40 2023 -0500

        Update cmake.yml

    commit 212299fa4355ae6ec18f9aaacbb79c51ea6c6f97
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Tue Jul 11 18:23:35 2023 -0500

        Update cmake.yml

    commit 7c2c1327086a61466cc6cac39f70865c051a8bc7
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Tue Jul 11 18:18:53 2023 -0500

        Update cmake.yml

    commit 191b5ce007e612e814c1d7a3afb4ad398f3852e1
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Tue Jul 11 16:03:22 2023 -0500

        Update cmake.yml

    commit 8824113d95f3e13c7ce4d0af8e0d9d8f522a6c4a
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Tue Jul 11 16:28:09 2023 +0000

        Fixing Pull from Gerrit job name

        Change-Id: I9e7ed9a27a13ca49d62c93bdadb30f0057e4d385

    commit cc3d5e4b02ffb439e8cc2b3efa53527c376f9982
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Tue Jul 11 16:21:43 2023 +0000

        Adding Staging sync job

        Change-Id: I0551f43878b0678ce4b3e74e27d62357cf95ad95

    commit b9be2eee71380a2e6dd34d520e92d0c4209277a0
    Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
    Date:   Tue Jul 11 15:57:11 2023 +0000

        Fixing build.sh

        Change-Id: Ia987b0244f0875370d5fe69907b3f5e9cea914de

    commit 9eee33a95a1abd656a7ac5ca10a9f245e9825431
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 21:39:46 2023 -0500

        Update cmake.yml

    commit 7093b85a78497140e8b52632ca2a002bdaeacd62
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 21:33:29 2023 -0500

        Update cmake.yml

    commit f54697172c72a67740f9fdfa0c217b6ea6931576
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 21:01:26 2023 -0500

        Update cmake.yml

    commit 1b6620e16f8940386b0f4f04e69e2410d21c0e26
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 20:21:02 2023 -0500

        Update cmake.yml

    commit a94bec740c6b42c4b79c87bca20fa87b99bf060d
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 19:46:35 2023 -0500

        Update cmake.yml

    commit 85d6b29d4375a69d575c18ece8542c50f2ddfcc3
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 19:34:39 2023 -0500

        Update cmake.yml

    commit 8c004887cf1435f1a6214c3d2455299a8a27bd4c
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 19:31:17 2023 -0500

        Update cmake.yml

    commit a14a9168e17d9348a53c6e9c9a47ba1edb4c4509
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 19:25:46 2023 -0500

        Update cmake.yml

    commit 000f2f40b84e6a2f7d4becdbf5aed01436ca4c83
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 19:08:18 2023 -0500

        Update cmake.yml

    commit a28a53d56731cad848fa9133d1c4dbaa8fc7afa7
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 19:03:39 2023 -0500

        Update cmake.yml

    commit a6a2db01027f0b01fdfbb5997ddb772c7f51b649
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 18:21:53 2023 -0500

        Update cmake.yml

    commit 118ef2a88b2d44e3207c31c343da3e5e5ec6f176
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 17:55:57 2023 -0500

        Update cmake.yml

    commit 03c4c232396440cd0be6d2dd7baf4ceea1c2589d
    Author: Ammar ELWazir <aelwazir@amd.com>
    Date:   Mon Jul 10 17:48:49 2023 -0500

        Create cmake.yml

    Change-Id: I2223efd600dcd8a4f695e61491b94b7f12ae2c5b

    * run formatting (clang-format v11) (#10)

    Co-authored-by: ammarwa <ammarwa@users.noreply.github.com>

    ---------

    Co-authored-by: gobhardw <gopesh.bhardwaj@amd.com>
    Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
    Co-authored-by: ammarwa <ammarwa@users.noreply.github.com>

commit f029195705a15700380c6f832ba5d15d46fd6de7
Author: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Date:   Thu Jul 13 14:38:56 2023 -0500

    Formatting workflows for source (clang-format) and cmake (cmake-format) (#4)

    * Add .cmake-format.yaml file

    * Add formatting workflow

    * provide base input for creating PR

    * Update scheme for extracting branch name

    - disable running formatting on push to amd-staging branch

    * patch .cmake-format.yaml for find_package signature

    - apparently cmake-format doesn't format the full signature of find_package

    * run formatting (clang-format v11) (#7)

    Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com>

    * run cmake formatting (cmake-format) (#6)

    Co-authored-by: jrmadsen <jrmadsen@users.noreply.github.com>

    ---------

    Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

commit bc4d135fdd8a1a9e51235f18a5d575fd2b3735e6
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Thu Jul 13 12:55:17 2023 -0500

    Removing Build cache for potential issues with auto-generated header files (#5)

    Change-Id: I9e2319f4335e2f88585ffa6fac2bd88a1c952e6e

commit ce86dea6a311d44d880fa684eb78f3329295e2a4
Author: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Date:   Thu Jul 13 11:08:58 2023 -0500

    Fix decltype(<hsa-function>) function pointer usage (#3)

    - the following is done in several places:
        decltype(hsa_memory_allocate)* hsa_memory_allocate
    - above can cause compiler errors
    - replace decltype(<hsa-function>) with decltype(::<hsa-function>)
      - this ensures that the type within the decltype is recognized as the global scope HSA function, not the variable
    - in many places, the variable has a "_fn" suffix to prevent this issue but added '::' anyway for consistency

commit ac49fdd92a72e9c99394253a02da413a6c2e3b3a
Merge: a07946a 03a0855
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Wed Jul 12 11:36:24 2023 -0500

    Merge pull request #2 from ROCm-Developer-Tools/gerrit-amd-staging

    Pull from gerrit

commit 03a085588cffe863e8f466de67be1cfb205b675a
Merge: e88cad2 a07946a
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Wed Jul 12 10:57:30 2023 -0500

    Merge branch 'amd-staging' into gerrit-amd-staging

commit a07946a5cd4c670c83c27ad1a076a9d4567ce6d7
Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
Date:   Wed Jul 12 15:46:04 2023 +0000

    Enabling Cached Builds

commit 525e494a7f13941077a8fd4ad6840904db4d27d4
Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
Date:   Wed Jul 12 04:53:54 2023 +0000

    Updating missed GPU Targets

commit 42c75862f628c9bee7cfb7dc04dff2619430efbc
Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
Date:   Wed Jul 12 04:43:02 2023 +0000

    Adding V1 Testing

commit 9d72fd4aee85e4b0c12e717060d2730fa5b73be1
Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
Date:   Wed Jul 12 03:34:31 2023 +0000

    Fixing Artifacts directory path

commit f4000cc558b3b2e4676f7994f7ce8c8e6f94518e
Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
Date:   Wed Jul 12 03:27:26 2023 +0000

    Fixing CMake for test build job

commit 2ce8115d4c33948c3c8f957f545a95a04e1d6cd2
Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
Date:   Wed Jul 12 03:16:18 2023 +0000

    Fixing Ubuntu CMake for ubuntu test build

commit 6d0ed439191be900748d0c025157f9d689a73ec7
Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
Date:   Wed Jul 12 01:28:41 2023 +0000

    Removing Navi21

commit e349a7642e5ae5eb03ab9fcd0a0f74f09f78cab5
Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
Date:   Wed Jul 12 01:14:14 2023 +0000

    Removing Navi21

commit fefd02fe68d2a4bca7ec2e381960ad004ee9fc5b
Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
Date:   Wed Jul 12 00:42:48 2023 +0000

    Fixing CMake Job

commit 2ea46abf7bf92643efa8c549fa70346ffbd79d65
Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
Date:   Wed Jul 12 00:35:13 2023 +0000

    Fixing CMake Job

commit d99d681ed1999c5fcf291dc678b11a77205fb0f3
Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
Date:   Wed Jul 12 00:32:13 2023 +0000

    Fixing Pull Latest Dockers and CMake Jobs

commit dfc4498072d13b4a1df3a63047d34c682c3d9a29
Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
Date:   Tue Jul 11 23:54:21 2023 +0000

    Fixing CMake job

commit 919efe04de707f7c702031be15c3e2c5f8442cbb
Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
Date:   Tue Jul 11 23:52:13 2023 +0000

    Adding Pull Last dockers job

commit be1b1256e8b0e05308e8f7e7e69bee3acca55281
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Tue Jul 11 18:25:40 2023 -0500

    Update cmake.yml

commit 212299fa4355ae6ec18f9aaacbb79c51ea6c6f97
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Tue Jul 11 18:23:35 2023 -0500

    Update cmake.yml

commit 7c2c1327086a61466cc6cac39f70865c051a8bc7
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Tue Jul 11 18:18:53 2023 -0500

    Update cmake.yml

commit 191b5ce007e612e814c1d7a3afb4ad398f3852e1
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Tue Jul 11 16:03:22 2023 -0500

    Update cmake.yml

commit 8824113d95f3e13c7ce4d0af8e0d9d8f522a6c4a
Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
Date:   Tue Jul 11 16:28:09 2023 +0000

    Fixing Pull from Gerrit job name

    Change-Id: I9e7ed9a27a13ca49d62c93bdadb30f0057e4d385

commit cc3d5e4b02ffb439e8cc2b3efa53527c376f9982
Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
Date:   Tue Jul 11 16:21:43 2023 +0000

    Adding Staging sync job

    Change-Id: I0551f43878b0678ce4b3e74e27d62357cf95ad95

commit b9be2eee71380a2e6dd34d520e92d0c4209277a0
Author: Ammar ELWazir <Ammar.ELWazir@amd.com>
Date:   Tue Jul 11 15:57:11 2023 +0000

    Fixing build.sh

    Change-Id: Ia987b0244f0875370d5fe69907b3f5e9cea914de

commit 9eee33a95a1abd656a7ac5ca10a9f245e9825431
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Mon Jul 10 21:39:46 2023 -0500

    Update cmake.yml

commit 7093b85a78497140e8b52632ca2a002bdaeacd62
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Mon Jul 10 21:33:29 2023 -0500

    Update cmake.yml

commit f54697172c72a67740f9fdfa0c217b6ea6931576
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Mon Jul 10 21:01:26 2023 -0500

    Update cmake.yml

commit 1b6620e16f8940386b0f4f04e69e2410d21c0e26
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Mon Jul 10 20:21:02 2023 -0500

    Update cmake.yml

commit a94bec740c6b42c4b79c87bca20fa87b99bf060d
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Mon Jul 10 19:46:35 2023 -0500

    Update cmake.yml

commit 85d6b29d4375a69d575c18ece8542c50f2ddfcc3
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Mon Jul 10 19:34:39 2023 -0500

    Update cmake.yml

commit 8c004887cf1435f1a6214c3d2455299a8a27bd4c
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Mon Jul 10 19:31:17 2023 -0500

    Update cmake.yml

commit a14a9168e17d9348a53c6e9c9a47ba1edb4c4509
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Mon Jul 10 19:25:46 2023 -0500

    Update cmake.yml

commit 000f2f40b84e6a2f7d4becdbf5aed01436ca4c83
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Mon Jul 10 19:08:18 2023 -0500

    Update cmake.yml

commit a28a53d56731cad848fa9133d1c4dbaa8fc7afa7
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Mon Jul 10 19:03:39 2023 -0500

    Update cmake.yml

commit a6a2db01027f0b01fdfbb5997ddb772c7f51b649
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Mon Jul 10 18:21:53 2023 -0500

    Update cmake.yml

commit 118ef2a88b2d44e3207c31c343da3e5e5ec6f176
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Mon Jul 10 17:55:57 2023 -0500

    Update cmake.yml

commit 03c4c232396440cd0be6d2dd7baf4ceea1c2589d
Author: Ammar ELWazir <aelwazir@amd.com>
Date:   Mon Jul 10 17:48:49 2023 -0500

    Create cmake.yml

Change-Id: I994a9fa743d47de3640d2bf7ae9ea3e01ea44f6a
2023-07-15 05:57:55 +00:00
2023-07-15 05:57:55 +00:00
2023-07-13 20:54:30 -04:00
2022-05-19 14:07:42 -04:00
2023-06-28 15:01:27 -04:00
2023-07-13 20:54:30 -04:00
2022-05-19 14:07:42 -04:00
2023-07-13 20:54:30 -04:00
2023-07-15 05:57:55 +00:00
2023-07-15 05:57:55 +00:00
2023-07-13 20:54:30 -04:00
2023-03-09 13:20:33 +00:00
2023-07-13 20:54:30 -04:00
2023-07-13 15:12:10 -04:00
2023-07-15 05:57:55 +00:00
2023-06-27 07:26:15 +00:00

DISCLAIMER

The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions, and typographical errors. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. Any computer system has risks of security vulnerabilities that cannot be completely prevented or mitigated. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes.THIS INFORMATION IS PROVIDED AS IS.” AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS, OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY RELIANCE, DIRECT, INDIRECT, SPECIAL, OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. AMD, the AMD Arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. Other product names used in this publication are for identification purposes only and may be trademarks of their respective companies.

© 2022 Advanced Micro Devices, Inc. All Rights Reserved.

ROC Profiler v1

Introduction

Profiling with metrics and traces based on perfcounters (PMC) and traces (SPM). Implementation is based on AqlProfile HSA extension. Library supports GFX8/GFX9. The last API library version for ROCProfiler v1 is 8.0.0

The library source tree:

  • doc - Documentation
  • include/rocprofiler/rocprofiler.h - Library public API
  • include/rocprofiler/v2/rocprofiler.h - V2 Beta Library public API
  • include/rocprofiler/v2/rocprofiler_plugins.h - V2 Beta Tool's Plugins Library public API
  • src - Library sources
    • core - Library API sources
    • util - Library utils sources
    • xml - XML parser
  • test - Library test suite
    • ctrl - Test controll
    • util - Test utils
    • simple_convolution - Simple convolution test kernel

Build environment

Roctracer & Rocprofiler need to be installed in the same directory.

export CMAKE_PREFIX_PATH=<path to hsa-runtime includes>:<path to hsa-runtime library>
export CMAKE_BUILD_TYPE=<debug|release> # release by default
export CMAKE_DEBUG_TRACE=1 # 1 to enable debug tracing

To build with the current installed ROCM:

cd .../rocprofiler
./build.sh ## (for clean build use `-cb`)

To run the test:

$ cd .../rocprofiler/build
$ export LD_LIBRARY_PATH=.:<other paths> # paths to ROC profiler and oher libraries
$ export HSA_TOOLS_LIB=librocprofiler64.so.1 # ROC profiler library loaded by HSA runtime
$ export ROCP_TOOL_LIB=test/librocprof-tool.so # tool library loaded by ROC profiler
$ export ROCP_METRICS=metrics.xml # ROC profiler metrics config file
$ export ROCP_INPUT=input.xml # input file for the tool library
$ export ROCP_OUTPUT_DIR=./ # output directory for the tool library, for metrics results file 'results.txt' and trace files
$ <your test>

Internal 'simple_convolution' test run script:
$ cd .../rocprofiler/build
$ ./run.sh

To enabled error messages logging to '/tmp/rocprofiler_log.txt':

$ export ROCPROFILER_LOG=1

To enable verbose tracing:

$ export ROCPROFILER_TRACE=1

ROCProfiler v2

Introduction

ROCProfilerV2 is a newly developed design for AMDs tooling infrastructure that provides a hardware specific low level performance analysis interface for profiling of GPU compute applications. The first API library version for ROCProfiler v2 is 9.0.0

Note: ROCProfilerV2 is currently considered a beta version and is subject to change in future releases

ROCProfilerV2 Modules

  • Counters
  • Hardware
  • Generic Buffer
  • Session
  • Filter
  • Tools
  • Plugins
  • Samples
  • Tests

Getting started

Requirements

  • makecache
  • Gtest Development Package (Ubuntu: libgtest-dev)
  • libsystemd-dev, libelf-dev, libnuma-dev, libpciaccess-dev on ubuntu or their corresponding packages on any other OS
  • Cppheaderparser, websockets, matplotlib, lxml, barectf Python3 Packages
  • Python packages can be installed using:
    pip3 install -r requirements.txt
    

Build

The user has two options for building:

  • Option 1 (It will install in the path saved in ROCM_PATH environment variable or /opt/rocm if ROCM_PATH is empty):

    • Run
    # Normal Build
    ./build.sh --build OR ./build.sh -b
    # Clean Build
    ./build.sh --clean-build OR ./build.sh -cb
    
  • Option 2 (Where ROCM_PATH envronment need to be set with the current installation directory of rocm), run the following:

    # Creating the build directory
    mkdir build && cd build
    
    # Configuring the rocprofv2 build
    cmake -DCMAKE_PREFIX_PATH=$ROCM_PATH -DCMAKE_MODULE_PATH=$ROCM_PATH/hip/cmake <CMAKE_OPTIONS> ..
    
    # Building the main runtime of the rocprofv2 project
    cmake --build . -- -j
    
    # Optionally, for building API documentation
    cmake --build . -- -j doc
    
    # Optionally, for building ROCProfiler V2 samples
    cmake --build . -- -j samples
    
    # Optionally, for building packages (DEB, RPM, TGZ)
    cmake --build . -- -j tests
    
    # Optionally, for building packages (DEB, RPM, TGZ)
    # Note: Requires rpm package on ubuntu
    cmake --build . -- -j package
    

Install

  • Optionally, run the following to install

    # Install rocprofv2 in the ROCM_PATH path
    ./rocprofv2 --install
    

    OR, if you are using option 2 in building

    cd build
    # Install rocprofv2 in the ROCM_PATH path
    cmake --build . -- -j install
    

Features & Usage

  • rocsys

    A command line utility to control a session (launch/start/stop/exit), with the required application to be traced or profiled in a rocprofv2 context. Usage:
    # Launch the application with the required profiling and tracing options with giving a session identifier to be used later
    rocsys --session session_name launch mpiexec -n 2 ./rocprofv2 -i samples/input.txt Histogram
    
    # Start a session with a given identifier created at launch
    rocsys --session session_name start
    
    # Stop a session with a given identifier created at launch
    rocsys –session session_name stop
    
    # Exit a session with a given identifier created at launch
    rocsys –session session_name exit
    
  • Counters and Metric Collection

    HW counters and derived metrics can be collected using following option:

    rocprofv2 -i samples/input.txt <app_relative_path>
    input.txt
    

    input.txt content Example (Details of what is needed inside input.txt will be mentioned with every feature):

    pmc: SQ_WAVES GRBM_COUNT GRBM_GUI_ACTIVE SQ_INSTS_VALU
    
  • Application Trace Support

    Different trace options are available while profiling an app:

    # HIP API & asynchronous activity tracing
    rocprofv2 --hip-api <app_relative_path> ## For synchronous HIP API Activity tracing
    rocprofv2 --hip-activity <app_relative_path> ## For both Synchronous & ASynchronous HIP API Activity tracing
    rocprofv2 --hip-trace <app_relative_path> ## Same as --hip-activity, added for backward compatibility
    
    # HSA API & asynchronous activity tracing
    rocprofv2 --hsa-api <app_relative_path> ## For synchronous HSA API Activity tracing
    rocprofv2 --hsa-activity <app_relative_path> ## For both Synchronous & ASynchronous HSA API Activity tracing
    rocprofv2 --hsa-trace <app_relative_path> ## Same as --hsa-activity, added for backward compatibility
    
    # Kernel dispatches tracing
    rocprofv2 --kernel-trace <app_relative_path> ## Kernel Dispatch Tracing
    
    # HIP & HSA API and asynchronous activity and kernel dispatches tracing
    rocprofv2 --sys-trace <app_relative_path> ## Same as combining --hip-trace & --hsa-trace & --kernel-trace
    

    For complete usage options, please run rocprofv2 help

    rocprofv2 --help
    
  • Plugin Support

    We have a template for adding new plugins. New plugins can be written on top of rocprofv2 to support the desired output format using include/rocprofiler/v2/rocprofiler_plugins.h header file. These plugins are modular in nature and can easily be decoupled from the code based on need. Installation files:

    rocprofiler-plugins_9.0.0-local_amd64.deb
    rocprofiler-plugins-9.0.0-local.x86_64.rpm
    
    • file plugin: outputs the data in txt files.
    • Perfetto plugin: outputs the data in protobuf format.
      • Protobuf files can be viewed using ui.perfetto.dev or using trace_processor
    • ATT (Advanced thread tracer) plugin: advanced hardware traces data in binary format. Please refer ATT section.
    • CTF plugin: Outputs the data in ctf format(a binary trace format)
      • CTF binary output can be viewed using TraceCompass or babeltrace.

    Usage:

    # plugin_name can be file, perfetto , ctf
    ./rocprofv2 --plugin plugin_name -i samples/input.txt -d output_dir <app_relative_path> # -d is optional, but can be used to define the directory output for output results
    
    • (ATT) Advanced Thread Trace

      Tool used to collect fine-grained hardware metrics. Provides ISA-level instruction hotspot analysis via hardware tracing.

      # ATT(Advanced Thread Trace) needs some preparation before running.
      
      # 1. Make sure to generate the assembly file for application by executing the following before compiling your HIP Application
      # This can be achieved globally by following environment variable
      export HIPCC_COMPILE_FLAGS_APPEND="--save-temps -g"
      # Similarly, the --save-temps -g flags can be added per file for better ISA generation control.
      
      # 2. Install plugin package
      # see Plugin Support section for installation
      
      # 3. Run the following to view the trace
      # Att-specific options must come right after the assembly file
      rocprofv2 -i input.txt --plugin att <app_assembly_file> --mode network <app_relative_path>
      
      # Example for vectoradd on navi31.
      # Special attention to gfx1100.s==navi31 in the ISA file name. 
      # Use gfx1030 for navi21, gfx90a for MI200 and gfx940 for MI300
      hipcc -g --save-temps vectoradd_hip.cpp -o vectoradd_hip.exe
      rocprofv2 -i input.txt --plugin att vectoradd_hip-hip-amdgcn-amd-amdhsa-gfx1100.s --mode network ./vectoradd_hip.exe
      # Then open the browser at http://localhost:8000
      # The ISA can also be obtained from llvm/roc objdump, however, annotations will be different
      
      • app_assembly_file_relative_path
        AMDGCN ISA file with .s extension generated in 1st step
      • app_relative_path
        Path for the running application
      • ATT plugin optional parameters
        • --depth [n]: How many waves per slot to parse (maximum).
        • --mpi [proc]: Parse with this many mpi processes, for greater analysis speed. Does not change results. Requires mpi4py.
        • --att_kernel "filename": Kernel filename to use (instead of ATT asking which one to use).
        • --trace_file "files": glob (wildcards allowed) of traces files to parse. Requires quotes for use with wildcards.
        • --mode [network, file, off (default)]
          • network
            Opens the server with the browser UI. att needs 2 ports available (e.g. 8000, 18000). There is an option (default: --ports "8000,18000") to change these. In case rocprofv2 is running on a different machine, use port forwarding "ssh -L 8000:localhost:8000 user@IP" so the browser can be used locally. For docker, use --network=host --ipc=host -p8000:8000 -p18000:18000
          • file
            Dumps the analyzed json files to disk for vieweing at a later time. Run python3 httpserver.py from within the generated ui/ folder to view the trace, similarly to network mode. The folder can be copied to another machine, and will run without rocm.
          • off
            Runs trace collection but not analysis, so it can be analyzed at a later time. Run rocprofv2 ATT [network, file] with the same parameters, removing the application binary, to analyze previously generated traces.
      • input.txt
        Required. Used to select specific compute units and other trace parameters. For first time users, we recommend compiling and running vectorAdd with
        att: TARGET_CU=1
        SE_MASK=0x1
        SIMD_MASK=0x3
        
        and histogram with
        att: TARGET_CU=0
        SE_MASK=0xFF
        SIMD_MASK=0xF // 0xF for GFX9, SIMD_MASK=0 for Navi
        
        Possible contents:
        • att: TARGET_CU=1 //or some other CU [0,15] - WGP for Navi [0,8]
        • SE_MASK=0x1 // bitmask of shader engines. The fewer, the easier on the hardware. Default enables 1 out of 4 shader engines.
        • SIMD_MASK=0xF // GFX9: bitmask of SIMDs. Navi: SIMD Index [0-3].
        • DISPATCH=ID,RN // collect trace only for the given dispatch_ID and MPI rank RN. RN ignored for single processes. Multiple lines with varying combinations of RN and ID can be added.
        • KERNEL=kernname // Profile only kernels containing the string kernname (c++ mangled name). Multiple lines can be added.
        • PERFCOUNTERS_COL_PERIOD=0x3 // Multiplier period for counter collection [0~31]. 0=fastest (usually once every 16 cycles). GFX9 only. Counters will be shown in a graph over time in the browser UI.
        • PERFCOUNTER=counter_name // Add a SQ counter to be collected with ATT; period defined by PERFCOUNTERS_COL_PERIOD. GFX9 only.
        • BUFFER_SIZE=[size] // Sets size of the ATT buffer collection, per dispatch, in megabytes (shared among all shader engines).
  • Flush Interval

    Flush interval can be used to control the interval time in milliseconds between the buffers flush for the tool. However, if the buffers are full the flush will be called on its own. This can be used as in the next example:

    rocprofv2 --flush-interval <TIME_INTERVAL_IN_MILLISECONDS> <rest_of_rocprofv2_arguments> <app_relative_path>
    
  • Trace Period

    Trace period can be used to control when the profiling or tracing is enabled using two arguments, the first one is the delay time, which is the time spent idle without tracing or profiling. The second argument is the profiling or the tracing time, which is the active time where the profiling and tracing are working, so basically, the session will work in the following timeline:

    # <DELAY_TIME> => <PROFILING_OR_TRACING_SESSION_START> => <ACTIVE_PROFILING_OR_TRACING_TIME> => <PROFILING_OR_TRACING_SESSION_STOP>
    

    This feature can be used using the following command:

    rocprofv2 --trace-period <delay>:<active_time>:<interval> <rest_of_rocprofv2_arguments> <app_relative_path>
    
    • delay: Time delay to start profiling (ms).
    • active_time: How long to profile for (ms).
    • interval: If set, profiling sessions will start (loop) every "interval", and run for "active_time", until the application ends. Must be higher than "active_time".
  • Device Profiling: A device profiling session allows the user to profile the GPU device for counters irrespective of the running applications on the GPU. This is different from application profiling. device profiling session doesn't care about the host running processes and threads. It directly provides low level profiling information.

  • Session Support: A session is a unique identifier for a profiling/tracing/pc-sampling task. A ROCProfilerV2 Session has enough information about what needs to be collected or traced and it allows the user to start/stop profiling/tracing whenever required. More details on the API can be found in the API specification documentation that can be installed using rocprofiler-doc package. Samples also can be found for how to use the API in samples directory.

Tests

We make use of the GoogleTest (Gtest) framework to automatically find and add test cases to the CMAKE testing environment. ROCProfilerV2 testing is categorized as following:

  • unittests (Gtest Based) : These includes tests for core classes. Any newly added functionality should have a unit test written to it.

  • featuretests (standalone and Gtest Based): These includes both API tests and tool tests. Tool is tested against different applications to make sure we have right output in evry run.

  • memorytests (standalone): This includes running address sanitizer for memory leaks, corruptions.

    installation:

    rocprofiler-tests_9.0.0-local_amd64.deb
    rocprofiler-tests-9.0.0-local.x86_64.rpm
    
  • Optionally, for tests: run the following:

  • Option 1, using rocprofv2 script:

    cd build && ./rocprofv2 -t
    
  • Option 2, using cmake directly:

    cd build && cmake --build . -- -j check
    

Logging

To enable error messages logging to '/tmp/rocprofiler_log.txt':

$ export ROCPROFILER_LOG=1

Documentation

We make use of doxygen to automatically generate API documentation. Generated document can be found in the following path:

# ROCM_PATH by default is /opt/rocm
# It can be set by the user in different location if needed.
<ROCM_PATH>/share/doc/rocprofv2

installation:

rocprofiler-docs_9.0.0-local_amd64.deb
rocprofiler-docs-9.0.0-local.x86_64.rpm

Samples

  • Profiling: Profiling Samples depending on replay mode
  • Tracing: Tracing Samples

installation:

rocprofiler-samples_9.0.0-local_amd64.deb
rocprofiler-samples-9.0.0-local.x86_64.rpm

usage:

samples can be run as independent executables once installed

Project Structure

  • bin: ROCProf scripts along with V1 post processing scripts
  • doc: Documentation settings for doxygen, V1 API Specifications pdf document.
  • include:
    • rocprofiler.h: V1 API Header File
    • v2:
      • rocprofiler.h: V2 API Header File
      • rocprofiler_plugin.h: V2 Tool Plugins API
  • plugin
    • file: File Plugin
    • perfetto: Perfetto Plugin
    • att: Adavced thread tracer Plugin
    • ctf: CTF Plugin
  • samples: Samples of how to use the API, and also input.txt input file samples for counter collection and ATT.
  • script: Scripts needed for tracing
  • src: Source files of the project
    • api: API implementation for rocprofv2
    • core: Core source files needed for the V1/V2 API
      • counters: Basic and Derived Counters
      • hardware: Hardware support
      • hsa: Provides support for profiler and tracer to communicate with HSA
        • queues: Intercepting HSA Queues
        • packets: Packets Preparation for profiling
      • memory: Memory Pool used in buffers that saves the output data
      • session: Session Logic
        • filter: Type of profiling or tracing and its properties
        • tracer: Tracing support of the session
        • profiler: Profiling support of the session
        • spm: SPM support of the session
        • att: ATT support of the session
    • tools: Tools needed to run profiling and tracing
      • rocsys: Controlling Session from another CLI
    • utils: Utilities needed by the project
  • tests: Tests folder
  • CMakeLists.txt: Handles cmake list for the whole project
  • build.sh: To easily build and compile rocprofiler
  • CHANGELOG.md: Changes that are happening per release

Support

Please report in the Github Issues

Limitations

  • Navi3x requires a stable power state for counter collection.
    Currently, this state needs to be set by the user. To do so, set "power_dpm_force_performance_level" to be writeable for non-root users, then set performance level to profile_standard:
    sudo chmod 777 /sys/class/drm/card0/device/power_dpm_force_performance_level
    echo profile_standard >> /sys/class/drm/card0/device/power_dpm_force_performance_level
    
    Recommended: "profile_standard" for counter collection and "auto" for all other profiling. Use rocm-smi to verify the current power state. For multiGPU systems (includes integrated graphics), replace "card0" by the desired card.
S
Описание
No description provided
Readme 282 MiB
Languages
C++ 67.5%
C 20.6%
Python 6.6%
CMake 3.4%
Shell 0.6%
Разное 1.1%