7c4369bde4
Contributors:
Ammar ELWazir <aelwazir@amd.com>
AravindanC <aravindan.cheruvally@amd.com>
Benjamin Welton <bewelton@amd.com>
Ma, Bing <Bing.Ma@amd.com>
Chun Yang <chun.yang@amd.com>
Cole Nelson <cole.nelson@amd.com>
Ethan Stewart <ethan.stewart@amd.com>
Evgeny <evgeny.shcherbakov@amd.com>
Freddy Paul <Freddy.paul@amd.com>
Giovanni Baraldi <gbaraldi@amd.com>
Gopesh Bhardwaj <Gopesh.Bhardwaj@amd.com>
Icarus Sparry <icarus.sparry@amd.com>
itrowbri <Ian.Trowbridge@amd.com>
James Edwards <JamesAdrian.Edwards@amd.com>
jatang <jatang@amd.com>
Jeremy Newton <Jeremy.Newton@amd.com>
Jonathan Kim <jonathan.kim@amd.com>
Kent Russell <kent.russell@amd.com>
Kiumars Sabeti <kiumars.sabeti@amd.com>
Lang Yu <lang.yu@amd.com>
Laurent Morichetti <laurent.morichetti@amd.com>
Mallya, Ameya Keshava <AmeyaKeshava.Mallya@amd.com>
Manjunath Jakaraddi <manjunath.jakaraddi@amd.com>
Mark Laws <markdavid.laws@amd.com>
Mohan Kumar Mithur <Mohan.KumarMithur@amd.com>
Nicholas Curtis <nicurtis@amd.com>
Nirmal Unnikrishnan <Nirmal.Unnikrishnan@amd.com>
Parag Bhandari <parag.bhandari@amd.com>
Ranjith Ramakrishnan <Ranjith.Ramakrishnan@amd.com>
Robert Gregory <Robert.Gregory@amd.com>
Saravanan Solaiyappan <saravanan.solaiyappan@amd.com>
Saurabh Verma <saurabh.verma@amd.com>
Srihari Uttanur <srihari.u@amd.com>
Srinivasan Subramanian <srinivasan.subramanian@amd.com>
Sriraksha Nagaraj <Sriraksha.Nagaraj@amd.com>
Sushma Vaddireddy <svaddire@amd.com>
Xianwei Zhang <Xianwei.Zhang@amd.com>
[ROCm/aqlprofile commit: 1ed169e30c]
68 line
2.2 KiB
Plaintext
68 line
2.2 KiB
Plaintext
HSA extension AMD AQL profile library.
|
|
Provides AQL packets helper methods for perfcounters (PMC) and SQ threadtraces (SQTT).
|
|
|
|
Library supports GFX9 APIs.
|
|
The library source tree:
|
|
- doc - Documentation, the API specification and the presentation
|
|
- <hsa-runtime>/inc/hsa_ven_amd_aqlprofile.h - AMD AQL profile library public API
|
|
- src - AMD AQL profile library sources
|
|
- core - AQL API sources
|
|
- pm4 - cmd/pmc/sqtt pm4 builders
|
|
- def - Generated GFXIP definition headers
|
|
- test - library test suite
|
|
- ctrl - Test control
|
|
- util - Test utils
|
|
- simple_convolution - Simple convolution test kernel
|
|
|
|
Build environment:
|
|
|
|
$ export CMAKE_PREFIX_PATH=<path to hsa-runtime includes>:<path to hsa-runtime library>
|
|
$ export CMAKE_BUILD_TYPE=<debug|release> # release by default
|
|
$ export CMAKE_DEBUG_TRACE=1 # 1 to enable debug tracing
|
|
|
|
To build with the current installed ROCM:
|
|
|
|
$ export CMAKE_PREFIX_PATH=/opt/rocm/lib:/opt/rocm/include/hsa
|
|
|
|
$ cd .../aqlprofile
|
|
$ mkdir build
|
|
$ cd build
|
|
$ cmake ..
|
|
$ make
|
|
|
|
To regenerate src/def headers:
|
|
|
|
Need to use 'clang' compiler:
|
|
$ export CXX=/usr/bin/clang++
|
|
$ export CC=/usr/bin/clang
|
|
|
|
'mygen' make target to regenerate the headers from full set of gfxip headers:
|
|
$ make mygen
|
|
|
|
To reset the generated headers:
|
|
$ make mygenreset
|
|
|
|
To run the test:
|
|
|
|
$ cd ../aqlprofile/build
|
|
$ export LD_LIBRARY_PATH=$PWD
|
|
$ run.sh
|
|
|
|
To enabled error messages logging to '/tmp/aql_profile_log.txt':
|
|
|
|
$ export HSA_VEN_AMD_AQLPROFILE_LOG=1
|
|
|
|
To enable verbose tracing:
|
|
|
|
$ export AQLPROFILE_TRACE=1
|
|
|
|
To recompile kernel object:
|
|
|
|
$ /opt/rocm/opencl/bin/clang -cl-std=CL2.0 -include /opt/rocm/opencl/include/opencl-c.h -nogpulib -Xclang -mlink-bitcode-file -Xclang /opt/rocm/amdgcn/bitcode/opencl.amdgcn.bc -Xclang -mlink-bitcode-file -Xclang /opt/rocm/amdgcn/bitcode/ockl.amdgcn.bc -target amdgcn-amd-amdhsa -mcpu=gfx906 vector_add_kernel.cl -o vector_add_kernel.so
|
|
|
|
With newer device-libs layout, use this recompile command:
|
|
$ /opt/rocm/opencl/bin/clang -cl-std=CL2.0 -include /opt/rocm/opencl/include/opencl-c.h --hip-device-lib-path=/opt/rocm/amdgcn/bitcode -target amdgcn-amd-amdhsa -mcpu=gfx906 vector_add_kernel.cl -o vector_add_kernel.so
|
|
|
|
### ROCm 5.7
|
|
Added support for GFX10/GFX11
|