Chun Yang 2b79931631 SWDEV-301543 SWDEV-276146 : Fix profile output buff allocation
L2 flush is triggered by explicit cache flush PM4 packet in aqlprofile
packets to GPU. This cache flush is used to sync up CPU and GPU to make
sure perfomance counters copied to profile output buffer is visible to
CPU. To get rid of this cache flush the followings are done:
  1. This explicit cache flush packet is removed from aqlprofile code
     (another commit to aqlprofile code).
  2. This commit which changed profile output buffer to use kernarg
     memory since it is uncached for GPU.
After these changes profile counter values when copied by GPU to output
buffer they are guaranteed to be visible to CPU.

Change-Id: Ie953949c85fbee2f4369f1de966bcfb33daec084
2021-09-02 17:30:57 -07:00
2021-08-20 10:45:39 -07:00
2018-06-15 10:53:06 -05:00
2020-09-11 10:01:54 -05:00
2021-08-20 10:45:39 -07:00
2020-01-14 10:25:13 -06:00

ROC Profiler library.
Profiling with metrics and traces based on perfcounters (PMC) and traces (SPM).
Implementation is based on AqlProfile HSA extension.
Library supports GFX8/GFX9.

The library source tree:
 - doc  - Documentation
 - inc/rocprofiler.h - Library public API
 - src  - Library sources
   - core - Library API sources
   - util - Library utils sources
   - xml - XML parser
 - test - Library test suite
   - ctrl - Test controll
   - util - Test utils
   - simple_convolution - Simple convolution test kernel

Build environment:

$ export CMAKE_PREFIX_PATH=<path to hsa-runtime includes>:<path to hsa-runtime library>
$ export CMAKE_BUILD_TYPE=<debug|release> # release by default
$ export CMAKE_DEBUG_TRACE=1 # 1 to enable debug tracing

To build with the current installed ROCM:

$ cd .../rocprofiler
$ export CMAKE_PREFIX_PATH=/opt/rocm/include/hsa:/opt/rocm
$ mkdir build
$ cd build
$ cmake ..
$ make

To run the test:

$ cd .../rocprofiler/build
$ export LD_LIBRARY_PATH=.:<other paths> # paths to ROC profiler and oher libraries
$ export HSA_TOOLS_LIB=librocprofiler64.so # ROC profiler library loaded by HSA runtime
$ export ROCP_TOOL_LIB=test/libtool.so # tool library loaded by ROC profiler
$ export ROCP_METRICS=metrics.xml # ROC profiler metrics config file
$ export ROCP_INPUT=input.xml # input file for the tool library
$ export ROCP_OUTPUT_DIR=./ # output directory for the tool library, for metrics results file 'results.txt' and trace files
$ <your test>

Internal 'simple_convolution' test run script:
$ cd .../rocprofiler/build
$ run.sh

To enabled error messages logging to '/tmp/rocprofiler_log.txt':

$ export ROCPROFILER_LOG=1

To enable verbose tracing:

$ export ROCPROFILER_TRACE=1
S
설명
제공된 설명이 없습니다
Readme 282 MiB
언어
C++ 67.5%
C 20.6%
Python 6.6%
CMake 3.4%
Shell 0.6%
기타 1.1%