Adding changes for v1 xml which was missed in change 6cf9df4ff0
Change-Id: I338f2736ee61e316522f1ce42cee74abec201499
[ROCm/rocprofiler commit: 2047bf4b8b]
Revision - Addition [Impact SoC: MI200, MI300]
Note: this set of counters are important help understand the
bottleneck.
1. TCC_TAG_STALL
a. Metric: TCC_TAG_STALL/TCC_CYCLE: percentage of time TCC
tag lookup pipeline is stalled
2. TCP_TCR_TCP_STALL_CYCLES
a. Metric: TCP_TCR_TCP_STALL_CYCLES/TCP_GATE_EN1: percentage
of time TCP is stalled by TCR
Revision - Addition [Impact SoC: MI300]
3. TCC_BUBBLE:
a. Definition: Number of 128-byte read requests sent to EA
b. Revised Metric #1, TCC-EA Read BW:
ReadBW = 128 * TCC_BUBBLE
+ 64 * (TCC_EA0_RDREQ - TCC_BUBBLE - TCC_EA0_RDREQ_32B)
+ 32 * TCC_EA0_RDREQ_32B
c. Revised Metric #2: TCC_EA Read Latency
ReadLatency = TCC_EA0_RDREQ_LEVEL / (TCC_BUBBLE + TCC_EA0_RDREQ)
/* [Fineprint] More detailed arithmetic:
* ReadLatency = TCC_EA0_RDREQ_LEVEL / (#32B_req + #64B_req + #128B_req * 2)
*/
Change-Id: I0a2dfc1b64ca97023b1e8ba0f9830330b3034946
[ROCm/rocprofiler commit: 46e02a9866]
1. Xml files updated for gfx940 counters
2. File plugin changes to allow rocprofv2 backward compatibility for results.csv
3. Changes in rocprofv2 script to use tblextr.py, to generate results.csv just like rocprof
Change-Id: I7798f4411ce01f6fbfffb126de654ed806ca7045
(cherry picked from commit 86cbaf38c436be876f0426fa27803b1e64d90378)
[ROCm/rocprofiler commit: 8f82ff6a46]
This is an attempt to support basic and derived counters for navi21. This code will not work correctly unless we add navi counters to metrics.xml and gfx_metrics.xml
Change-Id: Ied06a81345a6fbb02fa0fde1889d94bbe64e9a03
[ROCm/rocprofiler commit: b53fd84ade]
Added approved HW counters for MI200. Also added derived metrics for the same
Change-Id: I1c6abfdfde4e4fd4ba8bd5eec0557ad08fd71c77
[ROCm/rocprofiler commit: 6d233c65d7]
Compard to gfx900, e.g., Vega 10, gfx906 adds extra counter events.
A typical difference is on TCC-EA that gfx906 (e.g., Vega 20) has 2
EAs per TCC, while only one single EA/TCC on gfx900. As such,
additional counters must be profiled to get correct results. This
patch adds one extra events to specifically handle gfx906.
Change-Id: Id6c9d37548a102c80bbfddcfa11e77d20f17431a
[ROCm/rocprofiler commit: ca9a714b77]
To validate cache and memory blocks profiling, this patch prepares
tests to profile dedicated kernels using specified counters, to
compare the profiled results against expected ones, and further
show the test is a fail or pass. Tests here are focusing on cache
hit/miss, memory fetch/write size.
Change-Id: Icbc8096a6e15256dec66297597a57c7665a533b8
[ROCm/rocprofiler commit: 8b445d2c00]