Update fetch_size metric (#1165)
Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
[ROCm/rocprofiler-sdk commit: 4acca76edb]
This commit is contained in:
کامیت شده توسط
GitHub
والد
2800fb526f
کامیت
f6b0641a2a
@@ -107,6 +107,7 @@ Full documentation for ROCprofiler-SDK is available at [Click Here](source/docs/
|
||||
- Changed naming of agent profiling to device counting service (which more closely follows its name). To convert existing tool/user code to the new names, the following sed can be used: `find . -type f -exec sed -i 's/rocprofiler_agent_profile_callback_t/rocprofiler_device_counting_service_callback_t/g; s/rocprofiler_configure_agent_profile_counting_service/rocprofiler_configure_device_counting_service/g; s/agent_profile.h/device_counting_service.h/g; s/rocprofiler_sample_agent_profile_counting_service/rocprofiler_sample_device_counting_service/g' {} +`
|
||||
- Changed naming of dispatch profiling service to dispatch counting service (which more closely follows its name). To convert existing tool/user code to the new names, the following sed can be used: `-type f -exec sed -i -e 's/dispatch_profile_counting_service/dispatch_counting_service/g' -e 's/dispatch_profile.h/dispatch_counting_service.h/g' -e 's/rocprofiler_profile_counting_dispatch_callback_t/rocprofiler_dispatch_counting_service_callback_t/g' -e 's/rocprofiler_profile_counting_dispatch_data_t/rocprofiler_dispatch_counting_service_data_t/g' -e 's/rocprofiler_profile_counting_dispatch_record_t/rocprofiler_dispatch_counting_service_record_t/g' {} +`
|
||||
- Support specifying HW counters via command-line in rocprofv3, e.g. `rocprofv3 --pmc [COUNTER [COUNTER ...]]`
|
||||
- FETCH_SIZE metric on gfx94x uses TCC_BUBBLE for 128B reads.
|
||||
|
||||
### Fixes
|
||||
|
||||
|
||||
+16
-3
@@ -215,13 +215,15 @@ FETCH_SIZE:
|
||||
gfx908/gfx90a/gfx9/gfx900:
|
||||
expression: (TCC_EA_RDREQ_32B_sum*32+(TCC_EA_RDREQ_sum-TCC_EA_RDREQ_32B_sum)*64)/1024
|
||||
gfx942/gfx941/gfx940:
|
||||
expression: (TCC_EA0_RDREQ_32B_sum*32+(TCC_EA0_RDREQ_sum-TCC_EA0_RDREQ_32B_sum)*128)/1024
|
||||
expression: (TCC_BUBBLE_sum*128 + (TCC_EA0_RDREQ_sum-TCC_BUBBLE_sum-TCC_EA0_RDREQ_32B_sum)*64 + TCC_EA0_RDREQ_32B_sum*32)/1024
|
||||
description: The total kilobytes fetched from the video memory. This is measured with all extra fetches
|
||||
and any cache or memory effects taken into account.
|
||||
BANDWIDTH_EA:
|
||||
architectures:
|
||||
gfx90a/gfx940/gfx941/gfx942:
|
||||
expression: 1024*(FETCH_SIZE+WRITE_SIZE)/reduce(GRBM_GUI_ACTIVE,max)
|
||||
gfx940/gfx941/gfx942:
|
||||
expression: (WRITE_SIZE*1024+TCC_BUBBLE_sum*128+(TCC_BUBBLE_sum-TCC_EA0_RDREQ_sum)*64)/reduce(GRBM_GUI_ACTIVE,max)
|
||||
gfx90a:
|
||||
expression: 1024*(WRITE_SIZE+FETCH_SIZE)/reduce(GRBM_GUI_ACTIVE,max)
|
||||
description: Memory Bandwidth measured at the TCC_EA interface. In units of bytes/cycle.
|
||||
FetchSize:
|
||||
architectures:
|
||||
@@ -3210,6 +3212,17 @@ TCC_WRREQ_STALL_max:
|
||||
gfx942/gfx941/gfx940:
|
||||
expression: reduce(TCC_EA0_WRREQ_STALL,max)
|
||||
description: Number of cycles a write request was stalled. Max over TCC instances.
|
||||
TCC_BUBBLE:
|
||||
architectures:
|
||||
gfx942/gfx941/gfx940:
|
||||
block: TCC
|
||||
event: 56
|
||||
description: Number of 128-byte read requests sent to EA.
|
||||
TCC_BUBBLE_sum:
|
||||
architectures:
|
||||
gfx942/gfx941/gfx940:
|
||||
expression: reduce(TCC_BUBBLE,sum)
|
||||
description: Number of 128-byte read requests sent to EA. Sum over all TCC instances.
|
||||
# TCP Block (Texture Cache per Pipe)
|
||||
TCP_ATOMIC_TAGCONFLICT_STALL_CYCLES:
|
||||
architectures:
|
||||
|
||||
مرجع در شماره جدید
Block a user