Update fetch_size metric (#1165)

Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>

[ROCm/rocprofiler-sdk commit: 4acca76edb]
This commit is contained in:
Giovanni Lenzi Baraldi
2024-10-29 21:44:27 -03:00
کامیت شده توسط GitHub
والد 2800fb526f
کامیت f6b0641a2a
2فایلهای تغییر یافته به همراه17 افزوده شده و 3 حذف شده
@@ -107,6 +107,7 @@ Full documentation for ROCprofiler-SDK is available at [Click Here](source/docs/
- Changed naming of agent profiling to device counting service (which more closely follows its name). To convert existing tool/user code to the new names, the following sed can be used: `find . -type f -exec sed -i 's/rocprofiler_agent_profile_callback_t/rocprofiler_device_counting_service_callback_t/g; s/rocprofiler_configure_agent_profile_counting_service/rocprofiler_configure_device_counting_service/g; s/agent_profile.h/device_counting_service.h/g; s/rocprofiler_sample_agent_profile_counting_service/rocprofiler_sample_device_counting_service/g' {} +`
- Changed naming of dispatch profiling service to dispatch counting service (which more closely follows its name). To convert existing tool/user code to the new names, the following sed can be used: `-type f -exec sed -i -e 's/dispatch_profile_counting_service/dispatch_counting_service/g' -e 's/dispatch_profile.h/dispatch_counting_service.h/g' -e 's/rocprofiler_profile_counting_dispatch_callback_t/rocprofiler_dispatch_counting_service_callback_t/g' -e 's/rocprofiler_profile_counting_dispatch_data_t/rocprofiler_dispatch_counting_service_data_t/g' -e 's/rocprofiler_profile_counting_dispatch_record_t/rocprofiler_dispatch_counting_service_record_t/g' {} +`
- Support specifying HW counters via command-line in rocprofv3, e.g. `rocprofv3 --pmc [COUNTER [COUNTER ...]]`
- FETCH_SIZE metric on gfx94x uses TCC_BUBBLE for 128B reads.
### Fixes
@@ -215,13 +215,15 @@ FETCH_SIZE:
gfx908/gfx90a/gfx9/gfx900:
expression: (TCC_EA_RDREQ_32B_sum*32+(TCC_EA_RDREQ_sum-TCC_EA_RDREQ_32B_sum)*64)/1024
gfx942/gfx941/gfx940:
expression: (TCC_EA0_RDREQ_32B_sum*32+(TCC_EA0_RDREQ_sum-TCC_EA0_RDREQ_32B_sum)*128)/1024
expression: (TCC_BUBBLE_sum*128 + (TCC_EA0_RDREQ_sum-TCC_BUBBLE_sum-TCC_EA0_RDREQ_32B_sum)*64 + TCC_EA0_RDREQ_32B_sum*32)/1024
description: The total kilobytes fetched from the video memory. This is measured with all extra fetches
and any cache or memory effects taken into account.
BANDWIDTH_EA:
architectures:
gfx90a/gfx940/gfx941/gfx942:
expression: 1024*(FETCH_SIZE+WRITE_SIZE)/reduce(GRBM_GUI_ACTIVE,max)
gfx940/gfx941/gfx942:
expression: (WRITE_SIZE*1024+TCC_BUBBLE_sum*128+(TCC_BUBBLE_sum-TCC_EA0_RDREQ_sum)*64)/reduce(GRBM_GUI_ACTIVE,max)
gfx90a:
expression: 1024*(WRITE_SIZE+FETCH_SIZE)/reduce(GRBM_GUI_ACTIVE,max)
description: Memory Bandwidth measured at the TCC_EA interface. In units of bytes/cycle.
FetchSize:
architectures:
@@ -3210,6 +3212,17 @@ TCC_WRREQ_STALL_max:
gfx942/gfx941/gfx940:
expression: reduce(TCC_EA0_WRREQ_STALL,max)
description: Number of cycles a write request was stalled. Max over TCC instances.
TCC_BUBBLE:
architectures:
gfx942/gfx941/gfx940:
block: TCC
event: 56
description: Number of 128-byte read requests sent to EA.
TCC_BUBBLE_sum:
architectures:
gfx942/gfx941/gfx940:
expression: reduce(TCC_BUBBLE,sum)
description: Number of 128-byte read requests sent to EA. Sum over all TCC instances.
# TCP Block (Texture Cache per Pipe)
TCP_ATOMIC_TAGCONFLICT_STALL_CYCLES:
architectures: