파일
vedithal-amd 27585a8a2b Support MI 350 profiling (#632)
* Add MI 350 hardware information

* Refactor MI GPU YAML file and corresponding interface

* Add SoC file for gfx950 architecture

* Add analysis report configs for MI 350 containing existing metrics

* Add placeholder None valued metrics for previous architectures to make
  baseline comparison work

* Enable testing on MI 350

* Analysis config metric changes
    - SPI changes
        - Update metric formula for default SPI pipe counter
             - Use efficiently collected pipe wise SPI counters
        - Add SPI Wave Occupancy
        - Add Scheduler-Pipe Wave Utilization
        - Update formula for VGPR Writes
        - Add Scheduler-Pipe FIFO Full Rate
   - CPC changes
	- Add CPC SYNC FIFO Full Rate
	- Add CPC CANE Stall Rate
        - Add CPC ADC Utilization
   - SQ changes
        - Add VALU co-issue efficiency
        - Add F6F4 datatype metrics
        - Update formula for total FLOPs by adding F6F4 counters
        - Add LDS STORE / LOAD / ATOMIC metrics
        - Add LDS STORE / LOAD / ATOMIC bandwidth
        - Add LDS FIFO and TA ADDR / CMD / DATA FIFO full rates

* Collect TCP_TCP_LATENCY_sum only for gfx950 (MI 350)

* Do not inject SQ_ACCUM_PREV_HIRES unnecesarily

* Do not hardcode memory and shader clock speeds

* Write num_hbm_channels to sysinfo.csv instead of hbm_bw while profiling

* Move generate sysinfo.csv to pre processing step of profiling

* Add warnings to use --specs-correction for missing sysinfo.csv values during analysis phase

* Update CHANGELOG

* Analysis phase warning to use --specs-correction when needed

[ROCm/rocprofiler-compute commit: f9aa7be97c]
2025-04-03 02:21:18 -04:00

825 B

1workload_namecommandip_blockstimestampversionhostnamecpu_modelsbioslinux_distrolinux_kernel_versionamd_gpu_kernel_versioncpu_memorygpu_memoryrocm_versionvbioscompute_partitionmemory_partitiongpu_modelgpu_archgpu_l1gpu_l2cu_per_gpusimd_per_cuse_per_gpuwave_sizeworkgroup_max_sizemax_waves_per_cumax_sclkmax_mclkcur_sclkcur_mclktotal_l2_chanlds_banks_per_cusqc_per_gpupipes_per_gpuhbm_bwnum_xcdnum_hbm_channels
2path./tests/vcopy -n 1048576 -b 256 -i 3SQ|LDS|SQC|TA|TD|TCP|TCC|SPI|CPC|CPFThu 21 Mar 2024 03:52:12 PM (CDT)2t007-001.hpcfundAMD EPYC 7V13 64-Core ProcessorAmerican Megatrends Inc.0602Rocky Linux 9.1 (Blue Onyx)5.14.0-162.18.1.el9_1.x86_645276510086.0.2-115113-D3431401-100NANAMI100gfx9081681921204864102440150212001502120032326441228.8132