Files
vedithal-amd 27585a8a2b Support MI 350 profiling (#632)
* Add MI 350 hardware information

* Refactor MI GPU YAML file and corresponding interface

* Add SoC file for gfx950 architecture

* Add analysis report configs for MI 350 containing existing metrics

* Add placeholder None valued metrics for previous architectures to make
  baseline comparison work

* Enable testing on MI 350

* Analysis config metric changes
    - SPI changes
        - Update metric formula for default SPI pipe counter
             - Use efficiently collected pipe wise SPI counters
        - Add SPI Wave Occupancy
        - Add Scheduler-Pipe Wave Utilization
        - Update formula for VGPR Writes
        - Add Scheduler-Pipe FIFO Full Rate
   - CPC changes
	- Add CPC SYNC FIFO Full Rate
	- Add CPC CANE Stall Rate
        - Add CPC ADC Utilization
   - SQ changes
        - Add VALU co-issue efficiency
        - Add F6F4 datatype metrics
        - Update formula for total FLOPs by adding F6F4 counters
        - Add LDS STORE / LOAD / ATOMIC metrics
        - Add LDS STORE / LOAD / ATOMIC bandwidth
        - Add LDS FIFO and TA ADDR / CMD / DATA FIFO full rates

* Collect TCP_TCP_LATENCY_sum only for gfx950 (MI 350)

* Do not inject SQ_ACCUM_PREV_HIRES unnecesarily

* Do not hardcode memory and shader clock speeds

* Write num_hbm_channels to sysinfo.csv instead of hbm_bw while profiling

* Move generate sysinfo.csv to pre processing step of profiling

* Add warnings to use --specs-correction for missing sysinfo.csv values during analysis phase

* Update CHANGELOG

* Analysis phase warning to use --specs-correction when needed

[ROCm/rocprofiler-compute commit: f9aa7be97c]
2025-04-03 02:21:18 -04:00

809 B

1workload_namecommandip_blockstimestampversionhostnamecpu_modelsbioslinux_distrolinux_kernel_versionamd_gpu_kernel_versioncpu_memorygpu_memoryrocm_versionvbioscompute_partitionmemory_partitiongpu_seriesgpu_modelgpu_archgpu_chip_idgpu_l1gpu_l2cu_per_gpusimd_per_cuse_per_gpuwave_sizeworkgroup_max_sizemax_waves_per_cumax_sclkmax_mclkcur_sclkcur_mclktotal_l2_chanlds_banks_per_cusqc_per_gpupipes_per_gpunum_xcdnum_hbm_channels
2vcopytests/vcopy -n 1048576 -b 256 -i 3SQ|LDS|SQC|TA|TD|TCP|TCC|SPI|CPC|CPFFri Mar 28 22:43:57 2025 (UTC)3f77021840818AMD Ryzen Threadripper PRO 7985WX 64-CoresAMDVBS1052957N.FDUbuntu 22.04.5 LTS5.15.0-70-generic5274561166.5.0-831113-M3550101-100SPXNPS1MI350MI350gfx950301123240961284166410243200128326448128