Wykres commitów

52 Commity

Autor SHA1 Wiadomość Data
adapryor 33924ea79e Profiler - Fix SIMD Utilization
Change-Id: I6775cce9901a714d20e80c8c17e7a563edeb48a4
2025-05-07 00:56:52 -05:00
Galantsev, Dmitrii fa8b89f4ae CMAKE - Format with cmake-format
Change-Id: I08e71fc5060b1f6e0168225cc5fe66886c2044bd
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-05-06 17:28:14 -05:00
Galantsev, Dmitrii 02c0786a2c Profiler - Add SIMD_UTILIZATION (#171)
Change-Id: I19d5acd80dbed8c4fc4e1c85eec71ca89398d299

Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-05-06 13:20:03 -07:00
Galantsev, Dmitrii dfae9cd37f Profiler - Remove buffer to fix memory leaks
Change-Id: Ia3717ccfc147221557f5469965c2abb76b3f451c
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-04-11 17:27:27 -05:00
Galantsev, Dmitrii 91be467cad Profiler - Fix eval fields
The 'value' pointer was being written to a lot and then used for reading
within the same function. This likely caused issues all over RDC when
reading the metrics.

This commit changes it so *value is written to only once.

Change-Id: I83c158c1e46c6ce46ff87d8a2e769f26ffa8c0da
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-04-09 20:06:21 -05:00
Galantsev, Dmitrii bdb2367010 RVS - Add long-running tests
Change-Id: Iddeb7f2d4fdcd69d7ac1ae94b2fa128ee3011b1a
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-03-27 23:42:56 -05:00
Galantsev, Dmitrii 58350a8bb8 Profiler - Remove bootstrap link
Change-Id: Ieea57515d77c2d521d95568c3bc2660cc829d829
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-03-27 23:29:30 -05:00
Galantsev, Dmitrii 51de344be7 Profiler - Add CPC and CPF metrics
Change-Id: I27fd725e9e1868c9afe7624d6e4aafad2a42d47e
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-03-27 19:01:23 -05:00
Galantsev, Dmitrii f5a4402ce5 RVS - Use config files and make GPU aware
Change-Id: I7a5c80ed4e6122d102e494d1ae38b4b7d40c42cd
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-03-11 15:39:16 -05:00
Galantsev, Dmitrii 247c8c7d5e RVS - Disable IET test
Change-Id: I015d68735316d2dc6af18d16f972d9f379b76bcf
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-03-11 09:51:08 -05:00
Pryor, Adam 6f358ddc9e SWDEV-508477 Eval Flops Percent (#85)
SWDEV-508477 - Profiler add FP*_PERCENT

Change-Id: Idb6250fe6b7ba3df6fe7d30861e0fbbda7e9bdce

Signed-off-by: adapryor <Adam.pryor@amd.com>
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-01-24 10:07:32 -06:00
Galantsev, Dmitrii e033fd4c55 CMAKE - Rename SMI_*_DIR into AMD_SMI_*_DIR
Change-Id: I3b8b852e6b68f1448c8ed5d5e6ea4579c470ff53
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-01-23 20:56:00 -06:00
adapryor 290b90dc89 Implementation for RDC_FI_PROF_OCCUPANCY_PER_ACTIVE_CU SWDEV-50895
Signed-off-by: adapryor <Adam.pryor@amd.com>
Change-Id: I8da7d9846edabe5629c75f50cd2bb4b23e019a17
Signed-off-by: adapryor <Adam.pryor@amd.com>
2025-01-21 21:49:19 -06:00
Pryor, Adam 0ae4404a09 SWDEV-510089 Fix rocprof segfaulting on ctrl+c (#94)
Change-Id: Iaa0f3856bb8fed174cbc935b85739414ecd44758

Signed-off-by: adapryor <Adam.pryor@amd.com>
2025-01-21 10:30:31 -06:00
Galantsev, Dmitrii 5861ec7663 RVS - Add IET and PEBB tests
Change-Id: Ia032901d74c882e5cbfa5a3164199cd4d571341f
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-01-08 18:23:13 -06:00
Galantsev, Dmitrii b058cbecf1 RVS - Add memory bandwidth test
Change-Id: I4c8990170861f6a0f3853615db68634fdaa7a622
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-01-08 18:23:13 -06:00
Greg Scaffidi f4de4b0529 Add RDC_FI_PROF_SM_ACTIVE metric.
Signed-off-by: Greg Scaffidi <salvatore.scaffidi@amd.com>
Change-Id: I63aaf5eb05d74ba696ace2b088e17c2cfb1bd74b
Signed-off-by: adapryor <Adam.pryor@amd.com>
2024-12-21 15:21:46 -06:00
Galantsev, Dmitrii 7c91a07a43 Profiler - Migrate from rocprofv1 to rocprofv3
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>

Fixed RDC for Rocprofv3

Updates

Signed-off-by: adapryor <Adam.pryor@amd.com>
Change-Id: Ic9162bacf1322b265e6bbcdd9fbb9b1fdef414fd

last updates

Change-Id: I12e168501327c5e4cff8a9273b0512fb0e098fe7

comment

Change-Id: I61da61e66dcc017ec46f98ff4c90fb064c9679e8
2024-12-20 15:39:02 -06:00
Galantsev, Dmitrii 2c61dfe2ce Profiler - Remove averaging
Averaging happens very slowly and only confuses people...

Change-Id: I60754d3b896b6ffeb6104bb1c2fcc54e9869b331
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-12-11 11:58:50 -06:00
Galantsev, Dmitrii 2605eda5f3 Profiler - Fix fp64 metric
Change-Id: Iab27e21740c2c51143a9e88d085b80716bf193e2
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-12-11 11:27:41 -06:00
Chen Gong 251fcbe49d rocprofiler: add valu utilization
SWDEV-475242

For the description of "FP32 Engine Activity" and "FP64 Engine Activity" in dcgm,
It seems that we do not have an equivalent to these pipe-utilizations on our hardware.

In rocprofiler, I think VALU Utilization is the closest to what we want.

Change-Id: Ibce8835ef4757084cdfd73258de6fc1606ca0158
Signed-off-by: Chen Gong <curry.gong@amd.com>
2024-11-21 15:24:01 +08:00
Galantsev, Dmitrii e1b57c43f3 RVS - Fix cookie_t -> rdc_diag_callback_t types issue
Issue introduced in 37ddd5bf50

Change-Id: I2b6a8024d45fc44d92cf2770be9887dfc0fb3ede
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-11-12 10:36:52 -06:00
Galantsev, Dmitrii 37ddd5bf50 RVS - Report test progress in realtime
Change-Id: Id9fea71f242f372f408ecd777c030465b7ef9989
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-11-07 11:21:22 -06:00
Galantsev, Dmitrii 9c77312c51 Finish basic logging impl
Change-Id: Ia3d6ac80f4832f1bfb63573c543659abd5f84341
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-11-07 11:21:22 -06:00
Galantsev, Dmitrii cdf1588974 CMAKE - Find modules at build time
Change-Id: I9370ef1433579aff1a37f3636050f525638d8658
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-11-07 11:21:22 -06:00
Galantsev, Dmitrii dd50027748 CMAKE - Fix RVS include
Change-Id: I65095cc3d04fc2a5daeee5c809f635cb1662822f
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>

Revert "Disable RVS as the error scares people"

This reverts commit 660c5afaf4.

Change-Id: I5086c25772444aa3bfc4c10abc1ea58d3f3f1f27
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-11-07 11:18:41 -06:00
Galantsev, Dmitrii 28acbf0436 Profiler - Modify metrics
Remove occupancy metrics and replace with OccupancyPercent

Add OCCUPANCY_PERCENT which uses OccupancyPercent
Add GR_ENGINE_ACTIVE which uses GPU_UTIL/100
Add TENSOR_ACTIVE_PERCENT which uses MfmaUtil
Modify FLOPS_64 to use FP64_ACTIVE

Change-Id: I5f30d77a0c80f5ac78abd1a9e57f8a0a3c6cc00b
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-10-15 19:00:30 -05:00
Galantsev, Dmitrii c40a6308c5 SWDEV-466829 - Disable ROCP when in GTest
Change-Id: I3b218fe256717c1dc9187d5f17476dfc990656c2
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-09-26 17:00:05 -05:00
Bill(Shuzhou) Liu 9800528c19 Update the hsaco for diagonstic on MI300X
Add hsaco for gfx940, gfx941 and gfx942

Change-Id: Ibd55fcc2d036d1190357e1e86d4e170568426d94
2024-09-17 14:15:35 -05:00
Galantsev, Dmitrii d4bb33d100 Use correct rocprofiler metrics
Change-Id: I26603de7425abb6588f770ed68c22e14d6d20d56
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-06-11 11:15:18 -05:00
Galantsev, Dmitrii 3514225b83 Rewrite rocprofiler plugin
Change-Id: Ic7dd967cc60cacd2b16a465180505ea2a342fccf
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-06-11 03:11:15 -05:00
Galantsev, Dmitrii 20ca2ce574 Fix rocprofiler plugin
- Replace non-working fields with working ones
    - remove CU_OCCUPANCY completely as it isn't well supported
- Fix rocprofiler initialization with shared_ptr and rdc_module_init
- Replace env var ROCPROFILER_METRICS_PATH with ROCP_METRICS
    - ROCPROFILER_METRICS_PATH is only relevant for rocprofv2
    - ROCP_METRICS is only relevant for rocprofv1 (which we are using)

Change-Id: I21e6fa3f0e1694c38f44ca0e5659d672559f7380
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-06-06 01:51:39 -05:00
Galantsev, Dmitrii 07c414af5e Finalize the rocprofiler fields
Change-Id: I4ed1c4309f21bdcc7281d911663036caf5947182
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-06-04 19:49:06 -05:00
Galantsev, Dmitrii e11afbf60f Add GPU indexing and fix check for fields in rocprof
- Fix RUNPATH for tests

Change-Id: I79517592b49d27080a010a2e41e5878adf24a157
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-06-04 12:56:22 -05:00
Galantsev, Dmitrii 53033a5b77 Add memory bandwidth metrics
Change-Id: I310ca8af0536497be619d2bda1e540d1f11c2565
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-05-17 14:55:01 -05:00
Galantsev, Dmitrii c7fcb1ad25 Add rocprofiler_example.cc and fix logging
Change-Id: Ib3ed8754f314edc76ea56bfec9a645d720f8926d
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-05-17 14:55:01 -05:00
Galantsev, Dmitrii c3a4c899d5 Profiler - Add all required metrics
Change-Id: Iea3938df9407789c061c3a6ead9167a69069d6e6
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-05-09 23:24:02 -05:00
Galantsev, Dmitrii 234b2d835b Add rocprofiler plugin
Rename ROCR -> Runtime and ROCP -> Profiler

Change-Id: If90953da8fa5d695b681813dad4a3e7ec26a9c7e
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-05-07 04:39:39 -05:00
Galantsev, Dmitrii 9702d0f2d7 SWDEV-439576 - rocmsmi -> amdsmi
- Migrate to amdsmi library
- NOTE: raslib still uses rocmsmi
- Remove unused rocmsmi service
- Remove unused RDC client code
- Remove RSMI calls from protos/rdc.proto

Change-Id: Ifc34a264c506b0ec5792307ee56b34526268762d
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-04-09 20:19:28 -05:00
Galantsev, Dmitrii 662cc0f8b2 Revert "Sort the ROCr gpu index based on BDF"
Fix 'rdcd diag' compute and system tests.
This reverts commit 61a2773875.

Change-Id: Ia092c46649c1d6338fb96ffe7e6feba4b045f027
2024-04-09 10:27:19 -05:00
Galantsev, Dmitrii 9d55c26247 Remove -X from .hsaco files
Change-Id: I1f1b4f07eb854ce2e254564b83719be52b553b02
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-03-27 20:35:08 -05:00
Galantsev, Dmitrii 6d5d9971c2 CMAKE - Find hsa-runtime64
Change-Id: Id877eb9cfcc61d81993a6a43703ef2e5f72e1e8f
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-02-19 23:49:38 -05:00
Galantsev, Dmitrii f9e80cc37a Use templates for module population
Also add stddef.h workaround for old GCC.
RHEL-8 still uses GCC 8.5 and templates are not well supported.

Change-Id: Ia4dae23892ec63682ea848c46ba81de85cf6d209
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-01-10 00:27:09 -06:00
Galantsev, Dmitrii eaa1862a80 RVS: Finish initial RVS integration
NOTE: RVS Build is disabled by default due to CI build issues.

Change-Id: I1593f0fe22075a9f86f54afa3ac151e109f1f7bd
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-01-10 00:27:04 -06:00
Galantsev, Dmitrii 434e40305d LINT: Add cpplint, clang-format and pre-commit support
Change-Id: I3cbb787ef27d90486b212dfb1a8c77c460acc2ac
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-01-09 11:37:11 -06:00
Bill(Shuzhou) Liu 61a2773875 Sort the ROCr gpu index based on BDF
The rocm-smi index is changed to sort based on BDF. The rocr plugin
is also changed based on that.

Change-Id: I5851431db336d50266b253dec1894a7bd9f3554b
2023-11-16 09:07:22 -05:00
Galantsev, Dmitrii 24b3f138e9 SWDEV-380364 - Resolve dmon + rocmtools halt
* Move hsa_init out of rocmtools and into RDC
* Remove secondary hsa_shut_down from ROCR module

Change-Id: I57d84d41ddc51595b98e734265f10bc5129a7352
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
Depends-On: I2b389ee1a9ba3507b2df1fc2fe83598f67731aac
2023-02-02 18:33:14 -06:00
Galantsev, Dmitrii 861a843ed7 Add rocmtools support
This commit adds integration with ROCmTools

Additional changes:
- Fix DEB and RPM installation issue when systemd is not present
- Fix typos in rdc.h
- Wrap negative values in parentheses in rdc.h
- CMAKE: Improve rocm_smi searching
- README: Improve formatting, add info about ROCmTools

Metrics added: 700-714
Metrics can be listed with `rdci dmon --list-all`
Majority of the metrics are only supported by Instict (MI) series GPUs
700 RDC_FI_PROF_ELAPSED_CYCLES should be available on most devices
See README for more information

Change-Id: I907d3eacdc92fc5588ca6c76c2fa1ce0ad900770
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2022-12-16 12:19:59 -06:00
Galantsev, Dmitrii 2c171767b3 Compile rdctst and improve CMakeLists
Main CMake improvements:

* Add rdctst with -DBUILD_TESTS=ON
* Set default ROCM_DIR to /opt/rocm/
* Split rdc_libs/CMakeLists.txt into subdirectories
* Package tests into rdc-tests.deb and .rpm

Misc improvements:

* Add .editorconfig to normalize code formatting
* Add .gitignore
* Expand RPATH for gRPC to reduce LD_LIBRARY_PATH usage
* Export compile_commands.json
* Show warning and do not install gRPC if GRPC_ROOT is left as default
* Move .in files into relevant subdirectories
* Move most variables into project CMakeLists.txt to avoid redefinitions
* Normalize CMakeLists.txt formatting (4 spaces indentation)
* Rename DIAGNOSTIC_LIB to RDC_ROCR_LIB
* Update gRPC version in README to 1.44.0
* Remove gtest source
* Pull gtest from github if not installed

Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
Depends-On: I1039ef61247e3f0ff822925cc869fb0c2bf3af85
Change-Id: I879b21428e6642f19fda67092b365d8b78b7ba7b
2022-10-07 13:58:50 -05:00
Ranjith Ramakrishnan 52a3463147 File reorganization with backward compatibility
SWDEV-291455 -  Binary , header files and libraries installed in bin,include and lib folder under /opt/rocm-ver
Prebuilt ras library with updated search path
cmake config files in lib/cmake/rdc
grpc,sp3,hsaco and private libraries installed in lib/rdc
config  installed in share/rdc
authentication and python_binding installed in libexec/rdc
Backward compatibility added for header files and libraries

Depends-On: I3f3d192935923f71737b3fe55ded536654a73dd7
Change-Id: Ia1a6cadc59034b155631a1ee5fdbe692d2a8a71b
2022-08-04 23:42:42 -07:00