39 Commity

Autor SHA1 Zpráva Datum
Dmitrii 8abe24d3b0 rdc: Add CPU support and CPU metrics infrastructure (#770) 2025-09-12 16:14:38 -05:00
Galantsev, Dmitrii 1d55c1d820 CMAKE - Format with gersemi
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rdc commit: 40545dcb49]
2025-06-27 17:25:51 -05:00
Galantsev, Dmitrii 1e8bc4dc96 CMAKE - Format with cmake-format
Change-Id: I08e71fc5060b1f6e0168225cc5fe66886c2044bd
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rdc commit: fa8b89f4ae]
2025-05-06 17:28:14 -05:00
Galantsev, Dmitrii 874a7b438f CMAKE - Fix build types
Addresses issue https://github.com/ROCm/rdc/issues/43

Change-Id: I456184358524a6feef4bf83eecb655678c3bc42d
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rdc commit: 80ee980cdb]
2025-03-30 18:54:54 -05:00
Galantsev, Dmitrii d5ce61d95e CMAKE - Move rdc_options into share/rdc/conf/
Change-Id: Ib2e792aef180f0f267d86d68c57b852b2cdc8ea6
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rdc commit: 99d4d77e20]
2025-01-24 12:06:05 -06:00
Galantsev, Dmitrii 3218c2af5c CMAKE - Rename SMI_*_DIR into AMD_SMI_*_DIR
Change-Id: I3b8b852e6b68f1448c8ed5d5e6ea4579c470ff53
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rdc commit: e033fd4c55]
2025-01-23 20:56:00 -06:00
Galantsev, Dmitrii efd58742db AMDSMI - Fix kRasErrStateStrings in tests
Change-Id: Ia9498fae215397baf7201715574954313c17da93
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rdc commit: 4f7e441566]
2024-11-07 11:21:22 -06:00
Galantsev, Dmitrii 999cae5e2c SWDEV-466829 - Disable ROCP when in GTest
Change-Id: I3b218fe256717c1dc9187d5f17476dfc990656c2
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rdc commit: c40a6308c5]
2024-09-26 17:00:05 -05:00
Galantsev, Dmitrii 9a2806ac95 SWDEV-452795 - Disable RAS plugin, fix XGMI
RAS plugin loaded rocm-smi which is in conflict with amd-smi library

Main source of grief was the map 'devInfoTypesStrings' that is defined
in both rocm-smi and amd-smi

We assume that rocm-smi would get lazy-loaded by RAS library and
overwrite symbols defined in amd-smi. devInfoTypesStrings in rocm-smi
contains different number of elements, the enums are also different.
RDC relies on amd-smi's enums.

One such enum is kDevGpuMetrics:
  rocm-smi: kDevGpuMetrics = 68
  amd-smi:  kDevGpuMetrics = 75

Example of overlapping map definitions:

  $ objdump --dynamic-syms /opt/rocm/lib/libamd_smi.so | grep devInfoTypesStrings
  00000000003c4980 g    DO .data.rel.ro0000000000000008  Base        devInfoTypesStrings
  00000000003db830 g    DO .bss0000000000000030  Base        _ZN3amd3smi6Device19devInfoTypesStringsE
  $ objdump --dynamic-syms /opt/rocm/lib/librocm_smi64.so  | grep devInfoTypesStrings
  00000000003dc590 g    DO .bss0000000000000030  Base        _ZN3amd3smi6Device19devInfoTypesStringsE
  00000000003c9c68 g    DO .data.rel.ro0000000000000008  Base        devInfoTypesStrings

Change-Id: Ib2f2db32b6abd7ebe84e7807c25581461eb86bae
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rdc commit: d85657e5f2]
2024-06-26 03:42:07 -05:00
Galantsev, Dmitrii 29b86095ed Fix rocprofiler plugin
- Replace non-working fields with working ones
    - remove CU_OCCUPANCY completely as it isn't well supported
- Fix rocprofiler initialization with shared_ptr and rdc_module_init
- Replace env var ROCPROFILER_METRICS_PATH with ROCP_METRICS
    - ROCPROFILER_METRICS_PATH is only relevant for rocprofv2
    - ROCP_METRICS is only relevant for rocprofv1 (which we are using)

Change-Id: I21e6fa3f0e1694c38f44ca0e5659d672559f7380
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rdc commit: 20ca2ce574]
2024-06-06 01:51:39 -05:00
Galantsev, Dmitrii f73e123900 Add GPU indexing and fix check for fields in rocprof
- Fix RUNPATH for tests

Change-Id: I79517592b49d27080a010a2e41e5878adf24a157
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rdc commit: e11afbf60f]
2024-06-04 12:56:22 -05:00
Maisam Arif d9adf280cd Updated RDC to use AMD-SMI 24.6.0 structs
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I9ef0f3cb786c1238e53cf21df5c6afafac829175


[ROCm/rdc commit: 7c6bd4dc1c]
2024-05-31 10:37:39 -05:00
Galantsev, Dmitrii f74f1684de Update kBlockNameMap
Change-Id: I096f40f2b953fad7081d4b9bc05c0291c0f8058d
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rdc commit: cb87eeeae7]
2024-04-24 23:50:55 -05:00
Galantsev, Dmitrii 028355dff0 SWDEV-439576 - rocmsmi -> amdsmi
- Migrate to amdsmi library
- NOTE: raslib still uses rocmsmi
- Remove unused rocmsmi service
- Remove unused RDC client code
- Remove RSMI calls from protos/rdc.proto

Change-Id: Ifc34a264c506b0ec5792307ee56b34526268762d
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rdc commit: 9702d0f2d7]
2024-04-09 20:19:28 -05:00
Galantsev, Dmitrii ea624cbb7c LINT: Add cpplint, clang-format and pre-commit support
Change-Id: I3cbb787ef27d90486b212dfb1a8c77c460acc2ac
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rdc commit: 434e40305d]
2024-01-09 11:37:11 -06:00
Galantsev, Dmitrii d4440d392e Upgrade to CXX-17 gtest-1.14
Change-Id: I1c7316f151128cbc9318b226dac14950e399d2c7
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rdc commit: 8f9a6796f1]
2023-09-28 12:54:49 -05:00
Galantsev, Dmitrii a337dc062b SWDEV-392942 - Disable rocmtools
Temporarily disable rocmtools because of hsa_shut_down issues

Change-Id: I5e8b6729b8200ccdd5c399862bfc632ba69f884c
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rdc commit: 90e824c63b]
2023-04-05 13:20:19 -05:00
Galantsev, Dmitrii 4091faf4f4 SWDEV-376779 - Fix linking for rdctst
Ieb198ad96e26e89b09cb85986214a5b1451b17a6 broke linking
for rdctst and rdcd by removing "../lib/rdc" path.
This change adds it back and makes the paths more visible.

- Link librdc_ras and librdc_rocp to rdctst
- Add longer RUNPATH for rdctst to link rdc libraries

Change-Id: Id4f128c217a6de8bb67df6750ecafdb96545811b
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rdc commit: fc097d44ff]
2023-01-11 19:40:59 -05:00
Galantsev, Dmitrii 5c803f6b03 SWDEV-352414 - Fix gRPC linker issues
- Replace gRPC library with gRPC package
- Relax RUNPATH
- Make LINKER_FLAGS global

gRPC package includes its dependencies:
SSL, UPB, ABSL, and etc.

Change-Id: Ieb198ad96e26e89b09cb85986214a5b1451b17a6
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rdc commit: 3e4c55ec6c]
2023-01-04 18:50:07 -06:00
Galantsev, Dmitrii 2b89ab397c Improve CMake and relocate tests
- Respect CMAKE_INSTALL_PREFIX and ignore RDC_CLIENT_INSTALL_PREFIX
- Move example and rdctst from rocm/bin to rocm/share/rdc
- Add README for examples

Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
Change-Id: I0b1d996d206327fd1b51ac6e82d548829bdb1570


[ROCm/rdc commit: f6efd7fbf6]
2022-10-27 13:49:54 -05:00
Galantsev, Dmitrii 9ff80828e5 Compile rdctst and improve CMakeLists
Main CMake improvements:

* Add rdctst with -DBUILD_TESTS=ON
* Set default ROCM_DIR to /opt/rocm/
* Split rdc_libs/CMakeLists.txt into subdirectories
* Package tests into rdc-tests.deb and .rpm

Misc improvements:

* Add .editorconfig to normalize code formatting
* Add .gitignore
* Expand RPATH for gRPC to reduce LD_LIBRARY_PATH usage
* Export compile_commands.json
* Show warning and do not install gRPC if GRPC_ROOT is left as default
* Move .in files into relevant subdirectories
* Move most variables into project CMakeLists.txt to avoid redefinitions
* Normalize CMakeLists.txt formatting (4 spaces indentation)
* Rename DIAGNOSTIC_LIB to RDC_ROCR_LIB
* Update gRPC version in README to 1.44.0
* Remove gtest source
* Pull gtest from github if not installed

Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
Depends-On: I1039ef61247e3f0ff822925cc869fb0c2bf3af85
Change-Id: I879b21428e6642f19fda67092b365d8b78b7ba7b


[ROCm/rdc commit: 2c171767b3]
2022-10-07 13:58:50 -05:00
Ranjith Ramakrishnan 3df8b88ca6 File reorganization with backward compatibility
SWDEV-291455 -  Binary , header files and libraries installed in bin,include and lib folder under /opt/rocm-ver
Prebuilt ras library with updated search path
cmake config files in lib/cmake/rdc
grpc,sp3,hsaco and private libraries installed in lib/rdc
config  installed in share/rdc
authentication and python_binding installed in libexec/rdc
Backward compatibility added for header files and libraries

Depends-On: I3f3d192935923f71737b3fe55ded536654a73dd7
Change-Id: Ia1a6cadc59034b155631a1ee5fdbe692d2a8a71b


[ROCm/rdc commit: 52a3463147]
2022-08-04 23:42:42 -07:00
Bill(Shuzhou) Liu 234ef250e2 Upgrade GoogleTest to v1.11.0
The old GoogleTest has compile errors on Centos 9. Upgrade it
to latest version.

Change-Id: Ifc95c68ddf2321509b90e20af11c8d468a63f431


[ROCm/rdc commit: c465d29d8c]
2022-03-14 10:23:06 -04:00
Bill(Shuzhou) Liu 84eca4cf9e Add -g compiler option for ADDRESS_SANITIZER
Add -g compiler option for Address Sanitizer

Change-Id: I5c4a72dd06a7242715c537fc0d44770b126862d2


[ROCm/rdc commit: 6f95200387]
2021-08-03 13:52:21 -04:00
Bill(Shuzhou) Liu 0370836c04 Add the Address Sanitizer Support for RDC
Change the CMakLists.txt to add the -fsanitize=address
Refer to jira ticket SWDEV-259873

Change-Id: Ie37fd661787eaea16f366b925d9a97db233cd136


[ROCm/rdc commit: ceb562d630]
2021-01-07 12:11:12 -05:00
Chris Freehill 79b5e54d3b Add event notification support and rdci timestamps
Also:
* print header line every 50 line on output
* print events that are being listened for with header
* cpplint clean-up

Change-Id: Ic049eb79156a9528b556e56f0fa43e1344f898cc


[ROCm/rdc commit: b278cd379b]
2020-11-22 07:10:39 -05:00
Chris Freehill 8d297e07c1 Fix how test deals with terminating rdcd
Previously we would return -1 if we detected rdcd was
still running. But the rdcd process ID is alive as long
as the test is running. So now we return 0, and the rdcd
process ends, allowing the test to end cleanly.

Change-Id: I98a5aa0a03d14127824b86e1190047c9f9d2edb7


[ROCm/rdc commit: 15be17539f]
2020-08-17 14:09:37 -05:00
Chris Freehill 14b109f888 Fix rdcd start/kill issues in rdctst
Change-Id: I7e4b1c19832b09b17720892d2c4f200d304ef2fb


[ROCm/rdc commit: 63863cbd2e]
2020-08-17 14:09:37 -05:00
Chris Freehill 6b246dcf4b rdc_field_t replaces uint32_t; centralize field data
Make the RDC use the new rdc_field_t enum instead of uint32_t.
This will help prevent invalid field types from being passed in.

Also, centralize where data related to fields is kept. This will
reduce the number of places where changes are required each
time a new field is added.

Finally, cleaned up several cpplint issues.

Change-Id: I48e4512e18c164411d8b09ae3d4bed99fba359ec


[ROCm/rdc commit: 5950ebadc4]
2020-08-17 14:09:37 -05:00
Chris Freehill bf248131cb Fix rdctst build
Compile and link steps were looking in wrong directories for
include and library files.

Change-Id: I5cbfd67ca2a02cab898f820587a9793f2105f2e6


[ROCm/rdc commit: 9efb55b06f]
2020-08-17 14:09:37 -05:00
Chris Freehill cd5f37a3aa Prepare rdctst for automated test runs
Mostly this involves creating a "batch mode" which does not
have any interactive prompts. Also, in batch mode, both stand-
alone and embedded modes are run.

Change-Id: I9703e501ab1f853e992b6b401fa0215681ab69f0


[ROCm/rdc commit: 5f947270c1]
2020-08-17 14:09:29 -05:00
Divya Shikre b156e86589 Implement gtests for RDC
adding gtest placeholder
adding discovery,group,fieldgroup,dmon,stats test

Change-Id: I71428f70345af5c8025fb66c1d411dc348daa2ef


[ROCm/rdc commit: 61579371f8]
2020-08-17 14:07:25 -05:00
Chris Freehill 819c4febca Make GPRC and protobuf external components to RDC
Pass in GRPC root (or use default location) for RDC to use
when building RDC components.

Change-Id: I89db2ac2be27ab6449c817d210a94c11fef965fd


[ROCm/rdc commit: 1b58033183]
2020-08-17 14:07:25 -05:00
Chris Freehill 2f59e7e1ab Add support for gRPC authenticated communications
Also, make a few namespace corrections and some minor refactoring.

Change-Id: Iedcaf6b43cb7576bc11dfefe980abd190c838831


[ROCm/rdc commit: 47fdfa4c7e]
2020-08-17 14:07:25 -05:00
Chris Freehill 5c33103352 Make rdcd run as user "rdc"
The rdc account will be created on installation if it does
not already exist. It will be a system account with no
home directory.

rdcd will be started as a systemd service, but change to
user "rdc". The rdc user will drop all priviliges except
CAP_DAC_OVERRIDE, permitted. This means the default mode
will have no special privileges, but have the ability to
gain write access (e.g., to sysfs) when needed.

rdc tests were being inadvertantly added to the
installation. This was adversely impacting the new
functionality, so it was corrected in this commit.

Also included are a few small formatting changes.

Change-Id: I9c6bb132fee28119fd3960594dfb97bd2e7b282a


[ROCm/rdc commit: 5cc498c6aa]
2020-08-17 14:07:25 -05:00
Chris Freehill 87aa4ff77c Add read fan values and associated tests
Change-Id: I89322e93d5f3110adace15e5a576f00d4934be79


[ROCm/rdc commit: 4729c47866]
2020-08-17 14:07:25 -05:00
Chris Freehill 77683bf0e8 Add Google test based tests.
Initial testing include an "id test", which really just a
template test at this point, and a temperature sensor test.

The google test code is included in this commit. It will
eventually be taken out and replaced with a pull from a google
external repo.

Change-Id: I591818a9c169f4654fc8d8f17cf648f227d72545


[ROCm/rdc commit: ca4344f5fa]
2020-08-17 14:06:56 -05:00
Chris Freehill ba14edbb4d Break srvs. into rsmi & admin srvs. Add VerifyConnection api.
Change-Id: I67567264c37e31f3409062a14e56eba4801cd944


[ROCm/rdc commit: dc6f6f3e9a]
2020-01-09 20:02:33 -06:00
Chris Freehill bc7f01e992 Initial RDC commit
Includes server, client and example targets.

Change-Id: I30596fb0453af71d49b8390a8468a6d073200836


[ROCm/rdc commit: 5898345d17]
2020-01-09 17:57:29 -06:00