* Implement CPU discovery support
SWDEV-482949:
enable the CPU model name info support to the RDC, rdci command
can detect GPU and CPU modules at the same time.
It will query the CPU info through the amdsmi interface like below:
1 GPUs found.
-----------------------------------------------------------------
GPU Index Device Information
0 AMD Radeon PRO W7800
=================================================================
1 CPUs found.
-----------------------------------------------------------------
CPU Index Device Information
0 AMD Ryzen Threadripper PRO 7995WX 96-Cores
-----------------------------------------------------------------
Change-Id: Ibc6533c9a61000cd86c45b1bae14c3eb6788c119
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
* CMAKE - Add required version for amdsmi
Change-Id: I341a89351d196ec66cce215a5d1d3953302fcc66
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
---------
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
Co-authored-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
[ROCm/rdc commit: 3bdca8b8b6]
1. For temperature the unit in milli Celsius
2. For power the unit in microwatts.
3. Fix second register call to rdcd doesn't functional because start flag
Co-authored-by: Chao Fei <chao.fei@amd.com>
[ROCm/rdc commit: bd7d7c99c1]
Add the RdcSmiHealth module, which will call rocm_smi_lib.
It will support following health:
- XGMI error detected
- PCIE replay count detected
- Memory check
- InfoROM check
- Power/Thermal check
The grpc client and server side health function is added.
The health module is added to the rdci.
At present, XGMI/PCIE and a part of Memory have been implemented.
Others will be added as soon as possible.
Change-Id: I1bd99290bdc7dea733f21a41a8c4bcefb2138112
[ROCm/rdc commit: 853d3b0cc5]
Detcah the thread which handle shutdown signals instead of joining
thread can avoid the segfault issue on specific ASIC.
Signed-off-by: Li Ma <li.ma@amd.com>
Change-Id: I74ac53c027ac370605caaa87115c83fd8027526a
[ROCm/rdc commit: ca569346a3]
Implement an API to obtain the version information of the rdc calling component.
See rdc_component_t for details on available components.
It can be expanded later if necessary.
Change-Id: I03b48f774179c52c57b606704283add74ca39a02
Signed-off-by: Chen Gong <curry.gong@amd.com>
[ROCm/rdc commit: 5a3fd9fbc1]
Want to display version information along with the hash value.
Change-Id: I0f9ad576f8f66747ce2e84d4f524ccd16d399927
Signed-off-by: Chen Gong <curry.gong@amd.com>
[ROCm/rdc commit: ac874d3921]
Modifying the /opt/rocm/etc/rdc file modifies RDC launch options. If
the file doesn't exist, the service should still launch (though a new
file should likely be included with the next released package of 'rdc'.
Change-Id: I1a1891e9c5c3e6048754eb555779a97a170754c0
[ROCm/rdc commit: de3cb36ce0]
The executable rdcd was using an absolute path in rdc.service. Using update-alternatives gives the flexibility to invoke the binary from anywhere and no absolute path is required.
Change-Id: I2f3d6fcbf9dd854870cfc2e00532c504ce6cd6fc
[ROCm/rdc commit: 0ca6d6fa59]
These RUNPATH changes make it so libraries can be found without setting
LD_LIBRARY_PATH.
Mostly tested on installed RDC binaries and libraries. The
build binaries should also work.
Change-Id: Ifd908a5b61d24dfcbb1d08d21b4ee830156d8643
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
[ROCm/rdc commit: 32806681ca]
Also add stddef.h workaround for old GCC.
RHEL-8 still uses GCC 8.5 and templates are not well supported.
Change-Id: Ia4dae23892ec63682ea848c46ba81de85cf6d209
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
[ROCm/rdc commit: f9e80cc37a]
NOTE: RVS Build is disabled by default due to CI build issues.
Change-Id: I1593f0fe22075a9f86f54afa3ac151e109f1f7bd
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
[ROCm/rdc commit: eaa1862a80]
Join the signal handling thread instead of cancel it to prevent
crash with "terminate called without an active exception".
Change-Id: I2e18eb825728fd3a94f67b1b0049516bb7b6ebbc
[ROCm/rdc commit: 1ab4110d46]
- Replace gRPC library with gRPC package
- Relax RUNPATH
- Make LINKER_FLAGS global
gRPC package includes its dependencies:
SSL, UPB, ABSL, and etc.
Change-Id: Ieb198ad96e26e89b09cb85986214a5b1451b17a6
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
[ROCm/rdc commit: 3e4c55ec6c]
- Respect CMAKE_INSTALL_PREFIX and ignore RDC_CLIENT_INSTALL_PREFIX
- Move example and rdctst from rocm/bin to rocm/share/rdc
- Add README for examples
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
Change-Id: I0b1d996d206327fd1b51ac6e82d548829bdb1570
[ROCm/rdc commit: f6efd7fbf6]
Main CMake improvements:
* Add rdctst with -DBUILD_TESTS=ON
* Set default ROCM_DIR to /opt/rocm/
* Split rdc_libs/CMakeLists.txt into subdirectories
* Package tests into rdc-tests.deb and .rpm
Misc improvements:
* Add .editorconfig to normalize code formatting
* Add .gitignore
* Expand RPATH for gRPC to reduce LD_LIBRARY_PATH usage
* Export compile_commands.json
* Show warning and do not install gRPC if GRPC_ROOT is left as default
* Move .in files into relevant subdirectories
* Move most variables into project CMakeLists.txt to avoid redefinitions
* Normalize CMakeLists.txt formatting (4 spaces indentation)
* Rename DIAGNOSTIC_LIB to RDC_ROCR_LIB
* Update gRPC version in README to 1.44.0
* Remove gtest source
* Pull gtest from github if not installed
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
Depends-On: I1039ef61247e3f0ff822925cc869fb0c2bf3af85
Change-Id: I879b21428e6642f19fda67092b365d8b78b7ba7b
[ROCm/rdc commit: 2c171767b3]
With file reorganization changes binaries are moved to /opt/rocm-ver/bin.
Similarly rdc.service moved to /opt/rocm-ver/libexec/rdc
Test suites still used old paths
Once test suites changes are made, backward compatibility for binaries and rdc.service can be removed
Corrcted binary path in rdc.service.in
Corrected GRPC runpath
Change-Id: I306924d81cedc19586305a79d51eea8af6e70e83
[ROCm/rdc commit: c3ea96dd71]
SWDEV-291455 - Binary , header files and libraries installed in bin,include and lib folder under /opt/rocm-ver
Prebuilt ras library with updated search path
cmake config files in lib/cmake/rdc
grpc,sp3,hsaco and private libraries installed in lib/rdc
config installed in share/rdc
authentication and python_binding installed in libexec/rdc
Backward compatibility added for header files and libraries
Depends-On: I3f3d192935923f71737b3fe55ded536654a73dd7
Change-Id: Ia1a6cadc59034b155631a1ee5fdbe692d2a8a71b
[ROCm/rdc commit: 52a3463147]
grpc v1.44.0 needs to link to library absl_synchronization. The
CMakeLists.txt is changed to link to that library if available.
Change-Id: I92f7247473a70e7a83416b9744e788e45d104565
[ROCm/rdc commit: 2a46ee2ab2]
Provides a RdcSmiDiagnostic module, which will call rocm_smi_lib.
It will support following diagnostics: Get GPU Topology, Check GPU
parameters and check processes running on the GPUs.
The grpc client and server side diagnostics function is added.
The diag module is added to the rdci.
Change-Id: I10a0cf3c20556a61373ab686f82cae75acaa40dd
[ROCm/rdc commit: 76ccf58008]