* Implement CPU discovery support
SWDEV-482949:
enable the CPU model name info support to the RDC, rdci command
can detect GPU and CPU modules at the same time.
It will query the CPU info through the amdsmi interface like below:
1 GPUs found.
-----------------------------------------------------------------
GPU Index Device Information
0 AMD Radeon PRO W7800
=================================================================
1 CPUs found.
-----------------------------------------------------------------
CPU Index Device Information
0 AMD Ryzen Threadripper PRO 7995WX 96-Cores
-----------------------------------------------------------------
Change-Id: Ibc6533c9a61000cd86c45b1bae14c3eb6788c119
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
* CMAKE - Add required version for amdsmi
Change-Id: I341a89351d196ec66cce215a5d1d3953302fcc66
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
---------
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
Co-authored-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
1. For temperature the unit in milli Celsius
2. For power the unit in microwatts.
3. Fix second register call to rdcd doesn't functional because start flag
Co-authored-by: Chao Fei <chao.fei@amd.com>
Add the RdcSmiHealth module, which will call rocm_smi_lib.
It will support following health:
- XGMI error detected
- PCIE replay count detected
- Memory check
- InfoROM check
- Power/Thermal check
The grpc client and server side health function is added.
The health module is added to the rdci.
At present, XGMI/PCIE and a part of Memory have been implemented.
Others will be added as soon as possible.
Change-Id: I1bd99290bdc7dea733f21a41a8c4bcefb2138112
Detcah the thread which handle shutdown signals instead of joining
thread can avoid the segfault issue on specific ASIC.
Signed-off-by: Li Ma <li.ma@amd.com>
Change-Id: I74ac53c027ac370605caaa87115c83fd8027526a
Implement an API to obtain the version information of the rdc calling component.
See rdc_component_t for details on available components.
It can be expanded later if necessary.
Change-Id: I03b48f774179c52c57b606704283add74ca39a02
Signed-off-by: Chen Gong <curry.gong@amd.com>
Want to display version information along with the hash value.
Change-Id: I0f9ad576f8f66747ce2e84d4f524ccd16d399927
Signed-off-by: Chen Gong <curry.gong@amd.com>
Modifying the /opt/rocm/etc/rdc file modifies RDC launch options. If
the file doesn't exist, the service should still launch (though a new
file should likely be included with the next released package of 'rdc'.
Change-Id: I1a1891e9c5c3e6048754eb555779a97a170754c0
The executable rdcd was using an absolute path in rdc.service. Using update-alternatives gives the flexibility to invoke the binary from anywhere and no absolute path is required.
Change-Id: I2f3d6fcbf9dd854870cfc2e00532c504ce6cd6fc
These RUNPATH changes make it so libraries can be found without setting
LD_LIBRARY_PATH.
Mostly tested on installed RDC binaries and libraries. The
build binaries should also work.
Change-Id: Ifd908a5b61d24dfcbb1d08d21b4ee830156d8643
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
Also add stddef.h workaround for old GCC.
RHEL-8 still uses GCC 8.5 and templates are not well supported.
Change-Id: Ia4dae23892ec63682ea848c46ba81de85cf6d209
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
NOTE: RVS Build is disabled by default due to CI build issues.
Change-Id: I1593f0fe22075a9f86f54afa3ac151e109f1f7bd
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
Join the signal handling thread instead of cancel it to prevent
crash with "terminate called without an active exception".
Change-Id: I2e18eb825728fd3a94f67b1b0049516bb7b6ebbc
- Replace gRPC library with gRPC package
- Relax RUNPATH
- Make LINKER_FLAGS global
gRPC package includes its dependencies:
SSL, UPB, ABSL, and etc.
Change-Id: Ieb198ad96e26e89b09cb85986214a5b1451b17a6
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
- Respect CMAKE_INSTALL_PREFIX and ignore RDC_CLIENT_INSTALL_PREFIX
- Move example and rdctst from rocm/bin to rocm/share/rdc
- Add README for examples
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
Change-Id: I0b1d996d206327fd1b51ac6e82d548829bdb1570
Main CMake improvements:
* Add rdctst with -DBUILD_TESTS=ON
* Set default ROCM_DIR to /opt/rocm/
* Split rdc_libs/CMakeLists.txt into subdirectories
* Package tests into rdc-tests.deb and .rpm
Misc improvements:
* Add .editorconfig to normalize code formatting
* Add .gitignore
* Expand RPATH for gRPC to reduce LD_LIBRARY_PATH usage
* Export compile_commands.json
* Show warning and do not install gRPC if GRPC_ROOT is left as default
* Move .in files into relevant subdirectories
* Move most variables into project CMakeLists.txt to avoid redefinitions
* Normalize CMakeLists.txt formatting (4 spaces indentation)
* Rename DIAGNOSTIC_LIB to RDC_ROCR_LIB
* Update gRPC version in README to 1.44.0
* Remove gtest source
* Pull gtest from github if not installed
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
Depends-On: I1039ef61247e3f0ff822925cc869fb0c2bf3af85
Change-Id: I879b21428e6642f19fda67092b365d8b78b7ba7b
With file reorganization changes binaries are moved to /opt/rocm-ver/bin.
Similarly rdc.service moved to /opt/rocm-ver/libexec/rdc
Test suites still used old paths
Once test suites changes are made, backward compatibility for binaries and rdc.service can be removed
Corrcted binary path in rdc.service.in
Corrected GRPC runpath
Change-Id: I306924d81cedc19586305a79d51eea8af6e70e83
SWDEV-291455 - Binary , header files and libraries installed in bin,include and lib folder under /opt/rocm-ver
Prebuilt ras library with updated search path
cmake config files in lib/cmake/rdc
grpc,sp3,hsaco and private libraries installed in lib/rdc
config installed in share/rdc
authentication and python_binding installed in libexec/rdc
Backward compatibility added for header files and libraries
Depends-On: I3f3d192935923f71737b3fe55ded536654a73dd7
Change-Id: Ia1a6cadc59034b155631a1ee5fdbe692d2a8a71b
grpc v1.44.0 needs to link to library absl_synchronization. The
CMakeLists.txt is changed to link to that library if available.
Change-Id: I92f7247473a70e7a83416b9744e788e45d104565
Provides a RdcSmiDiagnostic module, which will call rocm_smi_lib.
It will support following diagnostics: Get GPU Topology, Check GPU
parameters and check processes running on the GPUs.
The grpc client and server side diagnostics function is added.
The diag module is added to the rdci.
Change-Id: I10a0cf3c20556a61373ab686f82cae75acaa40dd
Write access is required for some RSMI services. This change
temporarily permits write access so configuration can be done,
and then turns it off.
To help with this, the ScopedCapability struct is introduced to
provide scope limited access, helping to ensure a process is not
left with extra capability, should an exception occur.
Change-Id: I4978a1a688db935b8bfc27b3b537a0dd07959d3f
Install the grpc lib to rdc/grpc/lib and add miss libraries.
Add “--no-as-needed” and all extra grpc libraries in rdci/rdcd as
RUNPATH will only search direct dependencies.
Change-Id: I596acb2eb3a7228d703e79db64699bc20d0e7c09
Installing files to standard path across each version and using
ldconfig has issues with side-by-side install.
Usage of RUNPATH/RPATH for ROCm to ensure all ROCm libraries are
picked without the need for ldconfig.
For RDC server to be picked up by systemctl, service config file
shall be a symlink from /lib/systemctl/system/rdc.service to
corresponding RDC file path in a given version of ROCm
For side-by-side install packages of RDC post install scripts
will be removed. Hence Use will have to set the symlink explicitly
for now.
Change-Id: I916da7cf132f0f9c667e2470fac2b0875e3db9d0
Also:
* print header line every 50 line on output
* print events that are being listened for with header
* cpplint clean-up
Change-Id: Ic049eb79156a9528b556e56f0fa43e1344f898cc