Files
rocm-systems/projects/rocm-core/rdhc
solaiys eab103d4ed [RDHC] Update rocm-core package scripts to include rdhc script (#1482)
* Add rdhc script in to rocm-core package
  * Create the rdhc symlink within the package itself.
  * rdhc tool support is not enabled for windows.

  * [RDHC] Check if the required pip pkgs are present and warn .
     rdhc checks the required pip packages are present or not.
     if not warns the user and exits gracefully.

Signed-off-by: Saravanan Solaiyappan <saravanan.solaiyappan@amd.com>
2025-10-22 19:54:40 +05:30
..

rdhc

Rocm Deployment Health Check Tool

Features of the ROCm Deployment Health Check Tool

  1. Cross-Platform Support: Works on Ubuntu, RHEL, and SLES distributions
  2. Comprehensive Testing:
    • Default tests (GPU presence, driver status, rocminfo, rocm-smi)
    • Library dependency verification
    • Check some kernel parameters and ENV variables presence
    • Component-specific tests
      • Build and test the test program available from rocm-examples git repo dynamically.
  3. Dynamic Component Detection: Identifies installed ROCm components using distribution-specific package manager commands
  4. Flexible Reporting:
    • Pretty table output for terminal display
    • JSON export for further analysis or integration
  5. Configurable Verbosity: Through command-line options (-v for verbose, -s for silent)

Install dependency pip packages

sudo pip3 install -r requirements.txt

Usage

./rdhc.py -h
usage: sudo -E rdhc.py [options]

ROCm Deployment Health Check Tool

optional arguments:
  -h, --help            show this help message and exit
  --quick               Run quick tests only (default)
  --all                 Default tests + Compile and executes simple program for each component.
  -v, --verbose         Enable verbose output
  -s, --silent          Silent mode (errors only)
  -j FILE, --json FILE  Export results to JSON file
  -d DIR, --dir DIR     Directory path for temporary files (default: /tmp/rdhc/)

Usage examples:
# Run quick test (default tests only)
sudo -E ./rdhc.py

# Run all tests including compile and execute the rocm-example program for each component
sudo -E ./rdhc.py --all

# Run all tests with verbose output
sudo -E ./rdhc.py --all -v

# Enable verbose output
sudo -E ./rdhc.py -v

# Run in silent mode (only errors shown)
sudo -E ./rdhc.py -s

# Export results to a specific JSON file
sudo -E ./rdhc.py --all --json rdhc-results.json

# Specify a directory for temp files and logs (default: /tmp/rdhc/)
sudo -E ./rdhc.py -d /home/user/rdhc-dir/

RDHC Environment VARIABLES

RDHC tool will use the following ENV varaibles and act accordingly if they are set.

# ROCm installation path can be set by the below ENV varaible. Default is "/opt/rocm/"
export ROCM_PATH="/opt/rocm"

# For library dependency validation, the lib search depth can be set by the below ENV.
# Default is full depth. It checks for all the lib files in ROCM_PATH/lib/ folder recursively.
export LIBDIR_MAX_DEPTH=""

# if you want to check the libs only from the ROCM_PATH/lib/ folder set the depth as 1.
export LIBDIR_MAX_DEPTH=1

The tool is designed to be easily extended with additional component tests by adding new test methods following the naming convention test_check_component_name().