Files
rocm-systems/projects/rocminfo
harkgill-amd 782dc9214b Fix: Error messages printed to stderr to trigger CMake Error Variable (#743)
This PR intends to cover the edge case seen in https://github.com/ROCm/rocm-systems/issues/694. 

`hip-config-amd.cmake` uses rocm_agent_enumerator to determine which GPU architecture to target when no target is specified.
https://github.com/ROCm/rocm-systems/blob/9a02dae75f8df9d8f08923d34d06d76e96ced7b4/projects/clr/hipamd/hip-config-amd.cmake.in#L86-L95

On WSL, both `readFromKFD` and `readFromLSPCI` are skipped. If `readFromTargetLstFile()` isn't in use, `readFromROCMINFO()` is called on. If rocminfo times out, it prints the following message to stdout.
```
"Timeout querying rocminfo.  Are you compiling with more than 254 threads?"
```
Because this is output and not an explicit error message, `execute_command` in the previous code blocks treats the output as `OUTPUT_VARIABLE` and passes it on as a valid gfx arch which causes these errors in CMake,
```
lang++: error: invalid target ID 'Timeout'; format is a processor name followed by an optional colon-delimited list of features followed by an enable/disable sign (e.g., 'gfx908:sramecc+:xnack-')
    clang++: error: invalid target ID 'querying'; format is a processor name followed by an optional colon-delimited list of features followed by an enable/disable sign (e.g., 'gfx908:sramecc+:xnack-')
    clang++: error: invalid target ID 'rocminfo.'; format is a processor name followed by an optional colon-delimited list of features followed by an enable/disable sign (e.g., 'gfx908:sramecc+:xnack-')
    clang++: error: invalid target ID 'Are'; format is a processor name followed by an optional colon-delimited list of features followed by an enable/disable sign (e.g., 'gfx908:sramecc+:xnack-')
    clang++: error: invalid target ID 'you'; format is a processor name followed by an optional colon-delimited list of features followed by an enable/disable sign (e.g., 'gfx908:sramecc+:xnack-')
    clang++: error: invalid target ID 'compiling'; format is a processor name followed by an optional colon-delimited list of features followed by an enable/disable sign (e.g., 'gfx908:sramecc+:xnack-')
```
The output can be properly pushed to `ERROR_VARIABLE` if rocm_agent_enumerator pushes the output to stderr instead of stdout. This can be done with the changes to the print statement in this PR or using the `logging` module.
2025-09-04 15:12:41 -04:00
..
2025-05-12 21:50:26 -07:00
2024-08-19 16:12:31 +00:00
2017-11-08 11:12:00 -06:00
2021-10-29 20:24:11 -05:00

rocminfo

ROCm Application for Reporting System Info

To Build

Use the standard cmake build procedure to build rocminfo. The location of ROCM root (parent directory containing ROCM headers and libraries) must be provided as a cmake argument using the standard CMAKE_PREFIX_PATH cmake variable.

After cloning the rocminfo git repo, please make sure to do a git-fetch --tags to get the tags residing on the repo. These tags are used for versioning. For example,

$ git fetch --tags origin

Building from the CMakeLists.txt directory might look like this:

mkdir -p build

cd build

cmake -DCMAKE_PREFIX_PATH=/opt/rocm ..

make

cd ..

Upon a successful build the binary, rocminfo, and the python script, rocm_agent_enumerator, will be in the build folder.

Execution

"rocminfo" gives information about the HSA system attributes and agents.

"rocm_agent_enumerator" prints the list of available AMD GCN ISA or architecture names. With the option '-name', it prints out available architectures names obtained from rocminfo. Otherwise, it generates ISA in one of five different ways:

  1. ROCM_TARGET_LST : a user defined environment variable, set to the path and filename where to find the "target.lst" file. This can be used in an install environment with sandbox, where execution of "rocminfo" is not possible.

  2. target.lst : user-supplied text file, in the same folder as "rocm_agent_enumerator". This is used in a container setting where ROCm stack may usually not available.

  3. HSA topology : gathers the information from the HSA node topology in /sys/class/kfd/kfd/topology/nodes/

  4. lspci : enumerate PCI bus and locate supported devices from a hard-coded lookup table.

  5. rocminfo : a tool shipped with this script to enumerate GPU agents available on a working ROCm stack.