Files
rocm-systems/include
Galantsev, Dmitrii d85657e5f2 SWDEV-452795 - Disable RAS plugin, fix XGMI
RAS plugin loaded rocm-smi which is in conflict with amd-smi library

Main source of grief was the map 'devInfoTypesStrings' that is defined
in both rocm-smi and amd-smi

We assume that rocm-smi would get lazy-loaded by RAS library and
overwrite symbols defined in amd-smi. devInfoTypesStrings in rocm-smi
contains different number of elements, the enums are also different.
RDC relies on amd-smi's enums.

One such enum is kDevGpuMetrics:
  rocm-smi: kDevGpuMetrics = 68
  amd-smi:  kDevGpuMetrics = 75

Example of overlapping map definitions:

  $ objdump --dynamic-syms /opt/rocm/lib/libamd_smi.so | grep devInfoTypesStrings
  00000000003c4980 g    DO .data.rel.ro0000000000000008  Base        devInfoTypesStrings
  00000000003db830 g    DO .bss0000000000000030  Base        _ZN3amd3smi6Device19devInfoTypesStringsE
  $ objdump --dynamic-syms /opt/rocm/lib/librocm_smi64.so  | grep devInfoTypesStrings
  00000000003dc590 g    DO .bss0000000000000030  Base        _ZN3amd3smi6Device19devInfoTypesStringsE
  00000000003c9c68 g    DO .data.rel.ro0000000000000008  Base        devInfoTypesStrings

Change-Id: Ib2f2db32b6abd7ebe84e7807c25581461eb86bae
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-06-26 03:42:07 -05:00
..
2024-06-11 03:11:15 -05:00