* Updates:
- Fixed infinit loop on systems
which did not have VRAM files
- Fixed concise info from throwing exception
with no amdgpu driver loaded
- Fix for ability to see all nodes when
after switching partitions (mirrors
original card display/settings)
- Added to logs build type, lib path,
and set env. variables
Change-Id: Ic0333df355144ce2242cecea93fe4ce51caf311c
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
[ROCm/rocm_smi_lib commit: ed6777a8e7]
Code changes related to the following:
* Reverts earlier fix for the same issue
* Check for existence of files before reading
Change-Id: I175b20c3343c414b12b79dc3fc404f53fbaabf3a
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
[ROCm/rocm_smi_lib commit: 328ce0150b]
Change the affinity from unsigned int to integer to represent -1.
Change-Id: I82dc6f476b45fa4ec03a3c686fe8e6e2b7761b56
[ROCm/rocm_smi_lib commit: 471fbfddc1]
Code changes related to the following:
* rocm_smi.py
Change-Id: I600e776bf479f972b8d639ce5a658a24916aed3c
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
[ROCm/rocm_smi_lib commit: 3602447109]
Properly handles 'Unable to detect' vs 'Not supported' fan cases where:
* sysfs file (pwm#) exists, and readings report zero (0), "Unable to detect fan speed"
* sysfs file (pwm#) does not exist, then "Not supported"
Change-Id: If4b0312c872b76647a3e54427ba2a3f3e8e6dab1
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
[ROCm/rocm_smi_lib commit: f9fd6b0a96]
Sending RSMI_STATUS_UNEXPECTED_DATA for drivers
which do not set some clock freqs
Change-Id: I43a9515c2757dddd412bb25cfd54095e63367030
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
[ROCm/rocm_smi_lib commit: f191c2753c]
The driver may not expose VRAM sysfs in certain system. Add a
fallback to it.
Change-Id: Ib3be71b4f4d2c79318d5026b0a97f3657d8a97b6
[ROCm/rocm_smi_lib commit: a10f00bf57]
* Updates:
- Fix for devices which do not have edge sensors, but junction
- Added partitioning (memory and dynamic) displays for
base rocm-smi CLI calls
- Added subheading for base rocm-smi call output
- Added better hwmon and device detection logging
Change-Id: I8219884b2e532d6ed379527cacdc1f2b232a5451
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
[ROCm/rocm_smi_lib commit: 755e14dbad]
Code changes related to the following:
* All reinforcement work moved to their own files
* Self contained changes only to support them
* New files added to CMakeLists.txt
Change-Id: I761e91f54392824df9145eaed8b9805986861285
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
[ROCm/rocm_smi_lib commit: cc5ab079df]
* Updates:
- Env variable RSMI_LOGGING=0 or any other value
-> all logging off
- Env variable RSMI_LOGGING=1 -> logs only
- Env variable RSMI_LOGGING=2 -> console only
- Env variable RSMI_LOGGING=3 -> both logs + console
- Metrics output includes hexdump of current file
and decoded metrics (functions: logHexDump
and log_gpu_metrics)
- System info gathered, now includes if system's
perceived endianness - little or big endian
helpful for viewing decoded hexdump or any
binary translation
- Added templates for printing unsigned hex
(print_unsigned_hex_and_int), unsigned integers
(print_unsigned_int), and printing both unsigned
hex and int with an optional header
(print_unsigned_hex_and_int)
- Fixed some build compile warnings/errors -
ex. doing strncpys for sku or board names
this operation is expected and needed
and for temp file writes if unsuccessful
we now properly send RSMI_STATUS_FILE_ERROR
- Fixed on RHEL 8.8/9.x logrotate does not properly
initialize
Change-Id: Ifa0f0218c9cafd0a8cd6aa8e7f94d61e9107200f
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
[ROCm/rocm_smi_lib commit: 9c7eed7edc]
Since the reset will continue if the reset power and current power
is the same, error may confuse the user.
Change-Id: I35b9ef17afd47b5af5bd2b8882a44f63991fe509
[ROCm/rocm_smi_lib commit: aeb6c61f54]
Fix the error only one csv line can be printed out when output
is not based on device.
Change-Id: Idacc5d98acc223e932fb3d46c888bfa04778b73c
[ROCm/rocm_smi_lib commit: 80d650b95a]
Updates:
* [rocm-smi] Logging now can update files on
per-project-basis for install/remove
* [rocm-smi] README now has latest build
instructions, including test builds
* [rocm-smi] Updated README to include
revision dates
Change-Id: Ifb19a6f32ccf6938f47225db53fef88021909264
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
[ROCm/rocm_smi_lib commit: 4613e8dec3]
Code changes related to the following:
* Added 'rsmi_dev_revision_get()' related code
* Test code
* Functional tests
Change-Id: I8c2097c65384a028c8c8437b717d05d52fe45250
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
[ROCm/rocm_smi_lib commit: 573620f586]
The following read tests were failing:
*.TestIdInfoRead
*.TestSysInfoRead
1. *.TestIdInfoRead failed because rsmi_dev_brand_get did not specify
dependency on vbios_version.
2. *.TestSysInfoRead failed because the test didn't expect vbios_version to
be missing. Which is a new behavior in Aqua Vanjaram.
Change-Id: I9ee88a12fcf6cff2032049e2ecdfb2957efb03ab
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
[ROCm/rocm_smi_lib commit: 8fe848d10e]
The librocm_smi64.so is used for development, while
librocm_smi64.so.MAJOR is used for runtime, thus the python front end
should not be loading the .so binary, but rather the .so.MAJOR binary.
As well, it's good not to hardcode "lib" as some distros will change
this.
rsmiBindings.py is now generated with CMake
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
Change-Id: I7cb745f8936fdf10d3ebd6c1e606031f713184ca
[ROCm/rocm_smi_lib commit: 2d2c73a5e6]
There seems to be a scope issue with the existing variables, but just
putting in the pkg version string seems sufficient.
Change-Id: I4ccef872ff848a70cb2abc07bf605c5f29a608e8
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
[ROCm/rocm_smi_lib commit: 4f481dd7f3]
Building on this package on Fedora reports this warning
In file included from rpmbuild/BUILD/rocm_smi_lib-rocm-5.5.1/src/rocm_smi_main.cc:62:
In member function 'amd::smi::Device::set_bdfid(unsigned long)',
inlined from 'amd::smi::RocmSMI::Initialize(unsigned long)' at rpmbuild/BUILD/rocm_smi_lib-rocm-5.5.1/src/rocm_smi_main.cc:330:27:
rpmbuild/BUILD/rocm_smi_lib-rocm-5.5.1/include/rocm_smi/rocm_smi_device.h:199:42: warning: 'bdfid' may be used uninitialized [-Wmaybe-uninitialized]
199 | void set_bdfid(uint64_t val) {bdfid_ = val;}
| ~~~~~~~^~~~~
rpmbuild/BUILD/rocm_smi_lib-rocm-5.5.1/src/rocm_smi_main.cc: In member function 'amd::smi::RocmSMI::Initialize(unsigned long)':
rpmbuild/BUILD/rocm_smi_lib-rocm-5.5.1/src/rocm_smi_main.cc:324:12: note: 'bdfid' was declared here
324 | uint64_t bdfid;
| ^~~~~
Only set the bdfid when it is know to be valid.
Signed-off-by: Tom Rix <trix@redhat.com>
Change-Id: I839b4d2d2d4e3b25469cf5972245b9630da00c87
[ROCm/rocm_smi_lib commit: 19c3e2aff9]
When building from github, these tags don't exist, so the defaults
should try to match the internal tags
Change-Id: Id570341f27e21916b1a7f3605ee2b5b9716cad9b
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
[ROCm/rocm_smi_lib commit: 74dc98114f]
This looks like a typo, as the following variables are not defined:
- AMD_SMI_LIBS_TARGET_VERSION_MAJOR
- AMD_SMI_LIBS_TARGET_VERSION_MINOR
- AMD_SMI_LIBS_TARGET_VERSION_PATCH
Change-Id: I43449e7bd2a2de643d33e79fad063a7859679c8d
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
[ROCm/rocm_smi_lib commit: 1a86dd75bb]
The keyword "PROGRAMS" should be used in place of "FILES" in order to
make sure executable scripts have the correct permissions.
Change-Id: I6c287dc1291774ad6d97a04d621957dea0a1b697
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
[ROCm/rocm_smi_lib commit: d00d885394]
See SWDEV-391039 and SWDEV-391040 for details
Change-Id: I662ba43363d949465454ea4af4d4586b3d47a811
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
[ROCm/rocm_smi_lib commit: ac94bf5ed5]
If temp in hwmon was missing - rocm-smi crashed.
e.g. /sys/class/drm/card1/device/hwmon/hwmon5/temp1_input
This change displays "N/A" for temp instead of crashing.
Change-Id: I02f84a466bd3acfbd9b65e7e4ca0f18e76606c3b
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
[ROCm/rocm_smi_lib commit: 713f85721b]
Used pyright to show errors and warnings and resolved most
Change-Id: I0fdf7dcdf08db5c35dec80f6645e0a395fbe4197
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
[ROCm/rocm_smi_lib commit: e8391c9d7c]