Updates:
* [rocm-smi] Provide a thread-safe logging feature
* [rocm-smi] Adding logrotation into install/upgrade/remove
scripts
* [rocm-smi] Updated cmake lists to include rocm_smi_logger
* [rocm-smi] Updated DEB/RPM install/remove logging file &
folder with all users having r/w privledges for
/var/log/rocm_smi_lib/ROCm-SMI-lib.log
* [rocm-smi] Added ability to do a glob search for multiple files
(globFileExists), assists doing file searches with * strings
* [rocm-smi] Added ability to log system details when RSMI_LOGGING
is turned on (getSystemDetails())
* [rocm-smi] Added logging to provide which ROCm API is being called
when RSMI_LOGGING is on
* [rocm-smi] Added logging to provide SYSFS path and read value,
when RSMI_LOGGING is on. Provides error reponse on failure.
* [rocm-smi] Added logging to provide SYSFS path and read value,
when RSMI_LOGGING is on. Provides error reponse on failure.
* [rocm-smi] Added environment variable RSMI_LOGGING to control
when logging is enabled or disabled. By default, by not
setting this env. variable, logging is turned off. When
setting RSMI_LOGGING=<any value>, logging is enabled
which is placed in /var/log/rocm_smi_lib/ROCm-SMI-lib.log file.
Setting RSMI_LOGGING is allowed in both debug and release builds.
* [rocm-smi] Removed an initialize procedure which keeps
debug_inf_loop. Seems this feature is not being used.
Change-Id: I79b48387609c6233c6f05b04fb8bba66b68c2399
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
[ROCm/rocm_smi_lib commit: c3a095a180]
Updates:
* Added rsmi_dev_nps_mode_set and rsmi_dev_nps_mode_get
* Added ability to set multiple SYSFS files in debug build
* Added ability to see user's env variables set for debug build
* Added tests for rsmi_dev_nps_mode_set and rsmi_dev_nps_mode_get
* Added ability to restart AMD GPU driver, used in nps_mode_set
* Updated ROCm_SMI_Manual.pdf to include new APIs
* Added progress bar for long running python_smi_tools, used
in setting nps_mode if runs longer than .1 seconds
Change-Id: I6d61bedd28d7cba6aff432ad2d127ba741b7d15a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
[ROCm/rocm_smi_lib commit: 9ef376cd61]
Add DEBUG_LOG that will optionally print error
message when RSMI_DEBUG_BITFIELD is set to 2.
Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I6017e92d8a9e5f9861ae29ece0488d4bc198f996
[ROCm/rocm_smi_lib commit: 99be3451d7]
Previously, during the rsmi_init discovery process, the existence
of an hwmon# directory was used to distinguish between gpus nodes
and non-gpu nodes. This isn't reliable in some scenarios. Instead,
the existence of the vbios_version file is used as an
indicator that the node is indeed a gpu.
Change-Id: Icfbe5c42ed0970077b05f25c3d209308a31bec85
[ROCm/rocm_smi_lib commit: ff9546aa62]
The environment variable RSMI_DEBUG_INFINITE_LOOP is introduced
to facilitate debugging RSMI in user applications. When this
env. variable is non-zero, an infinite loop will be entered in
rsmi_init(). At this point, a debugger can be attached and RSMI
can be debugger. This only applies to debug builds.
Change-Id: I23f6dd730fc965764295070de053314a1cc5b6aa
[ROCm/rocm_smi_lib commit: 68095b50e7]
This corrects issues that arose after OAM reorganization.
It should address SWDEV-243294.
Also, fix some compile warnings that show up on RHEL.
Change-Id: Id14d444905da35cd7346bcfbcd82b6d0572708c4
[ROCm/rocm_smi_lib commit: c2ef9a6879]
This solution takes into account that some hwmons use
label files to map sensor types. The previous solution
did not take this into account.
Change-Id: I1d6204573cefa8197b2cfe0ffb412b545df3d80a
[ROCm/rocm_smi_lib commit: 324c0ca0e5]
The new functions added in this commit allow a caller to tell up
front what functions, function variants and monitors are
supported.
Also,
* fixed a few documentation/formatting issues
* fixed a process_info test issue
Change-Id: I2184ab1a4a6898f847e791f273e2185d556e78e9
[ROCm/rocm_smi_lib commit: 551b15182b]