Sending RSMI_STATUS_UNEXPECTED_DATA for drivers
which do not set some clock freqs
Change-Id: I43a9515c2757dddd412bb25cfd54095e63367030
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Code changes related to the following:
* Added 'rsmi_dev_revision_get()' related code
* Test code
* Functional tests
Change-Id: I8c2097c65384a028c8c8437b717d05d52fe45250
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
Updates:
* VoltRead - needed to properly send out RSMI_STATUS_NOT_SUPPORTED
when device does not have voltage hwmon files
* ComputePart. - test failure was likely caused due to EvtNotif
causing conflicts (unknown exactly why). Test passes when
moving it ahead of the event notifier. Both API calls may have
a system resource issue, TBD.
* rocm_smi_example - now indicates when an API call
returns RSMI_STATUS_NOT_SUPPORTED or
RSMI_STATUS_NOT_YET_IMPLEMENTED. Allows example to fully complete
on systems which may not provide support for all API calls.
Change-Id: I520b8584e078d412414e8e5797c664220a7e823a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Updates:
* Added RSMI_STATUS_SETTING_UNAVAILABLE for
rsmi_dev_compute_partition_set - gives users
better error output when attempting to set
compute partition to values not listed in
available_compute_partition SYSFS
* Updated python --setcomputepartition to
provide better output when receiving
RSMI_STATUS_SETTING_UNAVAILABLE
* Updated all test & example files to check for
RSMI_STATUS_SETTING_UNAVAILABLE when doing
rsmi_dev_compute_partition_set
Change-Id: Ida5d54880d9b9b6e4a0468cdb962fdc0c18d6257
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Updates:
* Added rsmi_dev_compute_partition_reset & rsmi_dev_nps_mode_reset
* Added --resetcomputepartition and --resetnpsmode python smi calls
* Added temp data files rocmsmi_boot_compute_partition_<device num>
& rocmsmi_boot_nps_mode_partition_<device num>, writes UNKNOWN
if data cannot be read or device does not support
* Cleaned up NPS & compute API documentation
* Added creation and reading of API temp files (used in reset
functionality)
* Cleaned up output of rocm_smi_example
* Updated rocm_smi_example to check if running with sudo permission
before executing write API calls (cleans up erroneous output)
* Added template specialization for storing temp data, requires
specific rsmi_type_t enums (restrics what data can be stored)
* Added storage of temp data, if temp files do not exist
* Updated google tests for NPS & compute to include reset API calls
Change-Id: I69895a466b97107617e6dbb355737b84499a76c9
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Updates:
* Added rsmi_dev_nps_mode_set and rsmi_dev_nps_mode_get
* Added ability to set multiple SYSFS files in debug build
* Added ability to see user's env variables set for debug build
* Added tests for rsmi_dev_nps_mode_set and rsmi_dev_nps_mode_get
* Added ability to restart AMD GPU driver, used in nps_mode_set
* Updated ROCm_SMI_Manual.pdf to include new APIs
* Added progress bar for long running python_smi_tools, used
in setting nps_mode if runs longer than .1 seconds
Change-Id: I6d61bedd28d7cba6aff432ad2d127ba741b7d15a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Original updates:
* Added .gitignore to help with future commits
* Updated/added copyrights on modified or added files
* Updated rocm_smi.h/.cc
- Added 3 new SMI API functions:
rsmi_dev_compute_partition_set &
rsmi_dev_compute_partition_get
- Added helpful maps/enums used in
new get/set compute_partition API calls
* Updated rocm_smi.py
- Added --showcomputepartition
- Added --setcomputepartition
- Fixed a few mistypes
* Updated rsmiBindings.py - added helpful class/dict/list
* Updated rocm_smi_example.cc
- Added helpful MACRO to detect if api is not supported.
- Added current_compute_partition set/get rocm lib calls
- Added helpful macro to discover future RSMI errors
- Commented out test_set_freq, was having permission issues
on a Navi21
* Updated rocm_smi_main.cc
- Added helpful map to debug API calls, left in for future use
- Added comment to better understand a non-class function returns
* Added computepartition_read_write.cc/.h
- Added get/set compute partition API test calls
- Confirmed on devices that do not support the API calls, tests pass
* Updated rocm_smi_test/main.cc
- Calls new compute partition gtests
Added following updates from review feedback:
* Updated rocm_smi.h/cc
- Removed C++ API calls, adding support for both C/C++
API calls could cause confusion and adds extra work for us
- rsmi_dev_compute_partition_get -> Fixed an edge case where
user gives a small buffer length size (smaller than data
received), but does not receive the partial buffer back.
google Tests are updated to reflect this find.
* Updated rocm_smi_example.cc
- Fixed test_set_freq, issue was that file was not writable.
We now indicate this warning, so prior errors make sense.
- General test code cleanup. Removed extra code,
by creating loops for tests.
* Updated rocm_smi_main.cc
- Moved and got rid of an external reference to a map used
for debugging RSMI enums, now is a const public reference.
* Updated rocm_smi.py
- Updated python code to identify NOT_SUPPORTED due to
(currently) only a few GPU support the feature
Change-Id: I4a567acbb59d6771fb64df08d19175fe3604fd1b