device_count to processor_count
devices to processor
device to processor
also handle_to_device is renamed to handle_to_processor
Change-Id: Ie9c7e01bc1b83058eeaae80934526d06468f4f5c
Fix was needed due to hwmon updates.
Several voltage sensors (ex. vddgfx/vddnb)
are unsupported or not applicable
to upcoming hardware. This was not the case
for previous hardware sensors, resulting in
the rocm-smi crash observed.
Change-Id: Ib8593e10811638def26fc7a1eda29309e328db09
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Updates:
* Added RSMI_STATUS_SETTING_UNAVAILABLE for
rsmi_dev_compute_partition_set - gives users
better error output when attempting to set
compute partition to values not listed in
available_compute_partition SYSFS
* Updated python --setcomputepartition to
provide better output when receiving
RSMI_STATUS_SETTING_UNAVAILABLE
* Updated all test & example files to check for
RSMI_STATUS_SETTING_UNAVAILABLE when doing
rsmi_dev_compute_partition_set
Change-Id: Ida5d54880d9b9b6e4a0468cdb962fdc0c18d6257
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Updates:
* Added rsmi_dev_compute_partition_reset & rsmi_dev_nps_mode_reset
* Added --resetcomputepartition and --resetnpsmode python smi calls
* Added temp data files rocmsmi_boot_compute_partition_<device num>
& rocmsmi_boot_nps_mode_partition_<device num>, writes UNKNOWN
if data cannot be read or device does not support
* Cleaned up NPS & compute API documentation
* Added creation and reading of API temp files (used in reset
functionality)
* Cleaned up output of rocm_smi_example
* Updated rocm_smi_example to check if running with sudo permission
before executing write API calls (cleans up erroneous output)
* Added template specialization for storing temp data, requires
specific rsmi_type_t enums (restrics what data can be stored)
* Added storage of temp data, if temp files do not exist
* Updated google tests for NPS & compute to include reset API calls
Change-Id: I69895a466b97107617e6dbb355737b84499a76c9
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Updates:
* Added rsmi_dev_nps_mode_set and rsmi_dev_nps_mode_get
* Added ability to set multiple SYSFS files in debug build
* Added ability to see user's env variables set for debug build
* Added tests for rsmi_dev_nps_mode_set and rsmi_dev_nps_mode_get
* Added ability to restart AMD GPU driver, used in nps_mode_set
* Updated ROCm_SMI_Manual.pdf to include new APIs
* Added progress bar for long running python_smi_tools, used
in setting nps_mode if runs longer than .1 seconds
Change-Id: I6d61bedd28d7cba6aff432ad2d127ba741b7d15a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
The format of the fdinfo file has changed
Change-Id: Iad2e26487e75f3e614e364456e929aa1f6f949a4
Signed-off-by: Dalibor Stanisavljevic <Dalibor.Stanisavljevic@amd.com>
The tag values largely were not used and were causing doxygen
generation issues.
In the few cases where the tags were being referenced, clean up
those compile issues.
Signed-off-by: Jason Albert <jason.albert@amd.com>
Change-Id: I7b32eac742fb5af560400c13dda2721705d882bc
Signed-off-by: Dalibor Stanisavljevic <Dalibor.Stanisavljevic@amd.com>
Original updates:
* Added .gitignore to help with future commits
* Updated/added copyrights on modified or added files
* Updated rocm_smi.h/.cc
- Added 3 new SMI API functions:
rsmi_dev_compute_partition_set &
rsmi_dev_compute_partition_get
- Added helpful maps/enums used in
new get/set compute_partition API calls
* Updated rocm_smi.py
- Added --showcomputepartition
- Added --setcomputepartition
- Fixed a few mistypes
* Updated rsmiBindings.py - added helpful class/dict/list
* Updated rocm_smi_example.cc
- Added helpful MACRO to detect if api is not supported.
- Added current_compute_partition set/get rocm lib calls
- Added helpful macro to discover future RSMI errors
- Commented out test_set_freq, was having permission issues
on a Navi21
* Updated rocm_smi_main.cc
- Added helpful map to debug API calls, left in for future use
- Added comment to better understand a non-class function returns
* Added computepartition_read_write.cc/.h
- Added get/set compute partition API test calls
- Confirmed on devices that do not support the API calls, tests pass
* Updated rocm_smi_test/main.cc
- Calls new compute partition gtests
Added following updates from review feedback:
* Updated rocm_smi.h/cc
- Removed C++ API calls, adding support for both C/C++
API calls could cause confusion and adds extra work for us
- rsmi_dev_compute_partition_get -> Fixed an edge case where
user gives a small buffer length size (smaller than data
received), but does not receive the partial buffer back.
google Tests are updated to reflect this find.
* Updated rocm_smi_example.cc
- Fixed test_set_freq, issue was that file was not writable.
We now indicate this warning, so prior errors make sense.
- General test code cleanup. Removed extra code,
by creating loops for tests.
* Updated rocm_smi_main.cc
- Moved and got rid of an external reference to a map used
for debugging RSMI enums, now is a const public reference.
* Updated rocm_smi.py
- Updated python code to identify NOT_SUPPORTED due to
(currently) only a few GPU support the feature
Change-Id: I4a567acbb59d6771fb64df08d19175fe3604fd1b
The amdsmi_dev_get_temp_metric() will cover both function:
amdsmi_get_temperature_measure() using AMDSMI_TEMP_CURRENT
and
amdsmi_get_temperature_limit() using AMDSMI_TEMP_CRITICAL
Remove those two function.
It also merge the amdsmi_get_power_limit() into
amdsmi_get_power_measure()
Change-Id: I40d4afeb2ec0ac7b64832729f36adfaae120c990
Some of the amdsmi function have the verb (set or get) at the
end of the function. Move it to the middle to be consistent with
other APIs.
Change-Id: I8053d16f46af951c25aaaf8febf2896a33633fa1
Change dev to device_handle throughout the file
Change the pcie_info pcie_speed field type to uint32_t
Add AMDSMI prefix before amdsmi_mm_ip enum
Change-Id: I242145389ddc3f2ad05dfd6ca371640f4d118fc4
Signed-off-by: Dalibor Stanisavljevic <Dalibor.Stanisavljevic@amd.com>
- Made all doxygen formatting consistent with @ use
- Added @file definition to fix a lot of missed references
- Simplified return definitions for easier maintainability
- Fixed bad formatting and missing section closures
Signed-off-by: Jason Albert <jason.albert@amd.com>
Change-Id: I02cc55f7d0ae277f318a4620978af096f56cac6c
Assign fixed values to status codes to prevent enum auto assign
from changing them.
Signed-off-by: Jason Albert <jason.albert@amd.com>
Change-Id: I0ca1de7ba503ce8a75c56026f5a54e212204595b
- Add up to date README file for python tool
Change-Id: I7a02f79469e990870398b3741b033ea447998fdd
Signed-off-by: Dalibor Stanisavljevic <Dalibor.Stanisavljevic@amd.com>
Init the ams_smi using the rocm-smi, which makes the GPU discovery
consistent with or without libdrm.
Change-Id: Ic714781f8ce791451b0c057621525926edb7f5ee
Those two APIs are changed to let the user get the handles count,
allocate memory, and then return handles to the allocated memory.
Change-Id: Ibe28a89ad188c99da6af3af1740b2b25ff22ba06
Move rocm_smi related function to rocm_smi folder. Move amd_smi to
top level include/ and src/ folder. Remove obsolte oam folder.
Change the CMakeLists.txt to update folder locations.
Change-Id: I52e6be739e49f3b0545865f25364787f5985e9c3
Changes made on rsmi_perf_determinism_mode_set function documentation
as well for styling consistency.
Change-Id: I09ce8139eb9cbda94352ac7725c4c9b9bb06bd59
Show an optional debug log (RSMI_DEBUG_BITFIELD=2) to
the user in the following scenarios:
1. If more than one current frequency is found
2. If frequencies are not read in increasing order of
their value
If current frequency is not available, index for it is
set to -1, values will not have * next to it in the
output. This will also be handled in rocm_smi.py.
Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I477ec065f7513c8045d6392f12ef6cb835a6b8f6
Add DEBUG_LOG that will optionally print error
message when RSMI_DEBUG_BITFIELD is set to 2.
Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I6017e92d8a9e5f9861ae29ece0488d4bc198f996
showclocks/showclkfrq does not display pp_dpm_pcie values
in sriov. This fix adds pcie clocks to rsmi_clk_type_t
where rest of the clocks are present.
Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I6d129ae412623b369c14456ae9781b2dbceb2139
This patch adds the following 4 missing GPU blocks to the SMI LIB:
-RSMI_GPU_BLOCK_MMHUB
-RSMI_GPU_BLOCK_PCIE_BIF
-RSMI_GPU_BLOCK_HDP
-RSMI_GPU_BLOCK_XGMI_WAFL
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: Ia1ec6f53e195f4bf7b8f073d6bed4fdb6572e546
'bool' keyword is supported only from C99 onwards. Include stdbool.h
for older compilers
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: I09fd5cf6eac20e7185e85a1123bc4826958b2b7c