Also update copyright years
Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: Ic9ead543c4937680afc1957623c4d5fcbfbd58b0
[ROCm/rocm_smi_lib commit: 85571318e2]
Wrapper header files
Soft link to libraries and binaries
rocm_smi.py and rsmiBindings.py installed in libexec/rocm_smi
Binaries, libraries and header files installed as per File Reorg folder structure
Change-Id: I3166ab67f89c2ae4aafbc87bb00c9a5233221ade
[ROCm/rocm_smi_lib commit: f1da5591b5]
The purpose of this patch is to hide 'One or more commands failed.'
from showing up, unless an appropriate log level has been set.
You can set the loglevel in the CLI with:
--loglevel <debug/info/warning/error/critical>
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: Ifa309cd62596491a6ea5892e0752251f037fc0e9
[ROCm/rocm_smi_lib commit: 007f326c34]
Remove carriage return at the end of the line in printLog function.
On linux end of line is encoded with \n, not \n\r.
Change-Id: If3835d773033b53a7f25b4a0284df359a6f9555d
[ROCm/rocm_smi_lib commit: 1aeb27c4c9]
Driver mem fills in 0xFF for all for the metrices not supported for that ASIC.
So if 0xFF is detected, return RSMI_STATUS_NOT_SUPPORTED
Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I86a38148c7a288ea0db94893f685560eaac098ab
[ROCm/rocm_smi_lib commit: 7b1daaef96]
This patch removes every erroneous occurance of a third argument
when calling printErrLog(device, err), since it takes two arguments.
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I5971cc68b69c86f37c69f44e4785dabfc82c7955
[ROCm/rocm_smi_lib commit: 40eed25a3b]
Display min and max bandwidth between gpu nodes
Signed-off-by: Elena Sakhnovitch
Change-Id: I7289fb83f80e2f899996b7d7560ece670cc5f31f
[ROCm/rocm_smi_lib commit: 13cde8429d]
Printing "Primary die (usually one above or below the secondary) shows
total (primary + secondary) socket power information" footnote only one time, not
for every secondary die.
Signed-off-by: Elena Sakhnovitch
Change-Id: Iae9c5c94945ec38ecdb128a576a4eacafc30a044
[ROCm/rocm_smi_lib commit: 15e4fe80e1]
The purpose of this patch is to implement --showtopoaccess
functionality in the CLI, which shows True or False if P2P is
possible between two given GPUs.
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I07d70d80ae7b484136b31d5d22780c4990029391
[ROCm/rocm_smi_lib commit: e2d9a37e5f]
Fix error message in -P for secondary die
Signed-off-by: Elena Sakhnovitch
Change-Id: Ica3c0a83b565d2231fad23389b9378056a0f56b3
[ROCm/rocm_smi_lib commit: 6a01b6b2ec]
During the tail end when process is terminating, subprocess module fails
to find the process. This results in extraneous printing of a line with
char 'b'. Fix this.
BUG: SWDEV-296409
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: I39aacf8ae948a5acec0aa93296cc0e0aec88b3ef
[ROCm/rocm_smi_lib commit: cef19745d1]
Fix error message in -P for secondary die
Signed-off-by: Elena Sakhnovitch
Change-Id: Ica3c0a83b565d2231fad23389b9378056a0f56b3
[ROCm/rocm_smi_lib commit: 2db7e2a312]
During the tail end when process is terminating, subprocess module fails
to find the process. This results in extraneous printing of a line with
char 'b'. Fix this.
BUG: SWDEV-296409
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: I39aacf8ae948a5acec0aa93296cc0e0aec88b3ef
[ROCm/rocm_smi_lib commit: a03acf2c07]
Python's default 'print' implementation is not thread safe, causing
empty lines to be printed during multithreaded code execution.
This fixes the --showevents output for multi-GPU systems.
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I72f7341cdf4401f1fed4cd8f7d7a4a90bf9a3a4c
[ROCm/rocm_smi_lib commit: 8d5ced1f60]
Use zero padding for the hexadecimal value 'device_model' inside
showProductName with a padding length of 4.
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I962b94d414c6ba050d951486ad9e7559123f8850
[ROCm/rocm_smi_lib commit: 034caf6f76]
Python's default 'print' implementation is not thread safe, causing
empty lines to be printed during multithreaded code execution.
This fixes the --showevents output for multi-GPU systems.
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I72f7341cdf4401f1fed4cd8f7d7a4a90bf9a3a4c
[ROCm/rocm_smi_lib commit: 95348f37cc]
Use zero padding for the hexadecimal value 'device_model' inside
showProductName with a padding length of 4.
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I962b94d414c6ba050d951486ad9e7559123f8850
[ROCm/rocm_smi_lib commit: 03ae187a35]
Since device is a list, we need to pass a single item to the isAmdGpu
function.
Fixes: 17bdc065a1 "rocm_smi.py: Don't try to reset non-AMD GPUs"
Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: I19a74377636ff4589f11d092f41e1d35c1acb307
[ROCm/rocm_smi_lib commit: 242d94a668]
Instead of throwing "Unsupported clock" errors for ASICs that don't
support a certain clock type (e.g. dcefclk on MI-series), just dump the
warning to logging.debug and don't try to read the clock
Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: If3cb9a472b03aa535a76fc24bcd9f77122090634
[ROCm/rocm_smi_lib commit: b931380f02]
Use default power cap exposed via sysfs to determine when to
show 'Out of Spec" warning.
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I0fa3612b50e230856b0d5a390f876b35268d9587
[ROCm/rocm_smi_lib commit: b71e07b3fb]
Implement showevent functionality in the ROCm SMI Python CLI.
It can be called using --showevents with any combination of:
VM_FAULT, THERMAL_THROTTLE, and/or GPU_RESET
For example:
./rocm-smi --showevents VM_FAULT, THERMAL_THROTTLE, GPU_RESET
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I905fd9c949e91423b79833a04ab89d6ba3760e62
[ROCm/rocm_smi_lib commit: a9e7e5a475]
Many data center cards are fanless. Don't show warning if unable to get
fan speed. The fan speed will be reported as 0
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: I53efe67ac88fb0824cf4820430b46c18bc7692df
[ROCm/rocm_smi_lib commit: 1c9e384c8f]
This won't work for obvious reasons, so exit with an error instead of
trying to access a file that doesn't exist and segfaulting
Change-Id: Id1230922fa6e9a19e9394280faad88a43c7d2e34
[ROCm/rocm_smi_lib commit: c7c2ac5559]
rocm_smi.py --set<m|s>clk was treating the freq as a string.
This causes problems in parsing when the index is more than 1
digit. Now, treat the indexes as integers.
Change-Id: Ia0d859d33b685fe90689a86ff1c83980808b1514
[ROCm/rocm_smi_lib commit: 11440536cf]
The purpose of this patch is to fix a power cap bug for --setpoweroverdrive.
This bug occurs when the user attempts to set a lower wattage than the current
or default wattage, which displays an unnecessary warning message.
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I730d2c6031b7d7c4af5acf32ecd28da5ca21ab12
[ROCm/rocm_smi_lib commit: 20e2d260fb]
The purpose of this patch is to implement GPU reset functionality
in the LIB, and to call it from the rocm_smi python CLI.
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: Iaf525f7016f8354a7fd93af0209ca2e97ef4fd56
[ROCm/rocm_smi_lib commit: 80f629b9be]
The purpose of this patch is to fix a fan speed bug for --showfan.
This bug occurs when the current and/or maximum fan speeds are not
found by the LIB, which displayed an unclear error message.
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: Ied06e460f22391238dd2d86572813e2a5a64f45b
[ROCm/rocm_smi_lib commit: 4f297bdeb3]