Граф коммитов

343 Коммитов

Автор SHA1 Сообщение Дата
Bill(Shuzhou) Liu bfa976cf08 Port rocm-smi function to amd-smi
Port most rocm-smi function to amd-smi and add unit tests.

Change-Id: I6387a4bdaf20ead2389c99bb01d438156ccd0747


[ROCm/amdsmi commit: f1d02aca79]
2022-09-06 12:08:59 -04:00
Bill(Shuzhou) Liu 236e4e2d3e Port more rocm-smi function to amd-smi
The API support function, performance counter, process information,
topology and xgmi info.

Change-Id: I3350ec75fdd2ca1438e79134582ae83c49763056


[ROCm/amdsmi commit: 86017b799c]
2022-08-24 12:49:27 -05:00
Bill(Shuzhou) Liu 39c1c4334e Support events in the amdsmi
Port the events handling from rocm-smi to amd-smi

Change-Id: I0b4cb30a585cb2188a24be0e21c1c156b461bb1d


[ROCm/amdsmi commit: 7b92c694a0]
2022-08-23 16:49:56 -04:00
Bill(Shuzhou) Liu 06a481c563 Add unit test support
Add gtest based unit test framework. Implement fan read/write function.

Change-Id: I83375c24b99d24d01d12bccda863a38f75f5987f


[ROCm/amdsmi commit: 98df483bef]
2022-08-05 09:55:34 -04:00
Alexsandar Nedeljkovic f79d7838f4 Update amdsmi header to include GpuvSMI related APIs and definitions
Signed-off-by: Alexsandar Nedeljkovic <alexsandar.nedeljkovic@amd.com>
Change-Id: Iff46d724f35b52028b67ce272f800fcf820c96ac


[ROCm/amdsmi commit: 61289339d8]
2022-07-22 16:06:20 +02:00
Bill(Shuzhou) Liu f7e5bb73d0 Load libdrm at run time
Remove the compile time dependency on libdrm. Load it at the run
time instead.

Add the headers missed from smi-lib

Change-Id: Ie1ecf293b51425b6a61c502d11a42809dc099f70


[ROCm/amdsmi commit: 5ba371f285]
2022-06-28 14:48:59 -04:00
Bill(Shuzhou) Liu 325173a20d The init version of amd_smi
The init version includes the amd_smi.h header, an example uses the
amd_smi, folder structure and CMake files.

Add the support to libdrm.

Change-Id: I779e55e4cf9491c61dc226a30d24e96be9bc6016


[ROCm/amdsmi commit: 91ad08aa65]
2022-06-14 09:14:24 -04:00
Elena Sakhnovitch ccf3ac2b15 [rocm_smi.py]: shownodesbw fix for non xgmi
Improve error output for non-xgmi nodes bandwidth

signed-off-by: Elena Sakhnovitch
Change-Id: I833970d3200a75c7639d33bf19e0e83afe176c8d


[ROCm/amdsmi commit: 44ea49eb01]
2022-05-24 16:45:32 -04:00
Ori Messinger 23b3bcc038 ROCm SMI CLI: Fix --showvoltagerange bug
This patch fixes a --showvoltagerange bug, which attempts to check
the voltage curve on a device that does not have any voltage
regions in its OverDrive voltage frequency data (odvf).

Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I647c30c978ffb13f6819ac3d069ee340710a7f99


[ROCm/amdsmi commit: 786f66671a]
2022-05-21 05:02:15 -04:00
Ori Messinger cf61df76ad ROCm SMI CLI: Fix setPowerOverdrive restPowerOverdrive Bugs
Fixes bug in the 'setPowerOverdrive' function which mishandles
GPUs with secondary dies. Secondary dies have a default power cap
of 0W and cannot be changed, so they are now skipped.

Fixes bug in the 'resetPowerOverdrive' function which incorrectly
resets the wattage to the current value.

Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I483fa3f58b1fa44a3bf7bae3b52c59ce523ae152


[ROCm/amdsmi commit: 4298cbb400]
2022-05-21 05:01:32 -04:00
Divya Shikre 92657b2380 Fix mem leaks observed while running rsmitst
1.  Memory allocated for handle was not deleted
when no variant, subvariant or supported function
was found
2. handle->func_id_iter address was set to 0
before delete[]

Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: Iab50fdfbe03eec8e6fd0e84e03bd2c47e645b3d8


[ROCm/amdsmi commit: b23cfc0e82]
2022-05-18 14:31:44 -04:00
Divya Shikre f4e33b90c9 Update get_frequencies to handle failures.
Show an optional debug log (RSMI_DEBUG_BITFIELD=2) to
the user in the following scenarios:
1. If more than one current frequency is found
2. If frequencies are not read in increasing order of
   their value
If current frequency is not available, index for it is
set to -1, values will not have * next to it in the
output. This will also be handled in rocm_smi.py.

Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I477ec065f7513c8045d6392f12ef6cb835a6b8f6


[ROCm/amdsmi commit: afe996c2ed]
2022-05-11 15:33:15 -04:00
Divya Shikre 3cbd1652de Add DEBUG_LOG macro
Add DEBUG_LOG that will optionally print error
message when RSMI_DEBUG_BITFIELD is set to 2.

Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I6017e92d8a9e5f9861ae29ece0488d4bc198f996


[ROCm/amdsmi commit: 99be3451d7]
2022-05-11 11:03:24 -04:00
Divya Shikre 8504d79dfa Add RSMI_CLK_TYPE_PCIE to rsmi_clk_type_t
showclocks/showclkfrq does not display pp_dpm_pcie values
in sriov. This fix adds pcie clocks to rsmi_clk_type_t
where rest of the clocks are present.

Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I6d129ae412623b369c14456ae9781b2dbceb2139


[ROCm/amdsmi commit: c9b42bff57]
2022-05-06 09:15:39 -04:00
Ori Messinger 821ffaa5b9 ROCm SMI LIB: Add Missing GPU Blocks
This patch adds the following 4 missing GPU blocks to the SMI LIB:
-RSMI_GPU_BLOCK_MMHUB
-RSMI_GPU_BLOCK_PCIE_BIF
-RSMI_GPU_BLOCK_HDP
-RSMI_GPU_BLOCK_XGMI_WAFL

Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: Ia1ec6f53e195f4bf7b8f073d6bed4fdb6572e546


[ROCm/amdsmi commit: 9d6403bb17]
2022-05-05 00:44:16 -04:00
Elena Sakhnovitch 65841a8fd0 Revert "rocm_smi.py: Don't try to print absent clock files"
This reverts commit 4de1e4094a.
DRM device id  does not always match GPU ID in the rocm_smi.py. This leads to cases where wrong device is checked by os.path.isfile().

Change-Id: Ib6f2b9be123b7eb64334d3feec57f63d7eb37d6f


[ROCm/amdsmi commit: be66d67ef2]
2022-05-03 16:42:42 -04:00
Elena Sakhnovitch 67d69e127e [rocm_smi.py] Hide unsupported clocks under debug
Signed-off-by: Elena Sakhnovitch <elena.sakhnovitch@amd.com>
Change-Id: I1f2c7b93d9a81f2735c76e8d441f9e298288f5c0


[ROCm/amdsmi commit: 9d7fd34d2b]
2022-05-03 16:38:22 -04:00
Bill(Shuzhou) Liu 9bf38c36a3 Sanity check amdgpu module is loaded in rocm_smi.py
Instead of check /proc/modules for amdgpu, the code will check
/sys/module/amdgpu/initstate which covers the case when the driver
is compiled into the kernel.

Change-Id: Id39ec5b0eb9b68204bc9f5f779057ba8cc090bdc


[ROCm/amdsmi commit: 9f6614e83b]
2022-04-14 11:28:38 -04:00
Bill(Shuzhou) Liu 538dc09a8b Suppress "rsmi_init() failed" error message
When an application call the library in a system without amdgpu,
it may always print out "rsmi_init() failed". Suppress the error
message in the library.

Change-Id: Ice63dd3a764b221a6935536bff1bfa6aa3e51a46


[ROCm/amdsmi commit: 7860de5107]
2022-04-12 09:44:00 -04:00
Ori Messinger a21208fc4e ROCm SMI CLI: Fix formatCsv Bug
Fixes a bug in the 'formatCsv' function which mishandles json
data conversion for 'system' data types.

Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I705060409bf5ae75b994ffda270843065ca12321


[ROCm/amdsmi commit: e800cbf161]
2022-04-07 19:33:46 -04:00
Bill(Shuzhou) Liu 4884de63fd Correct the __pycache__ folder
Remove the __pycache__ in the folder libexec/rocm_smi

Change-Id: I0ad505ff7e7368d5fe86e1eee12080039edc7111


[ROCm/amdsmi commit: 9f814e150e]
2022-03-24 09:44:33 -04:00
Bill(Shuzhou) Liu ee67414d43 Remove python pyc file when uninstall
Remove python pyc file when uninstall.

Change-Id: I383faf8fcfaeeb346c9ee38c1aad8577a460281e


[ROCm/amdsmi commit: c37d4bac8f]
2022-03-23 13:39:57 -04:00
Ranjith Ramakrishnan a2586d1044 Remove rocm_smi/bin folder and prefix name correction in pragma message
/opt/rocm/rocm_smi/bin folder was added by mistake as part of file reorg and removed the same.
File reorg commit :2a0ecb1e5689bfe5851bf91039b72df580fec372
Pragma message for oam header files was showing prefix as rocm_smi, Changed the same to oam

Change-Id: I74b3c1d2bd7e0ff0eee5738c1658063bc855066c


[ROCm/amdsmi commit: 869670866d]
2022-03-17 18:16:10 -07:00
Kent Russell da9b4c606e README: Remove restrictive licensing language
Also update copyright years

Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: Ic9ead543c4937680afc1957623c4d5fcbfbd58b0


[ROCm/amdsmi commit: 85571318e2]
2022-03-16 13:52:25 -04:00
Sreekant Somasekharan 5bc962bf3d make string variable 'tpath' an empty string.
string variable not being empty can lead to incorrect compilation
and corrupted output.

Change-Id: Ie66756c28aef7417759c29387500970a8b53e44c


[ROCm/amdsmi commit: dbe3403bd3]
2022-03-11 21:22:28 -05:00
Bill(Shuzhou) Liu 26df407089 Upgrade GoogleTest to v1.11.0
The old GoogleTest has compile errors on Centos 9. Upgrade it
to latest version.

Change-Id: I6bbe6afdfad6422a210f258880ddc87a9f088d76


[ROCm/amdsmi commit: 8ce9289bc2]
2022-03-09 15:18:43 -05:00
Sreekant Somasekharan b3ba591ee6 Add blacklist filter 'virtualization' for rsmi tests failing in SRIOV
Change-Id: Ibbaef092482c0b78ecd86a29f0b9b4331b51abe2


[ROCm/amdsmi commit: e6ae697e9c]
2022-03-04 22:13:44 -05:00
Elena Sakhnovitch 26ef2abe05 [rocm_smi.py] resetPowerOverdrive fix
resetPowerOverdrive: improve output messages.

Signed-off-by: Elena Sakhnovitch
Change-Id: Ic5b9084f0637458c36e460231f2d3622b0a23aa6


[ROCm/amdsmi commit: a3317714cb]
2022-03-04 11:26:45 -05:00
Ranjith Ramakrishnan 2a0ecb1e56 File reorganization with backward compatibility
Wrapper header files
Soft link to libraries and binaries
rocm_smi.py and rsmiBindings.py installed in libexec/rocm_smi
Binaries, libraries and header files installed as per File Reorg folder structure

Change-Id: I3166ab67f89c2ae4aafbc87bb00c9a5233221ade


[ROCm/amdsmi commit: f1da5591b5]
2022-03-03 18:48:52 -05:00
Bill(Shuzhou) Liu ef0b0eb0af Prevent stack buffer overflow
readlink() does not append a null byte to buffer. Initialize the
tpath to prevent stack buffer overflow.

Change-Id: I17895dc3576b080a0c35bd0528a5b83223ec1c1b


[ROCm/amdsmi commit: 4b65b0307f]
2022-03-03 15:43:53 -05:00
Saravanan Solaiyappan e8ea51109b Consider apt/yum upgrade operation check in package scripts.
Include the upgrade operation check in the prerm and postun scripts
for rocm-smi-lib package.

Signed-off-by: Saravanan Solaiyappan <saravanan.solaiyappan@amd.com>
Change-Id: Ic3dee7ae50a2ac317f1aab88472b6d4805c4de90


[ROCm/amdsmi commit: 3a3b8dd25d]
2022-02-24 10:11:32 -05:00
Elena Sakhnovitch 99a9fbfea8 [rocm_smi.py]: fix input error type for --setclock
signed-off-by: Elena Sakhnovitch
Change-Id: I9626978780f360c591fb8908f5b759f2289dff0b


[ROCm/amdsmi commit: 9b871fcd9f]
2022-02-22 14:24:38 -05:00
Freddy Paul 58e2cf0508 rocm-smi:Fix cmake target files to reflect correct location
Change-Id: I86fda8447609c42e0f0615abd837b53ca5fbe717


[ROCm/amdsmi commit: d0545854dd]
2022-02-18 09:53:43 -08:00
Ori Messinger e9afb27da3 ROCm SMI CLI: Hide Failed Command Warning
The purpose of this patch is to hide 'One or more commands failed.'
from showing up, unless an appropriate log level has been set.

You can set the loglevel in the CLI with:
--loglevel <debug/info/warning/error/critical>

Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: Ifa309cd62596491a6ea5892e0752251f037fc0e9


[ROCm/amdsmi commit: 007f326c34]
2022-02-09 11:52:33 -05:00
Bill(Shuzhou) Liu 159a3cfea3 Link the library using sha1 build-id
The address sanitizer build requires build id more than 8 bytes.

Change-Id: I530fe87dffbf4c46f010bf8a1c2914f733678e9a


[ROCm/amdsmi commit: 3aab7b199e]
2022-02-02 17:04:11 -05:00
Divya Shikre ee42de6190 Temporary blacklist TestPerfLevelReadWrite for navi21
Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: Iee2146170b6828fe4fe2846c3ebfd57f95734f34


[ROCm/amdsmi commit: 8c4635acea]
2022-01-27 22:56:37 -05:00
Laurent Morichetti d9fba4453f Don't use NDEBUG when the intent is !DEBUG
CMakeLists.txt does not set up the DEBUG macro correctly to mean
!NDEBUG, so, as a workaround, replace all uses of ifdef NDEBUG with
ifndef DEBUG in the library sources.

Change-Id: I408adb36d1a2310fb894a486574469662ebb27cd
(cherry picked from commit 9f87197d8d)


[ROCm/amdsmi commit: 2804bf7c28]
2022-01-27 11:08:48 -05:00
Divya Shikre b8360c38b8 Add fix to check for vector size while reading pp_dpm_pcie
pop_back() was causing a seg fault when pp_dpm_pcie file is empty and returns whitespace.

Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I888f1f79751cd456e43751a5b96d08560a039677


[ROCm/amdsmi commit: ec71380e1c]
2022-01-26 10:34:57 -05:00
Bill(Shuzhou) Liu f0cd40a10b Add rpm License header
Add rpm License header for cpack

Change-Id: I2f4a89015b6389cfde801f41d4f6e0f59e7087aa


[ROCm/amdsmi commit: ce9cfa584f]
2022-01-20 13:30:40 -05:00
Divya Shikre 2d7a5566e4 Don't assert when fan is not supported.
Add a check when RSMI_STATUS_NOT_SUPPORTED is returned for fanRead/fanReadWrite.
Fix for SWDEV-314176 & SWDEV-314175 reported.

Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: Icf2cc541a3fa5ca4794aff5d6bc91104adc45e6d


[ROCm/amdsmi commit: 11a71c63b1]
2022-01-20 12:29:12 -05:00
Bill(Shuzhou) Liu 8f4a82612d Add license file to smi-lib package
Install LICENSE.txt to share/doc/smi-lib

Change-Id: Idcbb70db8808111203e8e4a4c3ab4d1e070ac79d


[ROCm/amdsmi commit: 3356084074]
2022-01-19 12:15:31 -05:00
Sreekant Somasekharan 304636c27d Print ASD firmware version in hex instead of decimal format
Change-Id: Idf113f63b79f2d2903ae795d272d232a43680516


[ROCm/amdsmi commit: cf2f0b0508]
2022-01-18 10:44:20 -05:00
Bill(Shuzhou) Liu fcbb9e5945 Enable the linker build id generation for address sanitizer build
The -Wl,--build-id option is added for address sanitizer build

Change-Id: I0d75bc8e6169010c460e62e51708828e75de478e


[ROCm/amdsmi commit: 7b69dde24f]
2022-01-17 09:06:34 -05:00
Bill(Shuzhou) Liu e21e1aff43 strip the library instead of link when build release
When build the release, it will strip the library file instead of link.

Change-Id: Ib2d4cea614e8938bdb2be0fd74f046680158d256


[ROCm/amdsmi commit: 77502bed2a]
2022-01-14 10:39:15 -05:00
Harish Kasiviswanathan a014132bba rocm_smi_lib: add stdbool.h needed for C90
'bool' keyword is supported only from C99 onwards. Include stdbool.h
for older compilers

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: I09fd5cf6eac20e7185e85a1123bc4826958b2b7c


[ROCm/amdsmi commit: 8de6ed2b8d]
2021-12-14 15:25:59 -05:00
Elena Sakhnovitch 48a2251ff6 [rocm_smi.py] remove \r symbol at print
Remove carriage return at the end of the line in printLog function.
On linux end of line is encoded with \n, not \n\r.

Change-Id: If3835d773033b53a7f25b4a0284df359a6f9555d


[ROCm/amdsmi commit: 1aeb27c4c9]
2021-12-08 10:13:56 -05:00
Divya Shikre d72346c920 Add null ptr check for temperature read from all sensors.
The (temperature == nullptr) check happens only when HBM temperature is retrieved.
This check needs to apply in other cases as well, hence moving this outside the HBM condition.
This should return RSMI_STATUS_INVALID_ARGS consistently in all cases when nullptr is passed through rsmitst.

Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: Iea3cec75312a0a669c7da27e15e9782e6a885c5f


[ROCm/amdsmi commit: 432df20321]
2021-12-01 14:05:46 -05:00
Divya Shikre 656b39646e Update temp_read rsmitst.
Check for RSMI_STATUS_INVALID_ARGS when invalid args are passed.

Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I0d5ff84aee5cce4214026ddcd860a17ae3e43147


[ROCm/amdsmi commit: b4fd9c0d94]
2021-11-29 18:09:45 -05:00
Sreekant Somasekharan 1a4346e6ba Skip TestFrequenciesReadWrite for unsupported ASICs
For ASICs NAVI10 and above setting display clock [DCEFCLK] is not supported and the sysfs entry is
read-only. As a result, the test falsely fails for these ASICs. ROCm SMI Lib is ASIC independent.
So Display clock set cannot be selectively disabled for these ASICs.

As a compromise if the set (write to sysfs entry) fails due to permission error and euid is root,
assume that set feature is not supported and skip the test.

Change-Id: I7a273878cbf1465b01728705323e8a92a42378dd


[ROCm/amdsmi commit: c6f695f5a9]
2021-11-29 11:23:38 -05:00
Divya Shikre 58b5a538a7 Add fix to display correct GPU Memory Activity and GFX Activity value.
Driver mem fills in 0xFF for all for the metrices not supported for that ASIC.
So if 0xFF is detected, return RSMI_STATUS_NOT_SUPPORTED

Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I86a38148c7a288ea0db94893f685560eaac098ab


[ROCm/amdsmi commit: 7b1daaef96]
2021-11-25 14:28:06 -05:00