Граф коммитов

266 Коммитов

Автор SHA1 Сообщение Дата
Chris Freehill 6fc9f802ae Quiet address sanitizer warnings
Also,
* Fix some doxygen issues
* Fix address sanitizer issues in rsmitst

Change-Id: Ie6c6fd9af5c418210b7064e79650fb92cd4a5e2b


[ROCm/rocm_smi_lib commit: 63064b0000]
2020-11-10 14:16:39 -06:00
Chris Freehill 0fb36c2f41 Make CMakeLists.txt recognize ADDRESS_SANITIZER
Change-Id: Ic80ac42c62cd400e48fb26d504547931fdd6863a


[ROCm/rocm_smi_lib commit: e7c8dfe2a2]
2020-11-04 17:57:31 -06:00
Chris Freehill b7df80c34b Use relative path to find librocm_smi
Change-Id: Ifca3f54d680a802c1c5fa360d17e64338b9ac9a8


[ROCm/rocm_smi_lib commit: 438d28612f]
2020-10-29 14:36:48 -05:00
Elena Sakhnovitch c17f9e05e1 ROCm SMI Python CLI: --rasinject partial support
This implementation is copied directly from the previous rocm_smi.py
script; This feature is experimental and will be updated or removed with
feauture releases.

Signed-off-by: Elena Saknovitch
Change-Id: I5cd38266946302bc4123aeafaa825e13f704235e


[ROCm/rocm_smi_lib commit: 4117719edd]
2020-10-22 17:22:13 -04:00
Chris Freehill bbbdd0cb2c Add new XGMI counter events to rsmiBindings.py
Also, correct RSMI_EVNT_LAST to new value.

Change-Id: I9f693cb398bba583201f6b5b5f0e2d45ede2e4e0


[ROCm/rocm_smi_lib commit: 1982fdc4fb]
2020-10-22 17:21:50 -04:00
Divya Shikre d7d7d1e7ea Fix for weight/hops not being updated
Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I333d49fa011b85d41eca63c082c0615febe2f7e9


[ROCm/rocm_smi_lib commit: 94291bf882]
2020-10-20 15:01:06 -04:00
Ori Messinger 4e97667f31 ROCm SMI Python CLI: Add CU Occupancy to showPids function
The purpose of this patch is to add CU occupancy functionality to showPids
by calling rsmi_compute_process_info_get from the LIB.

Now showPids shows the following information on (KFD compute) processes:
PID, process name, GPU(s), VRAM used, SDMA used, and CU occupancy.

Change-Id: Ie005901e0eb946ef0fbb3523245ca451c4eed595
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>


[ROCm/rocm_smi_lib commit: 20ae72b078]
2020-10-15 21:21:32 -04:00
Ramesh Errabolu 8b53e7812f Update ROCm SMI library with ability to read CU occupancy
Change-Id: Ib9882fa2d81c13604af282279bfa116bc2fd05a4


[ROCm/rocm_smi_lib commit: 328878343c]
2020-10-14 09:33:37 -04:00
Divya Shikre caf3a5132b Adding gtest for gpu metrics read
Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I66edb15c8b7380f3427822b33e845202bfac7a2b


[ROCm/rocm_smi_lib commit: f397cba414]
2020-10-08 13:37:47 -04:00
Ori Messinger 9a20f6fa3e ROCm SMI Python CLI: Check for amdgpu Driver Initialization
The purpose of this patch is to check for amdgpu driver initialization
before attempting to initialize rocmsmi in the CLI.

Additionally, since the '--help' functionality does not rely on anything
external to the CLI, it can now be called without the driver initialized.

Change-Id: I2fcce60ca6d9f77835549e3558c4bb1747499c5c
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>


[ROCm/rocm_smi_lib commit: e3c9aec714]
2020-10-08 11:17:45 -04:00
Chris Freehill c2acc451af Revert "Revert "Support for RSMI_EVNT_GRP_XGMI_DATA_OUT counters""
This reverts commit c009809fcd.



Change-Id: Ic412a64d35aab74caf12bf4c791f0a66ac15b061


[ROCm/rocm_smi_lib commit: 5465d872aa]
2020-10-08 10:36:30 -04:00
Kent Russell f5015e6cb4 Remove extraneous mutexes
We already grab the mutex before getting the device name, so we don't
need to grab it again

Change-Id: Ib627ba3a39c485f6069af052cfd3e6c522873d43


[ROCm/rocm_smi_lib commit: e350278b68]
2020-10-08 07:55:07 -04:00
Chris Freehill c009809fcd Revert "Support for RSMI_EVNT_GRP_XGMI_DATA_OUT counters"
This reverts commit 8acd845e5b.

Temporarily reverting until the driver side of this is upstream

Change-Id: I2d8243208c1271ebad90bc2ee0fda2dfefb0831b


[ROCm/rocm_smi_lib commit: ae6d3fbdd0]
2020-10-07 18:42:56 -04:00
Kent Russell e432b4e9e3 Check FRU-based product information if available
WKS and server cards have an FRU with product information, so try to use
that for product name and product SKU if it exists.

Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: I40bbd3bf62f4cb02e96015ed1630112691cacbc3


[ROCm/rocm_smi_lib commit: df7c3434cd]
2020-10-07 14:09:23 -04:00
Chris Freehill 624f906f07 Fail gracefully if drm directory is not found
Change-Id: I0f3ab2721108355752caf0280124469b98af4967


[ROCm/rocm_smi_lib commit: c6f02b4d62]
2020-10-05 21:12:11 -04:00
Chris Freehill 8acd845e5b Support for RSMI_EVNT_GRP_XGMI_DATA_OUT counters
Also some format fixes

Change-Id: Id3c0f6b3cf5b327bb9ca6acb6091dc67764c8032


[ROCm/rocm_smi_lib commit: 946bf93dfb]
2020-10-05 17:22:19 -05:00
Divya Shikre 94fc1524c3 Adding functionality that will parse gpu_metrics sysfs file
Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I3a84870b83eb4cd0ed46f10bb19169c91f99fd8e


[ROCm/rocm_smi_lib commit: 8b48564ce3]
2020-10-02 10:25:41 -04:00
Chris Freehill 91267d1440 Add gtest lib dir to library search path
Change-Id: I57bb20e2a67a4eaac2d0e24314e22d1a5fbe3533


[ROCm/rocm_smi_lib commit: 3522e94ed0]
2020-10-01 23:46:33 -04:00
Ori Messinger 1b36ce7e6d ROCm SMI Python CLI: Implement --setclock for all Valid Clocks
The purpose of this patch is to implement --setclock functionality for
all of the valid clocks (can be set with --setclock TYPE LEVEL).

The valid clocks are: dcefclk, fclk, mclk, pcie, sclk, socclk.
This functionality uses the existing 'setClocks' method.

Change-Id: I1d62baf372427ac1c0642c26a949663b673ef335
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>


[ROCm/rocm_smi_lib commit: 4ed1c1d492]
2020-09-22 15:41:51 -04:00
Mukul Joshi 602f182344 Use correct string conversion function for VRAM and SDMA usage
VRAM and SDMA usage can be 64-bit long numbers. Use stoull()
instead of stoi() to convert the VRAM and SDMA usage strings to
numbers.

Change-Id: Ifadbada9f33320fc67666036ce8439823c1d1fb7


[ROCm/rocm_smi_lib commit: fb2ed24372]
2020-09-21 12:28:22 -04:00
Mukul Joshi 40bf5754fd Add support for GPU reset SMI events
Add handling for both pre GPU reset and post GPU reset SMI
events.

Change-Id: I64d5e006bef58cb28b1c580c75f482a4590427da


[ROCm/rocm_smi_lib commit: 8b95705e6f]
2020-09-16 13:25:06 -04:00
Mukul Joshi 4ad8b300d8 Add support for KFD Thermal Throttling SMI event
Add handling for receiving thermal throttling SMI event from the
kernel.
Also, update the event notification test to work with the new event.

Change-Id: Ib89c12b244f90998ccbae0a38b37f25705d156e0


[ROCm/rocm_smi_lib commit: aff75c955f]
2020-09-16 13:24:57 -04:00
Mukul Joshi 8082416569 Update KFD SMI event notification handling
Event bitmask in KFD SMI event is now replaced with event index in
the SMI event message. Sending a event bitmask, which was a 64-bit
field with only 1 bit set, was quite wasteful of memory and also
potentially limiting to 64 events. Instead the kernel would send
event index in the SMI event message. As a result, update the
KFD SMI event handling to expect the event index in the message.

Change-Id: I3e74620788d3c1f7c0bdaa69e9d9ab3d1aba2c92


[ROCm/rocm_smi_lib commit: 406859ca8a]
2020-09-16 13:24:50 -04:00
Chris Freehill 74113a5594 Enable library-based rocm_smi.py
Change-Id: I5443308905456defc9818fac07ac2f20fe9426fd


[ROCm/rocm_smi_lib commit: 8f9f9433d8]
2020-09-16 09:31:30 -05:00
Chris Freehill fb7952f401 Make sure all sensor labels have valid mappings
There may not be label files for some sensors on older
devices. We need to make sure there is a valid dummy
mapping in these cases.

Change-Id: Id6a8b71e554552be84a0e42a477070b504151e7f


[ROCm/rocm_smi_lib commit: b015052a07]
2020-09-11 17:32:54 -05:00
Chris Freehill c2381bff52 Add missing docs section for EvntNotif
Change-Id: I69187c734d2618ddb4272c58bb76d04646908793


[ROCm/rocm_smi_lib commit: cafd678d5d]
2020-09-11 15:48:56 -05:00
Elena Sakhnovitch 8116b10d72 ROCm SMI CLI: Add JSON support for topo functions
-Add divider between devices for --showclocks to increase readibility.
-Fix fan rounding error
-Fix spaces to comply with coding standard
-Fix @param description error in topo functions
-JSON result for topology:
{
  "card0": {
    "(Topology) Numa Node": "0",
    "(Topology) Numa Affinity": "4294967295"
  },
  "card1": {
    "(Topology) Numa Node": "0",
    "(Topology) Numa Affinity": "4294967295"
  },
  "system": {
    "(Topology) Weight between DRM devices 0 and 1": "40",
    "(Topology) Hops between DRM devices 0 and 1": "2",
    "(Topology) Link type between DRM devices 0 and 1": "PCIE"
  }
}

Signed-off-by: Elena Sakhnovitch <Elena.Sakhnovitch@amd.com>
Change-Id: I711c100362826ed729ff90edd407009237d64f8f


[ROCm/rocm_smi_lib commit: 91f8fcb7b1]
2020-09-10 12:57:14 -04:00
Elena Sakhnovitch 248fee7425 Add README.md starter file
signed-off-by: Elena Sakhnovitch
Change-Id: I677b7d643c6559693c5ad627b704ee36631cc32e


[ROCm/rocm_smi_lib commit: edcae88fe9]
2020-09-10 11:09:42 -04:00
Elena Sakhnovitch 889bda96e1 ROCm SMI Python CLI: Implement --showbw
PCIE bandwidth functionality

Signed-off-by: Elena Sakhnovitch
Change-Id: I5a9ddc589846b6032739d491319078ead5723a27


[ROCm/rocm_smi_lib commit: 8b82621e72]
2020-09-09 14:52:58 -04:00
Harish Kasiviswanathan 43831998c9 Don't hard code rocm_smi_lib path
During rocm_smi_lib installation the path should be set using ldconfig

Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: I0cab18f492013b783d1ce632591ce295f934a168


[ROCm/rocm_smi_lib commit: f1786a3095]
2020-09-08 19:29:09 -04:00
Divya Shikre 31f3b6d33d Adding setsrange, setmrange, setvc, setslevel and setmlevel functionality to rocm lib and cli
Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I5fd65ea7bcd5403aaf2e42d2aa28d837929da253


[ROCm/rocm_smi_lib commit: 54d4b9d500]
2020-09-08 18:42:39 -04:00
Ori Messinger e4aff0d37c ROCm SMI Python CLI: Implement show/set mclk OverDrive
The purpose of this patch is to implement show and set mclk OverDrive.
This implementation is copied directly from the previous rocm_smi.py
script since this functionality is mostly deprecated.

Change-Id: I705430f873a73f954b6812c222a385ff4e9b6eb2
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>


[ROCm/rocm_smi_lib commit: 95d43e30e3]
2020-09-08 14:24:11 -04:00
Ori Messinger c73a70b431 ROCm SMI Python CLI: Implement Valid Clocks
The purpose of this patch is to implement the remaining valid clocks.
The valid clocks are: dcefclk, fclk, mclk, pcie, sclk, socclk
This functionality is needed for the 'setClocks' method.

Change-Id: Ie648fb29dbbd61f0f064d4462ac566911f1ca2aa
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>


[ROCm/rocm_smi_lib commit: 2d59d0877b]
2020-09-02 06:40:59 -04:00
Divya Shikre b6ca634dcd Adding voltage range functionality to rocm cli
Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I9288c0c6cda2a984c34cfd2570deec640b6c9f0d


[ROCm/rocm_smi_lib commit: d1f4c252b0]
2020-08-28 12:04:36 -04:00
Divya Shikre 3e5469164e Adding logic to skip the loop if src and dest device are the same in HW Topology.
Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: Ib9cfbf5a7238ba75f6463e8fa6250bb9946b7979


[ROCm/rocm_smi_lib commit: 49734f8d34]
2020-08-20 10:44:28 -04:00
Harish Kasiviswanathan a659ff0a72 Update rsmi_process_info_t with sdma_usage field
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: Ie326e75674127a2e13f17fac344e2b672e877ce1


[ROCm/rocm_smi_lib commit: 9f5d4a698e]
2020-08-19 17:54:15 -04:00
Divya Shikre c4efd99208 Adding gpu reset functionality to rocm cli
Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: Ifc0a239e8e8046fd7f56893d0101e0866cc3185f


[ROCm/rocm_smi_lib commit: 1276e4b9e9]
2020-08-19 13:37:47 -04:00
Chris Freehill 48d986f0c2 Clean up comments for rsmitst
Change-Id: Iea5322a5fd3bffe77557fa2cecbce70716e1258c


[ROCm/rocm_smi_lib commit: 7be97ec2aa]
2020-08-17 11:48:07 -05:00
Divya Shikre a7e1b13cef Adding Sdma Usage to showpids
Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com
Change-Id: I72a9e1adc61eba382f1ac17c8e50b2a8bd6d6898


[ROCm/rocm_smi_lib commit: 2e8dc4f2a9]
2020-08-14 12:12:34 -04:00
Divya Shikre 990202f033 Adding Hw Topology option to ROCm SMI Python CLI
Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com
Change-Id: Ic46334567703f705e38b3a8b4a08ab388c749251


[ROCm/rocm_smi_lib commit: 4032898d1b]
2020-08-13 18:51:21 -04:00
Ori Messinger e5b1ba0f10 ROCm SMI Python CLI: properly cast pid to int
The purpose of this patch is to fix --showpids and --showpidgpus functionality.
When pid is passed into a LIB function, it must be cast to int first.

Change-Id: I5cb7ac41052abeefff0dedf2384c4bb3c8d577a3
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>


[ROCm/rocm_smi_lib commit: b568270f55]
2020-08-13 04:34:08 -04:00
Chris Freehill 43b908abd7 Move README back to root
README should be at root to display in github main page.
Also, removed paragraph related to API changes early
in development.

Change-Id: I2e92573a31d3caa7790364de9356c6d7e7be553d


[ROCm/rocm_smi_lib commit: da64e284dc]
2020-08-06 09:27:48 -05:00
Chris Freehill fc4d433877 Correct event counter documentation example
Change-Id: I74c41de8e4aacbd42d9e156983369eb76bec3367


[ROCm/rocm_smi_lib commit: 0468aa4971]
2020-08-06 08:49:21 -05:00
Ori Messinger 217b9b2aea ROCm SMI Python CLI
This tool acts as a command line interface for manipulating
and monitoring the Radeon Open Compute Kernel, similar to the
rocm_smi.py python tool.

The purpose of this commit is for the initial upload and cleanup
of the (incomplete) rocmSmiLib_cli.py and rsmiBindings.py files.

In the near future, this tool should have full feature parity with
rocm_smi.py by relying on the available rocm_smi_lib functions.

Change-Id: Ifbafd5118c15c68c240e3c83a47d2690a27c9353
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>


[ROCm/rocm_smi_lib commit: 2b909252ac]
2020-08-05 12:38:11 -04:00
Chris Freehill 0fe5175ed1 Replace "." in pkg name with "-"
Package name should have a hyphen (not a period) between
NumCommitsSinceLastTag and ROCMIntegrationJobIdentifier.

Fixes SWDEV-245838

Change-Id: I28c4337af6f92ac51a4aed03a09af23b92bd89b5


[ROCm/rocm_smi_lib commit: 92c258c364]
2020-07-27 20:54:52 -04:00
Chris Freehill b662e7ce51 Correct usage of bitwise &
Also, fix warning related to catch() and cpplint error.

Change-Id: I4292170538d0f700fccb605814c5058543abe74a


[ROCm/rocm_smi_lib commit: c2439d28e8]
2020-07-26 20:08:24 -05:00
Ashutosh Mishra 4371cc7afd Adding "BUILD_SHARED_LIBS" flag to cmake files
JIRA : SWDEV-234471
Changing cmake for dynamically creation of shared / archive libs depending upon the parameret to cmake

Adapted comments.

Change-Id: Ice5925719b8c307c32310b252f61cbc211d1af27


[ROCm/rocm_smi_lib commit: d325613220]
2020-07-16 22:32:55 -04:00
Chris Freehill 5c2ac56166 Update xgmi event counter documentation
Also:
* fix doxygen manual generation that was altered during
  OAM refactor
* quiet some compile warnings.

Change-Id: I548a3cf00eb887bea3dbf58e362ca6dfe90bde28


[ROCm/rocm_smi_lib commit: 52514835f0]
2020-07-16 17:42:56 -05:00
Mukul Joshi fd17bdb90f Fix compiler warning in TestPciReadWrite
Use unsigned number for left shift operation. If not specificed as
unsigned, compiler throws warning about left shift of negative
number.

Change-Id: I05948073b0c40700bee69399b08df6031fc49d70


[ROCm/rocm_smi_lib commit: 9d24fc9175]
2020-07-13 17:32:17 -04:00
Mukul Joshi fdda24038f Add support to retrieve process SDMA usage information.
Also, print SDMA usage information in TestProcInfoRead.

Change-Id: I8d19be3b8653e298c81237e5067eca75a1743e70


[ROCm/rocm_smi_lib commit: eea1ed8c3d]
2020-07-13 17:32:08 -04:00