830 Commitit

Tekijä SHA1 Viesti Päivämäärä
jingyu1l ff020dae44 Merge amd-staging into amd-master 20240622
Signed-off-by: jingyu1l <Jingyu1.Li@amd.com>
Change-Id: I7d1c62c8e61c5e43200efd4b5abd7f48e8182e65


[ROCm/rocm_smi_lib commit: 5463955787]
2024-06-27 14:37:24 +08:00
Bill(Shuzhou) Liu 0a11e23c08 Change error message for concise json/csv
The error message is changed to not supported instead of errors.

Change-Id: I28bd1e009770674389534be12519cc34673ba846


[ROCm/rocm_smi_lib commit: 57e8e72b79]
2024-06-21 16:16:36 -04:00
jingyu1l e1135540da Merge amd-staging into amd-master 20240619
Signed-off-by: jingyu1l <Jingyu1.Li@amd.com>
Change-Id: Ie1bdecfc8415cd8ca242a7f2de14ac4824015b09


[ROCm/rocm_smi_lib commit: e9cc9b8e81]
2024-06-21 11:47:01 +08:00
Ranjith Ramakrishnan 0e2b7a7623 SWDEV-468081 - Remove package provides field from RPM and DEB package
The provides tag is required when the package provides a virtual package.
Package name along with version will be provided by default and the provides tag is not required for this.
Using the tag for providing the name, but without version was resulting in package upgrade issues.

Change-Id: I74506d8c3bbd75d028bcdc03525c29541dce2b4c


[ROCm/rocm_smi_lib commit: d54bade574]
2024-06-18 18:27:53 -04:00
Galantsev, Dmitrii 4c21eea71d Fix assignment of member dismiss_
This patch fixes error:
    error: assignment of member 'dismiss_' in read-only object

Reported by kind Gentoo folks:
<https://bugs.gentoo.org/918709>

Change-Id: I7cc593043e97402afd85593c528ace86952b1350
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rocm_smi_lib commit: 12c8237705]
2024-06-17 16:56:46 -04:00
Sam Wu 6d4aa1f2e6 [ROCDOC-593] Update Read the Docs configuration to Python 3.10 and latest rocm-docs-core
Change-Id: Ia086cd708f5bfcff71780cc104afe1e0908923c9


[ROCm/rocm_smi_lib commit: 2d7d7c449a]
2024-06-12 15:06:24 -06:00
guanyu12 c76332a9c6 Merge amd-staging into amd-master 20240606
Signed-off-by: guanyu12 <guanyu12@amd.com>
Change-Id: I06c0f47701f580cd7440dc9fb5d394fad97d06aa


[ROCm/rocm_smi_lib commit: e7e8b59cba]
2024-06-06 16:48:50 +08:00
Roopa Malavally 5ba90607ee ROCm SMI Documentation Reorg
Change-Id: I3e4db2c50a43a51eeea4d3e06ba4811ad1859368
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/rocm_smi_lib commit: 2fd36e33ad]
2024-05-31 16:25:35 -05:00
Galantsev, Dmitrii 0d40412479 SWDEV-464886 - Fix ASAN REGEX error in cmake
Change-Id: Iaa5deed3ac833ebf1a010b98cfd4493359653ffe
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rocm_smi_lib commit: 10f3c2325c]
2024-05-30 16:42:00 -05:00
Galantsev, Dmitrii bd1b14c176 SWDEV-464886 - Fix REGEX error in cmake
Simplify rocm-core dependency handling

Change-Id: I07de1d40e4a3c90481c2de3abe9aac3dbfdd6d93
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rocm_smi_lib commit: 2096c8225c]
2024-05-30 14:54:44 -05:00
guanyu12 dc9b10151f Merge amd-staging into amd-master 20240530
Signed-off-by: guanyu12 <guanyu12@amd.com>
Change-Id: Ia5eb36449a72cbfd579942b62d1168dc7598ef65


[ROCm/rocm_smi_lib commit: 96b2ccde76]
2024-05-30 11:34:52 +08:00
Joseph Macaranas a1aa625059 Fix path typo
Change-Id: If6d539447f29bc5ac0449b7c8a717ae31c9f4bf0
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rocm_smi_lib commit: 69c74d696b]
2024-05-27 11:17:02 -05:00
guanyu12 b0a7661e6d Merge amd-staging into amd-master 20240522
Signed-off-by: guanyu12 <guanyu12@amd.com>
Change-Id: Iefd6b0b0161521b0108ac9ce3d09c8e527c6e081


[ROCm/rocm_smi_lib commit: cbf0890bf9]
2024-05-22 11:12:27 +08:00
Ranjith Ramakrishnan 10a438406a SWDEV-442738 - Static package generation for rocm_smi_lib
Package name will have suffix static-dev/devel

Change-Id: Ia273a66c663c56b023f6d765d024b30f1c35639d


[ROCm/rocm_smi_lib commit: 9f7e69bd5e]
2024-05-21 13:31:00 -04:00
Galantsev, Dmitrii 11b6c6374d Azure - Add rocm-ci.yml
Change-Id: If5db660729c732d96eb66897f0339850db98bb6b
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rocm_smi_lib commit: 0c3c3442e0]
2024-05-15 12:54:10 -05:00
Maisam Arif 16d60e6dd3 Merge amd-staging into amd-master 20240515
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I6f0514e16bc48f58d8ec2a1bf134370f82057ef2


[ROCm/rocm_smi_lib commit: c464a77576]
2024-05-15 03:36:28 -05:00
Oliveira, Daniel 0f2074d85c fix: [MIT-License] [rocm/rocm_smi_lib]
Updates the license to MIT

Code changes related to the following: None

Change-Id: I62d0a5f02a2d5e58c1952337dff54892793c16cf
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/rocm_smi_lib commit: e7d54946fb]
2024-05-15 01:38:36 -04:00
Oliveira, Daniel 05596bc060 fix: [SWDEV-461904] [rocm/rocm_smi_lib]
Checks returned error by rsmi_dev_od_volt_info_get() before assert

Code changes related to the following:
  * Unit tests

Change-Id: Icc0f329e35992aae19f07243024521181467bcd3
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/rocm_smi_lib commit: 497ef4a7ef]
2024-05-14 18:25:00 -05:00
Bill(Shuzhou) Liu 33bbb8efde Discover the amdgpu when card numbers are not consecutive.
When discover the amdgpu, if the assigned numbers are not consecutive,
not all GPU can be discovered. The code is change to discover the
GPU based on max card number.

Change-Id: I8b6a8b49594d6a54c7feb2645bedb83dc5c1b4cc


[ROCm/rocm_smi_lib commit: 8c44416410]
2024-05-08 13:59:16 -05:00
Maisam Arif cbe3fe6383 Merge amd-staging into amd-master 20240508
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I8ce1757eb0666c7c3242556e20c0fd5de22d740b


[ROCm/rocm_smi_lib commit: 217827e2c1]
2024-05-08 00:29:23 -05:00
Maisam Arif d68a3ffc8f Bump version lib:7.2.0 tool:2.2.0+hash
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I07138dad67d796fb8c2dd418a384f663dd8532c0


[ROCm/rocm_smi_lib commit: 9c16cc8baf]
2024-05-07 21:04:29 -05:00
Oliveira, Daniel af02873dfb fix: [SWDEV-458862] [rocm/rocm_smi_lib]
Fixes reading pp_od_clk_voltage new variable format and size.

Code changes related to the following:
  * get_od_clk_volt_info()
  * get_od_clk_volt_curve_regions()
  * Unit tests
  * CLI options restored: --showclkvolt, --showvc, --showvoltagerange, --setvc
    * Rework: 162d1d24
  * Bump CLI version
  * CHANGELOG.md

Change-Id: I817ca224de923fdaa992df84592d63b4d5a12b22
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/rocm_smi_lib commit: 8e6d66e15b]
2024-05-07 20:47:26 -05:00
guanyu12 388257944b Merge amd-staging into amd-master 20240506
Signed-off-by: guanyu12 <guanyu12@amd.com>
Change-Id: Ifecbc41972411afaf0e7d7b9f07982114402a65a


[ROCm/rocm_smi_lib commit: 2938796bc2]
2024-05-06 14:47:47 +08:00
Ori Messinger 02a862bde1 ROCm SMI LIB: Fix rsmiBindings.py.in Mismatch
This commit aligns the rsmiBindings.py.in file's
"notification_type_names" & "rsmi_evt_notification_type_t" with
those found in the rsmiBindings.py file.

Change-Id: I67f36606c505992fb98495651310bd70a1755033
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>


[ROCm/rocm_smi_lib commit: 0c48cd9122]
2024-05-02 23:22:44 -05:00
Bill(Shuzhou) Liu aad878c1d5 Remove thread safe only mutex warning message
In multiple GPUs environment, too many warning messages generated,
and then need to be removed.

Change-Id: I275de2397eb0e6b189e2e17e94335cb1e8f97815


[ROCm/rocm_smi_lib commit: 3d82f1799d]
2024-05-02 11:11:11 -05:00
Maisam Arif 5764275b71 Bump version lib:7.1.0 tool:2.1.0+hash
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I6f3d7c64aacf36c9d33d663e23559a7f50cd8db6


[ROCm/rocm_smi_lib commit: c425848141]
2024-05-02 03:30:48 -04:00
Oliveira, Daniel 162d1d24a4 fix: [SWDEV-458862] [rocm/rocm_smi_lib]
Fixes reading pp_od_clk_voltage new variable format and size.

Code changes related to the following:
  * get_od_clk_volt_info()
  * get_od_clk_volt_curve_regions()
  * Unit tests
  * CLI options removed: --showclkvolt, --showvc, --showvoltagerange, --setvc

Change-Id: Ieedb845eeadcea2f2e447ec576c253ad2a814176
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/rocm_smi_lib commit: 48ddd9abd7]
2024-05-02 03:29:59 -04:00
Ori Messinger 38b048f5f9 ROCm SMI LIB: Add Ring Hang Event Enums
This patch adds 'ring hang' enums to ROCM SMI LIB.
This event type name is KFD_SMI_EVENT_RING_HANG.

Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I9b886eb1fc027f03bcca1e5d1a89a2a186b64bf5


[ROCm/rocm_smi_lib commit: 3282aaa8de]
2024-05-01 17:02:52 -05:00
Galantsev, Dmitrii 62880844a6 Merge amd-staging into amd-master 20240423
Change-Id: I55800aeb88b62f85aae2715a9d314aa4530d8576
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rocm_smi_lib commit: 913d6fd173]
2024-04-23 10:53:41 -05:00
Bill(Shuzhou) Liu ada2cf681d Support thread only mutex
The environment variable RSMI_MUTEX_THREAD_ONLY=1 to enable thread only mutex.
The RSMI_INIT_FLAG_THRAD_ONLY_MUTEX can also be pass to rsmi_init()
to enable thread only mutex.

Change-Id: I2d9844039b774e386f03bb9bb130d8c342504ea6


[ROCm/rocm_smi_lib commit: 6ff95e55da]
2024-04-23 10:49:17 -05:00
Oliveira, Daniel c10a216702 fix: [SWDEV-458101] [rocm/rocm_smi_lib]
Drops checks that are invalid with the new pp_od_clk_voltage format

Code changes related to the following:
  * get_od_clk_volt_info()
  * get_od_clk_volt_curve_regions()

Change-Id: I5ebe23aa0ed4ea77d5ab5a94ce34ad9b1b51281f
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/rocm_smi_lib commit: e95d80f7ef]
2024-04-23 00:16:53 -05:00
guanyu12 8c7d9e7f84 Merge amd-staging into amd-master 20240411
Signed-off-by: guanyu12 <guanyu12@amd.com>
Change-Id: I25ed71cca91a0d78110a995861cff93ba748e056


[ROCm/rocm_smi_lib commit: 6881fc9c2e]
2024-04-11 10:24:26 +08:00
Oliveira, Daniel 5ddf42fe4e fix: [SWDEV-450058] [rocm/rocm_smi_lib]
Fixes TestMeasureApiExecutionTime test fails

Code changes related to the following:
  * Unit tests

Change-Id: I6223078f219448deb6bfbd78edae371a5a4cf03c
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/rocm_smi_lib commit: adf5c1da67]
2024-04-09 16:20:14 -04:00
Charis Poag dfab9d3f4e [SWDEV-450463] Fix --showmemuse clarity
* Updates:
  - [CLI] Updated --showmemuse:
    -> Add VRAM%, provide better context as "GPU Allocated Memory (VRAM%)"
    -> Update "GPU memory use (%)" as
       "GPU Memory Read/Write Activity(%)"
  - [CLI] Updated --showmaxpower and rocm-smi (no arg)
    -> Rounding was inconsistent with values past decimal.
       This provides the floor value of the device

Change-Id: Ib76dea2cb8483a1d7f53df675b0a94d8d01c81b9
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/rocm_smi_lib commit: b86f92230d]
2024-04-08 10:25:46 -04:00
WhiskyAKM eb582df502 Update rocm_smi.h
Fix for issue: https://github.com/ROCm/rocm_smi_lib/issues/162

Change-Id: I8778f5b8c034f2289625acb841676c144c967aa3
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rocm_smi_lib commit: 54af22ca61]
2024-04-05 11:16:07 -04:00
Junyi Hou e5075f260e Fix typos in rocm_smi.py, README.md, rsmiBindings.py
Change-Id: Ib03cec6130983a56657a388799fc2afaf3b8f728
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rocm_smi_lib commit: 9e2a6ea4bf]
2024-04-05 11:15:41 -04:00
Daniel Martinez 6477ccfc05 change CMAKE_HOST_SYSTEM_PROCESSOR to CMAKE_SYSTEM_PROCESSOR
Change-Id: I8e379676091903e2af3909e6d90daf6d62b8232c
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rocm_smi_lib commit: 38d1275d64]
2024-04-05 11:15:14 -04:00
Galantsev, Dmitrii c94401714f GIT - Sync dependabot settings with amdsmi
Change-Id: Id67a7f5273fd274291a1044dca50cc4006e853a5
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rocm_smi_lib commit: 65cf46dc76]
2024-04-04 17:00:50 -05:00
Charis Poag 4d1fab2ef6 Merge amd-staging into amd-master 20240401
Change-Id: I52c8665735e86deed53645197c11889fc7ece8c5
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/rocm_smi_lib commit: 6fada8c4a6]
2024-04-01 17:48:06 -05:00
Charis Poag 0025ebafca Add ROCm 6.1.1 changelog, ROCm SMI deprication, vbios fix
* Updates:
    - Add ROCm 6.1.1 Changelog updates
    - Add planned ROCm SMI deprication notice
    - Fix rocm-smi --showvbios showing extra errors
      for GPUs which do not have a VBIOS (MI300a ASICs)

Change-Id: I0e5ccfe2677f9c7909ca13863a920e323e82b439
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/rocm_smi_lib commit: f5c32b5415]
2024-03-30 00:11:09 -05:00
guanyu12 c48164d375 Merge amd-staging into amd-master 20240329
Signed-off-by: guanyu12 <guanyu12@amd.com>
Change-Id: Iea46d075f0ee45bb68469e87b377ce3519b39e2b


[ROCm/rocm_smi_lib commit: fe5648805f]
2024-03-29 10:26:04 +08:00
Bill(Shuzhou) Liu 2ddcade4e7 Unlock the mutex when process is dead
After the dead process is detected, pthread_mutex_consistent() will
be called. After that, the pthread_mutex_unlock() should also be
called to unlock it: "It is the responsibility of the application to
recover the state so it can be reused."

Change-Id: I45d3e2e68c3b06779f3acb1e908dbec0c6a39297


[ROCm/rocm_smi_lib commit: 750704720b]
2024-03-21 15:31:10 -05:00
guanyu12 47db0d99a0 Merge amd-staging into amd-master 20240321
Signed-off-by: guanyu12 <guanyu12@amd.com>
Change-Id: I006fc6c187f134a4851e262fa53ab6bf8d58759d


[ROCm/rocm_smi_lib commit: 8d4261c5c5]
2024-03-21 14:03:51 +08:00
Charis Poag e9608d8963 Update ROCm 6.0/6.1 CHANGELOG.md & README.md
* Updates:
    - [CHANGELOG.md] Provide 6.1 and 6.0 changes
    - [README.md] Update readme with relavant changes
    - [CLI] Updated --showpower to expand on types of power provided to users

Change-Id: Ic653cc81f80b7973654e2c23e1ab70567b930aa7
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/rocm_smi_lib commit: c5acd4ee88]
2024-03-20 00:17:33 -05:00
guanyu12 31e14064e3 Merge amd-staging into amd-master 20240314
Signed-off-by: guanyu12 <guanyu12@amd.com>
Change-Id: I1d79ce09196cf101c2a885fd6be8f1094e8d5f9f


[ROCm/rocm_smi_lib commit: ab8ebd4dea]
2024-03-14 11:15:44 +08:00
Galantsev, Dmitrii 08aadb1135 Fix misc memory leaks
Change-Id: I3dbf56e98d8c1312f9081956ed590962b2bdace3
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rocm_smi_lib commit: 9a3a50f929]
2024-03-08 16:26:47 -06:00
Galantsev, Dmitrii 1dd7207f58 Fix memory leak created by hanging opendir
Change-Id: I01e372c6a6b427f21e89cb5e4217f876346a35be
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rocm_smi_lib commit: b60541ef42]
2024-03-08 16:26:47 -06:00
Galantsev, Dmitrii 1ae9383f70 Add .github/CONTRIBUTING.md
Change-Id: Ie20c720514666dec307a92ec05fe9c3b56ba9cc5
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rocm_smi_lib commit: 46ea462189]
2024-03-08 16:25:35 -06:00
Charis Poag 3cf895b67d [SWDEV-436308] Add Partition_ID from KFD
* Updates:
    - [CLI] rocm-smi (no arg) and --showhw:
      Now displays 'ID'/'PARTITION ID' from the pcie_id identifier
      Helps users identify which partition # the device is
      Information provided by KFD
      Note: partition_id of 0, means a primary node (AKA root node),
      ex. ASICs which do not have partitioning support will show 0
    - [API] Fix partitions nodes which do not enumerate with domain:
            Adding kfd's domain, allows ASICs which have domains
            to enumerate in proper order.
            Full pcie_id / bdf propagates to all partition nodes.
    - [API] Update rsmi_dev_pci_id_get() to allow users to extract
      partition_id from device
    - [CLI] Added fix for devices which have modprobe failure,
      but DRM does not come up properly. Even though driver shows
      initialization was successful.
    - [API/Utils] Overloaded print_int_as_hex() template:
      Now accepts bitsize, and prints in smallest byte size
      possible. Note: bitsize of < 8, please just print as decimial.

Change-Id: Ib0c6f73b2b9c9fea29442a39a669c432874382d8
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/rocm_smi_lib commit: c2035fa1b9]
2024-03-08 10:51:15 -05:00
guanyu12 c01e17965d Merge amd-staging into amd-master 20240308
Signed-off-by: guanyu12 <guanyu12@amd.com>
Change-Id: I2edf51a9b8f93589bf6eadee7b2691629c433977


[ROCm/rocm_smi_lib commit: 4d1ea826e1]
2024-03-08 16:22:17 +08:00