Граф коммитов

798 Коммитов

Автор SHA1 Сообщение Дата
Adam McDaniel b3caa4972c Exposed the energy count and current socket power metrics to profilers like PAPI
[ROCm/rocm_smi_lib commit: f688c9938d]
2025-07-02 09:47:17 -05:00
Galantsev, Dmitrii ff02dc85da Revert "rsmi_init: Do not complain loudly when no driver is found"
This reverts commit 42dc44f54d.


[ROCm/rocm_smi_lib commit: 731be3f743]
2025-07-02 09:45:41 -05:00
Samuel Thibault 42dc44f54d rsmi_init: Do not complain loudly when no driver is found
When librocm-smi is pulled through a dependency, we may end up on a system
without actual hardware supported by ROCM, and rsmi_init() failing is
actually expected, we do want to frighten the user in such a case.


[ROCm/rocm_smi_lib commit: 8ca4207d5c]
2025-06-19 13:30:51 -05:00
Ranjith Ramakrishnan e8477f460f SWDEV-534264 - Add liboam.a to static rocm-smi package
liboam.a was missing in static rocm-smi package and resulting in compilation error on appliction that use rocm-smi


[ROCm/rocm_smi_lib commit: 59468e3f78]
2025-06-13 12:09:41 -05:00
Arif, Maisam 9002bcc5a8 Revert "SWDEV-534264 - Add liboam.a to static package"
This reverts commit ae9bcb11e1.


[ROCm/rocm_smi_lib commit: 5cc6c1ca1c]
2025-06-12 16:26:19 -05:00
Ranjith Ramakrishnan ae9bcb11e1 SWDEV-534264 - Add liboam.a to static package
liboam.a was missing in static package. The library is gettting created but not packaged.
Fixed the same


[ROCm/rocm_smi_lib commit: ff7561607e]
2025-06-11 13:45:21 -05:00
Charis Poag b45713faf5 [SWDEV-530035] Fix tests ran with partitioned configurations (CPX, DPX, QPX, etc.)
Changes: - Updates to APIs to handle null pointers or RSMI_STATUS_NOT_SUPPORTED
  - Fixes to tests to handle partitioned configurations correctly
  - Synced with latest AMD SMI API changes
Change-Id: I7a932f9336ef29ccb01d3b15e2101f6136b45720


[ROCm/rocm_smi_lib commit: 12b78439d2]
2025-06-06 16:39:29 -05:00
Peter Park 5a3556ca85 update copyright years to 2025
revert shared_mutex.h


[ROCm/rocm_smi_lib commit: a156bfa4ae]
2025-06-03 17:16:54 -05:00
Peter Park 148400af45 update license year to 2025
[ROCm/rocm_smi_lib commit: b0831d79cf]
2025-06-03 17:16:54 -05:00
Charis Poag 5e31509711 Removed backwards compatibility for jpeg_activity/vcn_activity
Updated:
- Removed backwards compatibility for jpeg_activity/vcn_activity
- On supported ASICs users can use XCP (partition) stat values:
  jpeg_busy and vcn_busy

Change-Id: I78c403f8462668738ec57cac12b107f6a3989b18
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/rocm_smi_lib commit: 1c6b2adae7]
2025-05-29 13:47:56 -05:00
Stella Laurenzo a768477da4 [PATCH] Miscellaneous CMake fixes.
Change-Id: Ibca31745d2e9375523193310bc1ca5994c87aa32


[ROCm/rocm_smi_lib commit: 92db324944]
2025-05-27 12:12:42 -05:00
Afzal Patel 29602fec52 add interface drm include directory
add interface drm include directory


[ROCm/rocm_smi_lib commit: f3c6e80fab]
2025-05-27 12:06:56 -05:00
Pham, Gabriel 87c455684d [SWDEV-533221] Synced rocm-smi with amd-smi lib to fix warning messages (#71)
* Removed URL that was on prohibited source list

---------

Signed-off-by: Gabriel Pham <Gabriel.Pham@amd.com.>


[ROCm/rocm_smi_lib commit: 4243e42758]
2025-05-26 10:08:16 -05:00
Castillo, Juan eaa2000af5 [SWDEV-523359] fan_read_write: Add set fan speed validation check. (#61)
[SWDEV-523359] fan_read_write: Add set fan speed validation check.
- Handled NOT_SUPPORTED status which previously caused rsmitst to false fail
- Added continute statement to proceed with rest of FanReadWrite test.
- fixed spacing line 140

Signed-off-by: Juan Castillo <juan.castillo@amd.com>

[ROCm/rocm_smi_lib commit: ac31c6e576]
2025-05-26 09:54:41 -05:00
Ramakrishnan, Ranjith 623a6452e5 SWDEV-532478 - Add rocm-core dependency to RPM packages (#67)
rocm-core dependency was missing for rpm packages and fixed the same

[ROCm/rocm_smi_lib commit: d32ff28ebf]
2025-05-20 14:41:38 -07:00
Maisam Arif ffb52bf65e Updated Maintenance mode notice
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>


[ROCm/rocm_smi_lib commit: 3f11537401]
2025-05-20 01:44:57 -05:00
Arif, Maisam a2443a5efe Revert "Correct the dependencies of rocm_smi package."
This reverts commit 93ff9b3547.


[ROCm/rocm_smi_lib commit: 40671be7c9]
2025-05-18 10:34:28 -05:00
Arif, Maisam a9ddb0dcea Revert "Use devel in RPM package requires field"
This reverts commit 22777b75b5.


[ROCm/rocm_smi_lib commit: a0ad4a3fcd]
2025-05-18 10:34:28 -05:00
Arif, Maisam afc87ee2f8 Revert "Use the alphabetic 'or' in the CPack RPM package Requires field."
This reverts commit edd6134d16.


[ROCm/rocm_smi_lib commit: c7797b19a3]
2025-05-18 10:32:12 -05:00
Ranjith Ramakrishnan edd6134d16 Use the alphabetic 'or' in the CPack RPM package Requires field.
Using the OR symbol "|" is causing error in cpack


[ROCm/rocm_smi_lib commit: 4d201ad2c2]
2025-05-15 15:28:26 -07:00
Castillo, Juan 08062c0577 Fix WARNING: AMD GPUs visible, but data is inaccessible (#58)
* [SWDEV-531834] Fix AMD GPUs visible, but data is inaccessible:
- Scans directories under /sys/bus/pci/drivers/amdgpu
- Verifies each device's runtime_status to determine if it's active
- Returns False if any device is not in active state
- Handles permission errors gracefully with proper debug logging
- Includes comments explaining behavior differences between Instinct / NAVI hardware

The default status is set to True, assuming devices are active unless
proven otherwise, which accommodates hardware like some Instinct ASICS
which do not support runtime power management.

Signed-off-by: Juan Castillo <juan.castillo@amd.com>

[ROCm/rocm_smi_lib commit: 47f80145cb]
2025-05-15 14:30:33 -05:00
Ranjith Ramakrishnan 22777b75b5 Use devel in RPM package requires field
[ROCm/rocm_smi_lib commit: cea68cd940]
2025-05-13 18:03:14 -05:00
Ranjith Ramakrishnan 93ff9b3547 Correct the dependencies of rocm_smi package.
Added libdrm/libdrm_amdgpu to the package requires/depends list and removed the same from suggests list.
The rocm smi header files are using drm.h


[ROCm/rocm_smi_lib commit: 6d53d9f9cf]
2025-05-13 18:03:14 -05:00
Ranjith Ramakrishnan e4f50f06cb SWDEV-531400 - Remove file reorganization backward compatibility code
The backward compatibility is already disabled and no longer required


[ROCm/rocm_smi_lib commit: 532f4be8be]
2025-05-13 17:54:20 -05:00
Choudhary, Rahul 2256d67503 Update palamida.yml (#49) - removing url
[ROCm/rocm_smi_lib commit: 9a5dec1c26]
2025-05-12 21:49:08 -07:00
Galantsev, Dmitrii 142d42392e CMAKE - Fix libc6 RPM dep issue
Change-Id: Id68e8a268f1b7e8561b9c9f741ae6e10b4ad7d8a
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rocm_smi_lib commit: 6d33c08346]
2025-05-08 10:56:38 -05:00
Galantsev, Dmitrii e0b9bf1dcb CMAKE - Fix lintian issues
Change-Id: Ie0099a27986eec017ea1e554c15dc06e6bd35c76
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rocm_smi_lib commit: a355bb6664]
2025-05-07 00:35:09 -05:00
Hila, Nino 23fdad4ea9 Add palamida.yml
[ROCm/rocm_smi_lib commit: 59c5213367]
2025-05-05 16:27:51 -04:00
Choudhary, Rahul 6dd08920ff Add palamida.yml (#47)
[ROCm/rocm_smi_lib commit: 2b8e118028]
2025-04-22 15:45:26 -07:00
Poag, Charis 25e0678501 [SWDEV-528097] Unique ID fix for missing ID in KGD -> use KFD's (#44)
Changes:
 - Unique Id tries reading from KGD
   -> falls back to use KFD if not found

Change-Id: I8fb8f38df5db7413805f4a20621ad12ed3fc89a3
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

[ROCm/rocm_smi_lib commit: 4276207ff8]
2025-04-22 16:37:23 -05:00
Hila, Nino 4f0e809bf9 Add palamida.yml
[ROCm/rocm_smi_lib commit: 461e5dbc11]
2025-04-22 12:01:04 -04:00
Poag, Charis efb37d89bc [SWDEV-522992] Make libdrm / libdrm_amdgpu load dynamically (#43)
Changes:
- Now load libdrm/libdrm_amdgpu dynamically

Change-Id: I49fb1f3540b3235a25370f7cfcfb9778db34c2a5
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

[ROCm/rocm_smi_lib commit: ce405476ca]
2025-04-16 16:03:42 -05:00
Charis Poag aacf23778d [SWDEV-518325/SWDEV-518320/SWDEV-443309] Fix Partition Enumeration
* Changes:
  - Updates to DRM renderD* / card* pathing for partition devices
  - Now use KFD to discover AMD devices and populate accordingly
    Device MUST have an accessible KFD node (via cgroups)
  - Updated several ROCm SMI CLI outputs to handle SYSFS files
    which are not accessible on partition nodes
  - Added a new method to help get card/drm info
    (rsmi_dev_device_identifiers_get) from ROCm SMI

Change-Id: If844f27ffc595942272abe9c8167ed90a0b0e225
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/rocm_smi_lib commit: a0df877fdf]
2025-04-14 16:03:24 -05:00
Castillo, Juan 07c06318ad [SWDEV-516013]-rocm-smi runtime status check fix (#28)
rocm-smi is not working in mGPU, Blocking DLM tests
Updates include:
 - Creating check_runtime_status function to check for device status of active.
 - Added warning to users that No AMD GPUs are available, check power status/control.
 - Added check for empty string coming from HWMON, if emtpy returns unexpected data.

---------

Signed-off-by: Juan Castillo <juan.castillo@amd.com>

[ROCm/rocm_smi_lib commit: 2630bf0a8c]
2025-04-14 13:05:22 -05:00
Mallya, Ameya Keshava 4d8e9cfa1d Added !verify features
[ROCm/rocm_smi_lib commit: d18e40f204]
2025-04-08 07:59:30 -07:00
Arif, Maisam d7e8c02428 [SWDEV-524528] Nullptr check correction for TestErrCntRead (#38)
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/rocm_smi_lib commit: 91f30c5756]
2025-03-28 12:14:28 -05:00
Mallya, Ameya Keshava 76e4f02aaf Added KWS check for amd-mainline
[ROCm/rocm_smi_lib commit: fb8d0c256b]
2025-03-28 08:10:45 -07:00
Arif, Maisam 5e3b10bf8c [SWDEV-524147] Patch for handling new ras filenames (#34)
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/rocm_smi_lib commit: 847ac53444]
2025-03-27 22:14:28 -05:00
Su, Daniel 2ae703fed6 External CI: enable trigger for amd-mainline (#30)
[ROCm/rocm_smi_lib commit: 172707cbd3]
2025-03-26 08:24:51 -05:00
Castillo, Juan 3aa80ec0e4 SWDEV-518214: GPU Metrics 1.8 (#31)
* SWDEV-518214: GPU Metrics 1.8 (#31)

- Updates:
    - Adding the following metrics to allow new calculations for violation status:
        - Per XCP metrics gfx_below_host_limit_ppt_acc
        - Per XCP metrics gfx_below_host_limit_thm_acc
        - Per XCP metrics gfx_low_utilization_acc
        - Per XCP metrics gfx_below_host_limit_total_acc
    - Increasing available JPEG engines to 40. Current ASICs may not support all 40. These will be indicated as UINT16_MAX or N/A in CLI.

Signed-off-by: Juan Castillo <juan.castillo@amd.com>
Co-authored-by: Charis Poag <Charis.Poag@amd.com>

[ROCm/rocm_smi_lib commit: f69e65f7bd]
2025-03-20 18:07:32 -05:00
Zhang, Ava 6075f89576 Merge branch 'amd-mainline' into amd-staging
[ROCm/rocm_smi_lib commit: 7327e645c6]
2025-03-17 08:58:09 +08:00
Mallya, Ameya Keshava 19a9ff813d Added release trigger for further releases
[ROCm/rocm_smi_lib commit: 8fe9882c49]
2025-03-14 14:06:21 -07:00
Arif, Maisam 9ae7a0b7b1 [SWDEV-517717] Maintence Mode Notice (#20)
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/rocm_smi_lib commit: 1416c2043d]
2025-03-09 14:23:33 -05:00
Kanangot Balakrishnan, Bindhiya 165cf24119 SWDEV-510419: Restore compute partition after memory partition test (#15)
Memory partition test was changing original compute partiton based
on default compute mode. Corrected this to set back to original
compute partition.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

[ROCm/rocm_smi_lib commit: d8de415960]
2025-03-09 14:23:33 -05:00
Charis Poag 5fc11e8325 [SWDEV-514998/SWDEV-511662] Fix tests for Guest and BM with static CPX config
Guest: Tests needed to account for not supporting changing compute
partitions.

BM: Tests need to account for invalid responses from Driver (due to
static CPX config).


[ROCm/rocm_smi_lib commit: 967493c39a]
2025-03-09 14:23:22 -05:00
Galantsev, Dmitrii 00ff814afc [SWDEV-508785] Bump version number to 7.6.0
Change-Id: I084f139802f73311f15c68f94bc98f631c7f2bd8
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rocm_smi_lib commit: 9c82706fc1]
2025-03-09 14:23:22 -05:00
Charis Poag 8e841f22ac [SWDEV-504146] Fix Device Name
Changes: - Fixed Device Name (market name)
  - Added new API rsmi_dev_market_name_get()
  - Updated tests
  - Updated amdgpu_drm.h to match latest mainline kernel
  - Fixed subsystem ID to only show hex value (not subsystem name)
  - rocm_smi_lib now has a recommended requirement for libdrm
Change-Id: Ic438529e16c8c3dbbdd620da664918148c40c997


[ROCm/rocm_smi_lib commit: b951a65cf2]
2025-03-09 14:23:22 -05:00
Galantsev, Dmitrii 83f16ffa06 Fix warnings on CXX/linker flags (#12)
1) When `clang` is used as system compiler, libraries were built without respecting LDFLAGS. For example, this affected LTO flags, if any (and it only affected clang, not gcc).

2) Linker flags are registered as CXX flags, which produces warnings during compilation:
```
clang++: warning: -Wl,-z,noexecstack: 'linker' input unused [-Wunused-command-line-argument]
clang++: warning: -Wl,-znoexecheap: 'linker' input unused [-Wunused-command-line-argument]
clang++: warning: -Wl,-z,relro: 'linker' input unused [-Wunused-command-line-argument]
clang++: warning: -Wl,-z,now: 'linker' input unused [-Wunused-command-line-argument]
```

3) Clang does not support `-Wtrampolines` flag:
```
warning: unknown warning option '-Wtrampolines' [-Wunknown-warning-option]
```

4) No linkers support `noexecheap` anymore. `noexecheap` linker flag was a part of PaX patches to GNU ld, (which were dropped in 2017)[https://www.gentoo.org/support/news-items/2017-08-19-hardened-sources-removal.html]. Now ld/ld.lld/ld.gold don't support it and protection of heap is managed by NX bit. Therefore every compiler produces this warning:
```
ld.lld: warning: unknown -z value: noexecheap
```

Closes #210.

Co-authored-by: Sv. Lockal <lockalsash@gmail.com>

[ROCm/rocm_smi_lib commit: 59cbeb57d1]
2025-03-09 14:23:22 -05:00
Johar, Adel 957699f7ba Docs: Fix broken links, warnings and use automodule (#11)
- Fixes the broken links in rocm_smi.h
- Uses automodule instead of autofunction in docs/reference/python_api.rst
- Fixes some warnings during docs build
- Update some of the versions in requirements.txt

[ROCm/rocm_smi_lib commit: fc61e40506]
2025-03-09 14:23:22 -05:00
Mallya, Ameya Keshava d86308f726 Added !verify trigger
[ROCm/rocm_smi_lib commit: a938e743f0]
2025-03-09 14:23:22 -05:00