2
0

138 Cometimentos

Autor(a) SHA1 Mensagem Data
Mario Limonciello a08170bc75 Apu prerequisites (#1946)
* Don't require powercap support

APUs don't necessarily support setting a power cap from sysfs.
Ignore failures of the file missing.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

* Show edge temperature in default output if hotspot is missing

APUs don't have a hotspot temperature, they have an edge though.
Use that.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

* Format all "power" keys as watts

There will be more power keys when APU support is added, so format
them properly.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

* Don't show power limit in output if it's invalid

APUs can't set power limit using power_cap1 interface.  The limit
will be 0 and thus the UX looks weird in default output.
Only add the `/power_limit` if it's valid.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

* Unify sizes of `amdsmi_power_info_t`

Sizes are used inconsistently.  This causes tools to not show
N/A when they should.  Make them unified.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

---------

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-12-08 21:36:45 -06:00
Maisam Arif 405f34e4d1 [SWDEV-554587] Added IFWI Version and boot_firmware API
- Changed amd-smi static --vbios to accept ifwi
- Change population logic for vbios version API
- Added IFWI boot_firmware to the CLI, C++, Rust, and Python API

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I4ea504d40a43cfb011ab38fc9a664ecf12d39c8a


[ROCm/amdsmi commit: cd21b5edcc]
2025-09-23 16:05:10 -05:00
Kanangot Balakrishnan, Bindhiya e0995ce7a0 [SWDEV-534605] Increase max devices supported and drm test link type (#625)
Increased the AMDSMI_MAX_DEVICES to 64 to accomodate all
devices in CPX mode. The link type has been modified in
amd-smi to match with rocm-smi types, updated the same
for drm tests.

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

[ROCm/amdsmi commit: 6715c5aa92]
2025-09-17 16:30:04 -05:00
Park, Peter 0f75c19e4d [SWDEV-551318] Add doc about RAS / CPER (#636)
* add doc about ras/cper
* add sample code examples for CPER and AFID
---------

Signed-off-by: Park, Peter <Peter.Park@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Co-authored-by: Oosman Saeed <oossaeed@amd.com>

[ROCm/amdsmi commit: 5e92adc5b3]
2025-09-09 10:27:15 -05:00
Mario Limonciello (AMD) c9eddf75e7 Remove unnecessary includes
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>


[ROCm/amdsmi commit: 924a06d1e1]
2025-09-05 17:44:17 -05:00
Poag, Charis 07dfa789d0 [SWDEV-542223] Update Violation Status Changes to Design + Minor cleanup (#558)
Changes:
  - Update violation status logic and metric naming for XCP/XCC metrics (thrm/thm consistency)
  - Added XCP identifier in monitor to allow partition metrics to be shown with applicable APIs
    (Violation Status is the first example of this in monitor)
  - Improve CLI monitor output:
    support multiple GPU lines per GPU, add new columns, and better formatting
  - Refactor helpers and logger for flexible unit formatting and table rendering
  - Add examples for amdsmi_get_gpu_pm_metrics_info()/amdsmi_get_gpu_reg_table_info()
    new metrics APIs in C++ example
  - Sync Python/C++ interface and structures for new metrics fields and naming
  - Remove deprecated/unused RSMI activity APIs, documentation not needed since
    the APIs no longer exist in ROCm SMI either.
  - Cleanup metric violations + fix handle watch arguments
  - Provide better handling/doc for average_flattened_ints()
  - Group xcp metrics with brackets in human readable + adjust output size

Signed-off-by: Poag, Charis <Charis.Poag@amd.com>

[ROCm/amdsmi commit: e2e4fc65c1]
2025-08-06 16:03:06 -05:00
Poag, Charis bf8bbd99c6 [SWDEV-518561] Separate Driver Reload from Memory Partition Sets (#582)
Description:
  - Added a new API `amdsmi_gpu_driver_reload()` to reload the AMD GPU driver independently.
  - Updated CLI (`sudo amd-smi reset -r`) and Python bindings to support driver reload functionality.
  - Removed automatic driver reload from `amdsmi_set_gpu_memory_partition()` and `amdsmi_set_gpu_memory_partition_mode()`.
  - Enhanced CLI and test cases to allow users to control when the driver reload occurs.
  - Updated documentation and changelog to reflect the new driver reload process.
  - Improved error handling and logging for driver reload operations.
  - Added progress bar and user confirmation prompts for driver reload commands.

* Update build/test strategy to only allow one test execution at a time
* Modify API verbage + modify systemctl error output
  - Systemctl is typically not enabled on docker.
  - And is an edge case for gpu being active process/etc for display devices.
* Remove AMDSMI_STATUS_AMDGPU_RESTART_ERR from the return values
* Move driver reload to after we save original compute partitions

---------

Signed-off-by: Charis Poag <Charis.Poag@amd.com>

[ROCm/amdsmi commit: d24dc7ef89]
2025-08-05 20:44:28 -05:00
Saeed, Oosman 7a6d75af7c [SWDEV-533349] codeQL erors in amdsmi source code (#588)
Signed-off-by: Saeed, Oosman <Oosman.Saeed@amd.com>

[ROCm/amdsmi commit: 753a5ea326]
2025-08-05 20:17:21 -05:00
Williams, Justin 21f5755794 Fixed NoDRM Failures
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Williams, Justin <Justin.Williams@amd.com>

[ROCm/amdsmi commit: 738627af29]
2025-06-25 13:18:25 -05:00
Justin Williams 010f95bfb7 Fixed NoDRM Failures
Signed-off-by: Justin Williams <Justin.Williams@amd.com>


[ROCm/amdsmi commit: bad4868f39]
2025-06-25 13:18:25 -05:00
Maisam Arif 6e37490e87 [SWDEV-529665] PLDM Bundle naming
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Id7f652ddc4e790027869683a4aaa3226ffc05c83


[ROCm/amdsmi commit: 6da33b8ded]
2025-06-12 02:19:37 -05:00
Galantsev, Dmitrii 6892907072 CMAKE - Remove example build from src/CMakeLists.txt (#469)
* CMAKE - Remove example build from src/CMakeLists.txt
For some reason it was building examples every time even when not
necessary...
* CMAKE - Format
* Fix drm_example broken PRIu32
* CMAKE - Do NOT create lib64 when building examples
* CMAKE - Examples should only install C and CMake files

---------

Change-Id: I6274b72a085a41b5bd5ae698af798f60a8a092a0
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>

[ROCm/amdsmi commit: f9b8066c26]
2025-06-11 07:12:44 -05:00
Charis Poag df6de25624 [SWDEV-529030/SWDEV-531217] Fix tests & output for partitioned configurations (CPX, DPX, QPX, etc.)
Changes:
  - Updated AMD SMI firmware to display "N/A" for unavailable firmware in partitioned environments, improving clarity.
    Example (in DPX):
    $ amd-smi firmware
    GPU: 0
        FW_LIST:
            ...
            FW 12:
                FW_ID: PM
                FW_VERSION: 00.86.39.00
    GPU: 1
        FW_LIST: N/A
  - Fixed amd-smi partition not showing current partition information on
    asics with inablity to set memory or accelerator partitions.
    $ amd-smi partition -c -m
    CURRENT_PARTITION:
    GPU_ID  MEMORY  ACCELERATOR_TYPE  ACCELERATOR_PROFILE_INDEX  PARTITION_ID
    0       NPS1    CPX               2                          0
    1       N/A     N/A               N/A                        1
    2       N/A     N/A               N/A                        2
    3       N/A     N/A               N/A                        3
    4       N/A     N/A               N/A                        4
    5       N/A     N/A               N/A                        5
    6       NPS1    SPX               0                          0
    7       NPS1    SPX               0                          0
    8       NPS1    SPX               0                          0

    MEMORY_PARTITION:
    GPU_ID  MEMORY_PARTITION_CAPS  CURRENT_MEMORY_PARTITION
    0       N/A                    NPS1
    1       N/A                    N/A
    2       N/A                    N/A
    3       N/A                    N/A
    4       N/A                    N/A
    5       N/A                    N/A
    6       N/A                    NPS1
    7       N/A                    NPS1
    8       N/A                    NPS1

  - Refactored amd_smi_drm_example.cc:
    - Grouped partition changes and restores original partition settings.
    - Now handles partitioned environments allowing example to continue even if some APIs are not supported in partitioned configurations.
  - Modified amdsmi_asic_info_t (see amdsmi_get_gpu_asic_info()) to report OAM ID as N/A if 0xFFFFFFFF (was 0xFFFF).
    Allows for better handling of OAM IDs in partitioned environments (DNE for non-primary nodes,
    since its a physical identifier). Easier to handle in tests and example code (ie. now consistent w/ max size of the structure's value).
  - Introduced amdsmi_RAII_open_FD() (internal API) to manage file descriptors using RAII, ensuring proper closure and preventing resource leaks.
    Updated the following APIs to use this function:
      - amdsmi_get_gpu_asic_info(), amdsmi_get_gpu_vram_usage(),
        amdsmi_get_gpu_vram_info(), amdsmi_get_gpu_vbios_info(),
        amdsmi_get_gpu_driver_info(), amdsmi_get_gpu_virtualization_mode()
  - Updated AMD SMI test_base.cc/.h:
    - Improved output and handling for partitioned environments.
    - Added detailed ASIC information logging to align with structure changes.
    - Enhanced error messages for better context before ASSERT checks.
  - Resolved test failures in partitioned environments by updating
    logic and handling for partition-specific configurations.
    Fixed tests include:
      - computepartition_read_write.cc, frequencies_read_write.cc,
        gpu_metrics_read.cc, mem_util_read.cc, memorypartition_read_write.cc,
        perf_level_read.cc, perf_level_read_write.cc, power_cap_read_write.cc,
        power_read.cc, sys_info_read.cc, gpu_busy_read.cc

Change-Id: I36e903f8fddd714c74c719459c71aba8bbb77e6f
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Resetting head + adding fixes for tests ran in partitions

Change-Id: I0c1e9ac07488b50c95f3bc6d8a724e67d2c715dc
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 391451752b]
2025-06-05 19:24:49 -05:00
Maisam Arif 16d60f3411 [SWDEV-488303] Fixed process list information source
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Iec3416cb5ca1bdd806c3225b514bbf3dbf8c0d2e


[ROCm/amdsmi commit: cebb0799cb]
2025-05-30 20:48:29 -05:00
Arif, Maisam 465f2e6a41 [SWDEV-488303] Updated CU occupancy for per-process retrieval (#243)
Change-Id: I2990597c6dd4b2e8cf3e11ce60f72049ebdd9a8c
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: 0fdaebdbaa]
2025-05-29 20:35:27 -05:00
Saeed, Oosman b793acaa71 [SWDEV-530385] Fix CPER "--follow" & "--file-limit" (#380)
* --follow option fix & --file_limit option added
* change --file_limit and --cper_file to --file-limit and --cper-file

---------

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: 91c9969b72]
2025-05-29 11:59:55 -05:00
Narlo, Joseph d2bf77401e [SWDEV-532129] Update amdsmi asic info (#369)
* Added `subsystem_id` to `amdsmi_get_gpu_asic_info`
---------
Signed-off-by: Narlo, Joseph <Joseph.Narlo@amd.com>

[ROCm/amdsmi commit: 9862db63dd]
2025-05-28 18:26:58 -05:00
Daniel Oliveira 806c3c62ed [SWDEV-529665] Add PLDM Bundle version
feat: Report PLDM Bundle from SMC to IB

Code changes related to the following:
  * APIs
  * CLI
  * Unit tests

Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
Change-Id: I35ccf01eb612ca80e3ae6b72039085c18c989222


[ROCm/amdsmi commit: fe9b6eeb49]
2025-05-20 01:37:00 -05:00
Mewar, Deepak 9a3c59c63c [SWDEV-512393] Added amdsmi_get_cpu_affinity_with_scope (#198)
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>

[ROCm/amdsmi commit: b999f86611]
2025-05-20 01:06:09 -05:00
josnarlo ffd54f5c3b [SWDEV-532119] Fix building examples
Signed-off-by: josnarlo <Joseph.Narlo@amd.com>


[ROCm/amdsmi commit: dd69aa1924]
2025-05-13 20:19:51 -05:00
Galantsev, Dmitrii c6c01ee675 CMAKE - Format with cmake-format
Change-Id: I5b86b7b83e3d151c3d6e1c216ecb28f1313d538a
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 42c77a5912]
2025-05-06 17:09:53 -05:00
Charis Poag 4f626e536f [SWDEV-528647/SWDEV-528450] Follow up Fix incorrect domain
Changes:
- Misc improvements
- Domain showed incorrectly for devices with different domains
  ex.
  GPU: 3
      BDF: 3000:01:00.0

  Fix provides in proper format -
    GPU: 3
        BDF: 0003:01:00.0

Change-Id: Ida4a0acb4922f3c2cb61a9e9cd0b7d1be31061a8
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: da1024cf96]
2025-05-06 12:50:43 -05:00
Galantsev, Dmitrii 3c32ef6c39 CMAKE - Clean-up cmake changes introduced in a9b8b6d369b390af0c00bbffab2b4fe1748b8bad
Change-Id: Ida0e9475a926a2495e36b0d9bc2468c48aee0e77
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: fe98b8bd63]
2025-05-05 15:43:12 -05:00
Poag, Charis fb3d7b9fb9 [SWDEV-528647/SWDEV-528450] Reduce API load times and libdrm/libdrm_amdgpu dynamic loading (#333)
Changes:
- Removed libdrm/libdrm_amdgpu dependencies
- Added/updated new internal libdrm/libdrm_amdgpu/xf86drm APIs
  to allow our APIs to reference before dynamic loading
  the libdrm/libdrm_amdgpu libraries:
  1. amdgpu_drm.h to what's seen in mainline
  2. Added xf86drm.h to whats seen in mainline
- Modified internal DRM capabilities:
  1. Require each API to independently connect to libdrm/libdrm_amdgpu
     + validate API handles reponses accordingly
  2. Initialization of AMD SMI no longer has as strong of a tie to
     libdrm
- Updated internal implementations of several APIs which have
connections to libdrm/libdrm_amdgpu or APIs which have conflicts
with open libdrm/libdrm_amdgpu connections:
  1. amdsmi_init()
  2. amdsmi_get_gpu_vram_usage()
  3. amdsmi_get_gpu_asic_info()
  4. amdsmi_get_gpu_vram_info()
  5. amdsmi_get_gpu_vbios_info()
  6. amdsmi_get_gpu_driver_info()
  7. amdsmi_get_gpu_virtualization_mode()
  8. amdsmi_set_gpu_memory_partition()
  9. amdsmi_set_gpu_memory_partition_mode()
- Cleaned up effected tests/APIs

Change-Id: I96e2cf1b06b0cfee1b01a5e991ccc6116c4245a8

[ROCm/amdsmi commit: b5a43b7744]
2025-05-02 21:58:53 -05:00
Galantsev, Dmitrii 2c88f8ebe6 CI - Disable example builds after breakage
Change-Id: I8a070dd65ed752b2485c17e0eeb5bc1dc875931e
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: a0e6c1c1bd]
2025-04-04 18:19:20 -05:00
Galantsev, Dmitrii ea1fcea378 CMAKE - Fix examples and clean up unused variables
Change-Id: Ie072476a525b49bb7c9c0fb9e49393a482a7d0b0
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 396afadd43]
2025-04-04 18:19:20 -05:00
Galantsev, Dmitrii 633d2a8890 Make amdsmi_get_power_info backwards compatible
Change-Id: Ie5b4c35265827e78934caa94c142d31efce597e4
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 4a3c70136f]
2025-03-19 23:23:48 -05:00
Castillo, Juan fff2d21baf SWDEV-518209: GPU Metrics 1.8 (#177)
- Updates:
    - Adding the following metrics to allow new calculations for violation status:
        - Per XCP metrics gfx_below_host_limit_ppt_acc
        - Per XCP metrics gfx_below_host_limit_thm_acc
        - Per XCP metrics gfx_low_utilization_acc
        - Per XCP metrics gfx_below_host_limit_total_acc
    - Increasing available JPEG engines to 40. Current ASICs may not support all 40. These will be indicated as UINT16_MAX or N/A in CLI.

Signed-off-by: Juan Castillo <juan.castillo@amd.com>
Co-authored-by: Charis Poag <Charis.Poag@amd.com>

[ROCm/amdsmi commit: 7c882b2f69]
2025-03-19 10:24:02 -05:00
Poag, Charis 267fa91e8a [SWDEV-493274/SWDEV-514998] Add AMD SMI partition tests + Add Guest amd-smi static --partition (#127)
* [SWDEV-493274/SWDEV-514998] Add AMD SMI partition tests + Add Guest amd-smi static --partition

Changes:
    - Added amd-smi static --partition for guest systems
    - Added C++ tests for memory and compute (accelerator) partitions
    - Added Python tests for amdsmi_get_gpu_vram_info(),
       amdsmi_get_gpu_accelerator_partition_profile_config()
    - Updated Python tests for
      amdsmi_get_gpu_accelerator_partition_profile()
      Now includes more profile and resource detail
    - Added amdsmi_get_gpu_xcd_counter();
      Tests provided for both C++/Python APIs
    - Added AmdSmiVramType & AmdSmiVramVendor: they were missing
      python testing required adding.

Change-Id: Ib6549d8ccc5fb68726f38745b87c78f890186022
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

[ROCm/amdsmi commit: 48cb5529d2]
2025-03-11 16:38:46 -05:00
Narlo, Joseph c26da62fc7 [SWDEV-513651] Sync Unified And Linux Header (#98)
Signed-off-by: Joseph Narlo <joseph.narlo@amd.com>

[ROCm/amdsmi commit: dc4a16da6f]
2025-02-06 22:25:50 -06:00
Joseph Narlo ab88f38fa0 [SWDEV-504583] Resolve Additional Compiler Warnings
Signed-off-by: Joseph Narlo <joseph.narlo@amd.com>


[ROCm/amdsmi commit: dc228398d0]
2025-01-28 15:36:44 -06:00
Juan Castillo 2ddb2ef032 [SWDEV-496693]GPU Metrics 1.7
Features added:
- [SWDEV-475244] Add new interface to get max memory bandwidth
Updated API: amdsmi_get_gpu_vram_info
Updated: struct amdsmi_vram_info_t to include vram_max_bandwidth
CLI: amd-smi static --vram

- [SWDEV-488349] Add new interface for XGMI link status
New API: amdsmi_get_gpu_xgmi_link_status
CLI: amd-smi xgmi --link-status

Signed-off-by: Juan Castillo <juan.castillo@amd.com>
Change-Id: I1aa35b741136eb4f02f7ea9a95b865886273eb72


[ROCm/amdsmi commit: f8b8347627]
2024-12-18 10:57:06 -06:00
Joe Narlo 68497c68e9 SWDEV-492272 [AMDSMI] Build/Compiler warnings messages
Fix compiler warnings

Signed-off-by: Joe Narlo <Joseph.Narlo@amd.com>
Change-Id: I10657b8f3ef18a9b45311e8f6509958297a57823


[ROCm/amdsmi commit: d0a7332d32]
2024-12-13 00:38:07 -05:00
Joe Narlo bad2cc9c23 SWDEV-495787 [AMDSMI] Different license headers
Change copyrights to MIT and remove date

Signed-off-by: Joe Narlo <Joseph.Narlo@amd.com>
Change-Id: I16f5b412f2b9ddefaaa1771aa714cc18829a1be4


[ROCm/amdsmi commit: 3052ad4220]
2024-11-22 08:55:28 -05:00
Charis Poag 7a35c805b0 [SWDEV-422195/SWDEV-440985] GPU metrics 1.6
Changes:
    - Added new GPU metrics:
      1) Violation status' (ex. PVIOL/TVIOL) accumulators
      2) XCP (Graphics Compute Partitions) statistics
      3) pcie other end recovery counter
    - CLI/API/tests changes were made accordingly

Change-Id: I589b9b1f570f25dda12d95bb501feca85da8b3bb
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 3a4abbd8c0]
2024-09-27 12:04:21 -05:00
Lang Yu 94d349573d SWDEV-463405: Add amdsmi_get_link_topology_nearest support
amdsmi_get_link_topology_nearest() is used to retrieve
the set of GPUs that are nearest to a given device
at a specific interconnectivity level.

Code changes related to the following:
    * API
    * CLI
    * Unit tests
    * Examples

Header Unification Change: "/amdsmi/+/1122408"

Change-Id: Id0317797c652c267742513936d321677793ec634
Signed-off-by: Lang Yu <lang.yu@amd.com>


[ROCm/amdsmi commit: 7a557b1c50]
2024-09-26 16:43:27 -05:00
Maisam Arif c2b9cdfd2e Udpated License Dates
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I8ca199c129c06508bc3e23745ab5ac2d20dce928


[ROCm/amdsmi commit: 105db1afcd]
2024-09-16 16:14:47 -04:00
Oliveira, Daniel 55b88706e1 SWDEV-463401: amdsmi_get_gpu_asic_info() adds num_of_compute_units
number of compute units `amdgpu_gpu_info.num_of_compute_units` is exposed through amdsmi_get_gpu_asic_info().

Code changes related to the following:
  * API
  * CLI
  * Unit tests
  * Examples

Change-Id: Ibeb612d079ed87437a0e56124b8504098fc2dcfd
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/amdsmi commit: b05849dad0]
2024-08-28 10:15:07 -04:00
Oliveira, Daniel e2e63055a6 SWDEV-463399: amdsmi_get_gpu_vram_info() adds bit-width
Driver info `amdgpu_gpu_info.vram_bit_width` is exposed through amdsmi_get_gpu_vram_info().

Code changes related to the following:
  * API
  * CLI
  * Unit tests
  * Examples

Change-Id: I8abd8db7a603078b2b1c008b2685cecf35caf3d2
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/amdsmi commit: 893f13ab98]
2024-08-27 18:22:50 -04:00
Maisam Arif 7dfe4276cc Use different sysfs for soc_pstate and xmgi_plpd
The sysfs is changed to use the pm_policy folder with multiple
dpm_policy files.

Change-Id: I40fac8de2d0cb127950d238b8196f6d2416778d0


[ROCm/amdsmi commit: e5d1ba4621]
2024-05-31 01:38:41 -04:00
Dalibor Stanisavljevic cdd24a7b0f SWDEV-457337 - Fix header alignment
Change-Id: I9f25f6c4f0d00c76b66d13162f30be11368f5b59
Signed-off-by: Dalibor Stanisavljevic <Dalibor.Stanisavljevic@amd.com>


[ROCm/amdsmi commit: 7b2463abe0]
2024-05-23 04:41:57 -04:00
Charis Poag f7d3417a7f SWDEV-450580 - Fix powercap set
Updates:
     * CLI - Added AMDSMIHelpers.convert_SI_unit() to help
       conversion of units
     * API - Reverted to uW for power cap limits
     * CLI - amd-smi static --limit now includes MIN_POWER
     * Tests now are all using uW units to keep W conversion
       to only happen in CLI
     * Python API now reflects same units as uW (what is seen
       in amdgpu driver)
     * CLI - amd-smi metric --power:
       Fixed power seen on gpu_metrics v1.3

Change-Id: I32d9ba78d0d8806772f0860f9a803a885b3f316a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: c24d66740e]
2024-05-02 10:13:39 -05:00
Oliveira, Daniel 9e2b1d8a09 fix: [SWDEV-442525] [rocm/amd_smi_lib]
Fixes gpu_process_list

Code changes related to the following:
  * amdsmi_get_gpu_process_list()
  * CLI
  * Examples
  * Unit tests
  * Changelog
  * Readme
  * rocm_smi_lib commit: 677433b367

Change-Id: I9210fbca7a5da92d0a8b472b72ca82597c8e4fb5
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/amdsmi commit: 08e2e21bab]
2024-03-27 16:48:24 -05:00
Oliveira, Daniel 51d25d8feb fix: [SWDEV-448201] [rocm/amd_smi_lib]
Adds Add PCIE Errors

Code changes related to the following:
  * amdsmi_get_pcie_info()
  * CLI
  * examples

Change-Id: Ie0b7053e77c88fb18309c16e74bce75d862c45a9
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/amdsmi commit: 1310c767ce]
2024-03-24 23:33:32 -04:00
Bill(Shuzhou) Liu 46ab68f840 Set and get DPM policy for GPU device
Add new APIs to set and get dpm policy for the GPU device.

Change-Id: I26fa49cd17d0ce66bda3446c38945a6cf35717ff


[ROCm/amdsmi commit: 108e6d4ae6]
2024-03-12 10:32:31 -04:00
Bill(Shuzhou) Liu 21cf0c1b5c Unify the amdsmi_get_pcie_info python interface
Make the python interface consistent with the C interface.

Change-Id: Idda08f888947c757e475d5a024b0ec3d8e1d846a


[ROCm/amdsmi commit: db33cda0c1]
2024-02-22 03:33:59 -05:00
Maisam Arif 4728d05c5f Align list and cache_info to Host
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I4fa55b360b74d5a202d0b9b4eb7aee660b0a1bcf


[ROCm/amdsmi commit: 77710921a4]
2024-02-15 01:47:59 -05:00
Oliveira, Daniel e9e246f23d fix: [rocm/amd_smi_lib] amdsmi_get_gpu_activity gfx/memory activity does not update
Checks and forces rereading gpu metrics unconditionally

Code changes related to the following:
  * Device::dev_log_gpu_metrics()
  * amdsmi_get_gpu_metrics_header_info()
    Removed unintentionally during work on 'header cleanup Remove non-unified headers'
  * Examples
  * Unit tests

Change-Id: I83710e173c0f7102d0b7f865c18474c979a95cd8
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/amdsmi commit: 78074d7d77]
2024-02-13 10:15:17 -06:00
Maisam Arif b6f62bb651 Renamed amdsmi_get_metrics_table to amdsmi_get_cpu_metrics_table
Renamed structs to be more conistent with what they are calling

Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I6f2be2fcb76f004aa592f0dad8545565700ccd4b


[ROCm/amdsmi commit: f831cf49f7]
2024-02-12 16:30:18 -06:00
Maisam Arif 39537d999d SWDEV-436533 - Cache Info Struct Update
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ic640fa657cdcc32d7b00ff78fc9452ec7e05dd07


[ROCm/amdsmi commit: 88192d8b6b]
2024-02-05 16:51:04 -05:00