2
0
Gráfico de cometimentos

522 Cometimentos

Autor(a) SHA1 Mensagem Data
gabrpham_amdeng 18faddf6f3 Added support for configuring PPT1 power cap
- Updated python integration test to account for PPT1 support changes
  - Updated set/reset power-cap input format
  - Adjusted python API and updated C++ API test

Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
Change-Id: Ia9d02868b6e91c88c10a9772d9e2d9f37c3c352f
2025-11-13 13:08:12 -06:00
Charis Poag 0a5fdc944f [SWDEV-560847] Fix Vram type not showing newer types
* Changes:
  - Allows `amd-smi static --vram` (`amdsmi_get_gpu_vram_info()`)
    to read the following types:
    DDR5, LPDDR4, LPDDR5, and HBM3E.

Change-Id: I1eddf9dcb574e1868541cc5063ae95cb6d6e1c59
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-10-29 16:13:42 -05:00
Pryor, Adam 2144cfbba4 [SWDEV-357472] Add evicted_ms metric (#620)
- **Added evicted_time metric for kfd processes**.  
  - Time that queues are evicted on a GPU in milliseconds
  - Added to CLI in `amd-smi monitor -q` and `amd-smi process`
  - Added to C API and Python API:
    - amdsmi_get_gpu_process_list()
    - amdsmi_get_gpu_compute_process_info()
    - amdsmi_get_gpu_compute_process_info_by_pid()

---------

Signed-off-by: Pryor, Adam <Adam.Pryor@amd.com>
2025-10-28 14:49:03 -05:00
Saeed, Oosman 90f4b8c43d Sync with latest ras-decode @bc6b43c (#770)
Signed-off-by: Oosman Saeed <oossaeed@amd.com>
2025-10-27 14:10:00 -05:00
Kanangot Balakrishnan, Bindhiya 09a97f02ed [SWDEV-542718] Correct socket_affinity (#760)
* [SWDEV-542718] Correct socket_affinity

Updated Socket affinity to show bitmask and expanded cpu list.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Update per-device local_cpulist for socket_affinity

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Added amdsmi_get_cpu_affinity_from_local_cpulist API.
Updated the wrapper.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Revert "Added amdsmi_get_cpu_affinity_from_local_cpulist API."

This reverts commit 9a2ef934b1787f8aa09d3e4efe02f897b4295215.

* Moved the changes to C API.
In case of SOCKET_SCOPE, use local_cpulist first.
If it is unavailable or not readable, fallback to
numa.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Addressed review comments

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-10-22 16:20:41 -05:00
Poag, Charis 01b4fe6614 [SWDEV-535159] Add support for GPU partition metrics (#490)
[SWDEV-535159] Add support for GPU partition metrics

Changes include:
  - Internal logic to smart-switch between gpu_metrics/xcp_metrics files
  - [WIP] Initial plumbing for new partition metric API

Change-Id: I4340fb1b48bac0117d80d5d486b9e871430d5cd8
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Add amdsmi_get_gpu_partition_metrics_info() + minor cleanup

Change-Id: I5d60604f18baddbd03852dc90e88aa0b8107d50e
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Fix partition metric logic + update logging/tests

Change-Id: I9e89b19ead17694c54e224f8e13ff8ee3eb2e22a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Adjust amd-smi metric/monitor/default to show (some) partition information

Change-Id: I2e8d2745876a19bdaec3c039daa97345c9f701b5
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Add C++ tests

Change-Id: Ib9eb0b57a6d7a280992e05a4c6eba632826952ef
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Remove modification of energy counter, not needed

Change-Id: I5c48eaaae248ee6dc79abba609d837ec35d78022
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

[CLI] amd-smi metric: cleaned up N/A'd multi-valued to show just N/A

Changes:
1. amd-smi metric: cleaned up N/A'd multi-valued to show just N/A
ex.
JPEG_ACTIVITY: [N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A]

Now just shows: N/A

2. [Python Unit Test] Changed testname TestAmdSmiPythonBDF(unittest.TestCase) ->
 AmdSmiPythonUnitTest

Test name was confusing.

Change-Id: Ieb3b036f30002fd22362508eb9fc5d443df395ae
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Log cleanup

Change-Id: I1b1a95f1844d35bec7a7bd8cb996f87e4914c069
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Add amd-smi partition-metrics CLI + general cleanup

Change-Id: Ia91488e6cb3a4d62b4087afbddfe0b3bb9378fdc
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

[1.3 metrics] Remove forwards compatibility for partition metrics

Change-Id: Iab928983e6f6f1587bc9307f6f3fa2b2696ca6f7
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Fixed violation output not showing % + general cleanup

Change-Id: Icac1b0a55b18c7628b07109ae0c377d17e0825f1
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Clean up amdsmi_get_gpu_partition_metrics_info & amd-smi partition-metric outputs

Change-Id: I6427028b980874641e9ffb3b5d88ad493dbf9cf4
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Fix metrics not found + extra logging/formatting

Change-Id: I841a27bb2c305e97ec7579a13ac915e5be497c3a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Update license to current default

Change-Id: I0de9b8a2d5dbbeab4491097f0354ba17b0d30866
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Cleanup for review

Change-Id: I96ed25c3f2b8968eea1af24c5e5860c2b4e74e6e
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Moderize updated/new interal APIs.

Change-Id: I3c48a250eeb703709b14cb5ffa68268d8321626c
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Remove extra logging in dynamic metrics

Change-Id: Idb97547bcbe143d6fa1cb5cb278ffe4da615ce14
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Remove amd-smi partition-metric command

Change-Id: Ib83c17e5cd7e0da3798198943bddd46c296b411c
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Move new CLI updates to another PR + minor fixes

Change-Id: I3b1163eec12f9b5f7d95ee33de08e168cec1b1fe
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Allow dynamic metrics to work for gpu/xcp metrics 1.9+/1.1+

Updated some logging as well.

Change-Id: I2ed9f5a5ef8afb1520508820ca6153525f0644b4
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Allow dyn gpu/xcp metric v1.9+/v1.1+

Added tests for quick check

Change-Id: I576d6f6582a55afb08e5ac57791ce95e2fa184a2
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Update tests for larger subset of version checks

Change-Id: I3cdf4f8bb4fc6161f4c76566939f90545d0f362a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Fix XCP metrics in gpu/partition metric pre-v1.9/v1.1 (dynamic)

Change-Id: I4dabc1ed6bef6b86c8e7f92bf9cb5992f3966fe2
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

---------

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-10-20 14:43:40 -05:00
Narlo, Joseph 460cfcba1f [SWDEV-555807] TestCudaMallocAsync test power draw failing (#755)
* Clarified comments regarding power limit retrieval and its support on virtualized systems.
* Change unsupported comment to UINT32_MAX

---------

Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
2025-10-17 08:57:57 -05:00
Pryor, Adam cba4c871d3 [SWDEV-559082] Add asic info cache (#756)
Signed-off-by: adapryor <Adam.pryor@amd.com>
2025-10-08 21:48:08 -05:00
Oosman Saeed c6698c9100 [SWDEV-553168] Add support for decoding out of band boot time CPER files.
Change-Id: Ic4278698f9c5b5ae56bd56fd43150c0653c1ef05
2025-10-07 22:23:33 -05:00
Pryor, Adam c967aead58 [SWDEV-525336] Use KFD to determine process start/stop (#723)
* Used KFD to determine linking between GPUs and PIDs rather than depend on fdinfo's per pid single gpu bdf info that we were getting.

Signed-off-by: adapryor <Adam.pryor@amd.com>

---------

Signed-off-by: adapryor <Adam.pryor@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
2025-10-02 10:57:08 -05:00
Maisam Arif e61eac1368 [SWDEV-538483] Fix amdsmi.h doc tag for amdsmi_set_power_cap
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I919eba1745990fd05ca3ff1077317e7b1244fded
2025-09-26 13:28:43 -05:00
Stella Laurenzo 4d5d24d1c6 Fix delay loading of drm by soname. 2025-09-24 20:44:03 -05:00
Narlo, Joseph 3c8fd1bf54 [SWDEV-554880] Sync Unified and Linux Header (#686)
Sync Unified and Linux Header

---------

Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
2025-09-23 16:56:32 -05:00
Maisam Arif cd21b5edcc [SWDEV-554587] Added IFWI Version and boot_firmware API
- Changed amd-smi static --vbios to accept ifwi
- Change population logic for vbios version API
- Added IFWI boot_firmware to the CLI, C++, Rust, and Python API

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I4ea504d40a43cfb011ab38fc9a664ecf12d39c8a
2025-09-23 16:05:10 -05:00
Kanangot Balakrishnan, Bindhiya 6715c5aa92 [SWDEV-534605] Increase max devices supported and drm test link type (#625)
Increased the AMDSMI_MAX_DEVICES to 64 to accomodate all
devices in CPX mode. The link type has been modified in
amd-smi to match with rocm-smi types, updated the same
for drm tests.

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-09-17 16:30:04 -05:00
Galantsev, Dmitrii 85e37bb6ce Add extra element to array for bounds checking
Decrement padding to keep struct size the same

Change-Id: I4bea5d4b4d5c908423c7cc55a7e8c404b4a6b5e8
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) eacec681dd Use nested namespace for amd::smi
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) 3d0ea25af3 Remove unnecessary typedef declarations
amd_smi_cper.h:32:1: warning: ‘typedef’ was ignored in this declaration

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) 924a06d1e1 Remove unnecessary includes
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-09-05 17:44:17 -05:00
Oosman Saeed ffca095246 [SWDEV-547223] RAS HBM CRC Read CE failed due to AFID missing 24
cherry-pick aca-decode repo changeset: aca-decode repo: f9e5ad5 (HEAD -> main, origin/main, origin/HEAD) Fix bug in Corrected HBM Error being decoded as AFID 34 (#5)
2025-08-21 11:00:30 -05:00
Charis Poag d3b73fac82 Revert Major ABI break for amdsmi_get_violation_status()
Changes:
- This aligns back to original struct naming for ROCm 7.0. This removes
any Major ABI breakages for updates for 7.0 release.
- Minor ABI breakage is required since there were additions to the
header. Refer to changelog for these updates.

Change-Id: If35af74eac6beac8c267d05ce789b7761ed24bff
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-08-18 11:36:57 -05:00
Galantsev, Dmitrii e7d6590bbc Bump version to 26.1
Change-Id: I1b6ab552c9be965524ad49a866374a0d21b9ceb3
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-08-08 08:12:10 -05:00
Poag, Charis e2e4fc65c1 [SWDEV-542223] Update Violation Status Changes to Design + Minor cleanup (#558)
Changes:
  - Update violation status logic and metric naming for XCP/XCC metrics (thrm/thm consistency)
  - Added XCP identifier in monitor to allow partition metrics to be shown with applicable APIs
    (Violation Status is the first example of this in monitor)
  - Improve CLI monitor output:
    support multiple GPU lines per GPU, add new columns, and better formatting
  - Refactor helpers and logger for flexible unit formatting and table rendering
  - Add examples for amdsmi_get_gpu_pm_metrics_info()/amdsmi_get_gpu_reg_table_info()
    new metrics APIs in C++ example
  - Sync Python/C++ interface and structures for new metrics fields and naming
  - Remove deprecated/unused RSMI activity APIs, documentation not needed since
    the APIs no longer exist in ROCm SMI either.
  - Cleanup metric violations + fix handle watch arguments
  - Provide better handling/doc for average_flattened_ints()
  - Group xcp metrics with brackets in human readable + adjust output size

Signed-off-by: Poag, Charis <Charis.Poag@amd.com>
2025-08-06 16:03:06 -05:00
Poag, Charis d24dc7ef89 [SWDEV-518561] Separate Driver Reload from Memory Partition Sets (#582)
Description:
  - Added a new API `amdsmi_gpu_driver_reload()` to reload the AMD GPU driver independently.
  - Updated CLI (`sudo amd-smi reset -r`) and Python bindings to support driver reload functionality.
  - Removed automatic driver reload from `amdsmi_set_gpu_memory_partition()` and `amdsmi_set_gpu_memory_partition_mode()`.
  - Enhanced CLI and test cases to allow users to control when the driver reload occurs.
  - Updated documentation and changelog to reflect the new driver reload process.
  - Improved error handling and logging for driver reload operations.
  - Added progress bar and user confirmation prompts for driver reload commands.

* Update build/test strategy to only allow one test execution at a time
* Modify API verbage + modify systemctl error output
  - Systemctl is typically not enabled on docker.
  - And is an edge case for gpu being active process/etc for display devices.
* Remove AMDSMI_STATUS_AMDGPU_RESTART_ERR from the return values
* Move driver reload to after we save original compute partitions

---------

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-08-05 20:44:28 -05:00
Liu, Shuzhou (Bill) abd3c02a3c Query UBB/OAM temperature API (#581)
Add support to Query UBB/OAM temperature.
* Updated Python API with new temperature metrics enum

---------

Co-authored-by: Bill Liu <shuzhliu@amd.com>
Co-authored-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-08-05 20:37:45 -05:00
Arif, Maisam 240a607904 Revert "[SWDEV-505176] Submodule Unified Header in AMDSMI"
This reverts commit a315b62e37.
2025-07-30 14:08:24 -05:00
Narlo, Joseph a315b62e37 [SWDEV-505176] Submodule Unified Header in AMDSMI
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-07-30 13:37:01 -05:00
Pham, Gabriel e2eac98496 [SWDEV-545342] Fixed amdsmi_link_type_t enumeration (#560)
Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-07-22 18:22:49 -05:00
Bindhiya Kanangot Balakrishnan 645c313f00 [SWDEV-543308] Revert amdsmi_link_metrics structure change
Moved the bit_rate and max_bandwidth back into links in the
amdsmi_link_metrics_t struct as this change was impacting
other teams. Modified the C and python API's, wrapper, and
CLI accordingly.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-07-14 13:56:26 -05:00
Narlo, Joseph 2cf6272b53 [SWDEV-541675] Remove Unnecessary API from amdsmi.h (#530)
Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
2025-07-07 11:14:27 -05:00
Galantsev, Dmitrii 9b5bbf555a DRM - Remove FD usage
Change-Id: I77dfa778ccd0d39a03289c2e11cf10357566ff16
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-06-20 11:00:42 -05:00
Galantsev, Dmitrii 202b46d96f DRM - Remove caching
Change-Id: I21716cc953462e385e981024f75a9a7c2d76a466
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-06-20 11:00:42 -05:00
Galantsev, Dmitrii cb2f152205 DRM - Update to latest public
Change-Id: I9f7b46acbae654c377702a599c4b094fd621f101
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-06-20 11:00:42 -05:00
josnarlo 5ed9fba9be [SWDEV-538604] Sync Unified Header and AMDSMI Comments
Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
2025-06-18 09:13:01 -05:00
josnarlo d4a946717b [SWDEV-537983] Fix comments about temperature units for amdsmi_get_temp_metric
Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
2025-06-13 16:51:59 -05:00
josnarlo 4aee30f49b [SWDEV-537983] Fix comments about temperature units for amdsmi_get_temp_metric
Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
2025-06-13 16:51:59 -05:00
Maisam Arif 6da33b8ded [SWDEV-529665] PLDM Bundle naming
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Id7f652ddc4e790027869683a4aaa3226ffc05c83
2025-06-12 02:19:37 -05:00
Maisam Arif 5763412f7d [SWDEV-537491] Updated Copyright to aca-decode files
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I9621e4c54f3b490c6eb4cfc3e9bdfb4d489f0052
2025-06-11 20:51:51 -05:00
Saeed, Oosman 815e0252b1 [SWDEV-536417] AFID & addc decode fixes (#449)
* fix endian problem
* use hw_revision and flags_mask from cper section instead of hardcoded values

---------

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-06-06 13:41:16 -05:00
Charis Poag 391451752b [SWDEV-529030/SWDEV-531217] Fix tests & output for partitioned configurations (CPX, DPX, QPX, etc.)
Changes:
  - Updated AMD SMI firmware to display "N/A" for unavailable firmware in partitioned environments, improving clarity.
    Example (in DPX):
    $ amd-smi firmware
    GPU: 0
        FW_LIST:
            ...
            FW 12:
                FW_ID: PM
                FW_VERSION: 00.86.39.00
    GPU: 1
        FW_LIST: N/A
  - Fixed amd-smi partition not showing current partition information on
    asics with inablity to set memory or accelerator partitions.
    $ amd-smi partition -c -m
    CURRENT_PARTITION:
    GPU_ID  MEMORY  ACCELERATOR_TYPE  ACCELERATOR_PROFILE_INDEX  PARTITION_ID
    0       NPS1    CPX               2                          0
    1       N/A     N/A               N/A                        1
    2       N/A     N/A               N/A                        2
    3       N/A     N/A               N/A                        3
    4       N/A     N/A               N/A                        4
    5       N/A     N/A               N/A                        5
    6       NPS1    SPX               0                          0
    7       NPS1    SPX               0                          0
    8       NPS1    SPX               0                          0

    MEMORY_PARTITION:
    GPU_ID  MEMORY_PARTITION_CAPS  CURRENT_MEMORY_PARTITION
    0       N/A                    NPS1
    1       N/A                    N/A
    2       N/A                    N/A
    3       N/A                    N/A
    4       N/A                    N/A
    5       N/A                    N/A
    6       N/A                    NPS1
    7       N/A                    NPS1
    8       N/A                    NPS1

  - Refactored amd_smi_drm_example.cc:
    - Grouped partition changes and restores original partition settings.
    - Now handles partitioned environments allowing example to continue even if some APIs are not supported in partitioned configurations.
  - Modified amdsmi_asic_info_t (see amdsmi_get_gpu_asic_info()) to report OAM ID as N/A if 0xFFFFFFFF (was 0xFFFF).
    Allows for better handling of OAM IDs in partitioned environments (DNE for non-primary nodes,
    since its a physical identifier). Easier to handle in tests and example code (ie. now consistent w/ max size of the structure's value).
  - Introduced amdsmi_RAII_open_FD() (internal API) to manage file descriptors using RAII, ensuring proper closure and preventing resource leaks.
    Updated the following APIs to use this function:
      - amdsmi_get_gpu_asic_info(), amdsmi_get_gpu_vram_usage(),
        amdsmi_get_gpu_vram_info(), amdsmi_get_gpu_vbios_info(),
        amdsmi_get_gpu_driver_info(), amdsmi_get_gpu_virtualization_mode()
  - Updated AMD SMI test_base.cc/.h:
    - Improved output and handling for partitioned environments.
    - Added detailed ASIC information logging to align with structure changes.
    - Enhanced error messages for better context before ASSERT checks.
  - Resolved test failures in partitioned environments by updating
    logic and handling for partition-specific configurations.
    Fixed tests include:
      - computepartition_read_write.cc, frequencies_read_write.cc,
        gpu_metrics_read.cc, mem_util_read.cc, memorypartition_read_write.cc,
        perf_level_read.cc, perf_level_read_write.cc, power_cap_read_write.cc,
        power_read.cc, sys_info_read.cc, gpu_busy_read.cc

Change-Id: I36e903f8fddd714c74c719459c71aba8bbb77e6f
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Resetting head + adding fixes for tests ran in partitions

Change-Id: I0c1e9ac07488b50c95f3bc6d8a724e67d2c715dc
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-06-05 19:24:49 -05:00
Saeed, Oosman 2c3fa591b5 [SWDEV-530385] Update aca-decode with parsing fixes (#435)
*Update aca-decode to #4cd539d that fixes some errors in parsing cper files for afid extraction
*Without this fix, we get garbage value for some cper input files relating GFX_poison_cpers

Signed-off-by: Oosman Saeed <oossaeed@amd.com>
2025-06-04 18:49:05 -05:00
Arif, Maisam e2692ab533 Add Directory Not Found Status code to map to ENOTDIR (#238)
* Corrected ecc count error return
* Added directory not found error code
* Added ENOTDIR mapping to RSMI_STATUS_DIRECTORY_NOT_FOUND in ErrnoToRsmiStatus

---------

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-06-03 17:53:33 -05:00
Narlo, Joseph c0c4e021ea [SWDEV-532069] Doxygen Not Picking Non-Documented Values (#362)
Signed-off-by: Narlo, Joseph <Joseph.Narlo@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>
Co-authored-by: Arif, Maisam <Maisam.Arif@amd.com>
Co-authored-by: Deepak Mewar <deepak.mewar@amd.com>
2025-06-03 17:24:44 -05:00
Narlo, Joseph ce7d6dfe61 [SWDEV-532769] amd-smi APIs mismatch with documentation (#428)
* Populated socket_power to get power info
---------

Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-06-03 17:12:13 -05:00
Joseph Narlo ee43ec71e8 [SWDEV-522996] Syncing Unified Header and AMDSMI
Signed-off-by: Joseph Narlo <joseph.narlo@amd.com>
2025-06-02 13:44:33 -05:00
Maisam Arif c89b5db09d Deprecated PASID
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ib008f80f3d736172079358c0ceb3ebca87340d28
2025-05-30 20:48:29 -05:00
Maisam Arif cebb0799cb [SWDEV-488303] Fixed process list information source
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Iec3416cb5ca1bdd806c3225b514bbf3dbf8c0d2e
2025-05-30 20:48:29 -05:00
Maisam Arif cc4dfd834f Version Bump 26.0.0
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I29ea6fa781dfc338a60b390ff498c46b4a1efe52
2025-05-30 20:48:29 -05:00
Kanangot Balakrishnan, Bindhiya 2eff0b3764 [SWDEV-530633] Use gpu_metric speed and BW for xgmi (#366)
The xgmi command was showing pcie bit rate and bandwidth instead of xgmi. Corrected the API to get xgmi data from gpu metric.
Added python API for amdsmi_get_link_metrics. Modified the amdsmi_link_metrics struct.
Added check to confirm non zero partition got xgmi command.

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-05-30 16:51:11 -05:00
Arif, Maisam 42441c78ea [SWDEV-488303] Adjusted process vram_mem data source (#411)
* [SWDEV-488303] Adjusted process vram_mem data source
* Standardized sscanf format strings

---------

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-05-29 23:26:12 -05:00