نمودار کامیت

1820 کامیت‌ها

مولف SHA1 پیام تاریخ
Deepak Mewar 7571eb014f Updated display format of cpu & socket affinities
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>
2025-06-13 17:37:00 -05:00
Bindhiya Kanangot Balakrishnan 6fbda16098 [SWDEV-512393] Print keys of lists in custom_dump
The custom_dump function was not printing list's key
and so static numa was not displaying list keys
CPU affinity and Socket affinity. Updated custom_dump
to print the keys.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-06-13 17:37:00 -05:00
josnarlo d4a946717b [SWDEV-537983] Fix comments about temperature units for amdsmi_get_temp_metric
Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
2025-06-13 16:51:59 -05:00
josnarlo 4aee30f49b [SWDEV-537983] Fix comments about temperature units for amdsmi_get_temp_metric
Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
2025-06-13 16:51:59 -05:00
Pham, Gabriel 940ece6813 Added GTT Memory to default output process table (#480)
* Added GTT Memory to default command and adjusted table format

---------

Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
2025-06-13 16:43:56 -05:00
dependabot[bot] 152184dd49 Bump rocm-docs-core[api_reference] from 1.17.0 to 1.20.1 in /docs/sphinx
Bumps [rocm-docs-core[api_reference]](https://github.com/ROCm/rocm-docs-core) from 1.17.0 to 1.20.1.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.20.1/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.17.0...v1.20.1)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-version: 1.20.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-13 16:35:08 -05:00
Maisam Arif 049c59c5bb Update workflows and Contrib docs
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I2ae31144ee1ab29c8bbba83f0c7eb0bb9dc079ba
2025-06-13 16:19:10 -05:00
Maisam Arif 772b572913 Updated 6.4.2 Changelog
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I975f5db0bde9ebccec3756415cb1e7dc47e78988
2025-06-12 17:17:13 -05:00
Maisam Arif 6da33b8ded [SWDEV-529665] PLDM Bundle naming
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Id7f652ddc4e790027869683a4aaa3226ffc05c83
2025-06-12 02:19:37 -05:00
Maisam Arif 5763412f7d [SWDEV-537491] Updated Copyright to aca-decode files
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I9621e4c54f3b490c6eb4cfc3e9bdfb4d489f0052
2025-06-11 20:51:51 -05:00
Arif, Maisam 23b9da656c Fixed type hinting & Added copy rights (#462)
* Added copyrights
* Fixed type hinting for processor_handle in python_interface
* Fixed Incorrect type hinting to actual return types

---------

Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Change-Id: Ie2a09acf628ed0c43eacc8ec78c159d125acbcdb
2025-06-11 17:19:02 -05:00
Justin Williams 6d03ca79ff CI - Added Build Warnings
Signed-off-by: Justin Williams <Justin.Williams@amd.com>
2025-06-11 13:13:38 -05:00
Maisam Arif b579d89ae2 [SWDEV-537062] Fixed CU Occupancy reporting UINT MAX
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I975579997a9e455eb930f6c0b8fc5f3dc3cbfae4
2025-06-11 10:42:00 -05:00
dependabot[bot] 7e956ce4f3 Bump requests from 2.32.3 to 2.32.4 in /docs/sphinx (#471)
Bumps [requests](https://github.com/psf/requests) from 2.32.3 to 2.32.4.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.32.3...v2.32.4)

---
updated-dependencies:
- dependency-name: requests
  dependency-version: 2.32.4
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-11 08:23:27 -05:00
Maisam Arif 93404a6bff [SWDEV-529665] Fix PLDM version format
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I7df4c2068e32a5c81c83adc69dc82a9f5d725533
2025-06-11 07:35:25 -05:00
Galantsev, Dmitrii f9b8066c26 CMAKE - Remove example build from src/CMakeLists.txt (#469)
* CMAKE - Remove example build from src/CMakeLists.txt
For some reason it was building examples every time even when not
necessary...
* CMAKE - Format
* Fix drm_example broken PRIu32
* CMAKE - Do NOT create lib64 when building examples
* CMAKE - Examples should only install C and CMake files

---------

Change-Id: I6274b72a085a41b5bd5ae698af798f60a8a092a0
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-06-11 07:12:44 -05:00
Maisam Arif ac63f410c2 Fixed Parser Folder Checking
* Adjusted help text
* Adjusted --afid to run only with --cper-file
* Fixed interface return error

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I2b96f4515c85f3b9dd84ba5c2d819729a997141b
2025-06-10 15:58:06 -05:00
Maisam Arif fb592e003a [SWDEV-536417] CPER Display fixes
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ic2f3901d0f4c95bd9ed4beda8aa5fd3d596df8d2
2025-06-10 15:58:06 -05:00
Williams, Justin ae4f56d14b CI v5.0 (#459)
Signed-off-by: Justin Williams <Justin.Williams@amd.com>
2025-06-06 16:29:20 -05:00
Saeed, Oosman 815e0252b1 [SWDEV-536417] AFID & addc decode fixes (#449)
* fix endian problem
* use hw_revision and flags_mask from cper section instead of hardcoded values

---------

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-06-06 13:41:16 -05:00
Maisam Arif 8bc37a19d2 [SWDEV-536417] CPER & AFID CLI Fixes
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I20aafb1cd2bf8386c30e6d0a0fff8df9c8587554
2025-06-06 12:26:13 -05:00
Charis Poag 391451752b [SWDEV-529030/SWDEV-531217] Fix tests & output for partitioned configurations (CPX, DPX, QPX, etc.)
Changes:
  - Updated AMD SMI firmware to display "N/A" for unavailable firmware in partitioned environments, improving clarity.
    Example (in DPX):
    $ amd-smi firmware
    GPU: 0
        FW_LIST:
            ...
            FW 12:
                FW_ID: PM
                FW_VERSION: 00.86.39.00
    GPU: 1
        FW_LIST: N/A
  - Fixed amd-smi partition not showing current partition information on
    asics with inablity to set memory or accelerator partitions.
    $ amd-smi partition -c -m
    CURRENT_PARTITION:
    GPU_ID  MEMORY  ACCELERATOR_TYPE  ACCELERATOR_PROFILE_INDEX  PARTITION_ID
    0       NPS1    CPX               2                          0
    1       N/A     N/A               N/A                        1
    2       N/A     N/A               N/A                        2
    3       N/A     N/A               N/A                        3
    4       N/A     N/A               N/A                        4
    5       N/A     N/A               N/A                        5
    6       NPS1    SPX               0                          0
    7       NPS1    SPX               0                          0
    8       NPS1    SPX               0                          0

    MEMORY_PARTITION:
    GPU_ID  MEMORY_PARTITION_CAPS  CURRENT_MEMORY_PARTITION
    0       N/A                    NPS1
    1       N/A                    N/A
    2       N/A                    N/A
    3       N/A                    N/A
    4       N/A                    N/A
    5       N/A                    N/A
    6       N/A                    NPS1
    7       N/A                    NPS1
    8       N/A                    NPS1

  - Refactored amd_smi_drm_example.cc:
    - Grouped partition changes and restores original partition settings.
    - Now handles partitioned environments allowing example to continue even if some APIs are not supported in partitioned configurations.
  - Modified amdsmi_asic_info_t (see amdsmi_get_gpu_asic_info()) to report OAM ID as N/A if 0xFFFFFFFF (was 0xFFFF).
    Allows for better handling of OAM IDs in partitioned environments (DNE for non-primary nodes,
    since its a physical identifier). Easier to handle in tests and example code (ie. now consistent w/ max size of the structure's value).
  - Introduced amdsmi_RAII_open_FD() (internal API) to manage file descriptors using RAII, ensuring proper closure and preventing resource leaks.
    Updated the following APIs to use this function:
      - amdsmi_get_gpu_asic_info(), amdsmi_get_gpu_vram_usage(),
        amdsmi_get_gpu_vram_info(), amdsmi_get_gpu_vbios_info(),
        amdsmi_get_gpu_driver_info(), amdsmi_get_gpu_virtualization_mode()
  - Updated AMD SMI test_base.cc/.h:
    - Improved output and handling for partitioned environments.
    - Added detailed ASIC information logging to align with structure changes.
    - Enhanced error messages for better context before ASSERT checks.
  - Resolved test failures in partitioned environments by updating
    logic and handling for partition-specific configurations.
    Fixed tests include:
      - computepartition_read_write.cc, frequencies_read_write.cc,
        gpu_metrics_read.cc, mem_util_read.cc, memorypartition_read_write.cc,
        perf_level_read.cc, perf_level_read_write.cc, power_cap_read_write.cc,
        power_read.cc, sys_info_read.cc, gpu_busy_read.cc

Change-Id: I36e903f8fddd714c74c719459c71aba8bbb77e6f
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Resetting head + adding fixes for tests ran in partitions

Change-Id: I0c1e9ac07488b50c95f3bc6d8a724e67d2c715dc
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-06-05 19:24:49 -05:00
Pham, Gabriel f0233eb664 [SWDEV-536184] Removed extra debug print statement (#447)
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
2025-06-05 17:50:56 -05:00
gabrpham_amdeng 7130de3058 [SWDEV-536184] Modified KFD fallback condition for getting VRAM to include sysfs read failures
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-06-05 01:49:16 -05:00
Bindhiya Kanangot Balakrishnan 872c58b7a3 [SWDEV-534746] Generate valid json output for partition command
The amd-smi partition --json output was not in valid json
format. Changes are done to get the output in valid
json format.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-06-05 01:40:52 -05:00
Saeed, Oosman 2c3fa591b5 [SWDEV-530385] Update aca-decode with parsing fixes (#435)
*Update aca-decode to #4cd539d that fixes some errors in parsing cper files for afid extraction
*Without this fix, we get garbage value for some cper input files relating GFX_poison_cpers

Signed-off-by: Oosman Saeed <oossaeed@amd.com>
2025-06-04 18:49:05 -05:00
Arif, Maisam e2692ab533 Add Directory Not Found Status code to map to ENOTDIR (#238)
* Corrected ecc count error return
* Added directory not found error code
* Added ENOTDIR mapping to RSMI_STATUS_DIRECTORY_NOT_FOUND in ErrnoToRsmiStatus

---------

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-06-03 17:53:33 -05:00
Narlo, Joseph c0c4e021ea [SWDEV-532069] Doxygen Not Picking Non-Documented Values (#362)
Signed-off-by: Narlo, Joseph <Joseph.Narlo@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>
Co-authored-by: Arif, Maisam <Maisam.Arif@amd.com>
Co-authored-by: Deepak Mewar <deepak.mewar@amd.com>
2025-06-03 17:24:44 -05:00
Narlo, Joseph ce7d6dfe61 [SWDEV-532769] amd-smi APIs mismatch with documentation (#428)
* Populated socket_power to get power info
---------

Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-06-03 17:12:13 -05:00
Bindhiya Kanangot Balakrishnan 8f943b03e1 [SWDEV-534745] Generate valid json output for xgmi command
The amd-smi xgmi --json output was not in valid json
format. Changes are done to get the output in valid
json format.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-06-03 12:48:02 -05:00
Saeed, Oosman fab13c5b60 [SWDEV-530385] show afids on each line of printout (#422)
* show afids on each line of printout
* clean up afids and cper code
---------

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-06-02 17:22:10 -05:00
Pham, Gabriel 91021da055 [SWDEV-446039] Added Flat Process table to default output (#425)
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-06-02 17:15:15 -05:00
Kanangot Balakrishnan, Bindhiya 8ed52616ad [SWDEV-519061] xgmi command output shows zero for all xgmi acc read/write data in the first column (#392)
The xgmi read and write accumulated data from gpu metric index
is based on sysfs xgmi_port_num file. Mapped these two to display
read and write wrt src_gpu Vs dst_gpu.
---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-06-02 14:01:06 -05:00
Justin Williams bf0448ff96 [SWDEV-533596] CI - Fixed Docs
Signed-off-by: Justin Williams <Justin.Williams@amd.com>
2025-06-02 13:48:01 -05:00
Joseph Narlo ee43ec71e8 [SWDEV-522996] Syncing Unified Header and AMDSMI
Signed-off-by: Joseph Narlo <joseph.narlo@amd.com>
2025-06-02 13:44:33 -05:00
Maisam Arif 996917e9bc Updated Changelog
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I10efa8ed10288d3445a330ad27081d1f03113b38
2025-05-30 20:48:29 -05:00
Maisam Arif c89b5db09d Deprecated PASID
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ib008f80f3d736172079358c0ceb3ebca87340d28
2025-05-30 20:48:29 -05:00
Maisam Arif cebb0799cb [SWDEV-488303] Fixed process list information source
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Iec3416cb5ca1bdd806c3225b514bbf3dbf8c0d2e
2025-05-30 20:48:29 -05:00
Maisam Arif cc4dfd834f Version Bump 26.0.0
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I29ea6fa781dfc338a60b390ff498c46b4a1efe52
2025-05-30 20:48:29 -05:00
gabrpham_amdeng c8f33c96c3 Updated CLI Tool Help
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-05-30 20:10:32 -05:00
dependabot[bot] dd81cfd688 Bump tornado from 6.4.2 to 6.5.1 in /docs/sphinx (#418)
Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.4.2 to 6.5.1.
- [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst)
- [Commits](https://github.com/tornadoweb/tornado/compare/v6.4.2...v6.5.1)

---
updated-dependencies:
- dependency-name: tornado
  dependency-version: 6.5.1
  dependency-type: indirect
...
2025-05-30 19:53:58 -05:00
gabrpham_amdeng 1fa4cdacf3 Suppressed help text of default command
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-05-30 19:53:14 -05:00
Pham, Gabriel daf74d1cd6 [SWDEV-511822] Added group check to default command (#415)
Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
2025-05-30 18:40:18 -05:00
Kanangot Balakrishnan, Bindhiya 2eff0b3764 [SWDEV-530633] Use gpu_metric speed and BW for xgmi (#366)
The xgmi command was showing pcie bit rate and bandwidth instead of xgmi. Corrected the API to get xgmi data from gpu metric.
Added python API for amdsmi_get_link_metrics. Modified the amdsmi_link_metrics struct.
Added check to confirm non zero partition got xgmi command.

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-05-30 16:51:11 -05:00
Castillo, Juan 2e8aaf02c9 [SWDEV-534728] Fixed deep_sleep status does not work with --json flag (#413)
- When in json output mode the .rstrip function does not work due to dict obj type.
	- The clk_value is now checked for dict instance before extracting the value.
	- If clk_value is a dict then the .get() function is used to extract the value.
	- Else it is a string obj which uses .split() to extract the value.
	- If clk_value is < min_clk_value then deep_sleep is set to ENABLED
    - initialize clk_value and min_clk_value to 0 for each loop.
    - fix if/else for better readability

---------

Signed-off-by: Juan Castillo <juan.castillo@amd.com>
2025-05-30 16:45:32 -05:00
Arif, Maisam 42441c78ea [SWDEV-488303] Adjusted process vram_mem data source (#411)
* [SWDEV-488303] Adjusted process vram_mem data source
* Standardized sscanf format strings

---------

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-05-29 23:26:12 -05:00
Maisam Arif 876f3976e0 [SWDEV-523247] Corrected amdsmi_get_gpu_vram_usage total
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I0f8bb067bf34f64d1b8d41e2a89d3a79a6745990
2025-05-29 21:30:00 -05:00
Arif, Maisam 0fdaebdbaa [SWDEV-488303] Updated CU occupancy for per-process retrieval (#243)
Change-Id: I2990597c6dd4b2e8cf3e11ce60f72049ebdd9a8c
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-05-29 20:35:27 -05:00
Maisam Arif fba62e2270 [SWDEV-534707] Adjust power value documentation
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I1c4516e403715b9a1fe9c78fae94848c89daa920
2025-05-29 18:55:44 -05:00
Liu, Shuzhou (Bill) 970560fc7c [SWDEV-520665] Add support for board voltage (#303)
* Add the API and CLI to show the board voltage. 

---------

Change-Id: Icb25bd653bb1d004704b5a21b378ca31b2b242c7
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>
2025-05-29 18:55:08 -05:00