Gráfico de commits

1745 Commits

Autor SHA1 Mensaje Fecha
AL Musaffar, Yazen bd39e461a5 [SWDEV-530385] CPER cannot be dumped continuously with "--follow" fix (#377)
--follow fix

Co-authored-by: Yazen ALMusaffar <yalmusaf@amd.com>
Change-Id: I911f456f3f658e694979c7ae014fb0b6bb3e45c1
2025-05-20 01:10:22 -05:00
Mewar, Deepak b999f86611 [SWDEV-512393] Added amdsmi_get_cpu_affinity_with_scope (#198)
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>
2025-05-20 01:06:09 -05:00
Pryor, Adam 51e99965b3 [SWDEV-527092] - Fix ringhang event removal (#372)
Signed-off-by: adapryor <Adam.pryor@amd.com>
2025-05-16 16:41:31 -05:00
Pryor, Adam 8713305f80 [SWDEV-527092] - Process Start/Stop event addition (#368)
- Added more events to `amdsmi_evt_notification_type_t`

Change-Id: I6a256fe828e4bec3197c7fecbed374ab17c6f850
Signed-off-by: Adam Pryor <Adam.Pryor@amd.com>
2025-05-16 11:01:15 -05:00
Poag, Charis bacbaac0b1 [SWDEV-517154] Rename compute_partition to accelerator_partition (#358)
Changes:
  - Updated references in the codebase to rename `COMPUTE_PARTITION` to `ACCELERATOR_PARTITION`
  - Moved around and rephrased several duplicated lines in the CHANGELOG.md file

Change-Id: Id6bc86a7133e952cca6ef0acb1616ad6251d19d4
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-05-16 10:44:54 -05:00
Galantsev, Dmitrii 912bcfaae9 Pad asic_serial and UUID with zeros
Change-Id: Icf9bfdf9be60525433da378a77ddf5a8bcc21579
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-05-16 10:34:32 -05:00
Saeed, Oosman 1bb1f8acc2 [SWDEV-522623] Add afid functionality to API and CLI (#330)
Change-Id: I015bde926491d54e09da8f39b05650515711e09f

[SWDEV-522623] Add afid functionality to API and CLI


Change-Id: I015bde926491d54e09da8f39b05650515711e09f

Signed-off-by: Oosman Saeed <oossaeed@amd.com>
Co-authored-by: Oosman Saeed <oossaeed@amd.com>
2025-05-16 10:49:56 +08:00
Park, Peter d4f057f95f [SWDEV-528854] docs: Add description of N/A in SMI tool output (#363)
Signed-off-by: Park, Peter <Peter.Park@amd.com>
2025-05-14 11:43:33 -05:00
josnarlo dd69aa1924 [SWDEV-532119] Fix building examples
Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
2025-05-13 20:19:51 -05:00
Maisam Arif ca297419c4 Fix reference to gpu_metrics adjustment for v1_8
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Idf6307292d37c973b0d44a187a2334f2cad8047d
2025-05-13 18:21:29 -05:00
Narlo, Joseph a66c6fa03b [SWDEV-418701] rsmitst Fallback with Driver-less Hardware (#329)
Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
2025-05-09 12:33:13 -05:00
Hila, Nino 24fdf0ae0f Update palamida.yml (#346)
* Add palamida.yml
2025-05-09 11:57:02 -05:00
Castillo, Juan e123e986f9 [SWDEV-530211] Fix for VCLK & DCLK N/A values + Update deep sleep logic (#342)
- Updated VCLK and DCLK min/max clock logic to populate N/A values.
- Updated VCLK and DCLK to show all available clocks.
- Updated deep_sleep logic using sys/fs clk_deep_sleep true/false.
- Added clarifying comments.
- Updated error output using e.get_error_info() instead of just error.
- Updated changelog

---------

Signed-off-by: Juan Castillo <juan.castillo@amd.com>
2025-05-08 14:39:21 -05:00
Arif, Maisam 249537b2ff CPER Doc update (#352)
Change-Id: I59053eda863fc2b7349a3071a02e4557a8abe8c7

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-05-08 12:20:00 -05:00
Arif, Maisam ace3b0901a Version & Doc update (#343)
Change-Id: Ibf8e1809913e30aba4b21ba889b72e5db7205736

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-05-08 12:19:04 -05:00
Williams, Justin 33283d281c Updated CODEOWNERS usernames (#354)
Signed-off-by: Justin Williams <Justin.Williams@amd.com>
2025-05-08 12:02:26 -05:00
Galantsev, Dmitrii bd82e881f5 [SWDEV-529762] CMAKE - Fix lintian issues (#325)
Change-Id: Ide3563a876cb530d0e80676e78f36f18a233a3ba

Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-05-06 17:59:47 -05:00
Galantsev, Dmitrii 42c77a5912 CMAKE - Format with cmake-format
Change-Id: I5b86b7b83e3d151c3d6e1c216ecb28f1313d538a
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-05-06 17:09:53 -05:00
Galantsev, Dmitrii 17b01e2456 CI - Add cmake-format to workflows
Change-Id: Iba0ab896a42abecf389e6b92811343e1fd51c302
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-05-06 17:09:53 -05:00
Arif, Maisam ee14ef7b95 [SWDEV-531364] Removed Python API debug statements (#351)
Removed Python API debug statements

Change-Id: Ifc17a7b49b11bce56075d620a9b0e7cbbdb5f417

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-05-06 14:01:59 -05:00
Charis Poag da1024cf96 [SWDEV-528647/SWDEV-528450] Follow up Fix incorrect domain
Changes:
- Misc improvements
- Domain showed incorrectly for devices with different domains
  ex.
  GPU: 3
      BDF: 3000:01:00.0

  Fix provides in proper format -
    GPU: 3
        BDF: 0003:01:00.0

Change-Id: Ida4a0acb4922f3c2cb61a9e9cd0b7d1be31061a8
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-05-06 12:50:43 -05:00
Justin Williams 1d89ec207b [SWDEV-527430] Added Major & Minor ABI Breakage Labels
Signed-off-by: Justin Williams <Justin.Williams@amd.com>
2025-05-05 18:41:46 -05:00
Galantsev, Dmitrii fe98b8bd63 CMAKE - Clean-up cmake changes introduced in a9b8b6d369b390af0c00bbffab2b4fe1748b8bad
Change-Id: Ida0e9475a926a2495e36b0d9bc2468c48aee0e77
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-05-05 15:43:12 -05:00
Bindhiya Kanangot Balakrishnan 62294df49a [SWDEV-518229] Display single N/A in case of empty clock
When all clocks are N/A's, it will be filtered. To
avoid confusion, single N/A is added.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-05-05 14:06:31 -05:00
Poag, Charis b5a43b7744 [SWDEV-528647/SWDEV-528450] Reduce API load times and libdrm/libdrm_amdgpu dynamic loading (#333)
Changes:
- Removed libdrm/libdrm_amdgpu dependencies
- Added/updated new internal libdrm/libdrm_amdgpu/xf86drm APIs
  to allow our APIs to reference before dynamic loading
  the libdrm/libdrm_amdgpu libraries:
  1. amdgpu_drm.h to what's seen in mainline
  2. Added xf86drm.h to whats seen in mainline
- Modified internal DRM capabilities:
  1. Require each API to independently connect to libdrm/libdrm_amdgpu
     + validate API handles reponses accordingly
  2. Initialization of AMD SMI no longer has as strong of a tie to
     libdrm
- Updated internal implementations of several APIs which have
connections to libdrm/libdrm_amdgpu or APIs which have conflicts
with open libdrm/libdrm_amdgpu connections:
  1. amdsmi_init()
  2. amdsmi_get_gpu_vram_usage()
  3. amdsmi_get_gpu_asic_info()
  4. amdsmi_get_gpu_vram_info()
  5. amdsmi_get_gpu_vbios_info()
  6. amdsmi_get_gpu_driver_info()
  7. amdsmi_get_gpu_virtualization_mode()
  8. amdsmi_set_gpu_memory_partition()
  9. amdsmi_set_gpu_memory_partition_mode()
- Cleaned up effected tests/APIs

Change-Id: I96e2cf1b06b0cfee1b01a5e991ccc6116c4245a8
2025-05-02 21:58:53 -05:00
Williams, Justin 757aa016f4 [SWDEV-500518/SWDEV-527430] CI v4.1 (#316)
[SWDEV-500518] CI v4.1

Signed-off-by: Justin Williams <Justin.Williams@amd.com>
2025-04-29 18:52:32 -05:00
Saeed, Oosman 9c297639f3 [SWDEV-529266] [MI308][AMDSMI][RAS CPER] CPER dump not working on CPX mode (#319)
* Do not raise excepction for cper status not found, but keep iterating to next gpu

* Do not raise excepction for cper status not found, but keep iterating to next gpu

* use partition id and skip if non-zero

* reverting un-needed change

* Do not raise excepction for cper status not found, but keep iterating to next gpu

* use partition id and skip if non-zero

---------

Co-authored-by: Oosman Saeed <oossaeed@amd.com>
2025-04-29 18:52:32 -05:00
Kanangot Balakrishnan, Bindhiya e26e26e308 [SWDEV-518229] Avoid N/A leaves filtering from static (#326)
The N/A leaves filering was removing clock in static.
To avoid this, removed N/A filtering from single tier.

Signed-off-by: Kanangot Balakrishnan, Bindhiya <Bindhiya.KanangotBalakrishnan@amd.com>
2025-04-29 18:52:32 -05:00
gabrpham_amdeng 81c65ed15d Added DRM check back for market name 2025-04-29 18:52:32 -05:00
gabrpham_amdeng 59f0f3f9eb [SWDEV-529889] Fixed incorrect vendor_id reporting in amdsmi_get_gpu_asic_info
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-04-29 18:52:32 -05:00
gabrpham_amdeng 4e232e287d [SWDEV-529889] Fixed incorrect vendor_id reporting in amdsmi_get_gpu_asic_info
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-04-29 18:52:32 -05:00
gabrpham_amdeng 1ab57ce7dd [SWDEV-529889] Fixed incorrect vendor_id reporting in amdsmi_get_gpu_asic_info
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-04-29 18:52:32 -05:00
Maisam Arif ea1b24003e Removed post install pciids udpate
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: If6a3fcec4778c99246e899890f1d586656841d3e
2025-04-29 18:52:32 -05:00
Pham, Gabriel 34678cc0ae [SWDEV-522336] Enabled Topology command in Guest (#304)
Enabled Topology command in Guest

---------

Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-04-29 18:52:32 -05:00
Poag, Charis 6a43b64d0c [SWDEV-518561] P1: Inconsistent memory partition change (#315)
Changes:
- Cleaned up drm connections before changing memory partitions

Change-Id: I08430d22536bb6094c91a06fbf2f74d9b75a4fa7

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-04-29 18:52:32 -05:00
Kanangot Balakrishnan, Bindhiya 797e4fba07 [SWDEV-518229] Filter N/A's from amd-smi metric clock CLI
The 'amd-smi metric --clock' was listing values with N/A. Filtered these outputs to show only available values.

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-04-28 12:28:59 -05:00
Maisam Arif 4099fee17f [SWDEV-529603] Fix subcommand aliasing
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I1c15faffa5fda1b8f5c2f5d6e4c7ae746ff8ee7c
2025-04-28 10:44:01 -05:00
Narlo, Joseph d5ce95573f [SWDEV-522996] Sync Unified Header and AMDSMI (#305)
Sync Unified Header and AMDSMI

Signed-off-by: josnarlo <Joseph.Narlo@amd.com>

---------

Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
2025-04-24 13:31:08 -05:00
Narlo, Joseph 30ebf19893 [SWDEV-489696] Improve Functional Test (#241)
Improve Functional Test for 
- SetCheckPowerCap
- TestPowerCapReadWrite
---------

Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
2025-04-24 12:33:55 -05:00
Williams, Justin 661989c339 [SWDEV-527430] Fixed Mainline ABI Check (#310)
Signed-off-by: Justin Williams <Justin.Williams@amd.com>
2025-04-24 10:21:34 -05:00
Poag, Charis 817c077067 [SWDEV-528097] Fix HIP_UUID output from updates (#311)
Changes:
- Changed HIP_UUID to reference rsmi_dev_unique_id_get()
- Added better logging for amdsmi_get_gpu_device_uuid() references

Change-Id: Ie233044de8c6e85b807faf22121b450233db861b

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-04-24 10:21:04 -05:00
Kanangot Balakrishnan, Bindhiya 8e5f6b1a8d [SWDEV-520371] Generate valid json format output (#273)
Earlier, the amd-smi metric and static json output
was not in valid json format. Changes are done to
get the output in valid json format.

---------
Change-Id: I5576333269509f63b3c800f225c3d73127ce80cf

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-04-23 00:08:43 -05:00
Arif, Maisam b2f1df85a3 Improved import structure (#275)
* Improved import structure

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Id265cbb7dba5ba805b7cf7c353af870fef6cbb4a

---------

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-04-22 23:52:06 -05:00
Arif, Maisam 53dbb7bf58 CLI Help text and parser formatting updates (#218)
* Small Fixes
* CLI Help text and parser formatting updates
* Changed metavar for set partition

---------
Change-Id: Ia8809665f6fac670452cd4db4e5e8f9c7270faba
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
Co-authored-by: Pham, Gabriel <Gabriel.Pham@amd.com>
2025-04-22 23:32:42 -05:00
Arif, Maisam 63b13ecb05 Reduce Load times for Partition CLI (#290)
* Reduced Load times for CLI in partition mode
* Change rsmi_dev_id_get() to use KFD, if KGD interface does not exist
* Make gpu_device_uuid fallback to rsmi_wrapper
* Moved Enumeration info calls in list for more speed
* Moved made group check excluded from recursion

---------

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Co-authored-by: Charis Poag <Charis.Poag@amd.com>
Co-authored-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-04-22 22:54:43 -05:00
AL Musaffar, Yazen f1f782312d [SWDEV-528364] CPER CLI --follow fix (#298)
Signed-off-by: Yazen ALMusaffar <yalmusaf@amd.com>
2025-04-22 22:52:03 -05:00
Poag, Charis b58625cafa [SWDEV-528097] Unique ID fix for missing ID in KGD -> use KFD's (#292)
Changes:
   - Unique Id tries reading from KGD
     -> falls back to use KFD if not found

Change-Id: I05456dd79715e04d83f118b5bb4f1d3612822173
---------

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
2025-04-22 16:27:33 -05:00
Hila, Nino bd20b1b150 Add palamida.yml 2025-04-22 16:07:51 -05:00
Justin Williams 7aa45b2f55 [SWDEV-527430] Add API breakage checks for amd-smi
Signed-off-by: Justin Williams <Justin.Williams@amd.com>
2025-04-22 13:18:39 -05:00
AL Musaffar, Yazen d6954bcc62 Removed CPER tests and adjust the implementation (#269)
- Moved helper functions into amdsmi_utils.cc
- Removed tests since they are not working.

---------

Co-authored-by: Saeed, Oosman <Oosman.Saeed@amd.com>
2025-04-21 14:54:47 -05:00