İşleme Grafiği

40 İşleme

Yazar SHA1 Mesaj Tarih
Saeed, Oosman 5b95d227bc [SWDEV-538308] CPER CLI 20 limit bug (#499)
The bug was reproduced like this.

In terminal #1, run command:
sudo amd-smi ras --cper --gpu 6 --severity all --folder /tmp/cper_dump --follow 

In terminal #2, inject errors:
while true; do sudo amdgpuras -b 7 -s 1 -m 6 -t 2; sleep 2; done

The terminal #1 starts dumping cper entry information that it captures. After 20 entries have been captured, open terminal #3 and run same command as terminal #1:
sudo amd-smi ras --cper --gpu 6 --severity all --folder /tmp/cper_dump --follow 

From terminal #3, there will be no output, even when terminal #1 continues capturing and printing information.

The fix:

Since we already have more than 20 CPER entries available in the GPU buffer, when we run the command from terminal #3 to start capturing from the beginning and pass 20 buffers to copy entries to, the C++ API returns a code saying there is more data available.

The Python CLI should not treat this as an error, but should continue to print what the API returned.

---------

Signed-off-by: Oosman Saeed <oossaeed@amd.com>
2025-07-07 11:11:13 -05:00
Narlo, Joseph ce7d6dfe61 [SWDEV-532769] amd-smi APIs mismatch with documentation (#428)
* Populated socket_power to get power info
---------

Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-06-03 17:12:13 -05:00
Saeed, Oosman fab13c5b60 [SWDEV-530385] show afids on each line of printout (#422)
* show afids on each line of printout
* clean up afids and cper code
---------

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-06-02 17:22:10 -05:00
Maisam Arif c89b5db09d Deprecated PASID
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ib008f80f3d736172079358c0ceb3ebca87340d28
2025-05-30 20:48:29 -05:00
Maisam Arif cebb0799cb [SWDEV-488303] Fixed process list information source
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Iec3416cb5ca1bdd806c3225b514bbf3dbf8c0d2e
2025-05-30 20:48:29 -05:00
Kanangot Balakrishnan, Bindhiya 2eff0b3764 [SWDEV-530633] Use gpu_metric speed and BW for xgmi (#366)
The xgmi command was showing pcie bit rate and bandwidth instead of xgmi. Corrected the API to get xgmi data from gpu metric.
Added python API for amdsmi_get_link_metrics. Modified the amdsmi_link_metrics struct.
Added check to confirm non zero partition got xgmi command.

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-05-30 16:51:11 -05:00
Arif, Maisam 42441c78ea [SWDEV-488303] Adjusted process vram_mem data source (#411)
* [SWDEV-488303] Adjusted process vram_mem data source
* Standardized sscanf format strings

---------

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-05-29 23:26:12 -05:00
Arif, Maisam 0fdaebdbaa [SWDEV-488303] Updated CU occupancy for per-process retrieval (#243)
Change-Id: I2990597c6dd4b2e8cf3e11ce60f72049ebdd9a8c
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-05-29 20:35:27 -05:00
Maisam Arif fba62e2270 [SWDEV-534707] Adjust power value documentation
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I1c4516e403715b9a1fe9c78fae94848c89daa920
2025-05-29 18:55:44 -05:00
Liu, Shuzhou (Bill) 970560fc7c [SWDEV-520665] Add support for board voltage (#303)
* Add the API and CLI to show the board voltage. 

---------

Change-Id: Icb25bd653bb1d004704b5a21b378ca31b2b242c7
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>
2025-05-29 18:55:08 -05:00
Kanangot Balakrishnan, Bindhiya 8e486c832b [SWDEV-463406] Update python doc for amdsmi_get_violation_status (#406)
* Updated the amdsmi_get_violation_status python API doc with newly added fields.
---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-05-29 14:59:16 -05:00
Pryor, Adam d0a89393df Remove ring hang (#391)
Change-Id: I856cd0949d3661911ab9302148aa1bc6e72abeed

Signed-off-by: adapryor <Adam.pryor@amd.com>
2025-05-29 11:58:46 -05:00
Narlo, Joseph 9862db63dd [SWDEV-532129] Update amdsmi asic info (#369)
* Added `subsystem_id` to `amdsmi_get_gpu_asic_info`
---------
Signed-off-by: Narlo, Joseph <Joseph.Narlo@amd.com>
2025-05-28 18:26:58 -05:00
Pham, Gabriel c40d4291f6 Updated docs with new KFD events (#382)
* Updated docs with new KFD events

---------

Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
2025-05-27 12:21:38 -05:00
Mewar, Deepak b999f86611 [SWDEV-512393] Added amdsmi_get_cpu_affinity_with_scope (#198)
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>
2025-05-20 01:06:09 -05:00
Saeed, Oosman 1bb1f8acc2 [SWDEV-522623] Add afid functionality to API and CLI (#330)
Change-Id: I015bde926491d54e09da8f39b05650515711e09f

[SWDEV-522623] Add afid functionality to API and CLI


Change-Id: I015bde926491d54e09da8f39b05650515711e09f

Signed-off-by: Oosman Saeed <oossaeed@amd.com>
Co-authored-by: Oosman Saeed <oossaeed@amd.com>
2025-05-16 10:49:56 +08:00
Arif, Maisam 249537b2ff CPER Doc update (#352)
Change-Id: I59053eda863fc2b7349a3071a02e4557a8abe8c7

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-05-08 12:20:00 -05:00
Kanangot Balakrishnan, Bindhiya 9d7964dff5 [SWDEV-516592] Add python interface API for Bad Page Threshold (#141)
- Added python interface APIs for amdsmi_get_gpu_bad_page_threshold()
 - Updated the docs and changelog.

---------

Signed-off-by: Kanangot Balakrishnan, Bindhiya <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
2025-04-14 04:19:45 -05:00
Arif, Maisam d81871ef16 [SWDEV-511234] Added amdsmi_get_gpu_cper_entries & CLI implementation
Added amdsmi_get_gpu_cper_entries() in the python and C APIs

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
Co-authored-by: Saeed, Oosman <Oosman.Saeed@amd.com>
Co-authored-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>
2025-04-12 01:54:57 -05:00
Arif, Maisam 35fbe2cbf1 [SWDEV-521408] Fixed call to amdsmi_get_gpu_virtualization_mode (#230)
Change-Id: I29c86f8982b53cc139004ebc06b26a5d8f430091

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-04-01 16:57:23 -05:00
Yuan, Perry 68e44c7f66 [SWDEV-482949] Add CPU model name querying support (#33)
- Add support to check CPU vendor info which will be called by RDC to
discovery CPU information
- Move esmi headers declaration to impl/amd_smi_common.h
- remove duplicated amdsmi_cpu_util_t

---------

Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Deepak Mewar <deepak.mewar@amd.com>
2025-03-28 21:21:39 -05:00
Arif, Maisam 0e67568902 [SWDEV-501958] Doc Update deprecating pasid in 7.0 (#166)
Change-Id: Ie19ba271c901d0be324143474871241272166124

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I024f7e2b5e7a5fcd6e1d12181d21ffacfe29c00f
2025-03-07 14:56:46 -06:00
Park, Peter 15c32f6116 [SWDEV-510820] Add missing goamdsmi documentation (#147)
* add API doc comments to goamdsmi.go
* update README and usage
* add sphinx directive to parse go doc
* fix walrus operator typos
* make docs more consistent
* add Go docs to index.md

---------

Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
2025-03-07 12:37:54 -06:00
AL Musaffar, Yazen 2936e00fed [SWDEV-453922] AMD SMI to provide mapping feature of other enumeration methods (#51)
Added enumeration mapping for 
- drm render
- drm card
- hsa id 
- hip id
- hip uuid (rocminfo uuid)

Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-03-07 09:09:12 -06:00
Pham, Gabriel d5b2763aba [SWDEV-515730] Updated set partition documentation (#151)
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
2025-03-06 23:16:32 -06:00
Park, Peter 0b4a6ff149 [SWDEV-513210] Add references to AMDGPU RAS Support info in API docs (#144)
Add reference to AMDGPU RAS Support info in API docs
2025-03-04 09:32:23 -06:00
Arif, Maisam 52b3ee2dc6 [SWDEV-503520] Add amdsmi_get_rocm_version() in python library (#76)
Changed amdsmi_get_rocm_version() to be an API in the python library only. 
Updated usage and version detection
Updated path detection of librocm-core.so
Updated docs to reflect both amdsmi_get_rocm_version and amdsmi_get_lib_version() do not require initialization.

Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
2025-02-26 05:45:58 -06:00
Narlo, Joseph dc4a16da6f [SWDEV-513651] Sync Unified And Linux Header (#98)
Signed-off-by: Joseph Narlo <joseph.narlo@amd.com>
2025-02-06 22:25:50 -06:00
Pham, Gabriel 09379f8438 Changed default behavior of amdsmi_get_gpu_virtualization_mode (#97)
Changed return behavior of amdsmi_get_gpu_virtualization_mode

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-02-05 19:09:44 -06:00
Pham, Gabriel e663bed7d6 [SWDEV-462952] Updated passthrough to use virtualization mode struct
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-01-31 17:34:01 -06:00
Scaffidi, Salvatore 9fbdaa66ab [SWDEV-463406] Updating Violations Documentation
Signed-off-by: Greg Scaffidi <salvatore.scaffidi@amd.com>
2025-01-30 02:45:13 -06:00
Ramalingam, Muthusamy ced110dbb6 amdsmi: Adding Support to get hsmp Driver version
* amdsmi: Adding Support to get hsmp Driver version

Adding Support to fetch hsmp driver version from ESmi Interfaces.
Adding Support to fetch memory bandwidth per socket.

Signed-off-by: muthusamy <muthusamy.ramalingam@amd.com>
2025-01-29 13:45:02 -06:00
Maisam Arif 803b18fe95 Dropped count from amdsmi_get_link_topology_nearest() python API
The count field was not pythonic nor needed

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I212f43dc11f2f2c7eddd39900e6e3aaec03f3f8f
2025-01-22 19:07:01 -06:00
Park, Peter d9bba639df [SWDEV-503717] Remove occurrences of "Fusion" in docs
Tiny PR to remove occurrences of "Kernel **Fusion** Driver" in
public-facing docs.

Signed-off-by: Peter Park <peter.park@amd.com>
2025-01-07 16:11:46 -06:00
Maisam Arif 8ca2c6e247 Deprecated amdsmi_get_energy_count() power field
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I1b5fe8e278b797458e57dff689e692347901bbfd
2025-01-07 12:45:55 -06:00
Peter Park cbfe403b1d remove duplicated changelog
black format docs/conf.py
add seealso to python api reference

Change-Id: I60fa754f0af662669282dc90eea4b7dc5c5030cc
Signed-off-by: Peter Park <peter.park@amd.com>
2024-11-13 11:46:47 -05:00
gabrpham 19cc4718c0 Documented and adjusted APIs for asic info, vram info, and P2P topology
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: I7ac9a868148e29c92299b21540e057f64cb4123e
2024-11-11 20:45:37 -05:00
gabrpham 4d26db84ca Documented and adjusted python apis for pm metrics and reg table info
* amdsmi_get_gpu_pm_metrics_info and amdsmi_get_gpu_reg_table_info
were added to python api documentation
* AmdSmiRegType added as enum
* amdsmi_get_gpu_reg_table_info reg_type changed to AmdSmiRegType

Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: I57239ecf048e82226151db071e8d9299e9182647
2024-11-11 20:45:37 -05:00
Peter Park e196f98dba docs: Remove redundant/stale docs
bump rocm-docs-core to 1.8.2

rm unused files

rm stale docs

fix sphinx conf

reorg docs

SWDEV-482203 -- add note to usage guides

update readmes

Change-Id: I9e0111ac8fe2a691ac964b27436ba47747c27904
Signed-off-by: Peter Park <Peter.Park@amd.com>
2024-11-11 16:49:17 -04:00
Roopa Malavally af225a6deb Amdsmidocs reorg
Change-Id: I836fc341d2a3567f531ba753463e57cd4b9b6495
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-05-15 04:26:41 -04:00