Peter Park
e196f98dba
docs: Remove redundant/stale docs
...
bump rocm-docs-core to 1.8.2
rm unused files
rm stale docs
fix sphinx conf
reorg docs
SWDEV-482203 -- add note to usage guides
update readmes
Change-Id: I9e0111ac8fe2a691ac964b27436ba47747c27904
Signed-off-by: Peter Park <Peter.Park@amd.com >
2024-11-11 16:49:17 -04:00
gabrpham
f5b7761ac7
[SWDEV-490187] reset gpu partition were removed
...
The reset gpu partition support for both compute and memory were removed
Code changes related to the following:
* amdsmi_reset_gpu_compute_partition()
* amdsmi_reset_gpu_memory_partition()
* CLI
Change-Id: I372589074b4da172bedd39223edde18939e373ae
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com >
2024-10-18 16:22:26 -05:00
Lang Yu
7a557b1c50
SWDEV-463405: Add amdsmi_get_link_topology_nearest support
...
amdsmi_get_link_topology_nearest() is used to retrieve
the set of GPUs that are nearest to a given device
at a specific interconnectivity level.
Code changes related to the following:
* API
* CLI
* Unit tests
* Examples
Header Unification Change: "/amdsmi/+/1122408"
Change-Id: Id0317797c652c267742513936d321677793ec634
Signed-off-by: Lang Yu <lang.yu@amd.com >
2024-09-26 16:43:27 -05:00
gabrpham
c9a489d437
Moved partition_id from static --asic-info to static --partition.
...
partition_id also removed from the `amdsmi_asic_info_t` struct and
supporting API has been added for querying partition information.
Signed-off-by: gabrpham <Gabriel.Pham@amd.com >
Change-Id: Id5a6291a77d11bb97a1c7a200fc465898e86e081
2024-09-20 03:48:42 -04:00
Maisam Arif
3b7f661e71
Moved KFD information to separate structure and API
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: If6eaea589edc704cf408d6391b5f2154134035e7
2024-09-20 03:48:42 -04:00
Maisam Arif
639daa3d90
Fixed amdsmi_get_utilization_count() wrapper generation
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: Ifd59fca042c4b3b0fc53e100b6892c6b4f7b3e95
2024-09-17 16:34:42 -04:00
Maisam Arif
787d4462fa
[SWDEV-482412] Optimized PCIe Bandwidth gpu_metrics calls
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: Ib37d232b94a080e9b490dd065628d2567aaf4642
2024-09-11 23:26:30 -05:00
gabrpham
7d8e54d0e1
[SWDEV-450553] Added gpu memory overdrive to metric function
...
Signed-off-by: gabrpham <Gabriel.Pham@amd.com >
Change-Id: If7bd6865d641a5a83c594a4d3c57938b1b6dc18e
2024-09-04 12:54:14 -04:00
gabrpham
95ca2b83a1
Changed power parameter in amdsmi_get_energy_count() to energy_accumulator
...
Issue linked here: https://github.com/ROCm/amdsmi/issues/38
Signed-off-by: gabrpham <Gabriel.Pham@amd.com >
Change-Id: I622236eb3f0144aefeb6c82d2713b4822bfeeb11
2024-09-04 09:38:08 -04:00
Maisam Arif
413c9ef6fe
SWDEV-466302 - Changed blank processes to N/A & Updated Docs
...
Change-Id: I2d68430dda8036879f58b0f1dea5d2825b441179
2024-06-24 00:38:17 -04:00
Maisam Arif
92f014059e
SWDEV-435197 - Add process table to CLI monitor subcommand
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: Ibe06f4a4be619ae9cba909c2474b0e482eeb87d5
2024-06-19 23:36:55 -05:00
Bill(Shuzhou) Liu
e3c63628e5
Change the clean shader API to clean local data
...
To be align with the unified API.
Change-Id: I2819339fba6f528204cebd3e9605109e82cbc5b4
2024-06-17 16:23:33 -05:00
Maisam Arif
f9bfb746fb
Update Python API README example code
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: I5dbb2e3cdab31b41e6f502d3257fe899eed1ee97
2024-06-07 16:20:00 -04:00
Maisam Arif
37c044696d
Removed Throttle Status from CLI Tool
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: I8eb8f30f821589003201d6d8bb96592ec5f8a483
2024-06-07 15:19:48 -05:00
Bill(Shuzhou) Liu
4cf59c4edb
Change the name of clear sram to run cleaner shader
...
The function is to clean the local data in LDS/GPRs. The clear sram
is misleading.
Change-Id: I0385e6d6348602fe0f347d17e48ed8983f7ceb87
2024-06-05 12:07:39 -05:00
Maisam Arif
e5d1ba4621
Use different sysfs for soc_pstate and xmgi_plpd
...
The sysfs is changed to use the pm_policy folder with multiple
dpm_policy files.
Change-Id: I40fac8de2d0cb127950d238b8196f6d2416778d0
2024-05-31 01:38:41 -04:00
Dalibor Stanisavljevic
7b2463abe0
SWDEV-457337 - Fix header alignment
...
Change-Id: I9f25f6c4f0d00c76b66d13162f30be11368f5b59
Signed-off-by: Dalibor Stanisavljevic <Dalibor.Stanisavljevic@amd.com >
2024-05-23 04:41:57 -04:00
Maisam Arif
7d999aa34c
SWDEV-458102 - Updates to pp_od_clk_voltage parsing
...
Signed-off-by: Maisam Arif <maisarif@amd.com >
Change-Id: I650dae1a99856dcde914fe66917cf9111f3ce0e2
2024-05-15 03:18:24 -05:00
Maisam Arif
52843152a5
SWDEV-444567 - Added Ring Hang Event
...
Signed-off-by: Maisam Arif <maisarif@amd.com >
Change-Id: I2e73ba08ee0004f6f30660b2fa425ea94bafceca
2024-05-03 17:21:28 -04:00
Maisam Arif
11c72946eb
Revert "SWDEV-458102 - Deprecated Voltage Curve API"
...
This reverts commit 1423fb632e .
Change-Id: I8a3eaf0a9f28200e09fb35d5260fbc070fe8a4a9
2024-05-02 15:27:16 -05:00
Charis Poag
c24d66740e
SWDEV-450580 - Fix powercap set
...
Updates:
* CLI - Added AMDSMIHelpers.convert_SI_unit() to help
conversion of units
* API - Reverted to uW for power cap limits
* CLI - amd-smi static --limit now includes MIN_POWER
* Tests now are all using uW units to keep W conversion
to only happen in CLI
* Python API now reflects same units as uW (what is seen
in amdgpu driver)
* CLI - amd-smi metric --power:
Fixed power seen on gpu_metrics v1.3
Change-Id: I32d9ba78d0d8806772f0860f9a803a885b3f316a
Signed-off-by: Charis Poag <Charis.Poag@amd.com >
2024-05-02 10:13:39 -05:00
Maisam Arif
1423fb632e
SWDEV-458102 - Deprecated Voltage Curve API
...
Signed-off-by: Maisam Arif <maisarif@amd.com >
Change-Id: I111c3ce26d2ab66d5e755432f4b8a9bfa631f805
2024-05-02 02:53:29 -04:00
Maisam Arif
962e217d08
Updated README example output
...
Signed-off-by: Maisam Arif <maisarif@amd.com >
Change-Id: I45e7ecea022a028501f381fea4291bf78cc4494b
2024-04-30 19:13:46 -05:00
Maisam Arif
e6054be6e7
SWDEV-453493 - Fix Null pointer reference in amd-smi bad-pages
...
Signed-off-by: Maisam Arif <maisarif@amd.com >
Change-Id: I10a1278b68cbb464dd0fb38a2de50413f6f43959
2024-04-26 04:04:43 -05:00
Bill(Shuzhou) Liu
7d2ab7970d
Process isolation and clean shader
...
A few APIs and command line options are added to support process
isolation and clean shader.
Change-Id: I98ad3fc9fc7429799a21798b7fca1c307de7f403
2024-04-24 13:22:20 -04:00
Maisam Arif
0d6626db0d
Removed print in python interface
...
Signed-off-by: Maisam Arif <maisarif@amd.com >
Change-Id: I80e8cf18dc7631c66d4863251438327b8853cead
2024-04-23 04:49:47 -05:00
Maisam Arif
c551c3caed
SWDEV-455131 - Updated process APIs
...
- Removed amdsmi_get_gpu_process_info from python API
- Updated documentation
- Aligned process --json output format to unit & value format
Signed-off-by: Maisam Arif <maisarif@amd.com >
Change-Id: I82bba1b6df71020b4a5995ff63b9aa62611ce4fe
2024-04-18 14:00:59 -05:00
Maisam Arif
50450a2a69
Added amdsmi_get_gpu_process_info python library documentation
...
Signed-off-by: Maisam Arif <maisarif@amd.com >
Change-Id: I2218bf664a8a155e6b3085378db0fb20f3be3f70
2024-04-05 02:30:13 -05:00
Oliveira, Daniel
08e2e21bab
fix: [SWDEV-442525] [rocm/amd_smi_lib]
...
Fixes gpu_process_list
Code changes related to the following:
* amdsmi_get_gpu_process_list()
* CLI
* Examples
* Unit tests
* Changelog
* Readme
* rocm_smi_lib commit: 677433b367
Change-Id: I9210fbca7a5da92d0a8b472b72ca82597c8e4fb5
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com >
2024-03-27 16:48:24 -05:00
Maisam Arif
51b3f8cccb
SWDEV-452739 - Add CEM slot type to amd-smi
...
Updated CHANGELOG.md and re-added spaces after bolded lines
Signed-off-by: Maisam Arif <maisarif@amd.com >
Change-Id: Ic728b3e9b083c62fe4c9791b8ede991f5dacc1ca
2024-03-27 02:01:25 -04:00
Maisam Arif
e2e4349bd2
SWDEV-445664 - Aligned metric --ecc & --ecc-blocks with Host
...
Signed-off-by: Maisam Arif <maisarif@amd.com >
Change-Id: I93cf2bdab8c4c066bacf0e910e5620d37b362b07
2024-03-26 16:30:31 -04:00
Maisam Arif
93b81e5012
SWDEV-445664 - Aligned metric --clock with Host
...
Change-Id: Ib4dc372aed61f6301680ac746eccf448e9d0ed00
Signed-off-by: Maisam Arif <maisarif@amd.com >
2024-03-26 16:30:31 -04:00
Maisam Arif
8bf2bd4b89
SWDEV-447333 - Corrected amdsmi_init() python documentation
...
Signed-off-by: Maisam Arif <maisarif@amd.com >
Change-Id: If46e7236316687cd97cf1a69770f87154e2681ff
2024-03-26 16:30:22 -04:00
Bill(Shuzhou) Liu
e4085c6414
Get and set the XGMI PLPD
...
Update the API and CLI to support XGMI Per-Link Power Down Policy.
Change-Id: Iaf04a771eb8bb0829a5b3088d803a7355a8dfd0b
2024-03-26 01:48:14 -05:00
Oliveira, Daniel
1310c767ce
fix: [SWDEV-448201] [rocm/amd_smi_lib]
...
Adds Add PCIE Errors
Code changes related to the following:
* amdsmi_get_pcie_info()
* CLI
* examples
Change-Id: Ie0b7053e77c88fb18309c16e74bce75d862c45a9
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com >
2024-03-24 23:33:32 -04:00
Maisam Arif
57a43babad
Removed old Python API function documentation
...
Signed-off-by: Maisam Arif <maisarif@amd.com >
Change-Id: Ib145fae98f1e99ab474b86ec4f6ddc2c8c44126e
2024-02-26 14:10:49 -06:00
Maisam Arif
a719ae9707
SWDEV-445396 - Aligned Static Command with Host
...
Signed-off-by: Maisam Arif <maisarif@amd.com >
Change-Id: I4182b9104e173f54830fc44819a61d74d31d65d7
2024-02-22 03:35:00 -05:00
Bill(Shuzhou) Liu
db33cda0c1
Unify the amdsmi_get_pcie_info python interface
...
Make the python interface consistent with the C interface.
Change-Id: Idda08f888947c757e475d5a024b0ec3d8e1d846a
2024-02-22 03:33:59 -05:00
Maisam Arif
f58613561c
Refactor ESMI Initialization and Argument Parsing
...
Signed-off-by: Maisam Arif <maisarif@amd.com >
Change-Id: Iefab3a8110e0d3c525ee0cef1bdef9101550e9de
2024-02-21 19:02:14 -05:00
Maisam Arif
703fdb0ed2
Aligned cache property enum with Host
...
Signed-off-by: Maisam Arif <maisarif@amd.com >
Change-Id: Ie64a33f55c9a9a7cc8c806419509897351f37c70
2024-02-20 05:48:53 -06:00
Maisam Arif
77710921a4
Align list and cache_info to Host
...
Signed-off-by: Maisam Arif <maisarif@amd.com >
Change-Id: I4fa55b360b74d5a202d0b9b4eb7aee660b0a1bcf
2024-02-15 01:47:59 -05:00
Maisam Arif
f831cf49f7
Renamed amdsmi_get_metrics_table to amdsmi_get_cpu_metrics_table
...
Renamed structs to be more conistent with what they are calling
Signed-off-by: Maisam Arif <maisarif@amd.com >
Change-Id: I6f2be2fcb76f004aa592f0dad8545565700ccd4b
2024-02-12 16:30:18 -06:00
Deepak Mewar
6f7273fda5
Added amdsmi cpu family & cpu model
...
- Updated header and source files
- Updated python interface
- Generated python wrapper for updated header
- Updated the CLI to have cpu family & cpu model
as part of metric table
Change-Id: Iea440251797270d5d29ffe883b0ad6db790be658
2024-02-06 18:46:27 -05:00
Maisam Arif
88192d8b6b
SWDEV-436533 - Cache Info Struct Update
...
Signed-off-by: Maisam Arif <maisarif@amd.com >
Change-Id: Ic640fa657cdcc32d7b00ff78fc9452ec7e05dd07
2024-02-05 16:51:04 -05:00
Deepak Mewar
3aabb927b4
amdsmi README updated for python interface
...
Change-Id: I92c1e8eb646488a9cdc32d0933f27e5db8c172ef
2024-01-25 02:19:38 -05:00
Maisam Arif
0550c9352c
Updated engine_activity api
...
Signed-off-by: Maisam Arif <maisarif@amd.com >
Change-Id: I3f62e093fdc0254015c0837dca59763551d3659c
2024-01-24 22:23:48 -05:00
Charis Poag
34bd26c68e
Fix metric type error output + re-align with ROCm SMI metrics
...
Changes:
* [CLI] Provide fix for "/opt/rocm/bin/amd-smi metric
TypeError: '>' not supported between instances of 'str' and 'i"
--> Python API was updated, CLI needed to reflect these changes
* [API] Updated amdsmi.h's with ROCm SMI
--> Incorrectly added mem_bandwidth_acc & mem_max_bandwidth
--> Realigned wrapper with updates
* [Test] Added metrics not shown in gpu_metrics_read.cc
Change-Id: Ia3a172377fd5a582254dd5a46d81dbec7e763cd9
Signed-off-by: Charis Poag <Charis.Poag@amd.com >
2024-01-24 21:23:40 -06:00
Bill(Shuzhou) Liu
0b67c2ccc4
Unified API
...
amdsmi_get_link_metrics() and amdsmi_get_pcie_info()
Change-Id: Iea060e449813b842236243b772e8809497ce98fe
2024-01-24 18:27:20 -05:00
Charis Poag
fe86afed8c
SWDEV-436533 [CLI/Python API] Align Cache Info BM UI to Host
...
- [CLI] Refactored cache info to display
cache flags as "cache_properties" names.
Names are displayed as a list of comma-separated
cache type strings. Previously, values
were shown one by one as ENABLED.
ex.
CACHE_PROPERTIES = <a,b,c>
- [JSON] mirrors CLI fields.
No longer display "cache_flags", renamed
field as "cache_properties" dictionary. This
allows users to better understand the
list of names provided.
- [Python API] Updated amdsmi_get_gpu_cache_info
to mirror Host return.
README.md - updated to reflect all changes.
Change-Id: Ife2ef5adcef30058937d1376efb01749e45c02fb
Signed-off-by: Charis Poag <Charis.Poag@amd.com >
2024-01-24 06:21:55 -05:00
Charis Poag
4575990ae7
GPU Usage/activity update
...
CLI:
Every usage field is notated by "activity"
gfx_usage -> gfx_activity
umc_usage -> umc_activity
vcn_activities -> vcn_activity
jpeg_activities[AID#] -> jpeg_activity
Wrapper: fixed metric output, misalignment
with generator
update_wrapper.sh:
DOCKER_BUILDKIT to 0 (if unset)
API:
amdsmi_get_gpu_metrics_info:
1.3: Removed commenting out avg socket power
Signed-off-by: Charis Poag <Charis.Poag@amd.com >
Change-Id: Id3fcc20aef420c7b7a90ba22fa3bc643b2716333
2024-01-15 23:34:08 -06:00