Граф коммитов

1558 Коммитов

Автор SHA1 Сообщение Дата
Charis Poag 7756539bd1 Merge amd-dev into amd-master 20241112
Change-Id: I04105595f37bb2903a8c3901592b597084e7fc25
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-11-12 18:04:21 -06:00
Charis Poag 3ea4a42a6e [SWDEV-488276/SWDEV-497613] Update memory partition set functionality
Changes:
  - [CLI] Added warning screen to AMD SMI users
    setting memory partition
  - [CLI] Added a progress bar time-bar for CLI sets display to 40 seconds
  - [API] Updated to wait until the driver reloads with SYSFS files active
  - [CLI] Now users can set or reset without providing:
    amd-smi set -g all <set arguments>
    or amd-smi reset -g all <set arguments>
    now can directly call -> sudo amd-smi set <set arguments>
    or sudo amd-smi reset <set arguments>
  - [SWDEV-475712][CLI/API] Fixed target_graphics_version field
    not properly displaying for older MI or Navi ASICs.
  - [All APIs] Added a catch for the driver to report invalid arguments
    now these APIs will show AMDSMI_STATUS_INVAL
    (ex. changing to NPS8 if the device does not support it)
  - [Install] Modified paths for Python install commands to support
    multi-ROCm installs

Change-Id: Id11f25d68a82d23c6b2d77ccb30b51e860dd0ca7
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-11-12 16:50:32 -04:00
Maisam Arif 9e221ebd93 Merge amd-dev into amd-master 20241111
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ica217a6e48f7deaf52d7021511ec8e8bf26c9a37
2024-11-11 19:46:42 -06:00
gabrpham 19cc4718c0 Documented and adjusted APIs for asic info, vram info, and P2P topology
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: I7ac9a868148e29c92299b21540e057f64cb4123e
2024-11-11 20:45:37 -05:00
gabrpham 4d26db84ca Documented and adjusted python apis for pm metrics and reg table info
* amdsmi_get_gpu_pm_metrics_info and amdsmi_get_gpu_reg_table_info
were added to python api documentation
* AmdSmiRegType added as enum
* amdsmi_get_gpu_reg_table_info reg_type changed to AmdSmiRegType

Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: I57239ecf048e82226151db071e8d9299e9182647
2024-11-11 20:45:37 -05:00
gabrpham 2273d95a6c [SWDEV-492739] Partial fix for sclk min/max out of bounds
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: I1f0230955c890c11a735c8cb352c8a9ee4cebe27
2024-11-11 20:45:37 -05:00
Maisam Arif 4b511a31e1 Bump Version to 24.7.1.0
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I0fc42fe55cb653102d189db9aa5eaf723280170e
2024-11-11 19:23:20 -06:00
gabrpham 0f067488e1 updated cli tool examples doc to reflect current CLI
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: Iab78a412464ba6d7919aeb7da04a031b063a7d09
2024-11-11 17:12:40 -05:00
Maisam Arif 7932de967a Updated parser help text
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I8cc65edb1e629a55e0efbfc1109b1c549ed81101
2024-11-11 15:07:21 -06:00
Peter Park e196f98dba docs: Remove redundant/stale docs
bump rocm-docs-core to 1.8.2

rm unused files

rm stale docs

fix sphinx conf

reorg docs

SWDEV-482203 -- add note to usage guides

update readmes

Change-Id: I9e0111ac8fe2a691ac964b27436ba47747c27904
Signed-off-by: Peter Park <Peter.Park@amd.com>
2024-11-11 16:49:17 -04:00
Maisam Arif 6e843436f5 Updated amdsmi_get_energy_count() C API documentation
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Iac75a0dcd583f39eb97aada769c736c3305cc8a2
2024-11-08 16:37:10 -05:00
Maisam Arif 5449d78cc4 Adjusted private helper variables
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I0590b9ee5a1b4d5e6d4ae71c9587550c8d95033b
2024-11-08 11:25:50 -06:00
Maisam Arif abee26d4ab Added ras and ecc counting back to Linux VMs
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ie981f7fe8f481f2137e95dda2e200d00ab4d92c8
2024-11-08 11:05:15 -06:00
Peter Park 31821cb585 Mod changelog to fit internal standard
Change-Id: Id90136f16f15a30b2791ed0634a408a7eb73f96f
2024-11-08 11:57:14 -05:00
Zhang Ava e0b913b30d Merge amd-dev into amd-master 20241107
Signed-off-by: Zhang Ava <niandong.zhang@amd.com>
Change-Id: I06979911d0e335c142eea73f95dc3801611ac275
2024-11-08 10:57:45 +08:00
gabrpham 27996aef18 [SWDEV-495985] Changed ACCELERATOR_TYPE default value.
Default value changed from 0 to "N/A".
	Actual values for all fields will be filled out in later API
update.

Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: I69b08fff894a032ef79301754807ed4b5c85257f
2024-11-07 21:22:28 -05:00
gabrpham 4effd48fe2 [SWDEV-489060] Added python3-setuptools and wheel as prereqs in README.
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: I51cf938033d746bd6c255d518d7e0d3a87296be4
2024-11-07 14:42:04 -04:00
Charis Poag 7fc4b853d4 [SWDEV-495305] Fix AttributeError: 'Namespace' object has no attribute 'compute_partition'
Changes:
   - [CLI] Earlier we removed compute & memory partition resets,
     this fix changes back to the correct spacing for
     reset commands

Change-Id: I707ff197baf7a32bfb7ef20f2b26a63acd13f08a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-11-05 18:49:08 -05:00
Maisam Arif 2678e1f3f7 [SWDEV-492031] Update Market Names
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I13c2047fd8c7af0dc566f88a3cac8b365697a092
2024-11-05 17:52:02 -04:00
Jorge López 172a3e233b Updates driverInitialized() to support amdgpu built as module as well as kernel built-in. Fixes ROCm/rocm_smi_lib#102 and is an updated version of ROCm/rocm_smi_lib#104
Change-Id: Icb3abe820bc67035b822358a1c04bd09a7c22b6b
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
Reviewed-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-11-05 16:30:34 -05:00
adapryor 02cbffb42a [SWDEV-412505] Handle mclk permission errors as not supported
Change-Id: Idb3eeed76ff55c507f28b5e692f8704704c3e46e
Signed-off-by: adapryor <Adam.pryor@amd.com>
2024-10-31 17:40:34 -04:00
Joe Narlo 54462ab447 SWDEV-495316 [AMDSMI] In amdsmi.h, change typedef amdsmi_accelerator_partition_profile_t to match definition in Confluence
Move memory_caps defintion and correct the number in reserved to match Confluence

Signed-off-by: Joe Narlo <Joseph.Narlo@amd.com>
Change-Id: Id94144f4b3d2d3d7b4d7327211ffc1957ffd0a93
2024-10-31 12:48:48 -04:00
adapryor 6e01df00ca [SWDEV-446215] Update cmake to put test libs in proper lib dir
Change-Id: I2e91b904b3f869cdba717d872c10d799d0260c30
Signed-off-by: adapryor <Adam.pryor@amd.com>
2024-10-29 16:07:58 -04:00
Zhang Ava ec9d73640b Merge amd-dev into amd-master 20241024
Signed-off-by: Zhang Ava <niandong.zhang@amd.com>
Change-Id: Ifafeecb8429c64440e9612c10927ea94d2de82b8
2024-10-24 18:13:11 +08:00
Charis Poag 0ceca28f41 [SWDEV-463406] Update sample rate + align metric output
Changes:
- Corrected max speed users can sample from FW/driver
  is 100 ms
- Added warning to amdsmi_get_violation_status()
  call on delay required 100ms to sample
- Removed guest support, this API will not be supported
- Updated CLI `amd-smi metric --throttle` outputs from
    XXX_active -> XXX_status
    XXX_percent -> XXX_activity
  to align with host
- Changelog updated

Change-Id: Ib30dd35dcc04ff67904ca82c86a55a16689df226
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-10-23 17:36:35 -04:00
Charis Poag 148b015cab Merge amd-dev into amd-master 20241022
Change-Id: I76f797f267679882d02106a4177359bd31e0b0d5
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-10-22 18:16:58 -05:00
gabrpham 00b3184e9f SWDEV-478748 Changed TestPciReadWrite Test Failure message to Warning
TEST FAILURE message for `amdsmi_get_gpu_cpi_throughput` and
`amdsmi_get_gpu_pci_bandwidth` changed to WARNING to indicate that
pcie_bw and/or pp_dpm_pcie sysfs files may not be supported on respetive
devices.

Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: I1ad6e15eceacb5a00b022458ee5fb21df9d845c7
2024-10-18 16:32:57 -05:00
gabrpham f5b7761ac7 [SWDEV-490187] reset gpu partition were removed
The reset gpu partition support for both compute and memory were removed

Code changes related to the following:
  * amdsmi_reset_gpu_compute_partition()
  * amdsmi_reset_gpu_memory_partition()
  * CLI

Change-Id: I372589074b4da172bedd39223edde18939e373ae
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
2024-10-18 16:22:26 -05:00
Justin Williams 2e5b164c43 [SWDEV-482058 / SWDEV-482971] Added setup.py install
Signed-off-by: Justin Williams <Justin.Williams@amd.com>
Change-Id: Ibad07d34dfb455043ce307fe036289f1d5c20a9a
2024-10-18 16:59:13 -04:00
Zhang Ava 3ff80be742 Merge amd-dev into amd-master 20241018
Signed-off-by: Zhang Ava <niandong.zhang@amd.com>
Change-Id: I1442e59dc6e77b6e07ac870a62eab8aad0343a84
2024-10-18 18:24:27 +08:00
Oliveira, Daniel 25bcf6af2a [SWDEV-488526] BI-Direction Table mismatch
Implements DiscoverIOLinkPerNodeDirection() based on KFD Node infrastructure;
'/kfd/topology/nodes/*/io_links'

Code changes related to the following:
  * Internal implementation

Change-Id: Iccd84d1d69234dbeae4d4925f657e7e3bd801106
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
2024-10-17 15:27:09 -04:00
gabrpham 27b5a35d65 [SWDEV-488846] Removed '--ecc' option from 'amd-smi monitor' when platform is VM
Change-Id: I8f5d7771cbfac3fe5f52dbccbd9f28020adb5f6f
2024-10-16 10:34:19 -04:00
gabrpham eb9116e8c2 [SWDEV-486872] Removed '--ras' from static command when platform is VM
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: I0b03f168d7011428cfea3ab303865f4eaeea78ac
2024-10-16 09:29:24 -05:00
Maisam Arif 9a0d56fea8 [SWDEV-491466] Fix throttle metrics CLI on VM
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I41166df4d155ec1d7d5f30b51dd9e0e02e655eb9
2024-10-16 09:14:25 -05:00
Joe Narlo b5887c2f05 SWDEV-487604 [AMD SMI][Unified Header] integration_test.py is failing with unified header
The script generator.py was not handling all of the anonymous and unnamed structures.
Logic was added to correct the errors seen in the script amdsmi_wrapper.py

Removed adding _ to structure definitions

Change-Id: I51958d23b3da40ec67e883e13dc74feaeaf1d58e
Signed-off-by: Joe Narlo <Joseph.Narlo@amd.com>
2024-10-15 16:22:02 -04:00
Khader Basha Shaik 8308ede9e8 amdsmi [CPU]: Add implementation to get cpu handles and core handles API
- Update the API names, parameters to return cpu handles and core
handles in the system.
  - Update the amdsmi_wrapper.py.
  - Update the amdsmi_interface.py to use the processor handles and
    core handles API.

Change-Id: Ie24f62f345864f8b6773fdb3c6369993bca7e25b
2024-10-14 05:41:19 -04:00
Jeremy Newton dd8795b099 goamdsmi: Use CMAKE_INSTALL_LIBDIR
Match libamd_smi and don't hardcode to "lib", so distros can customize
the library location.

Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
Change-Id: I0d2ff761975529fc06776c75cefea6907ec1ee8f
2024-10-10 15:12:35 -04:00
Maisam Arif 27a48e69d8 Corrected clean local data partition indexing
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ib0eeb065f160fccd3c3f4a2d13f0869af01a74ae
2024-10-10 10:54:45 -05:00
Zhang Ava 19bb018129 Merge amd-dev into amd-master 20241010
Signed-off-by: Zhang Ava <niandong.zhang@amd.com>
Change-Id: I70496500788e8890924d8d3169b296d9dde34a41
2024-10-10 18:23:14 +08:00
Maisam Arif 4fcf281f1d [SWDEV-447451] Fix attribute error for set/reset on Linux Guest
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I5d55bef44d2eea75c33ba489a57544976900c4a4
2024-10-09 12:59:19 -05:00
Charis Poag 5eff39915b [SWDEV-463406] Add volation_status current counter/accumulated values
Changes:
  - amdsmi_violation_status_t now includes current accumulated/counter
   values
  - Tests/wrapper now include added values
  - Removed ASIC references in header for host/bm alignment
  - Fix violation_status->per_hbm_thrm /
    violation_status->active_hbm_thrm
    calculations.

Change-Id: Ic86a7cbad5198a41018f82f6b588b83158d9ba0b
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-10-04 15:56:01 -04:00
gabrpham a440e12fb8 Merging amd-dev into amd-master 20241004
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: I412f8377b75280988f813ce66079a2eed57c1e5b
2024-10-04 12:56:13 -05:00
Maisam Arif e402fe7f36 Udpated market name
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I71948b185b6ac60610fedf2d48dd9c95c26e5777
2024-10-02 14:24:03 -05:00
Maisam Arif 30f6a114e1 [SWDEV-488819] - Backward Compatibility Disclaimer
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I8b00d2009e3d01da134ac21ddcb0994357d76a54
2024-10-01 14:57:23 -05:00
Maisam Arif b233db729b Corrected throttle status value check
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I2d75108c64c3ca3e290be1dd5b8c1435c5576f91
2024-09-30 13:40:32 -05:00
Maisam Arif 3408477fbf Merge amd-dev into amd-master 20240927
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I6559163305fa1967c3d9105f2d45df9063a02f74
2024-09-27 18:55:37 -05:00
Maisam Arif a266d602c5 Bump Version to 24.7.0.0
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ife9277f6abf64ed862e11e12a6472c6e6ea4d68f
2024-09-27 18:55:19 -05:00
Galantsev, Dmitrii 88ed9e2f09 CMAKE - Fix version
Change-Id: Ieefdd4c64ae657a53f1f5fd9a7fc94b3d2c899c2
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-09-27 18:34:16 -05:00
gabrpham 4e2fc2d604 Added amd-smi partition as preliminary command.
new command includes following arguments:
  - current - display the current partition information for the selected
    gpu(s)
  - memory - display memory partition information for the selected
    gpu(s)
  - accelerator - display accelerator partition information for the
    selected gpu(s)
additional functionality will be added as more partition APIs are added.

Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: Ica86160139002ef5213d6d4b0e390670aeef01c8
2024-09-27 17:05:04 -05:00
Maisam Arif 2c8e2060cb Adjusted throttle unit logic in amdsmi_commands.py
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Icce949ff93f45c9751f43df0a80614fd377318fa
2024-09-27 13:26:58 -05:00