Commit Graph

1558 Commits

Author SHA1 Message Date
Maisam Arif 927b9c644b Moved --clear-sram-data to 'amd-smi reset'
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I46eaf7f887b15d6a8d8a31155bb3e448ef0ec04a
2024-05-30 02:26:40 -05:00
Maisam Arif 3855fb2939 Add Process Isolation and Clear SRAM to VM
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I7776e5b10efb2eea798e3e3d523ec5c01a162dc3
2024-05-23 15:33:27 -04:00
Maisam Arif 3cf50dff0b Header unificaiton fixes
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I84bb9a8121927980e4306a9db47ae04d7d03d85f
2024-05-23 14:32:57 -05:00
Dalibor Stanisavljevic 7b2463abe0 SWDEV-457337 - Fix header alignment
Change-Id: I9f25f6c4f0d00c76b66d13162f30be11368f5b59
Signed-off-by: Dalibor Stanisavljevic <Dalibor.Stanisavljevic@amd.com>
2024-05-23 04:41:57 -04:00
Maisam Arif d0cafb7d93 Merge amd-dev into amd-master 20240521
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I995cbe11e4ca8f0916e6f66de176e013b11b20c5
2024-05-21 01:20:57 -05:00
Maisam Arif 1cee1baac2 Make product name empty when unable to find pciid
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: If2300bf2deb4fa099db695949bd4c74393dbbbfc
2024-05-21 02:19:19 -04:00
Charis Poag c5da93ab90 SWDEV-462728 Add update-pciids to install + remove subsystem name
Added to install to update-pciids if there is network connection.
Removed subsystem name from outputting under model. Added TODO
to add later on.

Change-Id: I028269f2931f61e094116a85a7a1286de548122a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-05-20 12:03:31 -05:00
Galantsev, Dmitrii a0bfa0e44e Azure - Add rocm-ci.yml
Change-Id: I1086884fe70081822a79d0e7a814ceb81813d62c
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-05-15 12:53:43 -05:00
Maisam Arif b0fc5d2004 Merge amd-dev into amd-master 20240515
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I64bba263badc1b1dfa8fab5baf90a5f2ad8ea11d
2024-05-15 03:29:23 -05:00
Maisam Arif 721e3ed3ea Bump Version to 24.5.3.0
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I0d1ecddd650320287446a06cd8ce680c52a89342
2024-05-15 04:28:27 -04:00
Roopa Malavally af225a6deb Amdsmidocs reorg
Change-Id: I836fc341d2a3567f531ba753463e57cd4b9b6495
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-05-15 04:26:41 -04:00
Maisam Arif 7d999aa34c SWDEV-458102 - Updates to pp_od_clk_voltage parsing
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I650dae1a99856dcde914fe66917cf9111f3ce0e2
2024-05-15 03:18:24 -05:00
Maisam Arif 8f8d88416f Added #defines from amdsmi.h to python interface
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ic1a17d20f9f1f76e55813db8e2fe287279cb231e
2024-05-15 00:54:25 -05:00
Charis Poag 4295bba37f [SWDEV-451104] Update static --board + amdsmi_get_gpu_board_info()
Updates:
    * Expanded `amdsmi_get_gpu_board_info()` amdsmi_board_info_t structure size
      Updated sizes that work for retrieving relevant board
      information across AMD's ASIC products.
    * Fixed `amdsmi_get_gpu_board_info()` to no longer return junk char strings

Change-Id: Ie1553c6109d678d283d82c24e9284f8e19cd6ccc
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-05-13 23:05:32 -05:00
Bill(Shuzhou) Liu 437cb07db6 Discover the amdgpu when card numbers are not consecutive.
When discover the amdgpu, if the assigned numbers are not consecutive,
not all GPU can be discovered. The code is change to discover the
GPU based on max card number.

Change-Id: Icf4c1df4a1651093b5de3cd7a25a9bd69a299075
2024-05-13 09:53:09 -04:00
marifamd ca520ac761 Update CHANGELOG.md
Change-Id: I6fe5e6c588f7823c271cb098bd932dabc620e8dd
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-05-09 13:11:11 -05:00
Peter Jun Park debe52f284 Prettify changelog for upcoming release
- Makes formatting more consistent with other ROCm components.

Change-Id: I16d65c14a0c52ddddd9cfa66365c2fde59bc4354
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-05-09 13:11:07 -05:00
Maisam Arif 7f59c586e2 Merge amd-dev into amd-master 20240506
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ie8056f3fb1ad99b732362385de3b70fc27176466
2024-05-06 17:28:26 -05:00
Sam Wu a37b0a3ffe Update RTD config to use Python 3.10 and rocm-docs-core 1.1.1
Change-Id: Icfd12dd44e8779dbf95a1f1e9277f9227ff816f6
2024-05-05 23:59:13 -04:00
Maisam Arif 52843152a5 SWDEV-444567 - Added Ring Hang Event
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I2e73ba08ee0004f6f30660b2fa425ea94bafceca
2024-05-03 17:21:28 -04:00
Galantsev, Dmitrii 65379b39dd CMAKE - Update to 3.20 due to compilation issues
Change-Id: If06b039a7aa7ba5966ceabaac864fda448b100a0
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-05-02 21:52:10 -04:00
Galantsev, Dmitrii a5f889930a DOCKER - Lock to 22.04 and install modern cmake
Change-Id: I032ef3c0b968a622cca49342f7e85170fc300b7f
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-05-02 21:52:10 -04:00
Maisam Arif f8c19dce67 Merge amd-dev into amd-master 20240502
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I9d8d0cd0f4ffe39605d087dd52a7768fc15db49d
2024-05-02 16:40:26 -05:00
Maisam Arif bf6fc51f4f Moved Changelog fixes to correspond with release
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I28c91f63ceb5d635d588e3d1d5ec1a385ddc467f
2024-05-02 16:38:58 -05:00
Maisam Arif 733ec3cd20 Updated Changelog with process isolation updates
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ic773137ff05b1819f60d42b8a933ef6ebb9addec
2024-05-02 16:16:29 -05:00
Maisam Arif 11c72946eb Revert "SWDEV-458102 - Deprecated Voltage Curve API"
This reverts commit 1423fb632e.

Change-Id: I8a3eaf0a9f28200e09fb35d5260fbc070fe8a4a9
2024-05-02 15:27:16 -05:00
Charis Poag c24d66740e SWDEV-450580 - Fix powercap set
Updates:
     * CLI - Added AMDSMIHelpers.convert_SI_unit() to help
       conversion of units
     * API - Reverted to uW for power cap limits
     * CLI - amd-smi static --limit now includes MIN_POWER
     * Tests now are all using uW units to keep W conversion
       to only happen in CLI
     * Python API now reflects same units as uW (what is seen
       in amdgpu driver)
     * CLI - amd-smi metric --power:
       Fixed power seen on gpu_metrics v1.3

Change-Id: I32d9ba78d0d8806772f0860f9a803a885b3f316a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-05-02 10:13:39 -05:00
Maisam Arif 051d5a4d42 Bump Version to 24.5.2.0
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I2f51ed93a356e55156983c56bac293a5d7d3b5c1
2024-05-02 02:53:48 -04:00
Maisam Arif 1423fb632e SWDEV-458102 - Deprecated Voltage Curve API
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I111c3ce26d2ab66d5e755432f4b8a9bfa631f805
2024-05-02 02:53:29 -04:00
Maisam Arif 962e217d08 Updated README example output
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I45e7ecea022a028501f381fea4291bf78cc4494b
2024-04-30 19:13:46 -05:00
Maisam Arif e6054be6e7 SWDEV-453493 - Fix Null pointer reference in amd-smi bad-pages
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I10a1278b68cbb464dd0fb38a2de50413f6f43959
2024-04-26 04:04:43 -05:00
Maisam Arif 25ef420407 Updated monitor --pcie to use gpu_metrics pcie bandwidth
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Id37aebc0297317edcd0f459a4817f56a6030d902
2024-04-25 20:10:02 -04:00
Bill(Shuzhou) Liu a0d0210761 Process isolation sysfs format change
The process isolation sysfs format is changed. This fix will
adapt to the new sysfs format.

Change-Id: Id6fd7eeb3e25525047dccab248fd9cfb206cbf62
2024-04-25 08:11:05 -05:00
khashaik aad42d414a amd-smi_cli: Fix issue for set core boost limit in CPU
Change-Id: I1af4e9d14b1667c5279fcf02cebb4103a92e162c
2024-04-25 02:05:41 -04:00
Maisam Arif 614816ab7e Added new functions to py-interface __init__.py
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I4bd591f834b026793cc9158890e30999cba46e82
2024-04-24 14:26:23 -04:00
Bill(Shuzhou) Liu 7d2ab7970d Process isolation and clean shader
A few APIs and command line options are added to support process
isolation and clean shader.

Change-Id: I98ad3fc9fc7429799a21798b7fca1c307de7f403
2024-04-24 13:22:20 -04:00
Maisam Arif 881920c864 Merge amd-dev into amd-master 20240424
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ic78df64607f1541f34305b4955a9696c9933a1c1
2024-04-24 05:10:55 -05:00
Oliveira, Daniel 1ae3a5b6cb fix: [SWDEV-458102] [rocm/amd_smi_lib]
Drops checks that are invalid with the new pp_od_clk_voltage format

Code changes related to the following:
  * get_od_clk_volt_info()
  * get_od_clk_volt_curve_regions()

Change-Id: I534c920e00fa3dacdb980f431db5eef260ac93f5
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
2024-04-23 18:23:39 -05:00
Maisam Arif 0d6626db0d Removed print in python interface
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I80e8cf18dc7631c66d4863251438327b8853cead
2024-04-23 04:49:47 -05:00
Maisam Arif e81051a724 Updated Sphinx to include Changelog
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I858d0579ab0e8ba9f228373b6d31dfd3088703ae
2024-04-23 04:48:42 -05:00
Maisam Arif 1bd18c1a65 Added new ecc blocks and adjusted metric --ecc-block filtering
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ib2f69c7d59ee5108024794434fb202b5e4f58738
2024-04-18 15:01:41 -04:00
Maisam Arif c551c3caed SWDEV-455131 - Updated process APIs
- Removed amdsmi_get_gpu_process_info from python API
  - Updated documentation
  - Aligned process --json output format to unit & value format

Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I82bba1b6df71020b4a5995ff63b9aa62611ce4fe
2024-04-18 14:00:59 -05:00
guanyu12 e2c2f7f8eb Merge amd-dev into amd-master 20240411
Signed-off-by: guanyu12 <guanyu12@amd.com>
Change-Id: I9aafdcd4f12c4e194afdb16ca4c389466a40f81b
2024-04-11 10:22:06 +08:00
Galantsev, Dmitrii c06be55b6a GIT - Set dependabot checks to monthly
Change-Id: If4db71c0d7b68bc03ba302a01e6cf779a32e4c2b
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-04-05 11:14:52 -04:00
Maisam Arif 1171c233cf Merge amd-dev into amd-master 20240405
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I7bc20ca726cb5bcd8ac166d2263074562e6e752c
2024-04-05 02:32:00 -05:00
Maisam Arif 092908daee Bump Version to 24.5.1.0
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I842e223b78f337a39098f652fa6e7ef51948fbaf
2024-04-05 02:31:08 -05:00
Maisam Arif 50450a2a69 Added amdsmi_get_gpu_process_info python library documentation
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I2218bf664a8a155e6b3085378db0fb20f3be3f70
2024-04-05 02:30:13 -05:00
Maisam Arif 9758a8bc33 Removed fb_sharing fields from Linux BM
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ia2942b9d33699ced1683270454c479701bce1246
2024-04-05 03:01:24 -04:00
Charis Poag cdf920f0f4 Merge amd-dev to amd-master 20240401
Change-Id: Ic4c10b49f58e1f2e220a6c295d54e7ad20f6be2c
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-04-01 23:04:38 -05:00
Charis Poag 08a3e76b26 SWDEV-445668 - Align topology JSON
Updates:
    - [CLI] Updated json output to provide format
      similar to host
      eg.
      [
    {
        "gpu": 0,
        "bdf": "0000:01:00.0",
        "links": [
            {
                "gpu": 0,
                "bdf": "0000:01:00.0",
                "weight": 0,
                "link_status": "ENABLED",
                "link_type": "SELF",
                "num_hops": 0,
                "bandwidth": "N/A",
                "fb_sharing": "ENABLED"
            },
            {
                "gpu": 1,
                "bdf": "0001:01:00.0",
                "weight": 15,
                "link_status": "ENABLED",
                "link_type": "XGMI",
                "num_hops": 1,
                "bandwidth": "50000-100000",
                "fb_sharing": "ENABLED"
            },
        ...
        ]
    },
    {
    ...

Change-Id: I63217f63a4d6ebc23a8a84eaac9dbb7aff5f4cb4
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-04-01 18:37:06 -04:00