Commit Graph

420 Commits

Author SHA1 Message Date
Bill(Shuzhou) Liu 1f2d0cefb3 Handle csv output when the command is not based on the device
Fix the error only one csv line can be printed out when output
is not based on device.

Change-Id: Idacc5d98acc223e932fb3d46c888bfa04778b73c


[ROCm/amdsmi commit: 80d650b95a]
2023-07-26 15:28:18 -05:00
Maisam Arif 8c2266573f SWDEV-394316 - Handle not applicable vbios
Change-Id: I3390078a63c9a5eff67024b84a3be1369c4b1460
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>


[ROCm/amdsmi commit: c78ec46671]
2023-07-25 16:33:22 -05:00
Charis Poag 43075e2886 Update logging and README for other project usage
Updates:
    * [rocm-smi] Logging now can update files on
      per-project-basis for install/remove
    * [rocm-smi] README now has latest build
      instructions, including test builds
    * [rocm-smi] Updated README to include
      revision dates

Change-Id: Ifb19a6f32ccf6938f47225db53fef88021909264
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 4613e8dec3]
2023-07-20 19:09:11 -05:00
Oliveira, Daniel bec2ebc893 Add revision to --showhw
Code changes related to the following:
  * Added 'rsmi_dev_revision_get()' related code
  * Test code
  * Functional tests

Change-Id: I8c2097c65384a028c8c8437b717d05d52fe45250
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/amdsmi commit: 573620f586]
2023-07-18 16:17:33 -05:00
Galantsev, Dmitrii 77d8364211 Fix sys and id tests
The following read tests were failing:
*.TestIdInfoRead
*.TestSysInfoRead

1. *.TestIdInfoRead failed because rsmi_dev_brand_get did not specify
   dependency on vbios_version.

2. *.TestSysInfoRead failed because the test didn't expect vbios_version to
   be missing. Which is a new behavior in Aqua Vanjaram.

Change-Id: I9ee88a12fcf6cff2032049e2ecdfb2957efb03ab
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 8fe848d10e]
2023-07-17 15:52:23 -04:00
Galantsev, Dmitrii fa34ddea56 Add .cache to gitignore
Change-Id: Ida03bf1f50704bea44827d7578cd74c1896d4368
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: b0fe2fbd07]
2023-07-17 15:52:23 -04:00
Bill(Shuzhou) Liu c5d1e3f8c0 rocm-smi --showevents shows wrong gpuID
Use the gpuid returned from the event data instead.

Change-Id: I7f286cc105f7ea12985223e603504f0ef3d9724e


[ROCm/amdsmi commit: 0aeb6025bd]
2023-07-13 08:28:53 -05:00
Galantsev, Dmitrii e925cc320e Simplify gitignore
Remove generic gitignore to simplify tracking of generated files

Change-Id: Idf1f9719b2cfd16b31332a3ed87be5943c2c1ce7
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: e6c42c6626]
2023-07-07 11:48:09 -04:00
Jeremy Newton b214d2047e Fix python loading of librocm_smi64
The librocm_smi64.so is used for development, while
librocm_smi64.so.MAJOR is used for runtime, thus the python front end
should not be loading the .so binary, but rather the .so.MAJOR binary.

As well, it's good not to hardcode "lib" as some distros will change
this.

rsmiBindings.py is now generated with CMake

Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
Change-Id: I7cb745f8936fdf10d3ebd6c1e606031f713184ca


[ROCm/amdsmi commit: 2d2c73a5e6]
2023-07-06 09:52:56 -04:00
Jeremy Newton dd8eaa40b2 Only install asan license if enabled
Change-Id: I79c6fce84c23ed12e65db8e234a29dbfedd11f68
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>


[ROCm/amdsmi commit: 828f46b445]
2023-06-30 23:34:43 -04:00
Jeremy Newton 9381fe711c Actually fix version string
There seems to be a scope issue with the existing variables, but just
putting in the pkg version string seems sufficient.

Change-Id: I4ccef872ff848a70cb2abc07bf605c5f29a608e8
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>


[ROCm/amdsmi commit: 4f481dd7f3]
2023-06-30 23:34:14 -04:00
Tom Rix 830de53e50 Improve handling of ContructBDFID errors
Building on this package on Fedora reports this warning
In file included from rpmbuild/BUILD/rocm_smi_lib-rocm-5.5.1/src/rocm_smi_main.cc:62:
In member function 'amd::smi::Device::set_bdfid(unsigned long)',
    inlined from 'amd::smi::RocmSMI::Initialize(unsigned long)' at rpmbuild/BUILD/rocm_smi_lib-rocm-5.5.1/src/rocm_smi_main.cc:330:27:
rpmbuild/BUILD/rocm_smi_lib-rocm-5.5.1/include/rocm_smi/rocm_smi_device.h:199:42: warning: 'bdfid' may be used uninitialized [-Wmaybe-uninitialized]
  199 |     void set_bdfid(uint64_t val) {bdfid_ = val;}
      |                                   ~~~~~~~^~~~~
rpmbuild/BUILD/rocm_smi_lib-rocm-5.5.1/src/rocm_smi_main.cc: In member function 'amd::smi::RocmSMI::Initialize(unsigned long)':
rpmbuild/BUILD/rocm_smi_lib-rocm-5.5.1/src/rocm_smi_main.cc:324:12: note: 'bdfid' was declared here
  324 |   uint64_t bdfid;
      |            ^~~~~

Only set the bdfid when it is know to be valid.

Signed-off-by: Tom Rix <trix@redhat.com>
Change-Id: I839b4d2d2d4e3b25469cf5972245b9630da00c87


[ROCm/amdsmi commit: 19c3e2aff9]
2023-06-30 00:16:44 -04:00
Jeremy Newton ebc4826ee9 Update default version to match tags
When building from github, these tags don't exist, so the defaults
should try to match the internal tags

Change-Id: Id570341f27e21916b1a7f3605ee2b5b9716cad9b
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>


[ROCm/amdsmi commit: 74dc98114f]
2023-06-30 00:16:22 -04:00
Jeremy Newton 70fa8a9903 Fix version file generation
This looks like a typo, as the following variables are not defined:
- AMD_SMI_LIBS_TARGET_VERSION_MAJOR
- AMD_SMI_LIBS_TARGET_VERSION_MINOR
- AMD_SMI_LIBS_TARGET_VERSION_PATCH

Change-Id: I43449e7bd2a2de643d33e79fad063a7859679c8d
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>


[ROCm/amdsmi commit: 1a86dd75bb]
2023-06-29 14:42:30 -04:00
Jeremy Newton 16f5c150d2 Fix python script install permissions
The keyword "PROGRAMS" should be used in place of "FILES" in order to
make sure executable scripts have the correct permissions.

Change-Id: I6c287dc1291774ad6d97a04d621957dea0a1b697
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>


[ROCm/amdsmi commit: d00d885394]
2023-06-27 14:57:59 -04:00
Bill(Shuzhou) Liu a2d80150c6 Crash if no hwmon sysfs
Return NOT_SUPPORTED if no hwmon sysfs.

Change-Id: I01356a21f004ab552ca6ef7ffb49934bfdfd5e31


[ROCm/amdsmi commit: 910bf677a9]
2023-06-26 08:00:32 -05:00
Galantsev, Dmitrii 56578b4333 SWDEV-406542 - Add gtest to install targets
Change-Id: I116505aaa33109fce66ab8daf9921e2de11a27d4
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 82078565e9]
2023-06-20 11:14:56 -05:00
Galantsev, Dmitrii 37f89fac6f SWDEV-391041 - Disable TestPowerReadWrite
Change-Id: I56b5bea3e5206a6f0d5ecdb482103881f80f0b8b
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 9519d5b8cf]
2023-06-16 15:18:27 -04:00
Galantsev, Dmitrii 849c6c3eaa Assign tests to aqua_vanjaram
Change-Id: Iee78b1e810356327261006087b081e39dab0b9e8
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: e7585cc045]
2023-06-16 15:18:27 -04:00
Bill(Shuzhou) Liu f6b66b7661 Expand showpids to provide more details
Provide details of GPU usage by an application.

Change-Id: I0f36df7d358754c2c8a60432b736d98f667ee99c


[ROCm/amdsmi commit: d9b6af7a09]
2023-06-16 08:52:18 -04:00
Galantsev, Dmitrii 94d7ad71a3 SWDEV-340919 - Package rsmitst
Similar to I879b21428e6642f19fda67092b365d8b78b7ba7b.

Main CMake improvements:

* Add rsmitst with -DBUILD_TESTS=ON
* Package tests into rocm-smi-lib-tests.deb and .rpm
* Note - this breaks build_rsmitst.sh

Misc improvements:

* Add .editorconfig to normalize code formatting
* Export compile_commands.json
* Remove gtest source and pull from github instead

Change-Id: Ib87ed4a5acd9f78badae6d028e5ff3d4f56dafc2
Depends-On: I8b26795471ad1432c805e45d8b58d7bb34abfcfc
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 0478d53e23]
2023-06-13 22:52:10 -05:00
Galantsev, Dmitrii cc9a3905b9 Temporarily ignore TestFrequencies
See SWDEV-391039 and SWDEV-391040 for details

Change-Id: I662ba43363d949465454ea4af4d4586b3d47a811
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: ac94bf5ed5]
2023-06-12 19:26:21 -05:00
Galantsev, Dmitrii 80dd98d778 --showtempgraph - Show N/A when no temp found
If temp in hwmon was missing - rocm-smi crashed.
e.g. /sys/class/drm/card1/device/hwmon/hwmon5/temp1_input

This change displays "N/A" for temp instead of crashing.

Change-Id: I02f84a466bd3acfbd9b65e7e4ca0f18e76606c3b
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 713f85721b]
2023-06-12 19:16:39 -05:00
Maisam Arif 57e2ba5fe1 SWDEV-404157 - Fixed printLog delimiter parsing
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I3d8e22d185790f4325aeacc18e4bfcfe8777d356


[ROCm/amdsmi commit: 00e170c2f5]
2023-06-08 20:02:51 -05:00
Galantsev, Dmitrii 2bbf54b117 Fix test temp blacklist, ignore TestVoltCurvRead
Change-Id: I86fa14fdc06e1b170a0bc0c0727fc08e4f4e2074
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: f78f9a4082]
2023-06-06 17:02:14 -04:00
Charis Poag 298268a0c5 [SWDEV-402336 + SWDEV-398070] Fix RPM install part2
Updates:
    [rocm-smi] RPM installation comment included a macro,
    now removed

Change-Id: Ifa7a8d2d1a713940c39e20df9d02635e0e623dd8
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: e2dec17284]
2023-06-05 13:50:57 -05:00
Galantsev, Dmitrii 303b207caf Clean-up python errors and warnings
Used pyright to show errors and warnings and resolved most

Change-Id: I0fdf7dcdf08db5c35dec80f6645e0a395fbe4197
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: e8391c9d7c]
2023-06-01 17:37:57 -04:00
Charis Poag 63def40bf2 [SWDEV-402336 + SWDEV-398070] Fix RPM install - override macros
Updates:
    * [rocm-smi] RPM installation now overrides macro usage

Change-Id: I2a5ba14670becc178f672182eabe71965a526178
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: b0f2a9d2ef]
2023-06-01 11:58:42 -04:00
Galantsev, Dmitrii fb3672e29b Fix memset compile warning
Change-Id: If31210f3c6038e56f43ae8631ed1657d1509488e
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 2048f8978f]
2023-05-31 21:54:32 -04:00
Bill(Shuzhou) Liu 4c8f64224e Fallback to gpu_metrics if the sysfs is not available
The gpu_metrics may have required PCI link width and speed.

Change-Id: I939d733f5f6a71088545ba042345eb1b6ad20ee5


[ROCm/amdsmi commit: a6467c4083]
2023-05-24 14:51:43 -05:00
Bill(Shuzhou) Liu a4c1afe5d4 SWDEV-400644: Reset the mutex only if errors
To prevent reset the mutex while using it, only reset the mutex
if it cannot acquire it.

Change-Id: I95e0ed1bf543f285ce81b4df9c51e16a88081d38


[ROCm/amdsmi commit: 160c99d12d]
2023-05-22 11:20:44 -04:00
Charis Poag 14b2f93d48 [SWDEV-398070] Adding logging to ROCm SMI (by default off)
Updates:
    * [rocm-smi] Provide a thread-safe logging feature
    * [rocm-smi] Adding logrotation into install/upgrade/remove
      scripts
    * [rocm-smi] Updated cmake lists to include rocm_smi_logger
    * [rocm-smi] Updated DEB/RPM install/remove logging file &
      folder with all users having r/w privledges for
      /var/log/rocm_smi_lib/ROCm-SMI-lib.log
    * [rocm-smi] Added ability to do a glob search for multiple files
      (globFileExists), assists doing file searches with * strings
    * [rocm-smi] Added ability to log system details when RSMI_LOGGING
      is turned on (getSystemDetails())
    * [rocm-smi] Added logging to provide which ROCm API is being called
      when RSMI_LOGGING is on
    * [rocm-smi] Added logging to provide SYSFS path and read value,
      when RSMI_LOGGING is on. Provides error reponse on failure.
    * [rocm-smi] Added logging to provide SYSFS path and read value,
      when RSMI_LOGGING is on. Provides error reponse on failure.
    * [rocm-smi] Added environment variable RSMI_LOGGING to control
      when logging is enabled or disabled. By default, by not
      setting this env. variable, logging is turned off. When
      setting RSMI_LOGGING=<any value>, logging is enabled
      which is placed in /var/log/rocm_smi_lib/ROCm-SMI-lib.log file.
      Setting RSMI_LOGGING is allowed in both debug and release builds.
    * [rocm-smi] Removed an initialize procedure which keeps
      debug_inf_loop. Seems this feature is not being used.

Change-Id: I79b48387609c6233c6f05b04fb8bba66b68c2399
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: c3a095a180]
2023-05-17 21:18:52 -05:00
Sam Wu 7e26f8ada1 sphinx documentation
ref: https://github.com/RadeonOpenCompute/rocm_smi_lib/pull/119

fix formatting in docs/index.md

Change-Id: I940ef8147a40bd3b702aa591bd56557a870621fb


[ROCm/amdsmi commit: ed74bc6eca]
2023-05-11 10:41:45 -04:00
Ranjith Ramakrishnan d248adcf16 SWDEV-383221 - Set the default value of ROCM_HEADER_WRAPPER_WERROR to OFF
Using wrapper header files will result in #warning message by default

Change-Id: I8941a96bdc1b921a7646ccb353130cb283957ff8


[ROCm/amdsmi commit: daffcdb930]
2023-05-08 16:56:52 -07:00
Charis Poag fc18ccd37a [SWDEV-392571] Fix concise info when missing VRAM info
Updates:
    * [rocm-smi] Added larger app width size, which helps
      display missing device info
    * [rocm-smi] Added better context when rsmi_ret_ok
      does not return with RSMI_STATUS_SUCCESS
    * [rocm-smi] Removed all references to an
      undefined function (printLogNoDev())
    * [rocm-smi] Fixed not detecting non-int
      values when setting the voltage curve
    * [rocm-smi] Added better context on missing
      sysfs file when setting clock overdrive
      values
    * [rocm-smi] Fixed getMemInfo() calls not
      referencing tuple values (making it easier
      to read)
    * [rocm-smi] Silenced concise info spitting
      out errors for missing VRAM files, instead
      display which metric is "unsupported" if
      the files are missing
    * [rocm-smi] Updated function descriptions for
      rsmi_ret_ok & getMemInfo
    * [rocm-smi] Updated getMemInfo to provide a
      quiet call, to silence for concise info calls.
      This provides a way to keep the output clean.
    * [rocm-smi-lib] Added when using debug sysfs
      files, to state, which enums are enabled
      for debug

Change-Id: I0e9e0c97ccf71467ced0e1a1f71803327a8be2b7
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 6be92b9e26]
2023-04-13 15:11:35 -04:00
Bill(Shuzhou) Liu 759d14709d Validate the clock frequency when set it
Add the check of the clock frequency when set it.

Change-Id: I707291bfb5007bb69100c780af50a4b0f697bb37


[ROCm/amdsmi commit: b6789891b0]
2023-04-06 11:54:38 -04:00
Charis Poag a3c5120159 [SWDEV-391036 + SWDEV-392933] Fixes for VoltRead and ComputePart.
Updates:
    * VoltRead - needed to properly send out RSMI_STATUS_NOT_SUPPORTED
      when device does not have voltage hwmon files
    * ComputePart. - test failure was likely caused due to EvtNotif
      causing conflicts (unknown exactly why). Test passes when
      moving it ahead of the event notifier. Both API calls may have
      a system resource issue, TBD.
    * rocm_smi_example - now indicates when an API call
      returns RSMI_STATUS_NOT_SUPPORTED or
      RSMI_STATUS_NOT_YET_IMPLEMENTED. Allows example to fully complete
      on systems which may not provide support for all API calls.

Change-Id: I520b8584e078d412414e8e5797c664220a7e823a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 78a0812f7f]
2023-04-05 12:44:29 -05:00
Bill(Shuzhou) Liu c71312a760 Increase the max BDF ID length
Increase the max length from 256 to 512.

Change-Id: I3114f7ce6852aafa9dfec0186f27c1121c939c69


[ROCm/amdsmi commit: 58c83eb379]
2023-03-29 10:04:28 -04:00
Bill(Shuzhou) Liu b875784232 Correct subsystem name by matching device id.
The rsmi_dev_subsystem_name_get() only matches subvendor id and
subdevice id for a vendor. The change will also match device id.

Change-Id: Ife3aedaf6fc7390ed7fa62edbde40c2340689b23


[ROCm/amdsmi commit: 0c82a9d577]
2023-03-28 15:48:31 -05:00
AravindanC 7e2d0970e5 SWDEV-351540 - ASAN packaging for rocm_smi_lib
Change-Id: Iab354d02d261a0270a3d118b825835fc6f021c15


[ROCm/amdsmi commit: 778f3b7fdc]
2023-03-20 13:14:53 -07:00
Charis Poag e543fc1e0c [SWDEV-387906] Fix rocm-smi initialize crash
Fix was needed due to hwmon updates.
Several voltage sensors (ex. vddgfx/vddnb)
are unsupported or not applicable
to upcoming hardware. This was not the case
for previous hardware sensors, resulting in
the rocm-smi crash observed.

Change-Id: Ib8593e10811638def26fc7a1eda29309e328db09
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: f44d1ea8bc]
2023-03-17 15:04:34 -05:00
Bill(Shuzhou) Liu 6d5a10d914 Fix cppcheck static analysis report warning
Fix some warning from static anaysis tool.

Change-Id: I7e8c2f5d6f79aff5fdcad81b1fd832900f213c47


[ROCm/amdsmi commit: 1b7eb4e1f4]
2023-03-13 09:27:19 -05:00
Ranjith Ramakrishnan ea950b8778 SWDEV-366831 - Compile time flag to switch between #warning and #error message
Using backward compatibility paths will provide an #error message. Compile time option added to enable/disable the #error message.
Disabling the same will provide a #warning message

Change-Id: Ib49633501aa6eb6d97158b1ecfc47de6f18fba85


[ROCm/amdsmi commit: 14b86107a7]
2023-03-10 08:56:45 -08:00
Bill(Shuzhou) Liu 41349aa57b Filter out the GPUs not assigned to a container in showpid
The process ids of other container are still visible in the sysfs file,
filter it out to prevent crash.

Change-Id: I665912cd09c606804186aff8cba5c24f5e58ded7


[ROCm/amdsmi commit: 710649ab66]
2023-03-06 11:05:02 -06:00
Charis Poag d2497bb2d3 [SWDEV-335697 + SWDEV-342812] Fix NPS & Compute tests
Updates:
    * Fixed rsmi_dev_compute_partition_get
      & rsmi_dev_nps_mode_get to properly check
      for invalid arguments
    * Updated compute partition & NPS mode tests
      - Now properly confirms the invalid
        argument is seen
      - Spacing for multiple devices is added
        to better see distinction between
        separate device's tests (for verbose output)
      - Changed expect to assert calls, so errors
        are observed faster for test failures
      - Fixed multiple device testing where a
        variable should have been unset, but
        having multiple devices caused it to
        set
      - Updated multiple device testing to iterate
        accross all devices (previously returned,
        instead of continuing checking support
        after RSMI_STATUS_NOT_SUPPORTED detected)
      - Fixed a few spelling errors & verbose output

Change-Id: Ieba9e5b46763c6cd880fbf27fcdf58be8ecbc683
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: c252ecccd1]
2023-03-02 13:24:38 -06:00
Bill(Shuzhou) Liu f3b32d05df mem_use_pct uninitialized error
Initialize mem_use_pct if the memory info is not available.

Change-Id: Id8e285050149c51077356826c8f99719b473060d


[ROCm/amdsmi commit: fcb6afa289]
2023-02-27 16:47:45 -06:00
Charis Poag ff26973e15 [SWDEV-335697] Add RSMI_STATUS_SETTING_UNAVAILABLE for dynamic partition
Updates:
    * Added RSMI_STATUS_SETTING_UNAVAILABLE for
      rsmi_dev_compute_partition_set - gives users
      better error output when attempting to set
      compute partition to values not listed in
      available_compute_partition SYSFS
    * Updated python --setcomputepartition to
      provide better output when receiving
      RSMI_STATUS_SETTING_UNAVAILABLE
    * Updated all test & example files to check for
      RSMI_STATUS_SETTING_UNAVAILABLE when doing
      rsmi_dev_compute_partition_set

Change-Id: Ida5d54880d9b9b6e4a0468cdb962fdc0c18d6257
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 0d3558945b]
2023-02-27 11:17:44 -06:00
Bill(Shuzhou) Liu c4d64a56d8 Memory usage division by zero
The showAllConcise with division by zero error.

Change-Id: I469f1b9f268842cd51662be6f9036f555a8949b2


[ROCm/amdsmi commit: 55bc2e2072]
2023-02-24 10:12:36 -06:00
Bill(Shuzhou) Liu 5c43dc8160 Use Unified Changelog Template
The CHANGELOG.md is added to track changes.

Change-Id: I33547cb7f1596b4b8abf206aebdd664649d4d19f


[ROCm/amdsmi commit: b40933b895]
2023-02-21 14:27:55 -06:00
Charis Poag 02ca598e70 [SWDEV-381630] Add reset partition functionality
Updates:
    * Added rsmi_dev_compute_partition_reset & rsmi_dev_nps_mode_reset
    * Added --resetcomputepartition and --resetnpsmode python smi calls
    * Added temp data files rocmsmi_boot_compute_partition_<device num>
      & rocmsmi_boot_nps_mode_partition_<device num>, writes UNKNOWN
      if data cannot be read or device does not support
    * Cleaned up NPS & compute API documentation
    * Added creation and reading of API temp files (used in reset
      functionality)
    * Cleaned up output of rocm_smi_example
    * Updated rocm_smi_example to check if running with sudo permission
      before executing write API calls (cleans up erroneous output)
    * Added template specialization for storing temp data, requires
      specific rsmi_type_t enums (restrics what data can be stored)
    * Added storage of temp data, if temp files do not exist
    * Updated google tests for NPS & compute to include reset API calls

Change-Id: I69895a466b97107617e6dbb355737b84499a76c9
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 77c950a4bf]
2023-02-17 12:55:08 -06:00