Граф коммитов

146 Коммитов

Автор SHA1 Сообщение Дата
Oliveira, Daniel 2c8ba4cae9 rocm_smi_lib: Fix Refactoring gpu_metrics code
Uses new support for 'gpu_metrics_v1_4'

Code changes related to the following:
  * rsmi gpu_metrics APIs
  * rsmi gpu_metrics Logs
  * new data structure fields added in 1.4
  * added APIs for all other existing metrics before 1.4
  * added support to older metrics; 1.1, and 1.2
  * public APIs renamed to start with prefix 'rsmi_dev_metrics_'
  * Unit tests updated
  * Examples updated

Build changes related to the following: None

Change-Id: Ibdaf031be9d916020b4049544dbd725858c7711d
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
2023-11-10 19:05:09 -06:00
Charis Poag 57b6135e54 Partition EBUSY with RSMI_STATUS_BUSY & invalid GPU Metrics check
* Updates:
   - [API/CLI] rsmi_dev_*_partition_set &
     rsmi_dev_*_partition_reset - exposed RSMI_STATUS_BUSY for
     EBUSY writes + cleaned up accidental map insertions
     (maplookup[] can insert values that are not in the map,
     map.at(key) fixes this potential issue)
   - [API] rsmi_dev_gpu_metrics_info_get() - returns
     RSMI_STATUS_NOT_SUPPORTED for unsupported metric tables
     outside of 1v1/1v2/1v3
   - [API] writeDevInfoStr() - exposes RSMI_STATUS_BUSY for
     EBUSY write errors; kept backward compatibility
     for other writes which do not care about these states
   - [API] rsmi_dev_od_volt_info_get()
      & rsmi_dev_od_volt_curve_regions_get() have better logging
     + Expose more details on why they are erroring
   - [Utils/logs/example] Expose AMD GPU gfx target version to aid in
     system troubleshooting
   - [Utils] Added test methods that look at od volt
     freq & regions into here - for easier access across
     several tests
   - [Utils] Updated getRSMIStatusString(new argument - fullstatus;
     default to true for backwards compatibility)
     -> true shows shortened RSMI STATUS response
   - [Utils] Added splitString to cut out noisy return responses
     (used in getRSMIStatusString(), when fullstatus = true)
   - [Utils] Added getFileCreationDate() to expose build date
     of the library - helpful for local builds or experimental builds
   - [Utils] Macro cleanup
   - [Example] Added a few gpu_metric checks - helpful for upcoming
     updates
   - [Device] SYSFS/DebugFS - now have better r/w displayed in logs
   - [LOGS] Expose library build date - see above for details
   - [Tests] Add more warnings/errors to test builds
   - [Tests] Moved up Partition tests for ordered test runs - helped
     identify issues with GPU BUSY writes
   - [Tests] compute_partition_read_write - handles RSMI_STATUS_BUSY
     with waits for busy status found & cleaned up how we checked
     for partition changes - with RSMI responses exposed more clearly
   - [Tests] perf_determinism - multi gpu now properly runs through
     with full resets as needed
   - [Tests] volt_freq_curv_read - better error handling with more
     verbose output

Change-Id: Ie94c6abb6a9aab95c345996d3ad3843cf6734977
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2023-10-27 14:52:02 -04:00
Charis Poag 6f1afd2678 bdfid fix for partition & xgmi nodes
* Updates:
    - [API] After discovering all amd gpus, we now properly
      map correct bdf (xgmi nodes). Especially important for
      partition changes - aka secondary nodes.
    - [API] While adding new secondary nodes we now have
      better grouping -> due to resorting based on
      kfd properties list & matching to primary uniqueid
    - [API] All secondary nodes are now AddToDeviceList
      with correct bdf (location id), provided by kfd
    - [API] Modified AddToDeviceList(..., uint64_t bdfid):
      providing an optional field - bdfid. This allows working
      around primary pcie cards with xgmi nodes
    - [API] Utils - cpplint minor fixes
    - [Example] Removed all endl references w/ newline, fixed
      spacing, and some incorrect values displaying as hex
      (needed dec representation)
    - [API] kfd node functions - now print full path of file
      for trace logs
    - [Tests] power_read.cc: Added in generic power test to
      confirm guaranteeing specific return values

Change-Id: I143474e8d64c4915a966e789be6bcea4fa7f4472
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2023-10-13 20:14:39 -05:00
Galantsev, Dmitrii 2a7589a065 TESTS - Skip XGMI test
Change-Id: Idd9f505f36fac4a670e5129f835aa051b5c4c9fa
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-10-12 21:27:55 -05:00
Charis Poag 31a1fcce7d Add rsmi_dev_power_get
* Updates:
  - [API] Added rsmi_dev_power_get(uint32_t dv_ind,
                                   uint64_t *power,
                                   RSMI_POWER_TYPE
                                   *type)
          provides generic get to average or
          current power & provides backwards
          compatibility
  - Added a utility function to get MonitorTypes
    (monitor_type_string(type)) &
    RSMI_POWER_TYPE (power_type_string(type))
    strings
  - [Tests] Added rsmi_dev_power_get tests and
    provided better verification of return values for
    all power APIs
  - [Tests] Updated power outputs to show correct
    units
  - [example] Now uses avg, current, and generic
    power functions with type output response

Change-Id: I5ca06ca37fd5f61e100f2835b664d6cdd1ca42e6
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2023-10-10 00:34:19 -05:00
Charis Poag b251bb0c9f Rename NPS -> memory partition + compute partition node fix
* Updates:
        - rocm_smi_lib + CLI:
          Rename all "NPS mode" -> "memory partition"
          related files/functions/API/CLI to align with correct
          technical naming
        - rocm_smi_main: fixed identifying primary card's unique id
          utilize rsmi_dev_unique_id_get to map which
          KFD nodes belong to it
        - rsmi_dev_*_partition*: now have better logging output
        - compute partition tests:
          Added 20 sec delay for workaround until GPU
          busy is confirmed as the issue
        - CPPLint fixes/formatting
        - [Example] Moved all endl to "\n" for efficiency
        - [Example] Added Edge & Junction temperature examples
        - [Example] Added rsmi_minmax_bandwidth_get() example - WIP

Change-Id: Ida6db6fda7e0ac9d696a34cb15b4746e69d58d51
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2023-10-06 11:51:09 -04:00
Galantsev, Dmitrii e962d3b281 TESTS - Don't fail on TestFrequenciesRead
- Return from freq_output function early if clock is unsupported
- Right-align frequencies

Change-Id: I799c9351dac8a5be161bc9243cd3816539728357
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-10-04 18:24:56 -05:00
Galantsev, Dmitrii cf6bcbbb27 Upgrade to CXX-17 gtest-1.14 and cmake-3.14
Also change the TARGET from amd_smi_libraries to rocm_smi_libraries
This helps reduce confusion between rocm-smi and amd-smi

Change-Id: Ie54cedd831ba24bd9afc341ad15b7e8e20732059
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-09-28 12:44:51 -05:00
Charis Poag f078375350 Add Current (Instant) Socket Power
* Updates:
    - rocm_smi_logger:
      General cleanup &
      Aligned to cpplint rules for usage
    - rocm_smi_monitor:
      Fixed MonitorTypes
      from not displaying properly in logs
      & Added socket power label + current
      socket power MonitorTypes
    - rocm_smi API:
      Added rsmi_dev_current_socket_power_get API
    - rocm_smi CLI:
      General cleanup,
      Concise info now displays device data
      in variable width (see printLogSpacer's
      new field),
      printLogSpacer now as an adjustable
      variable that overrides appWidth,
      Added Socket Power to base rocm-smi +
      --showpower CLI calls,
      --showpower & base rocm-smi CLI defaults
      to printing socket power (if not available,
      displays average power)
    - Cleaned up temp label references
    - power_read gtests:
      Added current socket power to testing

Change-Id: Ica57e6f98ad96e2584e7c7955e188f68d2dab89d
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2023-09-25 01:38:54 -04:00
Ori Messinger d44a6ef523 ROCm SMI LIB: Add Missing Firmware Blocks
The purpose of this patch is to add the following missing firmware
blocks to the SMI LIB:
-RSMI_FW_BLOCK_MES
-RSMI_FW_BLOCK_MES_KIQ

Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I5d4d37d883878dd02ef8533d4eb8891d54d70630
2023-09-25 01:37:38 -04:00
Galantsev, Dmitrii 0c662611e9 SWDEV-423672 - Always compile and install gtest
This commit makes sure GTest is always compiled with rocm_smi_lib_tests.

GTest installation was inconsistent outside of AMD CI environment.
libgtest.so wouldn't get installed with rocm_smi_lib_tests if gtest
existed on the build machine. Which is undesirable when packaging.

Change-Id: I607df6c67c81480e3b6487b28f14924e8bf56ad4
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-09-23 21:10:12 -04:00
Oliveira, Daniel e0483f2ee2 rocm_smi_lib: Fix [linux BM] [AMDSMI] Memory Bandwidth
Implements APIs for 'gpu_metrics_v1_3' utilization averages

Code changes related to the following:
  * rsmi_dev_activity_metric_get()
  * rsmi_dev_activity_avg_mm_get()
  * CLI shows "Avg.Memory Bandwidth" under "--showmemuse"

Change-Id: I8e4600f350a7c18499abf022534db2b875f09d5f
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
2023-09-21 11:00:29 -04:00
Galantsev, Dmitrii 5c574ac79c TESTS - Check power and frequency support
It is not guaranteed that power can be read or set for some GPUs
(MI300). It is also not guaranteed that frequencies can be set.

As this is not a tool issue - we simply skip the failing test.

Change-Id: I134e96a476040cef513cd924f00e30cd6dea42a5
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-09-14 22:19:33 -04:00
Galantsev, Dmitrii a4b470fe71 Add errors for existing but empty dev files
Change-Id: Iad9febc50f9b8e6085f8b605249ee884d2f134d6
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-09-14 17:30:03 -04:00
Galantsev, Dmitrii d9381b6dae Fix misspelling averge -> average
Change-Id: I3546348560acadb1e775e10ad24115de4ccfc800
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-09-13 19:49:46 -05:00
Galantsev, Dmitrii ff992e9b56 TESTS - re-enable frequency tests on aqua_vanjaram
Change-Id: I8fcd9418da5b973897ccfffc7d8a2f3ea833ea77
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-09-11 19:43:25 -05:00
Charis Poag ed6777a8e7 Add GPU partition nodes
* Updates:
    - Fixed infinit loop on systems
      which did not have VRAM files
    - Fixed concise info from throwing exception
      with no amdgpu driver loaded
    - Fix for ability to see all nodes when
      after switching partitions (mirrors
      original card display/settings)
    - Added to logs build type, lib path,
      and set env. variables

Change-Id: Ic0333df355144ce2242cecea93fe4ce51caf311c
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2023-09-07 22:17:54 -05:00
Galantsev, Dmitrii 4aef767596 Cleanup rocm_smi.cc
Change-Id: Ia676c237222b0dd5d9e8a054a93776f3b11e2225
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-09-07 15:50:40 -04:00
Galantsev, Dmitrii 84e90e55d5 TESTS - Add 90402 and simplify description
Change-Id: Ie6ab12d4201841fcb832d6827a5ec0ae5bb65114
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-08-25 14:01:53 -05:00
Bill(Shuzhou) Liu 471fbfddc1 Numa affinity shows large number
Change the affinity from unsigned int to integer to represent -1.

Change-Id: I82dc6f476b45fa4ec03a3c686fe8e6e2b7761b56
2023-08-25 09:01:08 -04:00
Galantsev, Dmitrii 613bd8ad1d TESTS - Fix incorrect TestVoltCurvRead assert if not supported
Change-Id: I2242aa9be84543276c63f1f57fdc489754c9ee07
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-08-22 16:51:42 -04:00
Galantsev, Dmitrii 62f01cb150 TESTS - Use gpu version as a workaround for a missing name
Depends-On: Ifbd38f11fbde7ba28af4be1d611310dea1b5112a
Change-Id: Ia7b7975f03424854df0a470b2719cf2ff2cf8e40
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-08-21 19:18:22 -04:00
Oliveira, Daniel 573620f586 Add revision to --showhw
Code changes related to the following:
  * Added 'rsmi_dev_revision_get()' related code
  * Test code
  * Functional tests

Change-Id: I8c2097c65384a028c8c8437b717d05d52fe45250
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
2023-07-18 16:17:33 -05:00
Galantsev, Dmitrii 8fe848d10e Fix sys and id tests
The following read tests were failing:
*.TestIdInfoRead
*.TestSysInfoRead

1. *.TestIdInfoRead failed because rsmi_dev_brand_get did not specify
   dependency on vbios_version.

2. *.TestSysInfoRead failed because the test didn't expect vbios_version to
   be missing. Which is a new behavior in Aqua Vanjaram.

Change-Id: I9ee88a12fcf6cff2032049e2ecdfb2957efb03ab
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-07-17 15:52:23 -04:00
Galantsev, Dmitrii 82078565e9 SWDEV-406542 - Add gtest to install targets
Change-Id: I116505aaa33109fce66ab8daf9921e2de11a27d4
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-06-20 11:14:56 -05:00
Galantsev, Dmitrii 9519d5b8cf SWDEV-391041 - Disable TestPowerReadWrite
Change-Id: I56b5bea3e5206a6f0d5ecdb482103881f80f0b8b
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-06-16 15:18:27 -04:00
Galantsev, Dmitrii e7585cc045 Assign tests to aqua_vanjaram
Change-Id: Iee78b1e810356327261006087b081e39dab0b9e8
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-06-16 15:18:27 -04:00
Bill(Shuzhou) Liu d9b6af7a09 Expand showpids to provide more details
Provide details of GPU usage by an application.

Change-Id: I0f36df7d358754c2c8a60432b736d98f667ee99c
2023-06-16 08:52:18 -04:00
Galantsev, Dmitrii 0478d53e23 SWDEV-340919 - Package rsmitst
Similar to I879b21428e6642f19fda67092b365d8b78b7ba7b.

Main CMake improvements:

* Add rsmitst with -DBUILD_TESTS=ON
* Package tests into rocm-smi-lib-tests.deb and .rpm
* Note - this breaks build_rsmitst.sh

Misc improvements:

* Add .editorconfig to normalize code formatting
* Export compile_commands.json
* Remove gtest source and pull from github instead

Change-Id: Ib87ed4a5acd9f78badae6d028e5ff3d4f56dafc2
Depends-On: I8b26795471ad1432c805e45d8b58d7bb34abfcfc
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-06-13 22:52:10 -05:00
Galantsev, Dmitrii ac94bf5ed5 Temporarily ignore TestFrequencies
See SWDEV-391039 and SWDEV-391040 for details

Change-Id: I662ba43363d949465454ea4af4d4586b3d47a811
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-06-12 19:26:21 -05:00
Galantsev, Dmitrii f78f9a4082 Fix test temp blacklist, ignore TestVoltCurvRead
Change-Id: I86fa14fdc06e1b170a0bc0c0727fc08e4f4e2074
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-06-06 17:02:14 -04:00
Charis Poag 78a0812f7f [SWDEV-391036 + SWDEV-392933] Fixes for VoltRead and ComputePart.
Updates:
    * VoltRead - needed to properly send out RSMI_STATUS_NOT_SUPPORTED
      when device does not have voltage hwmon files
    * ComputePart. - test failure was likely caused due to EvtNotif
      causing conflicts (unknown exactly why). Test passes when
      moving it ahead of the event notifier. Both API calls may have
      a system resource issue, TBD.
    * rocm_smi_example - now indicates when an API call
      returns RSMI_STATUS_NOT_SUPPORTED or
      RSMI_STATUS_NOT_YET_IMPLEMENTED. Allows example to fully complete
      on systems which may not provide support for all API calls.

Change-Id: I520b8584e078d412414e8e5797c664220a7e823a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2023-04-05 12:44:29 -05:00
AravindanC 778f3b7fdc SWDEV-351540 - ASAN packaging for rocm_smi_lib
Change-Id: Iab354d02d261a0270a3d118b825835fc6f021c15
2023-03-20 13:14:53 -07:00
Charis Poag c252ecccd1 [SWDEV-335697 + SWDEV-342812] Fix NPS & Compute tests
Updates:
    * Fixed rsmi_dev_compute_partition_get
      & rsmi_dev_nps_mode_get to properly check
      for invalid arguments
    * Updated compute partition & NPS mode tests
      - Now properly confirms the invalid
        argument is seen
      - Spacing for multiple devices is added
        to better see distinction between
        separate device's tests (for verbose output)
      - Changed expect to assert calls, so errors
        are observed faster for test failures
      - Fixed multiple device testing where a
        variable should have been unset, but
        having multiple devices caused it to
        set
      - Updated multiple device testing to iterate
        accross all devices (previously returned,
        instead of continuing checking support
        after RSMI_STATUS_NOT_SUPPORTED detected)
      - Fixed a few spelling errors & verbose output

Change-Id: Ieba9e5b46763c6cd880fbf27fcdf58be8ecbc683
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2023-03-02 13:24:38 -06:00
Charis Poag 0d3558945b [SWDEV-335697] Add RSMI_STATUS_SETTING_UNAVAILABLE for dynamic partition
Updates:
    * Added RSMI_STATUS_SETTING_UNAVAILABLE for
      rsmi_dev_compute_partition_set - gives users
      better error output when attempting to set
      compute partition to values not listed in
      available_compute_partition SYSFS
    * Updated python --setcomputepartition to
      provide better output when receiving
      RSMI_STATUS_SETTING_UNAVAILABLE
    * Updated all test & example files to check for
      RSMI_STATUS_SETTING_UNAVAILABLE when doing
      rsmi_dev_compute_partition_set

Change-Id: Ida5d54880d9b9b6e4a0468cdb962fdc0c18d6257
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2023-02-27 11:17:44 -06:00
Charis Poag 77c950a4bf [SWDEV-381630] Add reset partition functionality
Updates:
    * Added rsmi_dev_compute_partition_reset & rsmi_dev_nps_mode_reset
    * Added --resetcomputepartition and --resetnpsmode python smi calls
    * Added temp data files rocmsmi_boot_compute_partition_<device num>
      & rocmsmi_boot_nps_mode_partition_<device num>, writes UNKNOWN
      if data cannot be read or device does not support
    * Cleaned up NPS & compute API documentation
    * Added creation and reading of API temp files (used in reset
      functionality)
    * Cleaned up output of rocm_smi_example
    * Updated rocm_smi_example to check if running with sudo permission
      before executing write API calls (cleans up erroneous output)
    * Added template specialization for storing temp data, requires
      specific rsmi_type_t enums (restrics what data can be stored)
    * Added storage of temp data, if temp files do not exist
    * Updated google tests for NPS & compute to include reset API calls

Change-Id: I69895a466b97107617e6dbb355737b84499a76c9
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2023-02-17 12:55:08 -06:00
Charis Poag 9ef376cd61 SWDEV-342812- Add NPS support
Updates:
    * Added rsmi_dev_nps_mode_set and rsmi_dev_nps_mode_get
    * Added ability to set multiple SYSFS files in debug build
    * Added ability to see user's env variables set for debug build
    * Added tests for rsmi_dev_nps_mode_set and rsmi_dev_nps_mode_get
    * Added ability to restart AMD GPU driver, used in nps_mode_set
    * Updated ROCm_SMI_Manual.pdf to include new APIs
    * Added progress bar for long running python_smi_tools, used
      in setting nps_mode if runs longer than .1 seconds

Change-Id: I6d61bedd28d7cba6aff432ad2d127ba741b7d15a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2023-02-14 11:54:24 -06:00
Elena Sakhnovitch 2b449fe58d Measure api execution time
Add new test to measure api execution time.

Change-Id: I0ad10c822bad4a2ae04b5785173b4ff21996021d
2023-01-16 17:00:36 -05:00
Charis Poag 4d7f3f2bc7 SWDEV-335697- Add support for dynamic partitioning
Original updates:
    * Added .gitignore to help with future commits
    * Updated/added copyrights on modified or added files
    * Updated rocm_smi.h/.cc
      - Added 3 new SMI API functions:
          rsmi_dev_compute_partition_set &
          rsmi_dev_compute_partition_get
      - Added helpful maps/enums used in
        new get/set compute_partition API calls
    * Updated rocm_smi.py
      - Added --showcomputepartition
      - Added --setcomputepartition
      - Fixed a few mistypes
    * Updated rsmiBindings.py - added helpful class/dict/list
    * Updated rocm_smi_example.cc
      - Added helpful MACRO to detect if api is not supported.
      - Added current_compute_partition set/get rocm lib calls
      - Added helpful macro to discover future RSMI errors
      - Commented out test_set_freq, was having permission issues
        on a Navi21
    * Updated rocm_smi_main.cc
      - Added helpful map to debug API calls, left in for future use
      - Added comment to better understand a non-class function returns
    * Added computepartition_read_write.cc/.h
      - Added get/set compute partition API test calls
      - Confirmed on devices that do not support the API calls, tests pass
    * Updated rocm_smi_test/main.cc
      - Calls new compute partition gtests

Added following updates from review feedback:
   * Updated rocm_smi.h/cc
       - Removed C++ API calls, adding support for both C/C++
         API calls could cause confusion and adds extra work for us
       - rsmi_dev_compute_partition_get -> Fixed an edge case where
         user gives a small buffer length size (smaller than data
         received), but does not receive the partial buffer back.
         google Tests are updated to reflect this find.
   * Updated rocm_smi_example.cc
       - Fixed test_set_freq, issue was that file was not writable.
         We now indicate this warning, so prior errors make sense.
       - General test code cleanup. Removed extra code,
         by creating loops for tests.
   * Updated rocm_smi_main.cc
     - Moved and got rid of an external reference to a map used
       for debugging RSMI enums, now is a const public reference.
   * Updated rocm_smi.py
     - Updated python code to identify NOT_SUPPORTED due to
       (currently) only a few GPU support the feature

Change-Id: I4a567acbb59d6771fb64df08d19175fe3604fd1b
2023-01-13 10:46:40 -05:00
Bill(Shuzhou) Liu 8ce9289bc2 Upgrade GoogleTest to v1.11.0
The old GoogleTest has compile errors on Centos 9. Upgrade it
to latest version.

Change-Id: I6bbe6afdfad6422a210f258880ddc87a9f088d76
2022-03-09 15:18:43 -05:00
Sreekant Somasekharan e6ae697e9c Add blacklist filter 'virtualization' for rsmi tests failing in SRIOV
Change-Id: Ibbaef092482c0b78ecd86a29f0b9b4331b51abe2
2022-03-04 22:13:44 -05:00
Divya Shikre 8c4635acea Temporary blacklist TestPerfLevelReadWrite for navi21
Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: Iee2146170b6828fe4fe2846c3ebfd57f95734f34
2022-01-27 22:56:37 -05:00
Divya Shikre 11a71c63b1 Don't assert when fan is not supported.
Add a check when RSMI_STATUS_NOT_SUPPORTED is returned for fanRead/fanReadWrite.
Fix for SWDEV-314176 & SWDEV-314175 reported.

Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: Icf2cc541a3fa5ca4794aff5d6bc91104adc45e6d
2022-01-20 12:29:12 -05:00
Divya Shikre b4fd9c0d94 Update temp_read rsmitst.
Check for RSMI_STATUS_INVALID_ARGS when invalid args are passed.

Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I0d5ff84aee5cce4214026ddcd860a17ae3e43147
2021-11-29 18:09:45 -05:00
Sreekant Somasekharan c6f695f5a9 Skip TestFrequenciesReadWrite for unsupported ASICs
For ASICs NAVI10 and above setting display clock [DCEFCLK] is not supported and the sysfs entry is
read-only. As a result, the test falsely fails for these ASICs. ROCm SMI Lib is ASIC independent.
So Display clock set cannot be selectively disabled for these ASICs.

As a compromise if the set (write to sysfs entry) fails due to permission error and euid is root,
assume that set feature is not supported and skip the test.

Change-Id: I7a273878cbf1465b01728705323e8a92a42378dd
2021-11-29 11:23:38 -05:00
Sreekant Somasekharan 3f27dcc1ac Modify bool variable to true in if condition of src=dst
Change-Id: Ie2024b3a6ad68e48384bb3472fe8785bcd643665
2021-11-17 12:53:40 -05:00
Sreekant Somasekharan ce46fd237a Add test case for rsmi_is_P2P_accessible API.
Change-Id: Iccfede42925c98d96454b5f25cc0ed6fc9258911
2021-10-28 17:06:07 -04:00
Divya Shikre e96d6ab77e Add failing rsmi tests to exclude file to enable blacklisting
Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: Ibdad4d54ffe87391b13379c63e005fd04c6abaf5
2021-10-26 17:57:05 -04:00
Bill(Shuzhou) Liu 42d39d3e34 Add -g compiler option for ADDRESS_SANITIZER
Add -g compiler option for Address Sanitizer

Change-Id: I958fefa6c4b5871c29734ab1d4ec238c9e073192
2021-08-03 13:54:19 -04:00
Divya Shikre 6edea7a92e Add fix to ignore error returned when perf determinism is not supported.
Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I89b6a0a3dbba6fbd4b12ff2e20670eff9f32ed7f
2021-06-14 12:18:22 -04:00