Граф коммитов

271 Коммитов

Автор SHA1 Сообщение Дата
Bill(Shuzhou) Liu b52034fed8 Add API for the memory type
Get the memory type from libdrm and add a new API.

Change-Id: I89327bca2ef860f2e3f4f6ca20def2331eba66c0
2023-09-07 13:05:58 -05:00
Dmitrii Galantsev f96c7663b5 Merge "Update amdsmi_wrapper.py and name fields" into amd-dev 2023-08-30 17:30:38 -04:00
Galantsev, Dmitrii 03cfdeefd5 Update amdsmi_wrapper.py and name fields
When updating the wrapper I ran into an issue with anonymous structs.
Generated wrapper would contain a string split into multiple lines,
which is invalid python.

e.g.
    'struct_struct anonymous
    (struct.... amdsmi.h:355)'

After naming the structs - the issue is gone. BDF union now has to be
addressed with .fields

e.g.
    OLD: bdf.function_number
    NEW: bdf.fields.function_number

Change-Id: Ib3c640c088ad0cc67893d636827356902051f17f
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-08-30 16:30:03 -05:00
Shuzhou Liu fc5b481124 Merge "Support PCIe vendor name" into amd-dev 2023-08-30 09:58:21 -04:00
Deepak Mewar f1ade88d47 wrapper API to get first online core on cpu socket
Change-Id: Ia1785f94ff687e53fdb868e56d4a83c2466ba2ed
2023-08-29 05:15:33 -04:00
Deepak Mewar 0baa3f6b6a Renamed esmi library APIs and bound the APIs
to cpusocket handle

Change-Id: I6e3d8aa667df475339c28b27294349843f32230c
2023-08-29 05:15:12 -04:00
Deepak Mewar 7c0e21ddc7 Wrapper API declared for esmi error status
Change-Id: Ie3e00a50740d9ba58d7f4955ea6b76ab8b46fb5e
2023-08-29 05:14:01 -04:00
Galantsev, Dmitrii 1d24dd93a6 Fix uint32* -> int32* conversion error
Change-Id: I23c2a842468896e8d120ac4b8b55ef433dff6d85
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-08-28 18:32:31 -05:00
Bill(Shuzhou) Liu 9021ef96dc Support PCIe vendor name
Add the support for PCIe vendor name.

Change-Id: Ibc1d289a08731e4c5a14f992f3b0d31b51482396
2023-08-28 16:46:43 -05:00
Galantsev, Dmitrii 936719eeb6 Merge remote-tracking branch 'rocmsmi/amd-staging' into amd-dev
Change-Id: I9c38b4facd472b877d1ad133f3176a023c890955
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-08-23 16:04:15 -05:00
Bill(Shuzhou) Liu a10f00bf57 Fallback to kfd node when VRAM sysfs not available
The driver may not expose VRAM sysfs in certain system. Add a
fallback to it.

Change-Id: Ib3be71b4f4d2c79318d5026b0a97f3657d8a97b6
2023-08-17 14:36:03 -05:00
Charis Poag 755e14dbad [SWDEV-399953] Smart Temperature detection + partitioning display
* Updates:
    - Fix for devices which do not have edge sensors, but junction
    - Added partitioning (memory and dynamic) displays for
      base rocm-smi CLI calls
    - Added subheading for base rocm-smi call output
    - Added better hwmon and device detection logging

Change-Id: I8219884b2e532d6ed379527cacdc1f2b232a5451
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2023-08-10 19:53:38 -04:00
Oliveira, Daniel cc5ab079df Fix rsmitstReadWrite.TestPowerReadWrite test failure
Code changes related to the following:
  * All reinforcement work moved to their own files
  * Self contained changes only to support them
  * New files added to CMakeLists.txt

Change-Id: I761e91f54392824df9145eaed8b9805986861285
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
2023-08-09 21:51:05 -05:00
Maisam Arif b14da692eb Added workaround for inconsistent current pcie speed from gpumetrics
Change-Id: If8404d21341cd15eb4d0221ab92cb0b351bbdf3e
Signed-off-by: Maisam Arif <maisarif@amd.com>
2023-08-09 11:35:35 -05:00
Maisam Arif 82ac307f9b Added Gen type to pcie info
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Icaa050a6f53fad608ed0353b2a0cbea33dee1dd2
Signed-off-by: Maisam Arif <maisarif@amd.com>
2023-08-02 23:42:48 -05:00
Charis Poag 9c7eed7edc [lib] Enhance Logger: gpu_metrics + enable console out
* Updates:
    - Env variable RSMI_LOGGING=0 or any other value
        -> all logging off
    - Env variable RSMI_LOGGING=1 -> logs only
    - Env variable RSMI_LOGGING=2 -> console only
    - Env variable RSMI_LOGGING=3 -> both logs + console
    - Metrics output includes hexdump of current file
      and decoded metrics (functions: logHexDump
      and log_gpu_metrics)
    - System info gathered, now includes if system's
      perceived endianness - little or big endian
      helpful for viewing decoded hexdump or any
      binary translation
    - Added templates for printing unsigned hex
      (print_unsigned_hex_and_int), unsigned integers
      (print_unsigned_int), and printing both unsigned
      hex and int with an optional header
      (print_unsigned_hex_and_int)
    - Fixed some build compile warnings/errors -
      ex. doing strncpys for sku or board names
      this operation is expected and needed
      and for temp file writes if unsuccessful
      we now properly send RSMI_STATUS_FILE_ERROR
    - Fixed on RHEL 8.8/9.x logrotate does not properly
      initialize

Change-Id: Ifa0f0218c9cafd0a8cd6aa8e7f94d61e9107200f
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2023-08-01 21:46:19 -05:00
Maisam Arif a13d5be933 Updated READMEs
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Idf34bc431184414a17c3cb50c06543151ce3cb56
2023-08-01 14:28:33 -04:00
Maisam Arif ca59a60a9a Updated Versioning
corrected to amd-smi version from rocm-smi version
	Added newline characters in the gpu choices
	Updated cli versioning to 23.2.1.0 to match amd-smi

Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ia6db3a281e2349e05a09209bdcfdfa5ac48e3a86
2023-08-01 14:28:27 -04:00
Maisam Arif d705801adf ASIC serial updates
Corrected asic serial fallback to use rsmi's unique id
	Removed product serial due to duplication

Change-Id: Ib4e9ac00d2bf31ccbc35060bc84f7e79e5332d37
Signed-off-by: Maisam Arif <maisarif@amd.com>
2023-08-01 14:28:19 -04:00
Deepak Mewar 8a9771b225 esmi library integration update v1.0
1. new class files for cpu socket and cpu core created
2. wrapper API's for getting energy monitoring, system
   statistics, power monitoring values implemented
3. modified amdsmi init & cleanup functions for esmi lib support
4. modified amdsmi system class for esmi lib support
5. sample test code created in example dir

Change-Id: Ic41f31641c283a681de696bb4346b557265bad42
2023-07-27 17:29:27 -05:00
Deepak Mewar 0187de61e2 esmi library header changes
1. New processor types AMD_CPU_CORE, AMD_APU added to ENUM
2. esmi errorcodes, wrappers for structures and library APIs
3. Macro introduced to enable/disable the esmi library code

Change-Id: Ia64b29303c231d3f17ac6b40fcd09b09b4380903
2023-07-27 16:21:24 -05:00
Bill(Shuzhou) Liu 55bf9cbe13 Change API to get the driver date
Support the driver date from libdrm.

Change-Id: I88e694732b538220e11fdb4029712bb5a6f44380
2023-07-21 08:28:06 -05:00
Oliveira, Daniel 573620f586 Add revision to --showhw
Code changes related to the following:
  * Added 'rsmi_dev_revision_get()' related code
  * Test code
  * Functional tests

Change-Id: I8c2097c65384a028c8c8437b717d05d52fe45250
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
2023-07-18 16:17:33 -05:00
Marko Oblak 78faf411f8 SWDEV-391188 - [AMDSMI][LinuxGuest] Added description in amdsmi header file for amdsmi_get_gpu_process_list, changed mentioned API in py_interface
Signed-off-by: Marko Oblak <Marko.Oblak@amd.com>
Change-Id: I8cb7f2c6595da6ab0263e6fa4365bde91d900979
2023-07-03 06:35:12 -04:00
Marko Oblak 01474ff14e SWDEV-392359 - [AMDSMI] [Linux] [Guest] Documented unsupported APIs
Signed-off-by: Marko Oblak <Marko.Oblak@amd.com>
Change-Id: I0cff925082e6bc637e4b5073df64445380b3a3f5
2023-06-21 13:18:32 +02:00
Bill(Shuzhou) Liu 8f26e881fb SWDEV-405668 - BDF difference between amdsmi and rocmsmi
The render node discovery is changed to match rocm-smi index.

Change-Id: I707d0844b377304f4e8fc15035902c707805c2dc
2023-06-16 17:06:00 -04:00
Bill(Shuzhou) Liu d9b6af7a09 Expand showpids to provide more details
Provide details of GPU usage by an application.

Change-Id: I0f36df7d358754c2c8a60432b736d98f667ee99c
2023-06-16 08:52:18 -04:00
Maisam Arif 9cebc93cee Cleaned up APIs
Change-Id: I93487e01d7126bdfa77439b571df927a6af3bb70
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2023-06-07 10:48:37 -04:00
Dalibor Stanisavljevic 8dbc1d7d57 Align header changes with other platforms
Change-Id: I366e57310e0504855692626e2b2014bea235ed6b
Signed-off-by: Dalibor Stanisavljevic <Dalibor.Stanisavljevic@amd.com>
2023-06-02 12:28:09 +02:00
Bill(Shuzhou) Liu 62ce965409 Clean up the APIs
Remove and rename APIs after review.

Change-Id: I5464f200eb605b366673f8abca95183c3837843b
2023-05-30 16:08:54 -04:00
Dalibor Stanisavljevic 1bc1d431d8 SWDEV-384793 - Clean up API
Change-Id: I441b315d32df59a454e06d521e5ca8b2c229451a
Signed-off-by: Dalibor Stanisavljevic <Dalibor.Stanisavljevic@amd.com>
2023-05-19 16:40:26 +02:00
Charis Poag c3a095a180 [SWDEV-398070] Adding logging to ROCm SMI (by default off)
Updates:
    * [rocm-smi] Provide a thread-safe logging feature
    * [rocm-smi] Adding logrotation into install/upgrade/remove
      scripts
    * [rocm-smi] Updated cmake lists to include rocm_smi_logger
    * [rocm-smi] Updated DEB/RPM install/remove logging file &
      folder with all users having r/w privledges for
      /var/log/rocm_smi_lib/ROCm-SMI-lib.log
    * [rocm-smi] Added ability to do a glob search for multiple files
      (globFileExists), assists doing file searches with * strings
    * [rocm-smi] Added ability to log system details when RSMI_LOGGING
      is turned on (getSystemDetails())
    * [rocm-smi] Added logging to provide which ROCm API is being called
      when RSMI_LOGGING is on
    * [rocm-smi] Added logging to provide SYSFS path and read value,
      when RSMI_LOGGING is on. Provides error reponse on failure.
    * [rocm-smi] Added logging to provide SYSFS path and read value,
      when RSMI_LOGGING is on. Provides error reponse on failure.
    * [rocm-smi] Added environment variable RSMI_LOGGING to control
      when logging is enabled or disabled. By default, by not
      setting this env. variable, logging is turned off. When
      setting RSMI_LOGGING=<any value>, logging is enabled
      which is placed in /var/log/rocm_smi_lib/ROCm-SMI-lib.log file.
      Setting RSMI_LOGGING is allowed in both debug and release builds.
    * [rocm-smi] Removed an initialize procedure which keeps
      debug_inf_loop. Seems this feature is not being used.

Change-Id: I79b48387609c6233c6f05b04fb8bba66b68c2399
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2023-05-17 21:18:52 -05:00
Dalibor Stanisavljevic ca7f965018 SWDEV-384797 - Renamed measure to info
Change-Id: I2397ed189fe0171ed29bd6440f8fa0bb210b95a5
Signed-off-by: Dalibor Stanisavljevic <Dalibor.Stanisavljevic@amd.com>
2023-05-17 05:10:58 -04:00
Suma Hegde edd8f1ae23 Renamed APIs
amdsmi_dev_open_supported_func_iterator -> amdsmi_open_supported_func_iterator
amdsmi_dev_open_supported_variant_iterator -> amdsmi_open_supported_variant_iterator
amdsmi_dev_close_supported_func_iterator -> amdsmi_close_supported_func_iterator

Change-Id: Ie9b2efa5aee7095c3c835b91de1951df6b065510
2023-05-11 11:01:37 -04:00
Suma Hegde 6256bf6f1a Renamed API amdsmi_dev_reset_xgmi_error
amdsmi_dev_reset_xgmi_error -> amdsmi_reset_gpu_xgmi_error

grep -rli 'amdsmi_dev_reset_xgmi_error' * | xargs -i@ sed -i
's/amdsmi_dev_reset_xgmi_error/amdsmi_reset_gpu_xgmi_error/g' @

Change-Id: Ic7e4c4b345fdf6187aed42d53fb7ae8536c2edea
2023-05-11 11:01:25 -04:00
Suma Hegde a8acfbf8ff Renamed amd_smi_device.h to amd_smi_processor.h
Change-Id: I9f3cd8b29e4b5e9e552faeb7e977d7a1002abd65
2023-05-11 11:01:14 -04:00
Suma Hegde d9ba131f73 Renamed APIs
amdsmi_dev_get_gpu_ecc_status -> amdsmi_get_gpu_ecc_status
amdsmi_dev_get_gpu_ecc_enabled -> amdsmi_get_gpu_ecc_enabled
amdsmi_dev_get_gpu_ecc_count -> amdsmi_get_gpu_ecc_count

Change-Id: I84e6489f82bae115e1a13c9e4fce8029888ca379
2023-05-11 11:00:59 -04:00
Deepak Mewar e687a72235 Renamed APIs
1) amdsmi_dev_set_overdrive_level to amdsmi_set_gpu_overdrive_level
2) amdsmi_dev_set_overdrive_level_v1 to amdsmi_set_gpu_overdrive_level_v1

grep -rli 'amdsmi_dev_set_overdrive_level' * | xargs -i@ sed -i
's/amdsmi_dev_set_overdrive_level/amdsmi_set_gpu_overdrive_level/g' @

Change-Id: Id6934e5b0962c9262cca041bdfdf02c60f69573b
2023-05-11 11:00:45 -04:00
Deepak Mewar ced22230c4 Renamed API amdsmi_dev_get_od_volt_curve_regions
to amdsmi_get_gpu_od_volt_curve_regions

grep -rli 'amdsmi_dev_get_od_volt_curve_regions' * | xargs -i@ sed -i
's/amdsmi_dev_get_od_volt_curve_regions/amdsmi_get_gpu_od_volt_curve_regions/g' @

Change-Id: I4b390c2d5173ca919c4ab5b1173a4fc40e2a0015
2023-05-11 11:00:33 -04:00
Deepak Mewar 467f3e3bb7 Renamed API amdsmi_dev_set_od_volt_info
to amdsmi_set_gpu_od_volt_info

grep -rli 'amdsmi_dev_set_od_volt_info' * | xargs -i@ sed -i
's/amdsmi_dev_set_od_volt_info/amdsmi_set_gpu_od_volt_info/g' @

Change-Id: I2364f9f555c010e1022e2c946a65b72fcf3d2233
2023-05-11 10:59:51 -04:00
Deepak Mewar a72e1ec91d Renamed API amdsmi_dev_set_od_clk_info
to amdsmi_set_gpu_od_clk_info

grep -rli 'amdsmi_dev_set_od_clk_info' * | xargs -i@ sed -i
's/amdsmi_dev_set_od_clk_info/amdsmi_set_gpu_od_clk_info/g' @

Change-Id: I0f1fd5a80322a544f7d25e09146c9e52b82091f6
2023-05-11 10:59:25 -04:00
Deepak Mewar 2bd94db02c Renamed API amdsmi_dev_get_od_volt_info
to amdsmi_get_gpu_od_volt_info

grep -rli 'amdsmi_dev_get_od_volt_info' * | xargs -i@ sed -i
's/amdsmi_dev_get_od_volt_info/amdsmi_get_gpu_od_volt_info/g' @

Change-Id: Icd8658509b28523b7c04f8d2c53efb82689e294b
2023-05-11 10:59:11 -04:00
Deepak Mewar 78ce4979e1 Renamed API amdsmi_dev_get_overdrive_level
to amdsmi_get_gpu_overdrive_level

grep -rli 'amdsmi_dev_get_overdrive_level' * | xargs -i@ sed -i
's/amdsmi_dev_get_overdrive_level/amdsmi_get_gpu_overdrive_level/g' @

Change-Id: Id33a4544a2f2fd9d77de601addcf4e45d09d65d1
2023-05-11 10:59:00 -04:00
Deepak Mewar d83dc2b005 Renamed API amdsmi_dev_xgmi_error_status
to amdsmi_gpu_xgmi_error_status

grep -rli 'amdsmi_dev_xgmi_error_status' * | xargs -i@ sed -i
's/amdsmi_dev_xgmi_error_status/amdsmi_gpu_xgmi_error_status/g' @

Change-Id: I0d2338f0e924da5d69d280fdd988c2a6f9fe4ace
2023-05-11 10:58:49 -04:00
Deepak Mewar 0cb9e157db Renamed API amdsmi_counter_get_available_counters
to amdsmi_get_gpu_available_counters

grep -rli 'amdsmi_counter_get_available_counters' * | xargs -i@ sed -i
's/amdsmi_counter_get_available_counters/amdsmi_get_gpu_available_counters/g' @

Change-Id: Ief60be6c95f2ea4d0f6f91b153263d95710e6942
2023-05-11 10:56:57 -04:00
Deepak Mewar 7a6c26244e Renamed API amdsmi_read_counter
to amdsmi_gpu_read_counter

grep -rli 'amdsmi_read_counter' * | xargs -i@ sed -i
's/amdsmi_read_counter/amdsmi_gpu_read_counter/g' @

Change-Id: Ie9fec914358dd901930db54ab94e05f2fe32fa5a
2023-05-11 10:55:52 -04:00
Deepak Mewar 6e1a72d2c1 Renamed API amdsmi_control_counter
to amdsmi_gpu_control_counter

grep -rli 'amdsmi_control_counter' * | xargs -i@ sed -i
's/amdsmi_control_counter/amdsmi_gpu_control_counter/g' @

Change-Id: Ibdcd32327ebd2646375fb5c3b913cb528ac8aa97
2023-05-11 10:55:36 -04:00
Deepak Mewar e6dd8d49ba Renamed API amdsmi_dev_destroy_counter
to amdsmi_gpu_destroy_counter

grep -rli 'amdsmi_dev_destroy_counter' * | xargs -i@ sed -i
's/amdsmi_dev_destroy_counter/amdsmi_gpu_destroy_counter/g' @

Change-Id: I328f65f5a2a86108ee5b217f95ed0f4f03745286
2023-05-11 10:55:22 -04:00
Deepak Mewar 0c435b81c2 Renamed API amdsmi_dev_create_counter
to amdsmi_gpu_create_counter

grep -rli 'amdsmi_dev_create_counter' * | xargs -i@ sed -i
's/amdsmi_dev_create_counter/amdsmi_gpu_create_counter/g' @

Change-Id: Ic296057314f98547dd6a01b1c7d51668cfe5bc9a
2023-05-11 10:55:06 -04:00
Deepak Mewar fb419ab655 Renamed API amdsmi_dev_counter_group_supported
to amdsmi_gpu_counter_group_supported

grep -rli 'amdsmi_dev_counter_group_supported' * | xargs -i@ sed -i
's/amdsmi_dev_counter_group_supported/amdsmi_gpu_counter_group_supported/g' @

Change-Id: I69a5534f779dc0013bbe75b3d9b2c6074b2f378b
2023-05-11 10:54:57 -04:00