İşleme Grafiği

181 İşleme

Yazar SHA1 Mesaj Tarih
Chris Freehill 01401b0caa Add device mutual exclusion tests and related fixes
* Added a new test to verify mutual exclusion of access to device
  resources
* Added some missing acquiring of mutexes to some RSMI calls, as
  well as try-catch blocks.

Change-Id: I87aac009878a0b2d1f975e1d5b794d887bb23ff9


[ROCm/rocm_smi_lib commit: f8b57c3b16]
2020-04-08 15:05:11 -05:00
Chris Freehill 49b562b209 Shared mutex fixes and improvements
* Don't make different shared memory mutexes for different users
* Don't delete (unlink) the shared mutex file if the mutex
  initialization fails. This may mess up other processes that
  are using it. Instead, print a message on how to resolve the
  situation, and then throw an error.

  Note, this situation comes up when debug builds (usually)
  either assert() or otherwise end execution without a proper
  clean up.
* Remove cpplint from shared_mutex code

Change-Id: I5f8ca6150cac5c2405fb97007516da345093f966


[ROCm/rocm_smi_lib commit: 52196caaee]
2020-04-06 17:08:33 -05:00
Mukul Joshi 7137023637 Add rsmi_topo_get_numa_affinity()
Given a device index, return the corresponding NUMA node for the
device.
Also, add NUMA node tests to Sys Info Read test.

Change-Id: I0df4937470e6362e6737ccea568d4b3e5890c91a


[ROCm/rocm_smi_lib commit: fd79e5c161]
2020-04-01 11:38:08 -04:00
Chris Freehill 024e27229c Documentation update
Change-Id: I646cf3d2fd6064295937f7e727076532894d3514


[ROCm/rocm_smi_lib commit: 7abe6dc1b2]
2020-03-27 14:08:19 -05:00
Chris Freehill 17871ecb14 More general solution to api support hwmon mapping
This solution takes into account that some hwmons use
label files to map sensor types. The previous solution
did not take this into account.

Change-Id: I1d6204573cefa8197b2cfe0ffb412b545df3d80a


[ROCm/rocm_smi_lib commit: 324c0ca0e5]
2020-03-16 11:37:47 -05:00
Chris Freehill 4e2d769dcc Fix indexing problem with api support function
Also fix potential issue with evaluating functionality of
functions with multiple sub-variants.

Change-Id: I9a09e52f3d3f3181e72578ed1f3bfd0d85516aa3


[ROCm/rocm_smi_lib commit: 1d8e16bff2]
2020-03-12 11:43:01 -05:00
Chris Freehill 06149e94bb Make rsmitst tests fail quickly if rsmi_init fails
Change-Id: I7b5d94b77305b30e08f33e1ddb6e2f089db0431f


[ROCm/rocm_smi_lib commit: d9ab846bee]
2020-03-11 12:13:28 -05:00
Chris Freehill a7ca81d161 Don't assert or re-throw exception caught at top level
Instead, return error and let caller deal with it.

Change-Id: I1a55337134b00aa4259af27281b2450fc2252be9


[ROCm/rocm_smi_lib commit: d54a9484be]
2020-03-11 12:11:29 -05:00
Chris Freehill 6ba4f32620 Correct rsmitst build instructions
Change-Id: Ia7dbdd7a489d235c6003badb79f2d0808e18143b


[ROCm/rocm_smi_lib commit: a482394263]
2020-03-02 16:29:10 -05:00
Chris Freehill e4d918aa70 Fix segmentation fault that sometimes occurs on release builds
Fixes SWDEV-216441

Change-Id: I3ea01a4edd14000a103de751757dfaadc7d358bb


[ROCm/rocm_smi_lib commit: 0bf81ed2f9]
2020-02-24 17:17:26 -06:00
Chris Freehill 95d3da04b9 Add rsmi_compute_process_gpus_get()
Given a process ID, give the device indices that process is
currently using.

Also:
* made corrections to how RSMI, amdgpu (ie, "card#") and
  KFD indicies translate from one another
* add a few missing error codes to rsmi_status_string()
* fix some formatting

Change-Id: Icd2cae66bb4fec768da96af7cf9cf8b8b66ec7f9


[ROCm/rocm_smi_lib commit: 2d6e15190c]
2020-02-22 10:47:58 -06:00
Chris Freehill 386bab024e Merge "Ensure string is non-empty before calling stoul or stoi" into amd-master
[ROCm/rocm_smi_lib commit: 842bd29568]
2020-01-30 20:16:56 -05:00
Srinivasan Subramanian 05db31fdc7 Changes for multiple ROCm installation
1. Support multiple rocm installtion
2. Support shared library versioning.

Change-Id: Id5c25b90abed084e8fe8cb7c374c2d4384653bbf


[ROCm/rocm_smi_lib commit: 29d55e001a]
2020-01-30 11:08:57 -08:00
Chris Freehill 61db4c7e15 Ensure string is non-empty before calling stoul or stoi
Change-Id: I2c6314fb86d3bba8fd6aab932dbb989263fa8542


[ROCm/rocm_smi_lib commit: f748868818]
2020-01-28 17:05:14 -06:00
Chris Freehill 078c298e7b Security improvements
Improvements include
* adding additional build flags that warn about stack-smashing
and type conversion errors
* run-time checks for valid function input values and adquate
space for the result of arithmetic operations.
* make sure default case for switch statements do something
besides just assert
* disable using env. var. debugging in release mode

Change-Id: I5f048310c5c56e05d9ec31bcc273404d6a0dd646


[ROCm/rocm_smi_lib commit: d00b9ac07d]
2020-01-16 14:56:27 -06:00
Chris Freehill 322d1ff303 Use default value for version when git tags not present
Also, documentation typo correction.

Change-Id: I7fe4de05d3b8fb808a980862a09a9be32ed32bf5


[ROCm/rocm_smi_lib commit: fe4f7ed4a1]
2019-12-19 08:32:38 -06:00
Chris Freehill ddbe8013fe Merge "Make dpkg and rpm package names match their file names" into amd-master
[ROCm/rocm_smi_lib commit: 8ffe1bc7f6]
2019-11-09 14:27:17 -05:00
Chris Freehill d9f1009bde Make dpkg and rpm package names match their file names
For example,
$ dpkg -i rocm-smi-lib64-2.0.0.1.local-build-0-d10a391.deb 

will yield:
 ...
 Package: rocm-smi-lib64
 Version: 2.0.0.1.local-build-0-d10a391
 ...

Change-Id: I1e56e0c623b9421261cf0864958e821d10226d39


[ROCm/rocm_smi_lib commit: c926d50c3a]
2019-11-08 15:09:16 -04:00
Chris Freehill 5a0da92a99 Disable TestFrequenciesReadWrite for arcturus
Change-Id: Ia20ec853cdba34ff3dcdc68b4f869890bf58b539


[ROCm/rocm_smi_lib commit: 1004a01094]
2019-11-07 16:22:45 -05:00
Chris Freehill 7ae9816321 Merge "Docs., error checking and test improvements" into amd-master
[ROCm/rocm_smi_lib commit: 4ebb436893]
2019-11-06 20:15:26 -05:00
Chris Freehill 6a4ba6ff2a Use "-" instead of "_" for package name
This is part of fix to SWDEV-208805. The other part will
be in the build_* script.

Change-Id: I36397e3f918d08170db8bb228722a2b7389af83b


[ROCm/rocm_smi_lib commit: 0e5c44de2a]
2019-11-06 11:31:50 -05:00
Chris Freehill edc222faea Docs., error checking and test improvements
* Update doc. on api-support function
* Check for valid integer value when reading a monitor int. val.
* If fan-write test attempts to set speed higher than max.
   possible, then skip the test

Change-Id: I01ad0ab1f4caffdb0d2c26e9575f278c35a6b017


[ROCm/rocm_smi_lib commit: 52dfa4bcca]
2019-11-06 11:19:47 -05:00
Chris Freehill f1401048e2 Support rsmitst blacklisting by adding an exclude file
Change-Id: I9d581b8e24363a688b58a6ca59a6521c7be364d7


[ROCm/rocm_smi_lib commit: 3a26a7270c]
2019-10-17 13:47:02 -05:00
Chris Freehill f85d50583a Correct README Markdown formatting
Change-Id: Id63618fc7fa7fa7cdc68bcd451cbe89ef2c04469


[ROCm/rocm_smi_lib commit: ee13e85265]
2019-10-17 08:38:50 -05:00
Chris Freehill 9b707b1469 Support checking for specific device-getter api support
For device-getter functions, allow users to specify a nullptr
for the provided buffer. In those cases, the function will return
RSMI_STATUS_NOT_SUPPORTED if the hardware or system software does
not support the function. If the function is supported, then
RSMI_STATUS_INVALID_ARGS will be returned, unless a different
error is encountered.

Additionally, tests and documentation were updated to reflect
this change.

Change-Id: Ie7db3a4c8c66af97ebd7ee1e3b95cd331ace9d9c


[ROCm/rocm_smi_lib commit: 68d25e82fd]
2019-10-05 15:55:18 -05:00
Ori Messinger 1b39426034 Display GPU vram vendor
Add support and testing for reading the vram vendor associated with
the GPU. The vram vendor can be found as a separate sysfs file at:
/sys/class/drm/card[X]/device/mem_info_vram_vendor
The vram vendor is displayed as a string value.

Change-Id: I12c8e56e57f45aa08d7d6c25338c4e468ed1c7fc
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>


[ROCm/rocm_smi_lib commit: 2412dff6a2]
2019-10-04 11:51:30 -04:00
Chris Freehill 8ea817f79e Add functions that tell what capabilities are supported
The new functions added in this commit allow a caller to tell up
front what functions, function variants and monitors are
supported.

Also,
* fixed a few documentation/formatting issues
* fixed a process_info test issue

Change-Id: I2184ab1a4a6898f847e791f273e2185d556e78e9


[ROCm/rocm_smi_lib commit: 551b15182b]
2019-09-23 13:30:47 -05:00
Chris Freehill 0bcdfe553d Make bdfid use 32 bit domain if possible
If the 32-bit domain is found in the kfd node properties for
a device, then it will be used when constructing the bdfid.
If it's not present, it will continue to use the 16 bit version.

Also, whether or not 32b or 16b are used for the domain, the
domain will now be placed in the upper 32b of the 64b bdfid.

* Fixed some unrelated doxygen issues

Change-Id: Icb5116daa1ab45ee305bdbe6cd5df5736dd3ffa3


[ROCm/rocm_smi_lib commit: 469af303d6]
2019-08-27 11:05:58 -04:00
Chris Freehill e35a20518c Fix issues with buffer length when getting brand name
* Specifically, address case when brand name is longer than buffer
provided

* Also, slightly modify prototype to match similar, existing APIs.

* Address some cpplint issues.

Change-Id: Iaf77304e23085123e88f301e4b33bc4e6be2a225


[ROCm/rocm_smi_lib commit: 01e0800741]
2019-08-26 07:21:02 -04:00
Ori Messinger d23acc96fd Display GPU brand name
Add support and testing for reading the brand name associated with
a specific GPU (such as mi25, mi50, mi60, etc). The brand name is
associated with the SKU of the GPU, and some brand names can be
mapped from multiple different SKUs.

Change-Id: I36eb95ca8e72efdd294ccd684841195925dfe820
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>


[ROCm/rocm_smi_lib commit: 7f2d970a80]
2019-08-22 12:24:29 -04:00
Chris Freehill 8f26b1b03e Fix building lib and test in non-automated (CI) env.
Also, use abbreviated ROCM_BUILD_ID environment variable for job
and build number, if it's available.

Change-Id: Ib5a721f5920f1008bb6382935f7b439429389de0


[ROCm/rocm_smi_lib commit: aa2db48237]
2019-08-14 23:18:15 -05:00
Chris Freehill c50dcc1461 Add build and job numbers to package version
Change-Id: I06baf23e09b3a63a24d0046046f7f22281e0ec93


[ROCm/rocm_smi_lib commit: dffa533e13]
2019-08-14 09:48:59 -05:00
Chris Freehill 76ec21c516 Conform versioning of to uniform version standards
Library version will now only have major and minor. Package
version will now include number of commits since previous
package. Both SO and package versions rely on git tags to
determine the current build and the commits since the last
release.

Change-Id: If2bda74bf342930a9e07f5c91cb1380b6b7c64ca


[ROCm/rocm_smi_lib commit: fe738eaedb]
2019-08-12 08:59:09 -05:00
Chris Freehill b42ed59cca Adjust how we read ECC block counter status
This change corresponds to kernel changes.

Change-Id: Ibd977e8b3338349036cb16e55fb0b2c9c187726d


[ROCm/rocm_smi_lib commit: aaecfd6fff]
2019-08-09 16:06:43 -05:00
Kent Russell 9bc07064f2 Fix RAS change
RAS formatting changed, so get it to handle both types of sysfs output
until it's normalized
Change-Id: I56f2a2495af8ff4d01011bc614283376afb9ad0a


[ROCm/rocm_smi_lib commit: a34832f11e]
2019-08-08 12:09:18 -04:00
Chris Freehill 6f491f4948 Update docs for rsmi_dev_memory_reserved_pages_get()
Change-Id: I3cc479ea709bb8d9c23ff35d7339e329477ffe18


[ROCm/rocm_smi_lib commit: 0da1599c4f]
2019-08-06 16:57:09 -05:00
Chris Freehill d0e645fbf2 Add support for rsmi_dev_memory_reserved_pages_get()
Also, don't return an error for empty sysfs files. The reserved memory
page file will often have no lines. We don't want it to appear that
this function is not supported if the file is empty.

Change-Id: I1d28bb184ea587bb578fe71dd75adc2a812d09a8


[ROCm/rocm_smi_lib commit: 73c54e1fd0]
2019-08-06 11:42:03 -05:00
Chris Freehill 32645f25a6 Add rsmi_dev_serial_number_get()
Also correct whitespace issues

Change-Id: I7ffe23672304c31ed08d7148b04a19a7d4c3d7ef


[ROCm/rocm_smi_lib commit: cf13d6f4d8]
2019-07-22 07:09:53 -05:00
Chris Freehill d0ef228c6d Merge "Make git-describe find annotated and non-annotated tags" into amd-master
[ROCm/rocm_smi_lib commit: dea44dee54]
2019-07-12 21:11:59 -04:00
Harish Kasiviswanathan 0d25538efb Test rsmi_dev_drm_render_minor_get()
Change-Id: I5c0702efc8ed1bc155292e4c3a73d74e5c66204e
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>


[ROCm/rocm_smi_lib commit: 904ea5fc27]
2019-07-11 13:13:03 -04:00
Harish Kasiviswanathan ce2f1a4e58 Add rsmi_dev_drm_render_minor_get()
Function to get the drm minor number associated with ROCm device

Change-Id: I9356b9ca75151882acbb075076bc072f08b73aae
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>


[ROCm/rocm_smi_lib commit: 68cb303a44]
2019-07-11 13:12:34 -04:00
Chris Freehill b94ab55119 Make git-describe find annotated and non-annotated tags
Change-Id: I56f67e9a0a69fce3c825577160ab7380c297d243


[ROCm/rocm_smi_lib commit: 29353aa314]
2019-07-10 22:41:44 -05:00
Chris Freehill 93f23ef9b6 Add rsmi_dev_firmware_version_get()
Change-Id: Iba3e5f3eaa0eb031fc013fc168bded22bc249b5c


[ROCm/rocm_smi_lib commit: 31e02fdc61]
2019-07-09 22:50:44 -05:00
Chris Freehill e2c96d703a Add xgmi error_status and error_reset functions
Also, comment corrections and added check for invalid arguments

Change-Id: I891cbf9b37bfda629914a008811b840323872c02


[ROCm/rocm_smi_lib commit: 557e1f5704]
2019-07-09 09:55:05 -04:00
Chris Freehill fb87f41beb Add initial support for getting process information
Added implementation of and tests for
rsmi_dev_compute_process_info_by_pid_get() and
rsmi_dev_compute_process_info_get()

Change-Id: I4c4f5f39fe6701da37916c9ad41449b5d35ac7af


[ROCm/rocm_smi_lib commit: 9b93cbe21d]
2019-07-03 20:01:43 -05:00
Chris Freehill 4fa6f2f5bb Add rsmi_dev_memory_busy_percent_get()
Change-Id: Ide683b6c72870af547331f4502c5bb8c445d61b5


[ROCm/rocm_smi_lib commit: 1c5e090507]
2019-06-25 19:09:13 -05:00
Chris Freehill bdbd81c02a Event counter support
XGMI related events are supported

Change-Id: If17036fe890c8be45da3654353599821b5828c14


[ROCm/rocm_smi_lib commit: ea26baec20]
2019-06-24 17:40:01 -05:00
Kent Russell a22666e01f Merge "Add support for reading GPU's unique ID" into amd-master
[ROCm/rocm_smi_lib commit: 04479f0b44]
2019-06-24 13:01:50 -04:00
Chris Freehill 6205a232ce Revert "Event counter support"
This reverts commit e1b784a3b0.


[ROCm/rocm_smi_lib commit: 908f07cb3b]
2019-06-21 22:07:40 -05:00
Chris Freehill e1b784a3b0 Event counter support
XGMI related events are supported

Change-Id: Ic99d5a1847e8d28b22ad0b61cb9ea206eb878708


[ROCm/rocm_smi_lib commit: 075833e9a5]
2019-06-21 18:27:50 -05:00