Updated: rocm_smi.py
- Remove all else: clauses from functions where rsmi_ret_ok is part of the if clause, as requested.
- rsmi_ret_ok() function already handles unsucessful return codes and gracefully handles them.
- Updated check_runtime_status() function to sweep through /sys/class/drm to find active runtime_status.
- Updated the message to' AMD GPU device(s) is/are in a low-power state. Check power control/runtime_status'
- This clarifies the status of the GPU and tells them where to check for more info.
Signed-off-by: Juan Castillo <juan.castillo@amd.com>
Co-authored-by: Juan Castillo <juan.castillo@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: gabrpham <Gabriel.Pham@amd.com>
* Added check in Queue::sync to verify that there is a callback for every dispatch
* Removed new atomic, using get_balanced_signal_slots() atomic with initial value of NUM_SIGNALS to verify dispatches complete
* rocr: Fix Incorrect Assertion Check
The wrong variable is used in the assertion statement, should be error
checking for the value of paramEndLoc after it is modified by the call
to find().
Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
* rocr: Fix Potential Undefined Behaviour
In the event that the SvmProfileControl destructor is called and
event == -1 is true then the call to close(event) is effectively
close(-1) which is undefined behaviour. This has been changed to only
call close() on valid file descriptors.
Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
* rocr: Add Error Check on Bytes Read
In the case that there is an incomplete read the call to copyTo() will
now return an error.
Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
* rocr: Fix Exception Error
Destructors are implicitly marked with noexcept being true by default
so if its not explicitly marked false in the destructor or the
functions it calls, any thrown exceptions will cause the program to
crash.
Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
---------
Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
Co-authored-by: Sunday Clement <Sunday.Clement@amd.com>
* Use queries instead of views in summary.py
* Export queries when created
* Remove HIP and HSA from output
* Fix domain query
* Export summary queries in the main function
* Fix comments and variable names
* Change syntax for old python versions
---------
Co-authored-by: Young Hui <young.hui@amd.com>
Legal Requirements:
For AMD software being released as open source, add copyright at the top of each new file.
Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>
* Fix grid_group::group_dim to return grid_dim and not block_dim
* Add unit test for grid_group.group_dim()
* Fix unit test errors
* Skip group_dim() assertions for base_type test
* Added Fortran (amdflang) openmp tests using the openmp-vv project
---------
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
* add more derived metrics for navi4.
* addr comments
* addr comments, and add more derived counters.
* EOF.
* misc.
* remove duplicate counter.
* misc.
* Remove gfx12 architecture definition for ldslatency
* remove extra architectures for gfx12.
* use wgp for normalization
* move these changes to another PR.
---------
Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com>
If a compressed changelog exists from a previous build, reconfiguring
the project fails with
```
[rocm-core configure] CMake Error at utils.cmake:213 (message):
[rocm-core configure] Failed to compress: gzip:
[rocm-core configure] /home/ben/src/TheRock/build/base/rocm-core/build/DEBIAN/changelog.Debian.gz
[rocm-core configure] already exists; not overwritten
```
Add `-f` to force overwriting.
* Remove config checks for stream and kernel rename data collection
* Updated csv generation to check if kernel rename is on before calling get_kernel_name
* Update metadata to use kernel_rename bool argument
* Formatting + unconditionally store kernel name in rocpd
* Readded kernel rename parameter after rebase
* Fixed rebase conflicts
* Updated comment in line with github comments
* Added check in rocpd csv.cpp to output kernel name if region name is empty
* Add test for kernel rename
---------
Co-authored-by: Ian Trowbridge <Ian.Trowbridge@amd.com>