The xgmi command was showing pcie bit rate and bandwidth instead of xgmi. Corrected the API to get xgmi data from gpu metric.
Added python API for amdsmi_get_link_metrics. Modified the amdsmi_link_metrics struct.
Added check to confirm non zero partition got xgmi command.
---------
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
- When in json output mode the .rstrip function does not work due to dict obj type.
- The clk_value is now checked for dict instance before extracting the value.
- If clk_value is a dict then the .get() function is used to extract the value.
- Else it is a string obj which uses .split() to extract the value.
- If clk_value is < min_clk_value then deep_sleep is set to ENABLED
- initialize clk_value and min_clk_value to 0 for each loop.
- fix if/else for better readability
---------
Signed-off-by: Juan Castillo <juan.castillo@amd.com>
* Add the API and CLI to show the board voltage.
---------
Change-Id: Icb25bd653bb1d004704b5a21b378ca31b2b242c7
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>
* Added degree symbol and fixed power usage
* Added degree symbol and fixed power usage
* fixed default command
---------
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
* Update to remove vram enum and instead use the string directly from the driver.
Signed-off-by: Narlo, Joseph <Joseph.Narlo@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Changes:
- Updated references in the codebase to rename `COMPUTE_PARTITION` to `ACCELERATOR_PARTITION`
- Moved around and rephrased several duplicated lines in the CHANGELOG.md file
Change-Id: Id6bc86a7133e952cca6ef0acb1616ad6251d19d4
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
- Updated VCLK and DCLK min/max clock logic to populate N/A values.
- Updated VCLK and DCLK to show all available clocks.
- Updated deep_sleep logic using sys/fs clk_deep_sleep true/false.
- Added clarifying comments.
- Updated error output using e.get_error_info() instead of just error.
- Updated changelog
---------
Signed-off-by: Juan Castillo <juan.castillo@amd.com>
When all clocks are N/A's, it will be filtered. To
avoid confusion, single N/A is added.
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Changes:
- Removed libdrm/libdrm_amdgpu dependencies
- Added/updated new internal libdrm/libdrm_amdgpu/xf86drm APIs
to allow our APIs to reference before dynamic loading
the libdrm/libdrm_amdgpu libraries:
1. amdgpu_drm.h to what's seen in mainline
2. Added xf86drm.h to whats seen in mainline
- Modified internal DRM capabilities:
1. Require each API to independently connect to libdrm/libdrm_amdgpu
+ validate API handles reponses accordingly
2. Initialization of AMD SMI no longer has as strong of a tie to
libdrm
- Updated internal implementations of several APIs which have
connections to libdrm/libdrm_amdgpu or APIs which have conflicts
with open libdrm/libdrm_amdgpu connections:
1. amdsmi_init()
2. amdsmi_get_gpu_vram_usage()
3. amdsmi_get_gpu_asic_info()
4. amdsmi_get_gpu_vram_info()
5. amdsmi_get_gpu_vbios_info()
6. amdsmi_get_gpu_driver_info()
7. amdsmi_get_gpu_virtualization_mode()
8. amdsmi_set_gpu_memory_partition()
9. amdsmi_set_gpu_memory_partition_mode()
- Cleaned up effected tests/APIs
Change-Id: I96e2cf1b06b0cfee1b01a5e991ccc6116c4245a8
* Do not raise excepction for cper status not found, but keep iterating to next gpu
* Do not raise excepction for cper status not found, but keep iterating to next gpu
* use partition id and skip if non-zero
* reverting un-needed change
* Do not raise excepction for cper status not found, but keep iterating to next gpu
* use partition id and skip if non-zero
---------
Co-authored-by: Oosman Saeed <oossaeed@amd.com>
The N/A leaves filering was removing clock in static.
To avoid this, removed N/A filtering from single tier.
Signed-off-by: Kanangot Balakrishnan, Bindhiya <Bindhiya.KanangotBalakrishnan@amd.com>
The 'amd-smi metric --clock' was listing values with N/A. Filtered these outputs to show only available values.
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Earlier, the amd-smi metric and static json output
was not in valid json format. Changes are done to
get the output in valid json format.
---------
Change-Id: I5576333269509f63b3c800f225c3d73127ce80cf
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
* Small Fixes
* CLI Help text and parser formatting updates
* Changed metavar for set partition
---------
Change-Id: Ia8809665f6fac670452cd4db4e5e8f9c7270faba
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
Co-authored-by: Pham, Gabriel <Gabriel.Pham@amd.com>
* Reduced Load times for CLI in partition mode
* Change rsmi_dev_id_get() to use KFD, if KGD interface does not exist
* Make gpu_device_uuid fallback to rsmi_wrapper
* Moved Enumeration info calls in list for more speed
* Moved made group check excluded from recursion
---------
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Co-authored-by: Charis Poag <Charis.Poag@amd.com>
Co-authored-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
(#280)
- Fixed potential issue with min/max values when only one frequency is available
- Improve error handling in GPU frequency range detection
- Refactor clock frequency range detection for better readability
- Added special handling for current frequency indicator (*) in DPM output
- Added comments explaining special case handling for current frequency
- Cleaned up incorrect definitions in hsmp metric table definition
---------
Signed-off-by: Juan Castillo <juan.castillo@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
* Changes:
- Updates to DRM renderD* / card* pathing for partition
- Now use KFD to discover AMD devices and populate accordingly
Device MUST have an accessible KFD node (via cgroups)
- Updated serveral AMD SMI CLI outputs to handle SYSFS files
which are not accessible on partition nodes
- Tests are updated to handle not supported features
- Added new method to help get card/drm info
(rsmi_dev_device_identifiers_get) from ROCm SMI
- Renamed device->get_card_id() & device->get_drm_render_minor()
These can now be used on internal AMD SMI calls.
- Removed warnings shown in build
Change-Id: Ice882fd9b97fb625a5bd4ef327f3ceaf247dc570
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Added power cap to display on amd-smi monitor -p.
Updated help and Changelog as well.
Signed-off-by: Kanangot Balakrishnan, Bindhiya <Bindhiya.KanangotBalakrishnan@amd.com>
Earlier amd-smi monitor was showing VRAM usage as used and total.
Modified it to display free VRAM and VRAM percentage. Updated
Changelog.
Signed-off-by: Kanangot Balakrishnan, Bindhiya <Bindhiya.KanangotBalakrishnan@amd.com>
When argparse parses multiple invalid arguments, the error
message displays only the last argument and this leads to
confusion. To avoid the scenario, added valid command check
before argparse and in case of invalid first command, added
new exception.
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>