23 Коммитов

Автор SHA1 Сообщение Дата
Jatin Chaudhary 8e1aee62d0 make hip-tests compileable with TheRock (#1624)
## Motivation

Resolved: SWDEV-566226

The current implementation of agents inside of rocprof-systems keeps just the minimal necessary set of information required for populating the `info_agent` table inside of rocpd database. There is a sufficient amount of data that is being left out from database, so this change should fix that and store the additional agent information as an `extdata` row inside of `info_agent` table.

## Technical Details

This PR introduces additional filed inside of `agent` structure inside which is representing the JSON formatted string of all the additional information we can acquire about particular agent. This data is processed and added during the initial fetching of agents, and afterwards pushed inside of the database.

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

* SWDEV-557412 - Incorporate proper chunk offset when remapping virtual memory (#1848)

* SWDEV-557412 - Incorporate proper offset when remapping virtual memory

* Fix condition to check if VMHeap allocation address matches a chunk address

* Move offset calculation outside if/else block

---------

Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>

* SWDEV-567852 - Clean-up hip::init() (#1948)

* SWDEV-559267 - Use CLPrint to DevLogPrintf with Log Level - detail debug. (#1160)

* SWDEV-548892 - Stop using ocml isinf wrapper (#1854)

* SWDEV-562708 - change default maximum SVM size to 256GB (#1731)

* SWDEV-503089 - Fix and enable disabled HIP tests from math group (#1319)

* SWDEV-503089 - Fix and enable disabled HIP tests from math group

* SWDEV-503089 - Move single precision reduced run to a common function

* SWDEV-548892 - Stop using ockl steadyctr function (#1882)

Directly use the builtin

* Implement PTL support (#1957)

* Implement PTL support

Signed-off-by: adapryor <Adam.pryor@amd.com>
(cherry picked from commit 45bc31292e7940a3b8fca044ef7df22047b95733)

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

---------

Signed-off-by: adapryor <Adam.pryor@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>

* SWDEV-558080 - Add recommended granularity (#1176)

* Add recommended granularity

* Improve granularity testing

* Update based on feedback

* Fix and enable VMM tests on cuda (#1855)

* Fix and enable VMM tests on cuda

* Minor syntax fixes

---------

Co-authored-by: Rahul Manocha <rmanocha@amd.com>

* [rocprofiler-systems] Add support for ompt_callback_thread_begin (#1681)

* Add thread_begin callback

* Make OMPT callbacks that are instant have start_ts = end_ts

* SWDEV-567514: Remove default stream wait (#1977)

- when virtual map command is called

- can create deadlock

Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>

* Fix flaky test Unit_hipStreamAddCallback_StrmSyncTiming (#2022)

* Review comments

* skip the 3 failing tests to merge hip-tests rocm-systems PR

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: adapryor <Adam.pryor@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Co-authored-by: GunaShekar <agunashe@amd.com>
Co-authored-by: agunashe <ajay.gunashekar@amd.com>
Co-authored-by: Ethan Trinh <Ethan.Trinh@amd.com>
Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>
Co-authored-by: Victor Zhang <111778801+victzhan@users.noreply.github.com>
Co-authored-by: German Andryeyev <56892148+gandryey@users.noreply.github.com>
Co-authored-by: usrihari123 <srihari.u@amd.com>
Co-authored-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Co-authored-by: anujshuk-amd <anujshuk@amd.com>
Co-authored-by: itrowbri <Ian.Trowbridge@amd.com>
Co-authored-by: marantic-amd <marantic@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: cadolphe-amd <chris.adolphe@amd.com>
Co-authored-by: Karthik Jayaprakash <54370791+kjayapra-amd@users.noreply.github.com>
Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
Co-authored-by: Todd tiantuo Li <88386084+lttamd@users.noreply.github.com>
Co-authored-by: amilanov-amd <Aleksandar.Milanov@amd.com>
Co-authored-by: Adam Pryor <61172547+adam360x@users.noreply.github.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: AidanBeltonS <abeltons@amd.com>
Co-authored-by: Rahul Manocha <153310294+manocharahul@users.noreply.github.com>
Co-authored-by: Rahul Manocha <rmanocha@amd.com>
Co-authored-by: Kian Cossettini <Kian.Cossettini@amd.com>
Co-authored-by: Shadi Dashmiz <94885391+shadidashmiz@users.noreply.github.com>
Co-authored-by: Ioannis Assiouras <38722728+iassiour@users.noreply.github.com>
Co-authored-by: Ajay GunaShekar <86270081+agunashe@users.noreply.github.com>
2025-12-03 08:53:17 -08:00
MachineTom 5f76cb916d SWDEV-555888 - Refactor Numa code (#1191)
1. Create a set of mini numa interface.
In Linux, the interface is based on system call rather than libnuma.
In Windows, the interface can also work, but the policy class is dummy.
Different from Linux, Windows doesn't provide numactl tool or numa lib to setup numa policy, thus
the default policy is followed in Windows, that is, using the closest host numa node to allocate
pinned host memory in hipHostMalloc().
To get the closest host numa node of a GPU device, you need query the new attribute
hipDeviceAttributeHostNumaId. Then you can create a thread with CPU affinity on the numa node.
For example, reference the test in hip-tests/catch/perftests/memory/hipPerfHostNumaAllocWin.cc.

2. Remove pfnSetThreadGroupAffinity and pfnGetNumaNodeProcessorMaskEx as the functions have been exposed since Win7 and Win server 2008.

3. Other minor fixes.
2025-10-23 21:56:15 -04:00
SaleelK d0e622e978 hip-tests: Fix hipPerfBufferCopySpeed (#946)
* Fix formatting and buffer size
2025-09-23 23:11:20 -04:00
Pengda Xie b9fc643a56 SWDEV-538789 - Cleanup unused values in perftests(#789) 2025-09-03 09:13:29 -07:00
Danylo Lytovchenko 2ff2316227 Adjust clang format to the new versions, revert broken macro layout (#714) 2025-08-22 17:23:22 +02:00
Hadi Naeisseh b2857b5db9 SWDEV-543981 Part 2 This is a new branch to avoid the many errors in the previous PR due to migration (#672)
Co-authored-by: hnaeisse_amdeng <hadi.naeisseh@amd.com>
2025-08-21 09:06:57 -04:00
Danylo Lytovchenko f7338717ae SWDEV-470698 - fix formatting, add format check workflow (#657) 2025-08-20 19:58:06 +05:30
Naeisseh, Hadi 1d9c8b7f6d SWDEV-546485 Port and clean up for all tests in catch/perftests/memory folder. (#558)
* SWDEV-546485 Port and clean up for hipPerfBufferCopyRectSpeed

* SWDEV-546485 Port and clean up for hipPerfDevMemReadSpeed

* SWDEV-546485 Port and clean up for hipPerfDevMemWriteSpeed

* SWDEV-546485 Port and clean up for hipPerfHostNumaAlloc

* SWDEV-546485 Port and clean up for hipPerfMemcpy

* SWDEV-546485 Port and clean up for hipPerfMemMallocCpyFree

* SWDEV-546485 Port and clean up for hipPerfMemset

* SWDEV-546485 Port and clean up for hipPerfSampleRate

* SWDEV-546485 Port and clean up for hipPerfSharedMemReadSpeed

* SWDEV-546485 Ported and fixed up segfault for hipPerfMemFill

* SWDEV-545485 Returning to unedited stage

[ROCm/hip-tests commit: 04469c0cde]
2025-08-15 13:09:19 -07:00
Luo, Phoebe 17d12dff14 SWDEV-546504 - Improve Catch2 INFO Prints (#496)
SWDEV-546504 Added function to print output to terminal and a debug print that can be toggled

[ROCm/hip-tests commit: 0dde3ce589]
2025-08-11 13:53:04 -07:00
Gollamandala, Srinivasarao f05c8ddb2b SWDEV-532640-[catch2][dtest]-Prefetch all arguments and keep 0 hidden args if possible-PerfTest (#242)
* SWDEV-532640-[catch2][dtest]-Prefetch all arguments and keep 0 hidden args if possible-PerfTest

* SWDEV-532640-Addressed review comment

* SWDEV-532640-Fixed Neg clock time issue

* SWDEV-532640-Fixed Neg clock time issue

* SWDEV-532640-Addressed clang format issue

* SWDEV-532640-Fixed Clang Format issues

[ROCm/hip-tests commit: 2060125dfd]
2025-07-08 15:24:52 +05:30
Venkatesh, Anavena 882d176c50 SWDEV-526521 : Inter GPU copy performance improvements (#240)
* SWDEV-532641 Inter GPU copy performance improvements

* SWDEV-532641 changed source data pointer type to vector type

[ROCm/hip-tests commit: feaa82ac46]
2025-07-07 17:45:20 +05:30
Gollamandala, Srinivasarao ff14fb30bf SWDEV-513197-[catch2][dtest]PerfTest-Improve launch performance for Device Heap kernels (#159)
[ROCm/hip-tests commit: 327edf98b3]
2025-05-06 08:14:52 +05:30
Swargam, Rambabu 7e8e711087 SWDEV-515926 - [catch2][dtest] Tests for Memory Manager for memory pool performance (#155)
[ROCm/hip-tests commit: 1b60d60f5a]
2025-04-30 10:11:33 +05:30
Tao Sang a697edb15b SWDEV-505853 - Fix Unit_hipMemPoolApi_BasicAlloc in mgpu
Unit_hipMemPoolApi_BasicAlloc expects to work on device 0, but other
tests will set not-0 devices in mgpu. This leads to hang of
Unit_hipMemPoolApi_BasicAlloc. Fix by set device 0 in head code
of Unit_hipMemPoolApi_BasicAlloc.

SWDEV-508872 - Fix Perf_hipPerfMemFill_test

When mem size is 2G, the test is so slow that it looks like stuckness.
Set top mem size to 1G can make the test pass in an acceptiable time.

Change-Id: Ie26dbf597e5ba8cb898d1aae5ed5ecf0267c3228


[ROCm/hip-tests commit: 94eea4db59]
2025-03-07 14:52:10 -05:00
Aidan Belton e403f56351 SWDEV-475380 - fix perftests on cuda
Change-Id: Iae6fc6cfdc4c2e6cb07562a03ff4e055601ed463


[ROCm/hip-tests commit: 2053abc3b1]
2025-01-13 09:22:51 -05:00
taosang2 bb901bef4a SWDEV-475568 - Fix compiling issues
Fix compiling issues of "make perf_test" under
hip-tests.

Change-Id: Ib03328a2fb13375fa44626a42202b1eeb177b8b2


[ROCm/hip-tests commit: a2f37dfa3a]
2024-10-23 11:03:45 -04:00
Branislav Brzak 35c7d3e1c6 SWDEV-448163 - Fix Doxygen warnings
Change-Id: If72e312461a72920b6a482009c9aef4cf92f2e1b


[ROCm/hip-tests commit: 6c23e25c86]
2024-03-25 05:18:34 -04:00
taosang2 12dcaec75f SWDEV-438680 - Add copy perf test cases
Add copy perf test cases for all devices to
all devices.

Change-Id: I6fa9e2c111a9ef48ef63a721e7a64c54e7f2a72f


[ROCm/hip-tests commit: c745949ec1]
2024-03-14 14:28:19 -04:00
Tao Sang 19ac140e3c SWDEV-430760 Add hipPerfBufferCopySpeedP2P.cc
Change-Id: I606226705cd441c1742e0eac4841f7b189d69149


[ROCm/hip-tests commit: fbb6002829]
2024-01-19 02:33:05 -05:00
Rahul Manocha 9941cb9c48 SWDEV-431064 catch test perf_test target compilation fix
Change-Id: I7de355c0d8ffd60ae05851e94c6e1a08ad655fd8


[ROCm/hip-tests commit: c5fa5e683f]
2023-11-17 06:34:32 -05:00
Rahul Manocha e96f828db3 SWDEV-428567 Perf Catch Test for new hipMemcpyKind
Change-Id: I215d5465c6e538deecf99e735f6bcf67e159841b


[ROCm/hip-tests commit: 0f3750cf2c]
2023-11-17 06:28:17 -05:00
German Andryeyev 6596324b3d SWDEV-430748 - Fix/update hipPerfMemset test
Correct the size of allocated buffers.
Extend the number of executed tests
Make sure warm-up finishes, before starting the test
Use a non-blocking stream for Async tests
Align up the output with results

Change-Id: Ie107fd83c0a95dacb537d8bca0b534cf6a6d5032


[ROCm/hip-tests commit: 9971540ac8]
2023-11-07 09:47:27 -05:00
ROCm CI Service Account 101abc7b39 SWDEV-403471 - [catch2][dtest] Converting perftests-memory files from HIT to catch2 (#342)
Change-Id: I13d2513f31dffe0b280039c888a97cc0d7bba31f

[ROCm/hip-tests commit: cf174d5a47]
2023-08-14 21:17:55 +05:30