39 Tiomáintí

Údar SHA1 Teachtaireacht Dáta
Jatin Chaudhary 8e1aee62d0 make hip-tests compileable with TheRock (#1624)
## Motivation

Resolved: SWDEV-566226

The current implementation of agents inside of rocprof-systems keeps just the minimal necessary set of information required for populating the `info_agent` table inside of rocpd database. There is a sufficient amount of data that is being left out from database, so this change should fix that and store the additional agent information as an `extdata` row inside of `info_agent` table.

## Technical Details

This PR introduces additional filed inside of `agent` structure inside which is representing the JSON formatted string of all the additional information we can acquire about particular agent. This data is processed and added during the initial fetching of agents, and afterwards pushed inside of the database.

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

* SWDEV-557412 - Incorporate proper chunk offset when remapping virtual memory (#1848)

* SWDEV-557412 - Incorporate proper offset when remapping virtual memory

* Fix condition to check if VMHeap allocation address matches a chunk address

* Move offset calculation outside if/else block

---------

Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>

* SWDEV-567852 - Clean-up hip::init() (#1948)

* SWDEV-559267 - Use CLPrint to DevLogPrintf with Log Level - detail debug. (#1160)

* SWDEV-548892 - Stop using ocml isinf wrapper (#1854)

* SWDEV-562708 - change default maximum SVM size to 256GB (#1731)

* SWDEV-503089 - Fix and enable disabled HIP tests from math group (#1319)

* SWDEV-503089 - Fix and enable disabled HIP tests from math group

* SWDEV-503089 - Move single precision reduced run to a common function

* SWDEV-548892 - Stop using ockl steadyctr function (#1882)

Directly use the builtin

* Implement PTL support (#1957)

* Implement PTL support

Signed-off-by: adapryor <Adam.pryor@amd.com>
(cherry picked from commit 45bc31292e7940a3b8fca044ef7df22047b95733)

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

---------

Signed-off-by: adapryor <Adam.pryor@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>

* SWDEV-558080 - Add recommended granularity (#1176)

* Add recommended granularity

* Improve granularity testing

* Update based on feedback

* Fix and enable VMM tests on cuda (#1855)

* Fix and enable VMM tests on cuda

* Minor syntax fixes

---------

Co-authored-by: Rahul Manocha <rmanocha@amd.com>

* [rocprofiler-systems] Add support for ompt_callback_thread_begin (#1681)

* Add thread_begin callback

* Make OMPT callbacks that are instant have start_ts = end_ts

* SWDEV-567514: Remove default stream wait (#1977)

- when virtual map command is called

- can create deadlock

Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>

* Fix flaky test Unit_hipStreamAddCallback_StrmSyncTiming (#2022)

* Review comments

* skip the 3 failing tests to merge hip-tests rocm-systems PR

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: adapryor <Adam.pryor@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Co-authored-by: GunaShekar <agunashe@amd.com>
Co-authored-by: agunashe <ajay.gunashekar@amd.com>
Co-authored-by: Ethan Trinh <Ethan.Trinh@amd.com>
Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>
Co-authored-by: Victor Zhang <111778801+victzhan@users.noreply.github.com>
Co-authored-by: German Andryeyev <56892148+gandryey@users.noreply.github.com>
Co-authored-by: usrihari123 <srihari.u@amd.com>
Co-authored-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Co-authored-by: anujshuk-amd <anujshuk@amd.com>
Co-authored-by: itrowbri <Ian.Trowbridge@amd.com>
Co-authored-by: marantic-amd <marantic@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: cadolphe-amd <chris.adolphe@amd.com>
Co-authored-by: Karthik Jayaprakash <54370791+kjayapra-amd@users.noreply.github.com>
Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
Co-authored-by: Todd tiantuo Li <88386084+lttamd@users.noreply.github.com>
Co-authored-by: amilanov-amd <Aleksandar.Milanov@amd.com>
Co-authored-by: Adam Pryor <61172547+adam360x@users.noreply.github.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: AidanBeltonS <abeltons@amd.com>
Co-authored-by: Rahul Manocha <153310294+manocharahul@users.noreply.github.com>
Co-authored-by: Rahul Manocha <rmanocha@amd.com>
Co-authored-by: Kian Cossettini <Kian.Cossettini@amd.com>
Co-authored-by: Shadi Dashmiz <94885391+shadidashmiz@users.noreply.github.com>
Co-authored-by: Ioannis Assiouras <38722728+iassiour@users.noreply.github.com>
Co-authored-by: Ajay GunaShekar <86270081+agunashe@users.noreply.github.com>
2025-12-03 08:53:17 -08:00
MachineTom 5f76cb916d SWDEV-555888 - Refactor Numa code (#1191)
1. Create a set of mini numa interface.
In Linux, the interface is based on system call rather than libnuma.
In Windows, the interface can also work, but the policy class is dummy.
Different from Linux, Windows doesn't provide numactl tool or numa lib to setup numa policy, thus
the default policy is followed in Windows, that is, using the closest host numa node to allocate
pinned host memory in hipHostMalloc().
To get the closest host numa node of a GPU device, you need query the new attribute
hipDeviceAttributeHostNumaId. Then you can create a thread with CPU affinity on the numa node.
For example, reference the test in hip-tests/catch/perftests/memory/hipPerfHostNumaAllocWin.cc.

2. Remove pfnSetThreadGroupAffinity and pfnGetNumaNodeProcessorMaskEx as the functions have been exposed since Win7 and Win server 2008.

3. Other minor fixes.
2025-10-23 21:56:15 -04:00
Sourabh U Betigeri b24f922a24 SWDEV-552620 - Adds a new graph benchmark test for different topologies (#1073) 2025-09-25 09:50:10 -07:00
SaleelK d0e622e978 hip-tests: Fix hipPerfBufferCopySpeed (#946)
* Fix formatting and buffer size
2025-09-23 23:11:20 -04:00
Pengda Xie b9fc643a56 SWDEV-538789 - Cleanup unused values in perftests(#789) 2025-09-03 09:13:29 -07:00
systems-assistant[bot] 5f4e0dc889 SWDEV-538789 - Add multi stream kernel dispatch perf test (#556)
Co-authored-by: Pengda Xie <pengda.xie@amd.com>
2025-08-26 13:42:11 -07:00
Danylo Lytovchenko 2ff2316227 Adjust clang format to the new versions, revert broken macro layout (#714) 2025-08-22 17:23:22 +02:00
Hadi Naeisseh b2857b5db9 SWDEV-543981 Part 2 This is a new branch to avoid the many errors in the previous PR due to migration (#672)
Co-authored-by: hnaeisse_amdeng <hadi.naeisseh@amd.com>
2025-08-21 09:06:57 -04:00
Danylo Lytovchenko f7338717ae SWDEV-470698 - fix formatting, add format check workflow (#657) 2025-08-20 19:58:06 +05:30
Naeisseh, Hadi 1d9c8b7f6d SWDEV-546485 Port and clean up for all tests in catch/perftests/memory folder. (#558)
* SWDEV-546485 Port and clean up for hipPerfBufferCopyRectSpeed

* SWDEV-546485 Port and clean up for hipPerfDevMemReadSpeed

* SWDEV-546485 Port and clean up for hipPerfDevMemWriteSpeed

* SWDEV-546485 Port and clean up for hipPerfHostNumaAlloc

* SWDEV-546485 Port and clean up for hipPerfMemcpy

* SWDEV-546485 Port and clean up for hipPerfMemMallocCpyFree

* SWDEV-546485 Port and clean up for hipPerfMemset

* SWDEV-546485 Port and clean up for hipPerfSampleRate

* SWDEV-546485 Port and clean up for hipPerfSharedMemReadSpeed

* SWDEV-546485 Ported and fixed up segfault for hipPerfMemFill

* SWDEV-545485 Returning to unedited stage

[ROCm/hip-tests commit: 04469c0cde]
2025-08-15 13:09:19 -07:00
Luo, Phoebe 83d3897df9 SWDEV-546217 Complete hip-test Port to Catch2 Framework [Stream and Compute Folder] (#559)
* SWDEV-546498 hipPerfDeviceConcurrency

* SWDEV-546500 hipPerfStreamConcurrency

* SWDEV-546502 hipPerfStreamCreateCopyDestroy.c

* SWDEV-546479 hipPerfDotProduct

* SWDEV-546482 hipPerfMandelbrot

[ROCm/hip-tests commit: 9fdc9a98b7]
2025-08-15 12:38:33 -07:00
Luo, Phoebe 12a1235939 SWDEV-543981 - Performance Test Improvement for Dispatch Speed and Kernel Latency (#527)
* SWDEV-543981 new kernel latency test with different timing modes and taking multiple iterations of same test

* SWDEV-543981 cleanup

* SWDEV-543981 removed outdated hit test

* SWDEV-543981 Updated timing kernel

[ROCm/hip-tests commit: d227a8110c]
2025-08-15 12:34:44 -07:00
Luo, Phoebe 17d12dff14 SWDEV-546504 - Improve Catch2 INFO Prints (#496)
SWDEV-546504 Added function to print output to terminal and a debug print that can be toggled

[ROCm/hip-tests commit: 0dde3ce589]
2025-08-11 13:53:04 -07:00
Dittakavi, Satyanvesh 5e6d4d0087 SWDEV-511852 - Adding performance test for Virtual Memory Management (#150)
Co-authored-by: idass <ian.dass@amd.com>

[ROCm/hip-tests commit: 94bb129ac1]
2025-07-25 12:19:05 +05:30
Gollamandala, Srinivasarao f05c8ddb2b SWDEV-532640-[catch2][dtest]-Prefetch all arguments and keep 0 hidden args if possible-PerfTest (#242)
* SWDEV-532640-[catch2][dtest]-Prefetch all arguments and keep 0 hidden args if possible-PerfTest

* SWDEV-532640-Addressed review comment

* SWDEV-532640-Fixed Neg clock time issue

* SWDEV-532640-Fixed Neg clock time issue

* SWDEV-532640-Addressed clang format issue

* SWDEV-532640-Fixed Clang Format issues

[ROCm/hip-tests commit: 2060125dfd]
2025-07-08 15:24:52 +05:30
Venkatesh, Anavena 882d176c50 SWDEV-526521 : Inter GPU copy performance improvements (#240)
* SWDEV-532641 Inter GPU copy performance improvements

* SWDEV-532641 changed source data pointer type to vector type

[ROCm/hip-tests commit: feaa82ac46]
2025-07-07 17:45:20 +05:30
Sang, Tao 438882ceb7 SWDEV-514141 - Fix zero clock rate issues (#4)
1.Remove clock functions from some tests that don't need them.
2.In some memory pool tests and coherency tests, timer-based kernel
delay isn't reliable, use pinned host based notification instead.
3.Add CHECK_PCIE_ATOMICS_SUPPORT before some tests.
4.catch/unit/memory/hipMemoryAllocateCoherent.cc is removed
as it is useless and originally excluded in building.
5.Some tests can still pass even if clock rate =0, thus they
  will be kept as is.
6.Some logic and format improvement in some tests.

Change-Id: I6b3c6bf54c61cffd45cd6f17c75998f751b75725

[ROCm/hip-tests commit: ec8ff45a1d]
2025-06-11 21:11:25 +05:30
Gollamandala, Srinivasarao 9e2a9eba01 SWDEV-504650-[catch2][dtest]PerfTest-Reduce the lock scope for hipEventRecord and hipEventQuery (#158)
[ROCm/hip-tests commit: e3964c54d6]
2025-05-12 10:23:07 +05:30
Gollamandala, Srinivasarao ff14fb30bf SWDEV-513197-[catch2][dtest]PerfTest-Improve launch performance for Device Heap kernels (#159)
[ROCm/hip-tests commit: 327edf98b3]
2025-05-06 08:14:52 +05:30
Gollamandala, Srinivasarao f5398c7dde SWDEV-504658-[catch2][dtest]-Reduce the lock scope of the kernel object look-up (#73)
[ROCm/hip-tests commit: e540c3b94a]
2025-05-05 14:24:11 +05:30
Swargam, Rambabu 7e8e711087 SWDEV-515926 - [catch2][dtest] Tests for Memory Manager for memory pool performance (#155)
[ROCm/hip-tests commit: 1b60d60f5a]
2025-04-30 10:11:33 +05:30
Tao Sang a697edb15b SWDEV-505853 - Fix Unit_hipMemPoolApi_BasicAlloc in mgpu
Unit_hipMemPoolApi_BasicAlloc expects to work on device 0, but other
tests will set not-0 devices in mgpu. This leads to hang of
Unit_hipMemPoolApi_BasicAlloc. Fix by set device 0 in head code
of Unit_hipMemPoolApi_BasicAlloc.

SWDEV-508872 - Fix Perf_hipPerfMemFill_test

When mem size is 2G, the test is so slow that it looks like stuckness.
Set top mem size to 1G can make the test pass in an acceptiable time.

Change-Id: Ie26dbf597e5ba8cb898d1aae5ed5ecf0267c3228


[ROCm/hip-tests commit: 94eea4db59]
2025-03-07 14:52:10 -05:00
Ioannis Assiouras 360b0b68cc SWDEV-490855 - Enabled Perf_KernelLaunchLatency_IncreasingNumberOfStreams under perftests
Change-Id: I2a494022d5cc113dce044faadd7d2462a2aece08


[ROCm/hip-tests commit: 472e0b7b20]
2025-02-04 13:20:47 -05:00
SrinivasaRao 36c4a39444 SWDEV-491360-[catch2][dtest]-Improve hipGraphLaunch parallelism tests-stream collision
Change-Id: I1ea60bfbf4b738ed83a3e11ceb80ba7dd1f21998


[ROCm/hip-tests commit: f24fc09ed4]
2025-01-21 22:43:43 -05:00
Michael Xie b2f415e866 SWDEV-494221 - Remove unnecessary hipEventRecord
- Removed unnecessary hipEventRecord and fixed time calculation in
  hipPerfDispatchSpeed.cc where it was off by a factor of 1,000.

Change-Id: If538e1d236cf0e6d3c69caf7af53c9095d812ad6


[ROCm/hip-tests commit: b1f9f86543]
2025-01-16 11:45:20 -05:00
Aidan Belton e403f56351 SWDEV-475380 - fix perftests on cuda
Change-Id: Iae6fc6cfdc4c2e6cb07562a03ff4e055601ed463


[ROCm/hip-tests commit: 2053abc3b1]
2025-01-13 09:22:51 -05:00
Rambabu Swargam 80903662fc SWDEV-491360 - [catch2][dtest] hipGraphLaunch parallelism Tests
Tests for Alloc node detection optimization changes

Change-Id: I780553110b33887c65b7989490c9a72e796f1a62


[ROCm/hip-tests commit: 351ffa8378]
2025-01-09 00:02:07 -05:00
Ajay ffc418c7fd SWDEV-1 - avoid same stream target name for tests
Having same target name causes same includes to be called twice

Change-Id: I53469a07e6dee375ea4a4700ccac3c9487b79e4a


[ROCm/hip-tests commit: c03ad253fd]
2024-11-25 17:50:24 -05:00
Ioannis Assiouras 3be38db6a3 SWDEV-490855 - Add test cases to evaluate kernel launch performance with increasing number of idle streams
Change-Id: I466c57190259d4f5995b06974cc7f589580400b0


[ROCm/hip-tests commit: 22e27ec97f]
2024-11-20 04:19:23 -05:00
Saleel Kudchadker 3e10bf3e5e SWDEV-494221 - Fix hipPerfDispatchSpeed test
- Do an hipEventRecord on null stream, that creates the streams and
  avoids stream creation overhead when we time the core functionality

Change-Id: I117dccc42c92836fa113214d31bf14da49deba77


[ROCm/hip-tests commit: fb5e1d33d9]
2024-11-12 13:02:39 -05:00
taosang2 bb901bef4a SWDEV-475568 - Fix compiling issues
Fix compiling issues of "make perf_test" under
hip-tests.

Change-Id: Ib03328a2fb13375fa44626a42202b1eeb177b8b2


[ROCm/hip-tests commit: a2f37dfa3a]
2024-10-23 11:03:45 -04:00
Branislav Brzak 35c7d3e1c6 SWDEV-448163 - Fix Doxygen warnings
Change-Id: If72e312461a72920b6a482009c9aef4cf92f2e1b


[ROCm/hip-tests commit: 6c23e25c86]
2024-03-25 05:18:34 -04:00
taosang2 12dcaec75f SWDEV-438680 - Add copy perf test cases
Add copy perf test cases for all devices to
all devices.

Change-Id: I6fa9e2c111a9ef48ef63a721e7a64c54e7f2a72f


[ROCm/hip-tests commit: c745949ec1]
2024-03-14 14:28:19 -04:00
Tao Sang 19ac140e3c SWDEV-430760 Add hipPerfBufferCopySpeedP2P.cc
Change-Id: I606226705cd441c1742e0eac4841f7b189d69149


[ROCm/hip-tests commit: fbb6002829]
2024-01-19 02:33:05 -05:00
Rahul Manocha 9941cb9c48 SWDEV-431064 catch test perf_test target compilation fix
Change-Id: I7de355c0d8ffd60ae05851e94c6e1a08ad655fd8


[ROCm/hip-tests commit: c5fa5e683f]
2023-11-17 06:34:32 -05:00
Rahul Manocha e96f828db3 SWDEV-428567 Perf Catch Test for new hipMemcpyKind
Change-Id: I215d5465c6e538deecf99e735f6bcf67e159841b


[ROCm/hip-tests commit: 0f3750cf2c]
2023-11-17 06:28:17 -05:00
mbhiutra 0e5e2ec2f7 SWDEV-403471 - [catch2][dtest] Conversion of preftests to catch2
Change-Id: I68cb780a71a6094dca86718e7d427806d3a0e67d


[ROCm/hip-tests commit: b0c4a4f70f]
2023-11-17 06:24:57 -05:00
German Andryeyev 6596324b3d SWDEV-430748 - Fix/update hipPerfMemset test
Correct the size of allocated buffers.
Extend the number of executed tests
Make sure warm-up finishes, before starting the test
Use a non-blocking stream for Async tests
Align up the output with results

Change-Id: Ie107fd83c0a95dacb537d8bca0b534cf6a6d5032


[ROCm/hip-tests commit: 9971540ac8]
2023-11-07 09:47:27 -05:00
ROCm CI Service Account 101abc7b39 SWDEV-403471 - [catch2][dtest] Converting perftests-memory files from HIT to catch2 (#342)
Change-Id: I13d2513f31dffe0b280039c888a97cc0d7bba31f

[ROCm/hip-tests commit: cf174d5a47]
2023-08-14 21:17:55 +05:30