Commit Graph

52 Commits

Author SHA1 Message Date
German Andryeyev 7e12cf6318 SWDEV-257789 - Initial change to skip kernel arg copy
The optimization is controlled with ROCR_SKIP_KERNEL_ARG_COPY.
This is initial check-in for experiments. Extra changes are
necessary for full support:
- handle graph capture with the original sysmem alloc
- avoid memobject references, otherwise there is a race condition with
reusage of the arg buffer
- Remove arg setup from hip

Change-Id: Ib0af710f93e79834711fa4049a7c66093711e68b
2021-10-28 20:35:35 -04:00
Vladislav Sytchenko d934612948 SWDEV-1 - Prepare for c++17 switch
std::mem_fun() and std::bind2nd() are removed in c++17. Switch to
simpler logic that does not require those functions.

Change-Id: I19a31f076e1813e367615bd377b424046ce144c7
2021-09-08 16:18:33 -04:00
German Andryeyev ff15c0893e SWDEV-292018 - Switch to internal signals for markers
Add ref counting to ProfilingSignal class to track the last release.
If a signal was used in the marker, then don't reuse it,
but create a new one for internal usage.
Don't rely on HSA callback for the command status update if there
are no pending dispatches.

Change-Id: I19f14ed9d80acfe79993b343b2187635f8428a20
2021-08-22 23:56:07 -07:00
German Andryeyev f34c1b9ff8 SWDEV-292820 - Add a new notify lock
HSA signal calback may occur during the actual marker submit. That
may cause a deadlock, because shared lock_ object. Create the new
notify_lock_ field to protect the notification.

Change-Id: I9752af84e59895530620fac3932c6fc276de8658
2021-08-22 23:56:07 -07:00
agunashe d96481fb36 SWDEV-293742 - Update copyright end year VDI repo
Change-Id: I69d2fea4a7a43adf96ccea794270e4af991c5261
2021-08-22 23:56:07 -07:00
German Andryeyev ce8dad2ecc SWDEV-290160 - Switch to global HSA signals
Runtime can't assign internal HSA signals for HIP events, because
HIP application can destroy the HIP stream or signal reuse may
occur internally. Switch to global HSA signals for HIP events.

Change-Id: Ieaea2d6b039e492b2e7c5112782a8f4e601e50a1
2021-08-22 23:56:07 -07:00
Christophe Paquot 133287f31f SWDEV-240806 - Release resources in Command::terminate for HIP
We do not want to release resources during setStatus in HIP because of Graphs

Change-Id: Idc7b188ab5f8be6975ea91005dd2bbf177401f8c
2021-08-22 23:56:07 -07:00
German Andryeyev c49f1069ab SWDEV-290160 - Don't send notification for batch markers
Batch marker already has a barrier with HSA signal callback

Change-Id: I69fc63d72320c2e9cc2d2e59ebd3f07c0bd0e3b5
2021-08-22 23:56:07 -07:00
German Andryeyev 85c70a7495 SWDEV-284671 - Add HW event wait to improve hipDeviceSynchronize
If AMD event contains a reference to a HW event, then runtime
could check/wait for HW event. CPU status update will occur later
after HSA signal callback, but it's not important for the result.

Change-Id: I591391a953bbdba6a25ac07e2cd98aeb17cd4596
2021-08-22 23:56:07 -07:00
Saleel Kudchadker 9d0846e732 SWDEV-286092 - Enable handler for marker always
For DD, send a NOP packet so that we leverage the handler to indicate
completion.

Change-Id: Ie57ea0124a8497d39cc49da1c4575c2cd86b9319
2021-08-22 23:56:07 -07:00
German Andryeyev fa2e154a8b SWDEV-278894 - Use GPU waits for HIP events
Save HW events in amd::Event.
Use HW events for synchronization

Change-Id: I98cf9c2d0ec3c7fcaf254b749ac6c568d7270ae0
2021-05-25 13:41:15 -04:00
Anusha GodavarthySurya c9c6bed022 SWDEV-240806 - [hip-graph] Added functions updateEventWaitList and resetStatus
Change-Id: I6a753e9584bdacd39ee676466a884ec6b7859879
2021-04-20 09:43:40 -04:00
Saleel Kudchadker 9307ab43e4 SWDEV-278336 - Print time info only when profiling
Change-Id: Ic8d04e58cf4558fbfc5ed6db35f3ff2d788803f9
2021-04-09 13:17:31 -04:00
Satyanvesh Dittakavi a711a49881 SWDEV-264244 - Hide Notifications from HIP
This fixes hipStreamQuery returning hipErrorNotReady when idle
Change-Id: I3f77666a00bc6a7162b6c660d79e76c09669d94f
2021-03-16 06:30:55 -04:00
German Andryeyev 7f32d0b425 SWDEV-272496 - Detect callbacks and force AQL barrier
HIP tests require HIP callbacks to be processed in another thread.
This change will use a thread from HSA signal callbacks to make
sure a HIP callback was done asynchronously.
Also process the callback before changing the status of command

Change-Id: Icef85d0e0f808663882cf6881ff1be3e5eca29ac
2021-03-05 11:33:51 -05:00
Sarbojit Sarkar 14d54a7b29 SWDEV-254329 - Fix for profiler ON/OFF
Change-Id: Iea72ae96ebe7ed95322dfc39d785ac326b47f6dc
2021-03-02 02:16:14 -05:00
German Andryeyev 24299e25bd SWDEV-272496 - Fix multiple timing issues
- Don't notify if the batch is empty, because that means
the current command was processed already.
- Disable pinning optimization to avoid a race condition on stall.
- TS marker submition requires extra AQL barrier
to track the status.

Change-Id: I17eff4ad12ac66cfe1bb44048bebb1891805279d
2021-03-01 12:46:57 -05:00
German Andryeyev a9b0e20d26 SWDEV-272496 - Fix a regression in PAL
Skip notification for markers with direct dispatch only,
since they are blocking always

Change-Id: I6bb17650f73371dae6e29c59fd6bb2012cc062fd
2021-02-25 11:11:42 -05:00
Vladislav Sytchenko 184b2631d5 SWDEV-271964 - Revert "SWDEV-264244 Fix StreamSync"
This reverts commit a148a71075.

Change-Id: I870c8b71edeb31f587fffe2447762acba61a7938
2021-02-24 11:43:08 -05:00
German Andryeyev 0587fb7450 SWDEV-272496 - Disable notification for the previous notify
Direct disaptch doesn't insert extra barriers for Markers if
AQL barrier was the last issued command already.

Change-Id: I00fbc658547d83dd3ee64ec391ed50e5f8a08e30
2021-02-23 17:04:59 -05:00
German Andryeyev 6966d8098e SWDEV-269654 - Fix HIP stream busy query
- Avoid GPU wait on the marker submission and update the command
batch after HSA signal callback upon HSA barrier completion.

Change-Id: I5c1c97212aefc2ae4b99aa9e2a81627ee9a38c1c
2021-02-09 12:57:12 -05:00
Satyanvesh Dittakavi a148a71075 SWDEV-264244 Fix StreamSync
Change-Id: I3a46a607a77aaf46dcd1fcf08db7e926613fe8d1
2021-01-08 02:06:31 -05:00
Sarbojit Sarkar 0e4b4255b2 SWDEV-262857: minor fix for D2D
Change-Id: Ica3cb9108e7a0d40d6a910f318df0a2420145603
2020-12-16 23:13:15 -05:00
Saleel Kudchadker d0c35f1c40 Fix event reporting for AMD_DIRECT_DISPATCH
Change-Id: I2ff74b9470da976852228c30fefbd4abd8e1952b
2020-12-09 15:09:41 -05:00
Sarbojit Sarkar f403b1c079 [SWDEV-259635] explicit allow_access for hipMemcpy2D
Change-Id: Ia3206c08f92f417dc486c5f0dd40474f77b473d9
2020-12-09 01:09:53 -05:00
German Andryeyev 5b31c69a95 Add batch tracking for direct dispatch
Make sure the logic updates the command status when it's done in
HW, but not on submission.
Add the last command tracking, otherwise queue sync logic in the HIP
upper layer may skip synchronization, assuming the queue is empty.

Change-Id: I2d046792553e74df090a10f7d7a78914610f6df2
2020-12-04 10:16:17 -05:00
German Andryeyev 532f0ae951 Add direct dispatch simple hack for testing
The hack dosn't really track the commands status. It may be not
necessary for HIP, but will cause early resource release.

Change-Id: I791ad36dd8abd3b6b3d2c9b16a210a555c08ca64
2020-11-13 10:36:23 -05:00
Sarbojit Sarkar 099f8d61dd SWDEV-258573 : fix for OCLP2PBuffer test failure
Change-Id: I363d4fb2bb94d9bc03e96844d31dec7ef9b2ce33
2020-11-13 02:25:53 -05:00
German Andryeyev bd340d8cbf Correct reported info in ROC profiler
OCL can't distinguish different copy types, but ROC profiler
expects SDMA transfer visibility. Add extra code to detect
a transfer with the host memory and substitute OCL command

Change-Id: I5290acd0e10bc082e00c1d4ae1474a075de7f165
2020-10-23 18:29:48 -04:00
Jason Tang c5184d61b4 SWDEV-252150 - No need to send a Marker if the event is completed in Event::notifyCmdQueue()
Change-Id: Iaa1c550ce0849c12298a24812604345dbf877a5e
2020-10-14 09:29:24 -04:00
Sarbojit Sarkar 4a025e1a87 [perf]hipMalloc performance optimization
Change-Id: I6e8a918cc1c4cafad197b09e10755cd180e11ead
2020-10-06 03:19:41 -04:00
Laurent Morichetti 5d4b6f74d3 Use std::atomic
Replace amd::Atomic with std::atomic. Remove make_atomic uses by
converting the variable to std::atomic and making sure the memory
order is relaxed when synchronizes-with is not needed.

Delete utils/atomic.hpp.

Change-Id: I0b36db8d604a8510ac6e36b32885fd16a1b8ccfa
2020-09-09 14:55:29 -04:00
Saleel Kudchadker 1c24072d13 Revert "SWDEV-241977 [ROCm QA] Random Soft hang observed while running TF and Caffe2 benchmarks"
This reverts commit ce038f3163.

Change-Id: Ib56493c92eca793f1dfb6f1cbefb32f0b4f65e89
2020-09-01 18:09:10 -04:00
Saleel Kudchadker ec73340348 Add Queue profling param and toggle for HIP
Use signal timestamps if NDRange command takes forceProfile flag.

Change-Id: Ib7f187d781fd78a7346818afb3344a9378f4c104
2020-08-06 03:09:53 -04:00
Alex Xie ce038f3163 SWDEV-241977 [ROCm QA] Random Soft hang observed while running TF and Caffe2 benchmarks
Change-Id: I42016c11db15411b86e7b8130d6ba557bc22dbb7
2020-07-22 02:03:48 -04:00
Christophe Paquot 3d15a1e291 Make append and setLastQueuedCommand atomic
Two threads can enqueue to the same HostQueue (HostQueue::enqueue)
and result in last queued command being the first one reachine queue_.enqueue

NOTE: Temporarly make setLastQueuedCommand empty function to pass the build

Change-Id: Id09c3a28d184986f52b2ec86a2f6a18c40df1f0b
2020-07-14 18:22:45 -04:00
German Andryeyev 059832b526 Return always true for P2P validation under ROCr
Change-Id: Id32a5a94a642e708d1d042c5247af38501bec153
2020-07-04 11:38:04 -04:00
German Andryeyev 01c2727a3a Disable P2P emulation for HIP
Some apps use P2P transfer without any validation for peer access.
Report an error if runtime has found such a request.

Change-Id: I3bf728f1fc3969697ade97bb1d2f1dce294078e2
2020-06-16 11:21:54 -04:00
Vlad Sytchenko e50a9eec9d Fix -Wsequence-point warning
Change-Id: Ib6322e06f83887da4a29f8eafb99b743211e851d
2020-06-15 17:40:11 -04:00
German Andryeyev e4177b75bc Add missing return
Change-Id: Ibe9c1ccb377ce14ad69a0e9828ea70b707acba34
2020-06-12 17:35:45 -04:00
German Andryeyev c5afd5d412 Initial HMM support
- Expose ROCclr interfaces for HIP usage
- ROCr interfaces aren't available in staging, thus control the
build with AMD_HMM_SUPPORT define

Change-Id: Iadc2bcc230e78d3b0dc22b235189c8cc80843446
2020-06-12 09:06:07 -04:00
Vlad Sytchenko 6e985845b3 Take into account dynamic LDS size when validating the launch parameters.
Bottom layers don't error check this value, so we might and up writing a bad value to a register and cause the SPI to hang.

Change-Id: I6da4ae71c66a25c63ebb804da4afe4ca7fb831b7
2020-05-08 09:37:06 -04:00
Michael LIAO 503ef06555 Clear executable permission.
Change-Id: Ia0d363b1ba89d7947e5b5a55cb67edba86f0515e
2020-05-07 10:38:58 -04:00
kjayapra-amd 7458bf9964 SWDEV-229840 - Improve error messages on ROCCLR Layer.
Change-Id: Iab7d9156cdc206db86385aa05023a0095ed40f92
2020-04-19 20:01:49 -04:00
Payam 1b6f21ad9a removing AMD emails per palamida scan
Change-Id: If7307f5b1f81a43f2725ec5abd3b8989cbddbcc5
2020-03-11 21:26:55 -04:00
German Andryeyev 288967eff4 SWDEV-193956 - Fix a regression in OCL for user events
- Check the queue for nullptr, since the user events may not have
a queue, associated with them

Change-Id: Ib969a052acc9108ca3fd0c063157fe4d47c5b244
2020-03-09 11:10:23 -04:00
German Andryeyev 0fc433e076 SWDEV-193956 - [hipclang-vdi-rocm][perf]
~45% to 50% of Performance drop on rocBLAS_int8 test

Add support for active waits without blocking the host thread.

Change-Id: Ie7bb48dcafcb4c93d448bf74749b829b626c3578
2020-03-04 17:02:15 -05:00
German Andryeyev a66d09f5a3 SWDEV-193956 - [hipclang-vdi-rocm][perf]
~45% to 50% of Performance drop on rocBLAS_int8 test

Use the last command in the queue for a wait.
Add extra print information about processed commands.
Add an option to disable file location printing.

Change-Id: I4187883e1a90e571fde3128af98368108fda8785
2020-02-21 15:21:15 -05:00
Christophe Paquot 566144edb2 Append before setting last command to avoid corner case
Change-Id: Iafe5f899427f0119e7f43e96af38e6e3a1dbfc93
2020-02-13 22:23:20 -05:00
Laurent Morichetti d9d9c69399 Replace cl_* integral types with standard types.
cl_bool -> bool
cl_int -> int32_t
cl_uint -> uint32_t
cl_long -> int64_t
cl_ulong -> uint64_t
cl_float -> float
cl_double -> double
cl_bitfield -> uint64_t

Change-Id: I840c8993b55f98f5b745d21e27f5f28233647a58
2020-02-12 13:16:06 -08:00