Wykres commitów

76 Commity

Autor SHA1 Wiadomość Data
German Andryeyev 88bd851f72 Move returned last command under the lock
Change-Id: I4a2b29a6beacd56ea38d91a33b3c5f8b763be3c7
2020-12-11 15:19:06 -05:00
Saleel Kudchadker d0c35f1c40 Fix event reporting for AMD_DIRECT_DISPATCH
Change-Id: I2ff74b9470da976852228c30fefbd4abd8e1952b
2020-12-09 15:09:41 -05:00
German Andryeyev 1fde842703 Fix a deadlock in ROCr backend
When OCL ROCr backend performs CL_MEM_COPY_HOST_PTR it may attempt
to have access to amd::Memory object it's currently creating,
but it's not ready yet. The logic creates a temporary dummy object
to perform a copy transfer. The new change will make sure runtime
skips allocation of the same device::Memory object second time.

Change-Id: I14c6a00a3941fdcaa6aea299e9f096e4c3f5cadf
2020-12-09 13:23:17 -05:00
Sarbojit Sarkar f403b1c079 [SWDEV-259635] explicit allow_access for hipMemcpy2D
Change-Id: Ia3206c08f92f417dc486c5f0dd40474f77b473d9
2020-12-09 01:09:53 -05:00
German Andryeyev 5b31c69a95 Add batch tracking for direct dispatch
Make sure the logic updates the command status when it's done in
HW, but not on submission.
Add the last command tracking, otherwise queue sync logic in the HIP
upper layer may skip synchronization, assuming the queue is empty.

Change-Id: I2d046792553e74df090a10f7d7a78914610f6df2
2020-12-04 10:16:17 -05:00
Saleel Kudchadker 59c6cb0268 Use barrier packets for event profiling
Use barrier packets for every profile marker that gets submitted
and use the completion signal to get GPU ts. This gives most accurate
dispatch time. Club cache flushes with profile marker if there is a
pending dispatch that needs cache flush. This optimization saves on
extra barrier and helps wall time

Change-Id: Ib62d6d7aabf4743827b561be6c9c5afa813203da
2020-12-03 13:45:14 -05:00
German Andryeyev e4f51e063b Disable worker thread creation for direct dispatch
Change-Id: I28f08ab9352310c9bf843fcb803a48f95ddf4676
2020-11-30 17:50:12 -05:00
German Andryeyev 08b846ae12 Remove obsolete terminate() method
Change-Id: I66b4a74f17977f1af320f402402a2f1b602e9911
2020-11-30 11:46:09 -05:00
German Andryeyev 089a5cc4ad Add image view allocation
If deferred allocation is disabled, then make sure the image view
is created without a delay. Also reset the allocation state, since
create() method isn't called for a view creation.

Change-Id: I7aa22a62bff18289ade83e56b5d3305ba68c715b
2020-11-18 09:37:30 -05:00
German Andryeyev 532f0ae951 Add direct dispatch simple hack for testing
The hack dosn't really track the commands status. It may be not
necessary for HIP, but will cause early resource release.

Change-Id: I791ad36dd8abd3b6b3d2c9b16a210a555c08ca64
2020-11-13 10:36:23 -05:00
Sarbojit Sarkar 099f8d61dd SWDEV-258573 : fix for OCLP2PBuffer test failure
Change-Id: I363d4fb2bb94d9bc03e96844d31dec7ef9b2ce33
2020-11-13 02:25:53 -05:00
Jason Tang d943cae31f Add CommandKindString to the log
Change-Id: Ie23123a85cff82b1732da85f5bffbff6958c02e5
2020-10-26 09:16:03 -04:00
German Andryeyev bd340d8cbf Correct reported info in ROC profiler
OCL can't distinguish different copy types, but ROC profiler
expects SDMA transfer visibility. Add extra code to detect
a transfer with the host memory and substitute OCL command

Change-Id: I5290acd0e10bc082e00c1d4ae1474a075de7f165
2020-10-23 18:29:48 -04:00
Evgeny 6f88bf5eac SWDEV-249623 : vdi: case string for marker activity
Change-Id: Id1767a5f33d821d649e15f4659e3520ee215c374
2020-10-22 04:04:00 -04:00
Jason Tang 25cc965c76 Change file mode 755 back to 644
Change-Id: I4ba5d66997ffd3331c56674d4bf805160dcdf049
2020-10-19 15:09:32 -04:00
Jason Tang c5184d61b4 SWDEV-252150 - No need to send a Marker if the event is completed in Event::notifyCmdQueue()
Change-Id: Iaa1c550ce0849c12298a24812604345dbf877a5e
2020-10-14 09:29:24 -04:00
agodavar ac72e50adc SWDEV-254185 - Added support to pass include headers to hipRTC
Change-Id: Ic7f2957b04e518c57e2fd3fc9d839de07232405e
2020-10-12 03:46:04 -04:00
Sarbojit Sarkar 4a025e1a87 [perf]hipMalloc performance optimization
Change-Id: I6e8a918cc1c4cafad197b09e10755cd180e11ead
2020-10-06 03:19:41 -04:00
kjayapra-amd 7462e39954 SWDEV-252542 - Fixing Win Compilation on SWDEV-241902.
Change-Id: If76f79002b265dccf6da4acef1ff9372d8b0a2ff
2020-09-18 12:11:56 -04:00
kjayapra-amd a66c56d641 SWDEV-241902 - Changes to pass file descriptor and offset to load code object.
Change-Id: I0243cccdeaa533b2a56fde42f12d5424c3b63a3b
2020-09-15 07:54:24 -04:00
Laurent Morichetti 5d4b6f74d3 Use std::atomic
Replace amd::Atomic with std::atomic. Remove make_atomic uses by
converting the variable to std::atomic and making sure the memory
order is relaxed when synchronizes-with is not needed.

Delete utils/atomic.hpp.

Change-Id: I0b36db8d604a8510ac6e36b32885fd16a1b8ccfa
2020-09-09 14:55:29 -04:00
Jason Tang 8b4eb43a4a Call callback even if clBuildProgram is not successful
Change-Id: I3be1d500ecc712c738cfaf252eca83663cad6b77
2020-09-08 14:41:20 -04:00
Laurent Morichetti 080dcfe857 Improve queueLock and lastCmdLock
Reduce the size of the queueLock and lastCmdLock critical sections
to improve lock contention performance. The smaller the critical
sections are the better.

lasCmdLock is still needed to guarantee that getLastEnqueueCommand_
can retain the command before it is swapped out and released.

Change-Id: Id35d4a77c035b2da0de4c15568b153d49e958bb7
2020-09-01 18:09:31 -04:00
Laurent Morichetti c95c613edc Fix indentation with clang-format
Change-Id: I7aeadef3c613d5efc31a98e666bfb819ae34bdf5
2020-09-01 18:09:19 -04:00
Saleel Kudchadker 1c24072d13 Revert "SWDEV-241977 [ROCm QA] Random Soft hang observed while running TF and Caffe2 benchmarks"
This reverts commit ce038f3163.

Change-Id: Ib56493c92eca793f1dfb6f1cbefb32f0b4f65e89
2020-09-01 18:09:10 -04:00
Tao Sang e986f5c820 Replace private libelf with elfio
Change-Id: I4c630d78f7bf23dda85ec8480bb2790864405657
2020-08-26 12:32:13 -04:00
Saleel Kudchadker ec73340348 Add Queue profling param and toggle for HIP
Use signal timestamps if NDRange command takes forceProfile flag.

Change-Id: Ib7f187d781fd78a7346818afb3344a9378f4c104
2020-08-06 03:09:53 -04:00
Jason Tang 8ef5da00c7 SWDEV-246687 - Do not use std::vector reference as class member cuMask_
The current implementation creates default reference in the stack and assigns it to class member cuMasks_, so whenever the content of the stack changes, cuMask_ would change.

Change-Id: Iefab63c335d504b83c4ae90bd34ae76c6afb8f3c
2020-08-05 16:57:36 -04:00
Tao Sang fdef6f722f Apply constexpr on global constant varaibles
When HIP_ENABLE_DEFERRED_LOADING=0, many global variables will be
referenced but they are not initialized in that early time. The patch
will use constexpr to initialze global constant varables in compile
time.

Change-Id: I9d538b7abc6a0ce700ec3332b97fc144db5fc1ef
2020-07-22 22:14:13 -04:00
Alex Xie ce038f3163 SWDEV-241977 [ROCm QA] Random Soft hang observed while running TF and Caffe2 benchmarks
Change-Id: I42016c11db15411b86e7b8130d6ba557bc22dbb7
2020-07-22 02:03:48 -04:00
Jatin Chaudhary 48690f29e9 Adding AnyOrder Flag
Change-Id: I6baaef42b98adfbc8cf2605e175ec007e008045f
2020-07-22 00:25:04 -04:00
Matt Arsenault 5577eabcea Fix -Wmissing-braces
Change-Id: I2394b6923c789f36e72242f4b196844cc0ee90ba
2020-07-15 16:51:03 -04:00
German Andryeyev af1c4a5794 Disable sysmem alloc for SVM memory
Device backend is responsible for memory allocation, including
possible HMM support.

Change-Id: I0e4e5ae3b9551790f4f85f0791cca63196cc896e
2020-07-15 12:04:23 -04:00
Christophe Paquot 3d15a1e291 Make append and setLastQueuedCommand atomic
Two threads can enqueue to the same HostQueue (HostQueue::enqueue)
and result in last queued command being the first one reachine queue_.enqueue

NOTE: Temporarly make setLastQueuedCommand empty function to pass the build

Change-Id: Id09c3a28d184986f52b2ec86a2f6a18c40df1f0b
2020-07-14 18:22:45 -04:00
kjayapra-amd e993bf9f47 SWDEV-243423 - Avoid repeated metadata processing if the unbundled binary_ptr is same.
Change-Id: I71e008021b728dec61187d9ff29483ad8c4cad5c
2020-07-10 10:35:16 -04:00
kjayapra-amd 16e6b65b5c SWDEV-240165 - Move all amd::MemObjMap_ reference to ROCclr and only allow base ptr to get ipc handle.
Change-Id: I9de10a0c4ba4dee3b3c8b972966840ab807001d8
2020-07-09 21:19:45 -04:00
Tao Sang f7bf882981 Fix static lib crash by setting top init_priority
Set top init_priority on affecting global variables so that
they will be created firstly and destroyed lastly.

Change-Id: Ied59fbecab66ba8195c4a7a02b6bef9fa2fad3af
2020-07-06 16:54:10 -04:00
German Andryeyev 059832b526 Return always true for P2P validation under ROCr
Change-Id: Id32a5a94a642e708d1d042c5247af38501bec153
2020-07-04 11:38:04 -04:00
German Andryeyev c18892a590 Remove extra barriers
Don't flush current batch if the dependent wait is a nop

Change-Id: I8a8722b9011fe042c1a4ce195938290fc75e7c86
2020-06-22 12:41:02 -04:00
Tao Sang 53264a8a4a Support numa policy set by user
Add CL_MEM_FOLLOW_USER_NUMA_POLICY

Change-Id: I90a19dac7641827dff2ceb9ef8ae5f3467ed87a1
2020-06-19 18:16:47 -04:00
German Andryeyev 01c2727a3a Disable P2P emulation for HIP
Some apps use P2P transfer without any validation for peer access.
Report an error if runtime has found such a request.

Change-Id: I3bf728f1fc3969697ade97bb1d2f1dce294078e2
2020-06-16 11:21:54 -04:00
Vlad Sytchenko b835120dfa Fix typo from previous change
Change-Id: Ib8f3418a3460d86d75fc5529ed6270a164e9b10e
2020-06-16 11:12:33 -04:00
Vlad Sytchenko 5b9af8f28d Fix some -Wunused-but-set-variable warnings
Change-Id: I281583b5abdfc09d5dd8b7dfb20b8821581db193
2020-06-15 17:51:01 -04:00
Vlad Sytchenko e50a9eec9d Fix -Wsequence-point warning
Change-Id: Ib6322e06f83887da4a29f8eafb99b743211e851d
2020-06-15 17:40:11 -04:00
Tao Sang db10d42e50 Make hipHostMalloc() respect hipSetDevice()
Change-Id: Ibdb666fe8dd049735df2288878501a66f7eedc28
2020-06-12 18:32:10 -04:00
German Andryeyev e4177b75bc Add missing return
Change-Id: Ibe9c1ccb377ce14ad69a0e9828ea70b707acba34
2020-06-12 17:35:45 -04:00
German Andryeyev c5afd5d412 Initial HMM support
- Expose ROCclr interfaces for HIP usage
- ROCr interfaces aren't available in staging, thus control the
build with AMD_HMM_SUPPORT define

Change-Id: Iadc2bcc230e78d3b0dc22b235189c8cc80843446
2020-06-12 09:06:07 -04:00
Aryan Salmanpour b5552aa97f Add support for setting queue priority for ROCm backend
Change-Id: I67ed5a6868af79538f7f4522d8d11c043cdf3c1e
2020-06-04 20:16:32 -04:00
German Andryeyev 44bc0cb35d Revert "Avoid lock for last queued command"
This reverts commit dc4e09a63a.

Reason for revert: <INSERT REASONING HERE>

Change-Id: Ie10442c9447f010bb90c679b6cffca5b48b8d054
2020-06-04 18:08:17 -04:00
kjayapra-amd e9bd41bf1a SWDEV-234295 - Dont clear device programs during amd::program::build()
Change-Id: I87bc7e2c830edee783ee490bbb087492467f2704
2020-06-03 12:18:25 -04:00