Vlad Sytchenko
3a84fcd13e
Handle the option USE_COMGR_LIBRARY correctly
...
This is a follow up to http://gerrit-git.amd.com/c/compute/ec/vdi/+/359563 . The setting is now either ON or OFF, never "yes".
Change-Id: I031d013a8d239dc72ef610da81bd31b8b78a3ba8
2020-06-03 17:25:47 -04:00
Tao Sang
fabfc42b68
Fix TC linux build issue due to previous Numa patch
...
Change-Id: I6068edaf38cac6fad187c8429707afdb727e8d41
2020-06-03 16:42:53 -04:00
Tao Sang
aedb9590be
Support Numa-aware cpu selection
...
Select cpu in terms of the smallest Numa distance for a GPU device.
This will improve performance of hipMemcpy in the mode of
hipMemcpyHostToDevice or hipMemcpyDeviceToHost for small buffer.
`
Change-Id: I2860f1f83b79be0dff7bf5e64cf68ab4448db0a1
2020-06-01 21:01:24 -04:00
German Andryeyev
fb401bfe6d
Revert "Revert "Reenable cooperative groups""
...
This reverts commit abc115bda8 .
Reason for revert: <INSERT REASONING HERE>
Change-Id: I93c45fae27e0a08b199542d44fb0d65fc74ea13c
2020-05-25 14:11:58 -04:00
kjayapra-amd
3603272d24
SWDEV-237467 - Adding return type to comply with -Wall compiler flag.
...
Change-Id: I3c9935105ef262cdbf1c8ee293930b018be0197a
2020-05-23 14:22:54 -04:00
kjayapra-amd
ea0137fb22
FEAT-30761 - Fixing fall through in gfxip major/minor detection.
...
Change-Id: Ib97b3dbe993e01df3360cbeda6bd0d9d366535b6
2020-05-23 12:42:44 -04:00
kjayapra-amd
53a890b499
SWDEV-237467 - Return proper hip error codes incase of ROCclr IPC API failures.
...
Change-Id: I1d018918ed71f6d80846b3017f7a15f4ab496554
2020-05-22 22:10:15 -04:00
kjayapra-amd
618d66b5fe
SWDEV-236110 - Fixing uninitialized variable.
...
Change-Id: I26a2a6826da643b57da9746e3ce888a46c4e78f4
2020-05-22 20:40:24 -04:00
kjayapra-amd
32043017ed
SWDEV-229840 - Remove false error messages.
...
Change-Id: I0346768a2a52913d5330bc2007a7706e2a439c47
2020-05-22 18:18:41 -04:00
Aryan Salmanpour
fec4adfd19
check for valid queue before accessing cuMask()
...
Change-Id: I8d4b0dbcd097c2ec5c31dea5a3d0060f0864a7e8
2020-05-20 16:23:09 -04:00
German Andryeyev
f56a052243
Add missing memory allocation in printf
...
Change-Id: I452b676612b54f70106e7ef1bcb5ce2baf7b3ffc
2020-05-20 14:49:59 -04:00
Alex Xie
966448c53b
Fix compile error in certain version of GCC
...
Change-Id: I27f021db908bf114a685427a47cd9f0d6b2e5693
2020-05-20 13:13:55 -04:00
German Andryeyev
2ce6bbebc4
Fix async mem clear
...
Optimization for the fence release removed a sync for mem fill.
Add simple const buffer management forr the filled pattern to avoid
pattern overwriting with the async fills.
Change-Id: I63773ac09ceec31d5396d24570e4647ff096326b
2020-05-20 11:13:41 -04:00
Chauncey Hui
0af9c06968
Modified IpcDetach to return status instead of void.
...
Change-Id: I68ed94b93f0383babe25eb046b4047d249a0fdc1
2020-05-20 03:38:21 -04:00
Matt Arsenault
1d267c9c08
Remove include/ from #includes
...
These are unnecessary and an obstacle to producing a relocatable
package.
Change-Id: I0059bf7a2d11fcece0cd7ab47d7545d0df4d7099
2020-05-19 19:35:09 -04:00
Aakash Sudhanwa
abc115bda8
Revert "Reenable cooperative groups"
...
This reverts commit 82dc1a6343 .
Reason for revert: <INSERT REASONING HERE>
Change-Id: I8954b37c354382804a139d80e2551c381fd9b2ed
2020-05-19 18:21:48 -04:00
Jason Tang
cd2a713d63
Add major/minor/stepping to device layer
...
Change-Id: If82ea55a46b166b243a98089a6e9c40ccfdb479f
2020-05-17 12:57:34 -04:00
Jason Tang
9b5e1fec6c
Correct the way to get subtarget
...
Change-Id: I47805424c0bd69547cff0ab71c369552016052b5
2020-05-15 17:18:39 -04:00
Aryan Salmanpour
fed94b8604
Add support for setting CU mask on ROCclr for ROCm backend
...
Change-Id: I0dbe2eeb33467fc0f24b26929119c10e9b455da7
2020-05-15 14:23:43 -04:00
German Andryeyev
82dc1a6343
Reenable cooperative groups
...
Change-Id: Ia43049ef550bffa6d21704dbd306ddb9c1d56af0
2020-05-15 12:41:12 -04:00
Christophe Paquot
6a5af4056e
Use system scope for packet following sdma copies
...
SWDEV-234947
SWDEV-236298
Instead of forcing a barrier packet, just inject system scope on the next packet.
Change-Id: If9bcee23e08dfe5db731235e2fcb30582cbd4c1c
2020-05-15 12:20:06 -04:00
Matt Arsenault
3624b8df16
Fix missing target includes for GL/EGL headers
...
Change-Id: I9a31eae40cb7187dd0264ad5b9577fab96464b41
2020-05-14 16:56:34 -04:00
Matt Arsenault
3a7f2e3682
Improve usage of target_include_directories
...
Eliminates most of the global include_directories. The install header
paths are different from the build directory, so we have to separate
those for the exported target include paths.
Change-Id: I13e4c56c1218cb31c29a316422dc5fd1d09d8b1b
2020-05-13 17:25:58 -04:00
Vlad Sytchenko
614aaa8409
Load versioned comgr library
...
Change-Id: I4cc81f33e6889ac81a82747159bc210256f33c21
2020-05-13 16:46:35 -04:00
German Andryeyev
8904848abc
Set CPU access flag for SVM
...
Make sure all GPUs have CPU access flag for the fine grain buffer.
Change-Id: Ifc843c2807e70a271b269192ae7859205ff458f3
2020-05-13 16:05:46 -04:00
German Andryeyev
d2b9a57c4f
Disable cooperative groups support
...
Change-Id: I1b526f2228d083ecad7907a6eaf37c1dd4428277
2020-05-12 14:31:10 -04:00
Saleel Kudchadker
d10d691e76
Add env var to toggle large bar support in runtime
...
Use ROC_ENABLE_LARGE_BAR (0/1) to toggle. The support is
enabled by default.
Change-Id: I6cb93a46594cb6f5e90bf6057738330225efb553
2020-05-12 13:20:06 -04:00
Jason Tang
b4f1239f34
device/rocm: split gfxVersion to major/minor/stepping
...
Change-Id: I1e437eaee30794147713d9516229211670f01d90
2020-05-12 12:17:13 -04:00
German Andryeyev
ae4aceb55e
Make sure the list of HSA agents is valid
...
If HIP_VISIBLE_DEVICES is active, then make sure the list of HSA
agents contains the valid agents
Change-Id: I584aad999a230ab7f88a0cfe20dcd0abe79c43a5
2020-05-11 15:49:30 -04:00
Christophe Paquot
3ed185307e
Fix cooperative flag for hsa_queue creation in case they're not available
...
SWDEV-233766
Change-Id: If410ecfed61f2b3bb50b847cf2ededc573139494
2020-05-11 13:40:50 -04:00
Christophe Paquot
2a02026696
Add gpu().hasPendingDispatch() in the SDMA path
...
SWDEV-234947
Change-Id: I8aa501f8755d136708b0d12ee3c30229c238660d
2020-05-08 18:19:51 -04:00
Michael LIAO
12fcfee41d
Fix build failure.
...
- Also fix `-Wreorder` warning. NFC.
Change-Id: I766fdc622c9107f901a55498bdc8fef3d821d1b7
2020-05-07 10:39:10 -04:00
Michael LIAO
503ef06555
Clear executable permission.
...
Change-Id: Ia0d363b1ba89d7947e5b5a55cb67edba86f0515e
2020-05-07 10:38:58 -04:00
German Andryeyev
3446d4e638
Switch PAL version to 592
...
Change-Id: I7e90b8fd55c57d8d49e4ec1273ab671f96197bae
2020-05-06 14:51:32 -04:00
Payam
d6100a9547
name change vdi to rocclr
...
Change-Id: I856d6ac1a9a83d89715d6e33dec4aa17abc2f2f2
2020-05-06 00:54:45 -04:00
Alex Xie
bfbc8cd09b
SWDEV-234684 - hipmemcpy optimization does not work in tests
...
Change-Id: I899d172c5b2af88c796fe9a36f97d15ac45caf94
2020-05-05 15:58:03 -04:00
Saleel Kudchadker
0fbc0a895b
Disable small copy optimization for now
...
Change-Id: Ib7a4aa676bb60940e067c985eb19070bd63b2fc2
2020-05-05 11:52:42 -04:00
German Andryeyev
7302ebcfbc
Optimize synch operations
...
- Stall the queue only for HSA copy operations
Change-Id: Ia3debcc0f36284c5f8cd2776d31674f3aeed04ea
2020-04-30 11:17:48 -04:00
Alex Xie
6c5a42b33c
SWDEV-232894 Port hipMemcpy optimizations from HCC to VDI
...
Apply the optimization to change for OpenCL too.
Clean up some unnecessary checks.
Change-Id: I840261fe35baeeadeba7388e86779d482f509aad
2020-04-30 11:06:28 -04:00
Christophe Paquot
b54c3f7db9
Couple of cleanups.
...
Remove queue limitation since we loop through HW queues now.
Add a DevLogError if we fail to create the hsa_queue. A ticket showed a regression there.
Change-Id: I4f58e405f88e75600a762f6d6352838c969cdb5e
2020-04-29 09:18:07 -07:00
Saleel Kudchadker
5f64e6e7ad
Add a threshold for forcing ROCr to take blit path
...
This workaround is to avoid performance penalty of SDMA engine
taking a while to clock up from a lower DPM state. Add env var
GPU_FORCE_BLIT_COPY_SIZE (1024 by default for HIP in KB). Forcing
Src and Dst agent to be amdgpu makes ROCr take blit copy path for
what otherwise should have been SDMA copy
Change-Id: I222f687155f86000d17d66d25182e490b6710463
2020-04-28 17:11:24 -04:00
Vlad Sytchenko
2963d0d454
Add entry for another unannounced asic
...
Change-Id: I63c6ce6221e812a33e9427841be49840a8f48154
2020-04-28 14:23:57 -04:00
Vlad Sytchenko
63b90a32c4
Add entry for new device id
...
This is accomodate upcoming Pal::AsicRevision changes.
Change-Id: Ic108b647f3548d34b7aa83d6077fb88452768998
2020-04-28 14:23:49 -04:00
agodavar
f149fe0803
P2PStating buffer allocation when P2P is not enabled between all GPUs
...
SWDEV-232580 & SWDEV-232580
Allocate p2p statging buffer when full P2P access is not available between all devices.
p2p staging buffer will eventually be used when required.
Change-Id: If8490ba7b1c52c432c1e942ae95421b9d2ec7097
2020-04-28 07:10:57 -04:00
Alex Xie
009d0b5f55
SWDEV-232894 Port hipMemcpy optimizations from HCC to VDI
...
Change-Id: I6bebe9ac503a9f80d067aeea8a848409ad210338
2020-04-27 14:53:58 -04:00
German Andryeyev
082cbfa1f5
Don't attempt to reuse the cooperative queue
...
Change-Id: I0e98e292a562715a7b395118f899af859f3e42bb
2020-04-27 09:18:05 -04:00
Matt Arsenault
c60d7d860d
Add comgr macros to public definition export
...
This should allow the cmake build for the opencl runtime to work
without manually adding these definitions. The PAL build also adds
these as private defines in its build, so change rocm to match. This
should probably be including these a config header to benefit other
builds, but this will at least avoid some clutter in the opencl build
for now.
Change-Id: I1044984b87ba3fc72e280e255ceea2dd9e3337ff
2020-04-24 12:12:54 -04:00
Matt Arsenault
350d54e198
Don't use include_directories for ROCR includes
...
Use the modern cmake, target specified method.
Change-Id: Icd7196bfccb85f255bbc01bc87c6667d961bb236
2020-04-24 11:05:40 -04:00
Matt Arsenault
83455f36c5
Modernize cmake usage for finding amd_comgr
...
Don't use find_path on the header, it's redundant with the interface
include directories on the imported target. Use the target specific
forms for including and linking it.
Change-Id: I3923143c992888ee7d5ee1130084ac2e5eaa0f3a
2020-04-24 11:03:27 -04:00
Matt Arsenault
a36f19df51
Don't use CMAKE_SOURCE_DIR
...
This is almost never the correct thing to use since it breaks adding
this as a subproject build in a larger build. Switch to refer to
CMAKE_CURRENT_SOURCE_DIR, which is equivalent in a standalone build.
Change-Id: Ib8dbbc0668491f4227389b9a5b27da770b3bc5ce
2020-04-24 11:02:52 -04:00