Grafico dei commit

3521 Commit

Autore SHA1 Messaggio Data
Chauncey Hui 0981ba433b SWDEV-2 - Change OpenCL version number from 3136 to 3137 2020-05-06 03:00:03 -04:00
Payam d6100a9547 name change vdi to rocclr
Change-Id: I856d6ac1a9a83d89715d6e33dec4aa17abc2f2f2
2020-05-06 00:54:45 -04:00
Alex Xie bfbc8cd09b SWDEV-234684 - hipmemcpy optimization does not work in tests
Change-Id: I899d172c5b2af88c796fe9a36f97d15ac45caf94
2020-05-05 15:58:03 -04:00
Saleel Kudchadker 0fbc0a895b Disable small copy optimization for now
Change-Id: Ib7a4aa676bb60940e067c985eb19070bd63b2fc2
2020-05-05 11:52:42 -04:00
kjayapra-amd 8931ac106c SWDEV-209747 - Enable DevLogs on DEBUG or DEV_LOG_ENABLE Compiletime var
Change-Id: Ie5b7855c469f03947b680d4844c1657cbae55b11
2020-05-05 09:55:54 -04:00
Chauncey Hui 339a830bc0 SWDEV-2 - Change OpenCL version number from 3135 to 3136 2020-05-05 03:00:04 -04:00
kjayapra-amd 347e36e31b SWDEV-232464 - Memory Map modules loaded via file from hipModuleLoad
Change-Id: I0e644a161c8000abe1b07fbec72de09f1c0a4b18
2020-05-04 12:40:16 -04:00
Chauncey Hui 6d0fa49c5d SWDEV-2 - Change OpenCL version number from 3134 to 3135 2020-05-01 03:00:03 -04:00
German Andryeyev 7302ebcfbc Optimize synch operations
- Stall the queue only for HSA copy operations

Change-Id: Ia3debcc0f36284c5f8cd2776d31674f3aeed04ea
2020-04-30 11:17:48 -04:00
Alex Xie 6c5a42b33c SWDEV-232894 Port hipMemcpy optimizations from HCC to VDI
Apply the optimization to change for OpenCL too.
Clean up some unnecessary checks.

Change-Id: I840261fe35baeeadeba7388e86779d482f509aad
2020-04-30 11:06:28 -04:00
Chauncey Hui 1de8abd031 SWDEV-2 - Change OpenCL version number from 3133 to 3134 2020-04-30 03:00:03 -04:00
Laurent Morichetti 9e1964ddaa Make the device binary copy optional
Device binaries that are embedded inside the host binary do not
require a copy. Their lifetime is guaranteed to exceed that of the
loaded executable.

Add a 'make_copy' parameter to amd::Program::addDeviceProgram. If
make_copy is false the original image will be used and will not
get freed when the amd::Program is destroyed.

Change-Id: I7973bb0243f5a2d1b639b8a88445cfe6af919dd7
2020-04-29 18:39:57 -04:00
Christophe Paquot b54c3f7db9 Couple of cleanups.
Remove queue limitation since we loop through HW queues now.
Add a DevLogError if we fail to create the hsa_queue. A ticket showed a regression there.

Change-Id: I4f58e405f88e75600a762f6d6352838c969cdb5e
2020-04-29 09:18:07 -07:00
Chauncey Hui 860ba6f0a1 SWDEV-2 - Change OpenCL version number from 3132 to 3133 2020-04-29 03:00:03 -04:00
Saleel Kudchadker 5f64e6e7ad Add a threshold for forcing ROCr to take blit path
This workaround is to avoid performance penalty of SDMA engine
taking a while to clock up from a lower DPM state. Add env var
GPU_FORCE_BLIT_COPY_SIZE (1024 by default for HIP in KB). Forcing
Src and Dst agent to be amdgpu makes ROCr take blit copy path for
what otherwise should have been SDMA copy

Change-Id: I222f687155f86000d17d66d25182e490b6710463
2020-04-28 17:11:24 -04:00
Matt Arsenault cba7a4d20e Avoid intermediate object library
Object libraries are weird, and producing a library by using the
target objects from them doesn't automatically import the interface
properties of the linked targets. These object libraries only have
single uses, so just directly create the final library from the
sources.

Leaves libelf as an object library, since there seems to be some cmake
oddity when trying to link an unexported target to an exported one.

Change-Id: Ic379612c89340c40085c9862cfe111fa4bbff425
2020-04-28 16:41:34 -04:00
Vlad Sytchenko 2963d0d454 Add entry for another unannounced asic
Change-Id: I63c6ce6221e812a33e9427841be49840a8f48154
2020-04-28 14:23:57 -04:00
Vlad Sytchenko 63b90a32c4 Add entry for new device id
This is accomodate upcoming Pal::AsicRevision changes.

Change-Id: Ic108b647f3548d34b7aa83d6077fb88452768998
2020-04-28 14:23:49 -04:00
agodavar f149fe0803 P2PStating buffer allocation when P2P is not enabled between all GPUs
SWDEV-232580 & SWDEV-232580
Allocate p2p statging buffer when full P2P access is not available between all devices.
p2p staging buffer will eventually be used when required.

Change-Id: If8490ba7b1c52c432c1e942ae95421b9d2ec7097
2020-04-28 07:10:57 -04:00
Chauncey Hui 27bfd2a3ee SWDEV-2 - Change OpenCL version number from 3131 to 3132 2020-04-28 03:00:02 -04:00
Alex Xie 009d0b5f55 SWDEV-232894 Port hipMemcpy optimizations from HCC to VDI
Change-Id: I6bebe9ac503a9f80d067aeea8a848409ad210338
2020-04-27 14:53:58 -04:00
German Andryeyev 082cbfa1f5 Don't attempt to reuse the cooperative queue
Change-Id: I0e98e292a562715a7b395118f899af859f3e42bb
2020-04-27 09:18:05 -04:00
Chauncey Hui d2091cc266 SWDEV-2 - Change OpenCL version number from 3130 to 3131 2020-04-25 03:00:03 -04:00
Matt Arsenault e7d6a5e5a6 Prune some unused compile definitions
There's a lot of unnecessary system configuration junk here which
isn't used, and is already available through compiler predefines. This
is also blindly placed without really checking the host architecture.

-DLINUX is unused.

-D__AMD64__ is predefined by the compiler, and is also redundant with
 __x86_64__ and ATI_BITS_64.

__x86_64__ should also be removed. It's used in libelf, but I'm not
sure if msvc predefines this or not.

-DqLittleEndian is unused, and also doesn't follow macro naming
 conventions (plus compilers have their own predefines for checking
 this).

Change-Id: I89f6fc4c88e861623be7f32df41aecbb4e9009ab
2020-04-24 12:38:42 -04:00
Matt Arsenault c60d7d860d Add comgr macros to public definition export
This should allow the cmake build for the opencl runtime to work
without manually adding these definitions. The PAL build also adds
these as private defines in its build, so change rocm to match. This
should probably be including these a config header to benefit other
builds, but this will at least avoid some clutter in the opencl build
for now.

Change-Id: I1044984b87ba3fc72e280e255ceea2dd9e3337ff
2020-04-24 12:12:54 -04:00
Matt Arsenault 350d54e198 Don't use include_directories for ROCR includes
Use the modern cmake, target specified method.

Change-Id: Icd7196bfccb85f255bbc01bc87c6667d961bb236
2020-04-24 11:05:40 -04:00
Matt Arsenault ff12016c7b Use target_compile_definitions for HSA vs. PAL device macros
Change-Id: I7e1240cb4d32ce86948814d727a516025ee976fa
2020-04-24 11:05:16 -04:00
Matt Arsenault 815198bec9 Cleanup libelf build
Use target specific forms for define/include. Don't set
CMAKE_CXX_FLAGS for the standard, which is already implied from the
parent build.

Change-Id: I4000893376d6685e9889b66ad8451fc493020272
2020-04-24 11:04:52 -04:00
Matt Arsenault ec62f9b8de Unscreamake some cmake functions
This was already using the new lowercase style in most places.

Change-Id: I7ed04a3652c932581a2897f2fee79d79aa732f8e
2020-04-24 11:04:21 -04:00
Matt Arsenault 3c2e0f6155 Remove leftover cmake debug printing
Change-Id: I886b21717eadab6b4365ddaff063fbcd37300aa8
2020-04-24 11:03:37 -04:00
Matt Arsenault 83455f36c5 Modernize cmake usage for finding amd_comgr
Don't use find_path on the header, it's redundant with the interface
include directories on the imported target. Use the target specific
forms for including and linking it.

Change-Id: I3923143c992888ee7d5ee1130084ac2e5eaa0f3a
2020-04-24 11:03:27 -04:00
Matt Arsenault a36f19df51 Don't use CMAKE_SOURCE_DIR
This is almost never the correct thing to use since it breaks adding
this as a subproject build in a larger build. Switch to refer to
CMAKE_CURRENT_SOURCE_DIR, which is equivalent in a standalone build.

Change-Id: Ib8dbbc0668491f4227389b9a5b27da770b3bc5ce
2020-04-24 11:02:52 -04:00
Chauncey Hui bb348a463d SWDEV-2 - Change OpenCL version number from 3129 to 3130 2020-04-24 03:00:03 -04:00
German Andryeyev 89133a7301 SWDEV-232807
[ROCm][TCT][HIP] cooperative stream test case is failing.

Make sure lockXfer() in the blit manager returns a valid value.
Port the latest PAL backend logic into the ROCr backend.
This change doesn't fix the issue, reported in the ticket.

Change-Id: I54101a824f49a2dcfbbf5414cb5b3af41745306d
2020-04-23 15:01:02 -04:00
Chauncey Hui a3f163be6a SWDEV-2 - Change OpenCL version number from 3128 to 3129 2020-04-23 03:00:03 -04:00
Michael LIAO 97f55b5c7f [vdi] Add device assertion support.
- Once device assertion occurs, abort the host execution as well.
- TODO: This's the initial support. As we need to drain hostcall queue
  to ensure device assertion message being flushed out, hostcall
  listener needs an interface to explicitly drain its queue.

Change-Id: I8a04400aa7109bfd054ae5777c41a4abbf0db4a9
2020-04-22 10:03:55 -04:00
Chauncey Hui 43a8e929a2 SWDEV-2 - Change OpenCL version number from 3127 to 3128 2020-04-22 03:00:03 -04:00
Matt Arsenault d4a447967e Use cmake features to set c++ version
Change-Id: I2649cdca5bc68298371f770a7a624a21db3f4137
2020-04-21 16:01:30 -04:00
Matt Arsenault 356c3bf5b8 Update cmake project name
Change-Id: If4bdf15ca3774148c0c415e8cd950efa310ef62f
2020-04-21 15:23:06 -04:00
Chauncey Hui a8150cfe39 SWDEV-2 - Change OpenCL version number from 3126 to 3127 2020-04-21 03:00:03 -04:00
Payam 0b0d9c27cb adding License file
Change-Id: I21062736fe9c9b6e6c0595108421e90e6f31d8d9
2020-04-20 16:12:32 -04:00
Chauncey Hui 54fc281d6c SWDEV-2 - Change OpenCL version number from 3125 to 3126 2020-04-20 03:00:03 -04:00
kjayapra-amd 7458bf9964 SWDEV-229840 - Improve error messages on ROCCLR Layer.
Change-Id: Iab7d9156cdc206db86385aa05023a0095ed40f92
2020-04-19 20:01:49 -04:00
Chauncey Hui 4e46da4fb0 SWDEV-2 - Change OpenCL version number from 3124 to 3125 2020-04-18 03:00:03 -04:00
Matt Arsenault 55cc77d7d1 Fix -Winconsistent-missing-override warnings
Change-Id: I67d4a853045197ed28e5d616a4afc86f1d6a1d7c
2020-04-17 15:24:39 -04:00
Matt Arsenault 72c435ea35 Fix several instances of -Wsizeof-array-div
e.g.:
 warning: expression does not compute the number of elements in this
 array; element type is '__cpu_mask' (aka 'unsigned long'), not
 'uint32_t' (aka 'unsigned int') [-Wsizeof-array-div]

for (uint i = 0; i < sizeof(mask_.__bits) / sizeof(uint32_t); ++i) {

__bits is a __cpu_mask, which is a 64-bit type. These were accessed
through uint32_t pointers so the loop bound should have been
correct. These operations can be done directly on the 64-bit type so
we can leave the array size pattern, and eliminate the casts.

The case in getNextSet should probably be rephrased in terms of
__cpu_mask to avoid the pointer casting, but this is tricker than the
other cases so I used the easy option to quiet the warning.

Change-Id: I1332584fad58439ccd9d369589519a9918e1678e
2020-04-17 15:24:33 -04:00
German Andryeyev e289ebdf20 SWDEV-231691
- Problem with CL_DEVICE_GLOBAL_FREE_MEMORY_AMD query.
Check if allocated memory exceeds the total size.

Change-Id: Ieed8829860663bac1acfa41d21309dff4d8772c7
2020-04-17 09:03:04 -04:00
Chauncey Hui 0318fcccf6 SWDEV-2 - Change OpenCL version number from 3123 to 3124 2020-04-17 03:00:03 -04:00
Tao Sang 02cd18813f Hide elf Apis for internal use only
Hide elf Apis so that there won't be mixing with external libelf

Change-Id: I2b3d25a8ab3b161f4dd969e31ad45e8aa627263b
2020-04-16 11:47:34 -04:00
Chauncey Hui 04563c4503 SWDEV-2 - Change OpenCL version number from 3122 to 3123 2020-04-16 03:00:03 -04:00