Tao Sang
fabfc42b68
Fix TC linux build issue due to previous Numa patch
...
Change-Id: I6068edaf38cac6fad187c8429707afdb727e8d41
2020-06-03 16:42:53 -04:00
Tao Sang
aedb9590be
Support Numa-aware cpu selection
...
Select cpu in terms of the smallest Numa distance for a GPU device.
This will improve performance of hipMemcpy in the mode of
hipMemcpyHostToDevice or hipMemcpyDeviceToHost for small buffer.
`
Change-Id: I2860f1f83b79be0dff7bf5e64cf68ab4448db0a1
2020-06-01 21:01:24 -04:00
Aryan Salmanpour
fec4adfd19
check for valid queue before accessing cuMask()
...
Change-Id: I8d4b0dbcd097c2ec5c31dea5a3d0060f0864a7e8
2020-05-20 16:23:09 -04:00
Chauncey Hui
0af9c06968
Modified IpcDetach to return status instead of void.
...
Change-Id: I68ed94b93f0383babe25eb046b4047d249a0fdc1
2020-05-20 03:38:21 -04:00
Jason Tang
cd2a713d63
Add major/minor/stepping to device layer
...
Change-Id: If82ea55a46b166b243a98089a6e9c40ccfdb479f
2020-05-17 12:57:34 -04:00
Aryan Salmanpour
fed94b8604
Add support for setting CU mask on ROCclr for ROCm backend
...
Change-Id: I0dbe2eeb33467fc0f24b26929119c10e9b455da7
2020-05-15 14:23:43 -04:00
Saleel Kudchadker
d10d691e76
Add env var to toggle large bar support in runtime
...
Use ROC_ENABLE_LARGE_BAR (0/1) to toggle. The support is
enabled by default.
Change-Id: I6cb93a46594cb6f5e90bf6057738330225efb553
2020-05-12 13:20:06 -04:00
Jason Tang
b4f1239f34
device/rocm: split gfxVersion to major/minor/stepping
...
Change-Id: I1e437eaee30794147713d9516229211670f01d90
2020-05-12 12:17:13 -04:00
German Andryeyev
ae4aceb55e
Make sure the list of HSA agents is valid
...
If HIP_VISIBLE_DEVICES is active, then make sure the list of HSA
agents contains the valid agents
Change-Id: I584aad999a230ab7f88a0cfe20dcd0abe79c43a5
2020-05-11 15:49:30 -04:00
Christophe Paquot
3ed185307e
Fix cooperative flag for hsa_queue creation in case they're not available
...
SWDEV-233766
Change-Id: If410ecfed61f2b3bb50b847cf2ededc573139494
2020-05-11 13:40:50 -04:00
Michael LIAO
503ef06555
Clear executable permission.
...
Change-Id: Ia0d363b1ba89d7947e5b5a55cb67edba86f0515e
2020-05-07 10:38:58 -04:00
Alex Xie
bfbc8cd09b
SWDEV-234684 - hipmemcpy optimization does not work in tests
...
Change-Id: I899d172c5b2af88c796fe9a36f97d15ac45caf94
2020-05-05 15:58:03 -04:00
Saleel Kudchadker
0fbc0a895b
Disable small copy optimization for now
...
Change-Id: Ib7a4aa676bb60940e067c985eb19070bd63b2fc2
2020-05-05 11:52:42 -04:00
Alex Xie
6c5a42b33c
SWDEV-232894 Port hipMemcpy optimizations from HCC to VDI
...
Apply the optimization to change for OpenCL too.
Clean up some unnecessary checks.
Change-Id: I840261fe35baeeadeba7388e86779d482f509aad
2020-04-30 11:06:28 -04:00
Christophe Paquot
b54c3f7db9
Couple of cleanups.
...
Remove queue limitation since we loop through HW queues now.
Add a DevLogError if we fail to create the hsa_queue. A ticket showed a regression there.
Change-Id: I4f58e405f88e75600a762f6d6352838c969cdb5e
2020-04-29 09:18:07 -07:00
agodavar
f149fe0803
P2PStating buffer allocation when P2P is not enabled between all GPUs
...
SWDEV-232580 & SWDEV-232580
Allocate p2p statging buffer when full P2P access is not available between all devices.
p2p staging buffer will eventually be used when required.
Change-Id: If8490ba7b1c52c432c1e942ae95421b9d2ec7097
2020-04-28 07:10:57 -04:00
Alex Xie
009d0b5f55
SWDEV-232894 Port hipMemcpy optimizations from HCC to VDI
...
Change-Id: I6bebe9ac503a9f80d067aeea8a848409ad210338
2020-04-27 14:53:58 -04:00
German Andryeyev
082cbfa1f5
Don't attempt to reuse the cooperative queue
...
Change-Id: I0e98e292a562715a7b395118f899af859f3e42bb
2020-04-27 09:18:05 -04:00
Michael LIAO
97f55b5c7f
[vdi] Add device assertion support.
...
- Once device assertion occurs, abort the host execution as well.
- TODO: This's the initial support. As we need to drain hostcall queue
to ensure device assertion message being flushed out, hostcall
listener needs an interface to explicitly drain its queue.
Change-Id: I8a04400aa7109bfd054ae5777c41a4abbf0db4a9
2020-04-22 10:03:55 -04:00
kjayapra-amd
7458bf9964
SWDEV-229840 - Improve error messages on ROCCLR Layer.
...
Change-Id: Iab7d9156cdc206db86385aa05023a0095ed40f92
2020-04-19 20:01:49 -04:00
German Andryeyev
481d526859
SWDEV-184709 - support hipLaunchCooperativeKernel()
...
- Enable cooperative groups support, based on ROCr capability
Change-Id: I975bcea0af7865009eaed24454ce71d897ea8fc4
2020-04-01 12:13:33 -04:00
German Andryeyev
7ef8dfdfe7
SWDEV-184709 - support hipLaunchCooperativeKernel()
...
Add ROCr cooperative queue allocation
Change-Id: I1384482692f4080d31255b09e0f68a21ccad3da8
2020-03-30 16:09:09 -04:00
Vladislav Sytchenko
52046e41b2
SWDEV-224023
...
Correct typo.
Change-Id: I72131a6e0210e7b961e586cd0ae18608d21fc529
2020-03-15 16:37:25 -04:00
Vladislav Sytchenko
e76d867740
SWDEV-224023
...
Each WGP consists of 2 CU, so the number of available SIMD units is doubled.
Change-Id: I43978a8a9139c33f5f776b344a36bee927cc187d
2020-03-06 13:43:36 -05:00
Saleel Kudchadker
0730b39adb
Implement HIP_HIDDEN_FREE_MEM env var
...
Set value to 256Mb to reflect what HIP/HCC reserves
Change-Id: Icaadf79f60d3916965ac168da237d15b975b1fe4
2020-02-14 12:57:11 -05:00
Karthik Jayaprakash
7fb53890b8
SWDEV-210443 - For Numa nodes pick up the CPU that has Memory pool.
...
Change-Id: If52852b6f12053e4dfe8a83b8aa5743137c3d6dc
2020-02-13 20:48:37 -05:00
Laurent Morichetti
d9d9c69399
Replace cl_* integral types with standard types.
...
cl_bool -> bool
cl_int -> int32_t
cl_uint -> uint32_t
cl_long -> int64_t
cl_ulong -> uint64_t
cl_float -> float
cl_double -> double
cl_bitfield -> uint64_t
Change-Id: I840c8993b55f98f5b745d21e27f5f28233647a58
2020-02-12 13:16:06 -08:00
Laurent Morichetti
b4c6143a2f
Update copyright info
...
Change-Id: Ia4f9ff0f5f873b4223a8cca154188bb0d2f1abba
2020-02-04 09:26:14 -08:00
Laurent Morichetti
20c7173849
Merge branch 'origin/pghafari/vdi-prototype' into lmoriche/amd-master
...
Change-Id: Id3b833d405596735becb3346f3b08c6da57033fe
2020-01-30 20:12:13 -08:00