Commit graph

29 Commits

Autor SHA1 Nachricht Datum
Tao Sang fabfc42b68 Fix TC linux build issue due to previous Numa patch
Change-Id: I6068edaf38cac6fad187c8429707afdb727e8d41
2020-06-03 16:42:53 -04:00
Tao Sang aedb9590be Support Numa-aware cpu selection
Select cpu in terms of the smallest Numa distance for a GPU device.
This will improve performance of hipMemcpy in the mode of
hipMemcpyHostToDevice or hipMemcpyDeviceToHost for small buffer.
`

Change-Id: I2860f1f83b79be0dff7bf5e64cf68ab4448db0a1
2020-06-01 21:01:24 -04:00
Aryan Salmanpour fec4adfd19 check for valid queue before accessing cuMask()
Change-Id: I8d4b0dbcd097c2ec5c31dea5a3d0060f0864a7e8
2020-05-20 16:23:09 -04:00
Chauncey Hui 0af9c06968 Modified IpcDetach to return status instead of void.
Change-Id: I68ed94b93f0383babe25eb046b4047d249a0fdc1
2020-05-20 03:38:21 -04:00
Jason Tang cd2a713d63 Add major/minor/stepping to device layer
Change-Id: If82ea55a46b166b243a98089a6e9c40ccfdb479f
2020-05-17 12:57:34 -04:00
Aryan Salmanpour fed94b8604 Add support for setting CU mask on ROCclr for ROCm backend
Change-Id: I0dbe2eeb33467fc0f24b26929119c10e9b455da7
2020-05-15 14:23:43 -04:00
Saleel Kudchadker d10d691e76 Add env var to toggle large bar support in runtime
Use ROC_ENABLE_LARGE_BAR (0/1) to toggle. The support is
enabled by default.

Change-Id: I6cb93a46594cb6f5e90bf6057738330225efb553
2020-05-12 13:20:06 -04:00
Jason Tang b4f1239f34 device/rocm: split gfxVersion to major/minor/stepping
Change-Id: I1e437eaee30794147713d9516229211670f01d90
2020-05-12 12:17:13 -04:00
German Andryeyev ae4aceb55e Make sure the list of HSA agents is valid
If HIP_VISIBLE_DEVICES is active, then make sure the list of HSA
agents contains the valid agents

Change-Id: I584aad999a230ab7f88a0cfe20dcd0abe79c43a5
2020-05-11 15:49:30 -04:00
Christophe Paquot 3ed185307e Fix cooperative flag for hsa_queue creation in case they're not available
SWDEV-233766

Change-Id: If410ecfed61f2b3bb50b847cf2ededc573139494
2020-05-11 13:40:50 -04:00
Michael LIAO 503ef06555 Clear executable permission.
Change-Id: Ia0d363b1ba89d7947e5b5a55cb67edba86f0515e
2020-05-07 10:38:58 -04:00
Alex Xie bfbc8cd09b SWDEV-234684 - hipmemcpy optimization does not work in tests
Change-Id: I899d172c5b2af88c796fe9a36f97d15ac45caf94
2020-05-05 15:58:03 -04:00
Saleel Kudchadker 0fbc0a895b Disable small copy optimization for now
Change-Id: Ib7a4aa676bb60940e067c985eb19070bd63b2fc2
2020-05-05 11:52:42 -04:00
Alex Xie 6c5a42b33c SWDEV-232894 Port hipMemcpy optimizations from HCC to VDI
Apply the optimization to change for OpenCL too.
Clean up some unnecessary checks.

Change-Id: I840261fe35baeeadeba7388e86779d482f509aad
2020-04-30 11:06:28 -04:00
Christophe Paquot b54c3f7db9 Couple of cleanups.
Remove queue limitation since we loop through HW queues now.
Add a DevLogError if we fail to create the hsa_queue. A ticket showed a regression there.

Change-Id: I4f58e405f88e75600a762f6d6352838c969cdb5e
2020-04-29 09:18:07 -07:00
agodavar f149fe0803 P2PStating buffer allocation when P2P is not enabled between all GPUs
SWDEV-232580 & SWDEV-232580
Allocate p2p statging buffer when full P2P access is not available between all devices.
p2p staging buffer will eventually be used when required.

Change-Id: If8490ba7b1c52c432c1e942ae95421b9d2ec7097
2020-04-28 07:10:57 -04:00
Alex Xie 009d0b5f55 SWDEV-232894 Port hipMemcpy optimizations from HCC to VDI
Change-Id: I6bebe9ac503a9f80d067aeea8a848409ad210338
2020-04-27 14:53:58 -04:00
German Andryeyev 082cbfa1f5 Don't attempt to reuse the cooperative queue
Change-Id: I0e98e292a562715a7b395118f899af859f3e42bb
2020-04-27 09:18:05 -04:00
Michael LIAO 97f55b5c7f [vdi] Add device assertion support.
- Once device assertion occurs, abort the host execution as well.
- TODO: This's the initial support. As we need to drain hostcall queue
  to ensure device assertion message being flushed out, hostcall
  listener needs an interface to explicitly drain its queue.

Change-Id: I8a04400aa7109bfd054ae5777c41a4abbf0db4a9
2020-04-22 10:03:55 -04:00
kjayapra-amd 7458bf9964 SWDEV-229840 - Improve error messages on ROCCLR Layer.
Change-Id: Iab7d9156cdc206db86385aa05023a0095ed40f92
2020-04-19 20:01:49 -04:00
German Andryeyev 481d526859 SWDEV-184709 - support hipLaunchCooperativeKernel()
- Enable cooperative groups support, based on ROCr capability

Change-Id: I975bcea0af7865009eaed24454ce71d897ea8fc4
2020-04-01 12:13:33 -04:00
German Andryeyev 7ef8dfdfe7 SWDEV-184709 - support hipLaunchCooperativeKernel()
Add ROCr cooperative queue allocation

Change-Id: I1384482692f4080d31255b09e0f68a21ccad3da8
2020-03-30 16:09:09 -04:00
Vladislav Sytchenko 52046e41b2 SWDEV-224023
Correct typo.

Change-Id: I72131a6e0210e7b961e586cd0ae18608d21fc529
2020-03-15 16:37:25 -04:00
Vladislav Sytchenko e76d867740 SWDEV-224023
Each WGP consists of 2 CU, so the number of available SIMD units is doubled.

Change-Id: I43978a8a9139c33f5f776b344a36bee927cc187d
2020-03-06 13:43:36 -05:00
Saleel Kudchadker 0730b39adb Implement HIP_HIDDEN_FREE_MEM env var
Set value to 256Mb to reflect what HIP/HCC reserves
Change-Id: Icaadf79f60d3916965ac168da237d15b975b1fe4
2020-02-14 12:57:11 -05:00
Karthik Jayaprakash 7fb53890b8 SWDEV-210443 - For Numa nodes pick up the CPU that has Memory pool.
Change-Id: If52852b6f12053e4dfe8a83b8aa5743137c3d6dc
2020-02-13 20:48:37 -05:00
Laurent Morichetti d9d9c69399 Replace cl_* integral types with standard types.
cl_bool -> bool
cl_int -> int32_t
cl_uint -> uint32_t
cl_long -> int64_t
cl_ulong -> uint64_t
cl_float -> float
cl_double -> double
cl_bitfield -> uint64_t

Change-Id: I840c8993b55f98f5b745d21e27f5f28233647a58
2020-02-12 13:16:06 -08:00
Laurent Morichetti b4c6143a2f Update copyright info
Change-Id: Ia4f9ff0f5f873b4223a8cca154188bb0d2f1abba
2020-02-04 09:26:14 -08:00
Laurent Morichetti 20c7173849 Merge branch 'origin/pghafari/vdi-prototype' into lmoriche/amd-master
Change-Id: Id3b833d405596735becb3346f3b08c6da57033fe
2020-01-30 20:12:13 -08:00