نمودار کامیت

42 کامیت‌ها

مولف SHA1 پیام تاریخ
German Andryeyev fca05eae5f Avoid null stream allocation in hipFree
- Add option to skip stream allocation on access.
- Avoid null stream allocation in ihipFree, so an inactive device
won't be initialized

Change-Id: Id24426640df59a5e7a08b2dd9dcd4d67758b84bf
2020-05-04 17:17:36 -04:00
kjayapra-amd 8d6478a051 SWDEV-231874 - Make hipMemset patten size 8byte aligned when possible
Change-Id: Ida98bd89212af9b00f3c9c7c5d22ae81f3b5396a
2020-05-01 23:04:04 -04:00
Vlad Sytchenko 518be44661 Handle copy cases where the uses pases a bad ptr
Change-Id: I4490b8519e4d0dde25b845f9ca7e81c1d80d4f28
2020-04-28 11:04:29 -04:00
kjayapra-amd 19c654e794 SWDEV-232008 - Handle cases where Memcpy size is less than Symbol Size.
Change-Id: I046bc70445dfd4f6fdfd06415f53d1fc06f425b4
2020-04-27 10:47:28 -04:00
Michael LIAO f8062c6fbd [vdi] Revise the symbol management.
- As different modules may have symbols with the same name, each symbol
  needs identifying with a pair of the module handle and the symbol
  name.

Change-Id: I85650a787d9a424545154cc40ebd59e706fa358f
2020-04-24 22:28:31 -04:00
German Andryeyev 5a7c094729 SWDEV-232918
hipEventRecord is much slower in hipclang/vdi

- Make sure default streams don't sync each other.
- Add null stream into the list of default streams.
- Code clean-up to simplify queue look-up.

Change-Id: I36e1fc8d86a600e3dce806694d95d146ed8afd03
2020-04-24 14:40:43 -04:00
kjayapra-amd 92c63384cb SWDEV - 231874 - Do Dword aligned memset if the total size aligns.
Change-Id: Id05db4cfd9c43b2cffa3cec8b02f1cd07f340dd6
2020-04-23 21:08:50 -04:00
German Andryeyev 21840ac6a0 SWDEV-231579 - [hipclang-vdi-rocm][perf]
- HIPPerfDispatchSpeed disparity between HIP/HCC vs HIP/VDI
Insert a wait marker command in the default stream only when
HIP has pending operations on other async streams

Change-Id: I68660a54867fab7571ba57eb1df5feb1bca1c61a
2020-04-21 10:05:26 -04:00
kjayapra-amd b3308f7674 SWDEV-229480 - Improve error messages in HIP Layer.
Change-Id: I054b979d3aa6cf6ed4ca14a9393bdcba757772ff
2020-04-20 21:23:52 -04:00
Vlad Sytchenko 4d6e394c73 Correctly calculate size of the copy region
Since we adjust we adjust the start of the region, amd::BufferRect::end_ is no longer the size, just the offset as to where the region ends.

The actual size of the region is (amd::BufferRect::end_ - amd::BufferRect::start_).

Change-Id: I8425d8bdfb20f485740863813e762e8923d9ee94
2020-04-14 12:25:17 -04:00
Payam 14010cb705 updated LOG_LEVEL prints to print pid and tid
Change-Id: I8a9212b26bb7e312408a222823efcfd00344094b
2020-04-06 16:58:25 -04:00
Vladislav Sytchenko a3613cc6da (SWDEV-228488)
These fixes address regressions caused by http://gerrit-git.amd.com/c/compute/ec/hip/+/337601

Currently we're converting a 1D offset into a 3D offset, which doesn't make much sense once you consider the fact that this offset is relative to a different origin than our current 3D offset.

I traced through our blit kernels in VDI - the copy buffer rect path is able to handle immediate offsets in the 3D buffer via the amd::BufferRect::start_ parameter.

Instead of adjusting the offset, simply adjust the start of the region.

Change-Id: Ic8797a2c8ac0ad106f246f61ff06ca1ca03d3058
2020-04-06 14:17:11 -04:00
Christophe Paquot b820c66c55 Default HostMalloc to uncached memory
Change-Id: I72e19c7f7820a77fd5afc09f09cfea9acd0b8e84
2020-04-03 19:19:33 -04:00
Michael LIAO 6fe3edc5a8 [vdi] Add hipFreeHost
Change-Id: I8a5b7ff3f0ab4f5674efd6723c18808ad6ef33f5
2020-04-03 16:34:28 -04:00
Vladislav Sytchenko c9084d0ad2 Take into an account the number of channels...
when querying the element size of an array.

Change-Id: Id57d3374b14d80a59230ec8286704f2fbabb0fae
2020-04-03 15:43:18 -04:00
Vladislav Sytchenko 6e0722a5d0 (SWDEV-229354)
This patch is a workaround to support user pitch for hipMemcpy{2D/3D}.

Historically OpenCL didn't support pitch with clEnqueueFillBuffer(), so neither did we in VDI. Adding it now will be slightly nontrivial, since the fill kernel and runtime in many places will need to be modified.

As a temporary workaround for cases when pitch > width, we can just enqueue a fill for each row separately. This implementation is slow, but it satisfies the correctness criteria.

Change-Id: Idfeca349288b51d6ff84a7cf001fb63c6a66818a
2020-03-31 18:12:56 -04:00
Michael LIAO a14695d4eb [vdi] Fix hipGetSymbol{Address|Size}
- Use symbol value as the qeury key. Compared to the symbol name, the
  symbol value is more robust as developers may use unqualified or
  qualified identifiers. It also removes the mangling and/or demangling
  requirement for the runtime API.

Change-Id: I9d4259f3842612c7cc98551269fc2092d8b5c19e
2020-03-31 00:26:53 -04:00
Christophe Paquot 47718cbf16 Do not retry to allocate when OOM. Shouldn't be needed since we idle on Free.
SWDEV-229214

Change-Id: I183006f409388e3c7981f2569649d01d6378be46
2020-03-30 12:49:48 -07:00
Vladislav Sytchenko 86c969fcea (SWDEV-228794)
Adjust the origin of the copy if the user passes a pointer that wasn't allocated by the runtime.

Change-Id: I0aeb20195ed730857a461a53f537626ec2573fd1
2020-03-27 16:33:16 -04:00
Vladislav Sytchenko 08729d0f43 (SWDEV-228794)
Add hipMallocHost()

Change-Id: Ia3c7c5ca94b39fe30f3a51d1b60782d3472259ff
2020-03-27 15:57:48 -04:00
Vladislav Sytchenko 111897dbea (SWDEV-228782)
The only requirment from hipMallocPitch() is that the returned pitch is aligned to the HW image pitch alignment. There is no restriction on the size of the allocation, since the memory might not be used for images.

Change-Id: I97438e5fe4012ca4721b14b85f514dbac803c17c
2020-03-27 15:52:17 -04:00
Vladislav Sytchenko 2b538b09e6 Add support for formating hipExtent objects
Change-Id: Iea54a510e81a856c0c450305b3e5a7179ee48295
2020-03-26 14:45:20 -04:00
Vladislav Sytchenko 06bb183558 Add initial entry points for mipmapped array API
Change-Id: Icd59cc7323ddcb6773da6105260415a1e6f4cdcb
2020-03-26 14:45:20 -04:00
Vladislav Sytchenko 23211f05d6 Enable initial sRGB support
Instead of using the sampler field force_degamma to perform sRGB->linear conversion during pixel sampling, we use an appropriate image format instead. The overhead of this is having to create an image view when creating a texture object from an array.

Change-Id: I1ca368c312c1fd4b6f784a3a1b35b5eeb28070ff
2020-03-26 14:45:20 -04:00
Vladislav Sytchenko af1c5a0015 Handle offsets for dptr <-> image copies
Change-Id: I7a4a56ee07a26a741d2aac35502446d248f720ad
2020-03-26 14:45:20 -04:00
Saleel Kudchadker 68df8efe90 Sync streams when freeing or destroying mem
Change-Id: I6932f225a8b932bb2adbd5e37880f7e604496809
2020-03-20 10:53:23 -07:00
Vladislav Sytchenko 78f7954765 Add support for creating typed buffers
What Cuda refers to "linear texture memory" is the OpenCL equivalent of CL_MEM_OBJECT_IMAGE1D_BUFFER. For these types of allocations we should create a typed buffer instead of an image.

Currently there is no check in the texture fetch functions as to what kind of SRD is written into the texture object, so any kind of incorrect programming will cause the TA to hang. Fortunately for us, every one writes correct code :)

Change-Id: I80dab85a992f2c0754ebf303d40ac6b5e045c7c1
2020-03-18 18:15:17 -04:00
Vladislav Sytchenko cd76eb7486 Add hipDrvMemcpy3D.
This is the equivalent of cuMemcpy3D.

Change-Id: Ib2e06dbd6f5093c931cdfd36c87617f32acffc2d
2020-03-09 16:11:25 -04:00
Vladislav Sytchenko 59fb85f94d Add hipMemcpy3DParams -> HIP_MEMCPY3D conversion
Change-Id: I66dd1ae722b8c0d7ddb2562c958b14854e6b86bc
2020-03-06 14:33:52 -05:00
Vladislav Sytchenko 801c70279f Correct logic in ihipMemcpyAtoA()
HIP assumes that image width is in bytes, but VDI assumes that image width in pixels. Need to perform byte -> pixel coversion before doing anything.

Change-Id: Ia9fd1f46d05db3fbe8049add10b4d7e5118a2b9a
2020-03-05 15:03:18 -05:00
Vladislav Sytchenko aa490c96a1 Fix typos in ihipMemcpy3D()
Change-Id: I8720f113642f00bb013cf46284e9b13cc932bf4a
2020-03-05 14:11:55 -05:00
agodavar f360ff6479 Enable tests that are passing on hip-vdi
Change-Id: I7de965f7e8bb7e4d0fa61cd584f0cd118c1e212e
2020-03-05 16:08:51 +05:30
Vladislav Sytchenko d28b77bf23 Fix hipMemcpy3d (partially)
Incoming changes from upstream split the struct hipMemcpy3DParms into two separate ones - hipMemcpy3DParms and HIP_MEMCPY3D, which are cudaMemcpy3DParms and CUDA_MEMCPY3D equivalents respectively.

Note that HIP_MEMCPY3D is missing half the members of CUDA_MEMCPY3D (this should be fixed in PR#1887). Work around this by using a substitute _HIP_MEMCPY3D struct for now.

Change-Id: Ic15134e6deb260189b662b3804d2309a9b8473e9
2020-03-01 13:52:05 -05:00
Christophe Paquot e9af4c8794 Blocking and default streams' sync:
Add hip::syncStreams(dev) to sync blocking streams on a given device.
hip::syncStreams(void) should only sync streams on the current device.

Change-Id: Ib6b0735215fa0ed12c646ebd029e9763ee3712ce
2020-02-26 08:54:00 -08:00
Saleel Kudchadker 1ccaea7ca8 Use the context variant of getNullStream
Do not create a new queue to call finish in hipFree if none was
created earlier elsewhere.
Change-Id: I87bb191e6b186ddbe607ab29d11e3ae5bc2ac8e6
2020-02-25 00:13:43 -08:00
Christophe Paquot 2bdfc73649 Fixed a few multithreaded potential issues
Also make D2H and H2D keep track of the chain of events
when we need to use a different HostQueue.

Change-Id: I1c5da6ea6104b37ad7aac00f0eb8ea9371e6ba1c
2020-02-24 20:14:10 -08:00
Vladislav Sytchenko f71817a342 HIP-VDI texture rework
The current texture implementation is based off the one for HIP-HCC. There's a lot of problems with it - only creating images from buffers, hard coding logic and ignoring user parameters. This leads to a whole lot of UB even with simple examples (as seen with RedShift's code).

This CL is aimed to bring the HIP-VDI texture implementation closer to what is described by Cuda.

hipMemcpyAtoA() - image to image copy.
hipMemcpyHtoA()/hipMemcpyDtoA() - buffer to image copy.
hipMemcpyAtoH()/hipMemcpyAtoD() - image to buffer copy.

hipArrayCreate()/hipArray3DCreate()/hipMallocArray()/hipMalloc3DArray() - creates 1D/2D/3D/1D Array/2D Array images.
hipCreateTextureObject() - creates sampler, (optional) creates 1D/2D image from buffer, (optional) creates image views.
hipBindTexture() - creates 1D image from buffer (should create a typed buffer, however this is not compatible with HIP-HCC).
hipBindTexture2D() - creates 2D image form buffer.
hipBindTextureToArray() - creates image view.
hipTexRefSetAddress() - creates 1D image from buffer (should create a typed buffer, however this is not compatible with HIP-HCC).
hipTexRefSetAddress2D() - creates 2D image from buffer.
hipTexRefSetArray() - creates image view.

There are still a lot of  TODOs in the code, here's a few important ones:
1. VDI doesn't support a lot of sampler flags.
2. VDI doesn't support device to image 2D/3D copy.
3. Mipmaps implementation is incomplete.
4. Image view implementation is incomplete.

Change-Id: Ia374ee27aa14f76451fee7667495036f4419a487
2020-02-24 15:23:45 -05:00
Christophe Paquot 13bf30569e SWDEV-223262
hipMemcpyWithStream is supposed to be synchronous.

Change-Id: Ie44e37ecc9246e26a6b315c01e88a279f9e42fd7
2020-02-19 14:08:12 -08:00
Christophe Paquot b4ad4262cc Introducing hip::Device which wraps around amd::Context and deviceId
Change-Id: Ie35a6edb65c001b35eb9f5d2af26e765dc41c00e
2020-02-18 17:18:56 -05:00
Christophe Paquot 0c6efd9678 SWDEV-220533 - HostMapped should use fine grained.
Change-Id: I4ad2064e8e5ea1cd4ed7df143c778ccb685c4f22
2020-02-10 16:53:06 -05:00
Laurent Morichetti 5b098e68b6 Update copyright info for VDI files
Change-Id: Ib160fbf89ec89a5895321f73402a33b4d344a68f
2020-02-04 08:47:10 -08:00
Laurent Morichetti 6f3e18a764 Merge HIP/VDI branch 'amd-staging' into lmoriche/amd-master-next
Change-Id: Iabaab4e72815ba483a1330ec6a1130f2b86676f0
2020-01-29 15:02:13 -08:00