- The driver code should not re-define `tex` again as it's already
defined in the kernel code. Eventually, the driver code should be as
regular C++ code instad of HIP code.
Change-Id: I8c7cab204b98990619d6e7109b990d7089ea9261
These might contain garbage causing the runtime to incorrectly parse the state of the texture references.
Change-Id: I93c726fa30b580b3e14c50ac939f3c71b0d1c8d9
- HIPPerfDispatchSpeed disparity between HIP/HCC vs HIP/VDI
Insert a wait marker command in the default stream only when
HIP has pending operations on other async streams
Change-Id: I68660a54867fab7571ba57eb1df5feb1bca1c61a
1.Combine libamdhip64_static_base.a and libamdvdi_static.a into libamdhip64_static.a.
2.Let hipcc use -use-staticlib to link libamdhip64_static.a.
3.Add some samples for static lib.
4.Fix compiling failure of code object.
Change-Id: Ic8c95228eb139058da8b5d66ba8439486154ca6f
This reverts commit 5210ee6ca5.
Reason for revert: It is causing dkms-no-npi-hipclang broken.
It is top priority to maintain dkms-no-npi-hipclang build, otherwise we lose track of regression analysis.
So revert the change for now and recommit it after fixing it.
Change-Id: Ia5136e888baecb6148c6c18eedbf37066fcb1eaa
1.Combine libamdhip64_static_base.a and libamdvdi_static.a into libamdhip64_static.a.
2.Let hipcc use -use-staticlib to link libamdhip64_static.a.
3.Add some samples for static lib.
4.Fix compiling failure of code object.
Change-Id: Ia2333622a8d05639b90974c4c5d3d85654ba0138
Since we adjust we adjust the start of the region, amd::BufferRect::end_ is no longer the size, just the offset as to where the region ends.
The actual size of the region is (amd::BufferRect::end_ - amd::BufferRect::start_).
Change-Id: I8425d8bdfb20f485740863813e762e8923d9ee94
Two issues are fixed:
libamdhip64_static.a is not included in package.
cmake generated target files uses installation path of libraries
which are created when the libraries are built and installed.
The CI uses customized installation directory which is not
the package installation directory, thefore the library location
in cmake generated target files differs from the library location
installed from package. This causes rocPRIM build failure since
rocPRIM uses pkg-config which checks library location.
The fix is to fix the library location before adding cmake
generated target files to package.
Change-Id: I4aa2c6138f58df6d4a86301a5c0436edcb19ab70
This is charrypick of 9ead991784
and https://github.com/ROCm-Developer-Tools/HIP/pull/2009
Fix cmake config file
Removed cmake target files under packaging directory.
Merged cmake config .in files for HIP-Clang and HCC as one.
Use cmake generated target files in both install and packaging.
This makes cmake config file consistent for make install and
make package.
Let device side malloc/free return nullptr and trap
Change-Id: I448f3ea2d4934648089bad371debc203f895cba6
VDI reports the limits in pixels, but user provides the size in bytes.
Make sure both values are in pixels before doing comparisons.
Change-Id: I082c7175c9fa4383e0b0ee38ff8c047c26ff20b4
The following warnings are addressed:
comparison of different enumeration types in switch statement
Change-Id: I6cb3948aeab7287851c57ecc1d4b3a439ab14ec6
Latest llvm already includes the texture/surface rework, but appropriate runtime changes have not been submitted.
Disable all texture related tests until http://gerrit-git.amd.com/c/compute/ec/hip/+/342147 is submitted.
Change-Id: I359c2eac6becdd3ca5110f2140679bd29d8ae54b
Support hipLaunchCooperativeKernelMultiDevice()
- Add validation logic for MGPU launches to pass a cuda test
Change-Id: Iccca7fde43493fc3bc6685512d39202271ae3e92
Support hipLaunchCooperativeKernelMultiDevice()
- Add hipCooperativeLaunchMultiDeviceNoPreSync and
hipCooperativeLaunchMultiDeviceNoPostSync support to pass a cuda test
Change-Id: If518f11ef2636a2235e5df9e77f879d8ced68102
These fixes address regressions caused by http://gerrit-git.amd.com/c/compute/ec/hip/+/337601
Currently we're converting a 1D offset into a 3D offset, which doesn't make much sense once you consider the fact that this offset is relative to a different origin than our current 3D offset.
I traced through our blit kernels in VDI - the copy buffer rect path is able to handle immediate offsets in the 3D buffer via the amd::BufferRect::start_ parameter.
Instead of adjusting the offset, simply adjust the start of the region.
Change-Id: Ic8797a2c8ac0ad106f246f61ff06ca1ca03d3058