Currently we force inlining everything for HIP. Now we'd like to enable function
supports. The first step is to remove uses of `-amdgpu-early-inline-all` in
various places. This patch is to remove all of them from clr.
Change-Id: Ib0cad1f586714c9989778b00746aa4c47a4eec95
[ROCm/clr commit: a09204388a]
This is an initial change before we refactor the build/link paths for
kernel launches for HIP. This current change is needed as compiler was
setting some dump file which needed fs access which has slowdowns for
NFS mounted file systems
Change-Id: I828f9bb04d789b4f8c05c1ed08767f325efeb47c
[ROCm/clr commit: 3f2f7252aa]
Add trap handler code into runtime and compile/load during
device initialization. The current interface for trap handler in
PAL is obsolete and the new interface will be provided later.
Change-Id: I1fa702c5d1f2e6731f781369c980d546cf422328
[ROCm/clr commit: e1d34cb24f]
This reverts commit 58e62063f3.
Reason for revert: There are currently some outstanding issues with the COMPILE_SOURCE_WITH_DEVICE_LIBS Comgr action (https://ontrack-internal.amd.com/browse/SWDEV-386072). Once these LLVM issues have been resovled, we can safely re-apply this patch
Change-Id: I8501967af8496ea50d6e4a97399e45db51bbed1e
[ROCm/clr commit: 19526e46e6]
This is related to SWDEV-410182, but it's not enough to fix it.
Functions from device-libs are precompiled into llvm-ir in a "target agnostic" way
(in reality, it's not 100% target agnostic, which brings us many headaches).
When linking builtins (like device-libs) from the command line, we use the flag
-mlink-builtin-bitcode. The difference between regular linking of bitcode and
this flag is that the later propagates target-specific attributes. If this
attributes are not propagated, we can end up with incosistent target attributes.
Comgr provides the action AMD_COMGR_ACTION_COMPILE_SOURCE_WITH_DEVICE_LIBS_TO_BC
for this exact reason. The old action is currently deprecated and this one should
be used.
Change-Id: I518415214debdf4fedf0b1d81456d6e9fb8a3d19
[ROCm/clr commit: f3dc04a50d]
This reverts commit 2e664d2492.
Reason for revert: Performance regressions and failures observed. Need to investigate those and before re-applying patch
Change-Id: I42ba0605797f9bdcfb5d5102927dd01405cf05e3
[ROCm/clr commit: 8047d8e3e8]
Previously, we used the following approach and Comgr actions
for device lib linking:
AMD_COMGR_COMPILE_SOURCE_TO_BC (compile with clang driver)
AMD_COMGR_ADD_DEVICE_LIBRARIES (link in device libs with
llvm-link API)
However, the clang driver can link in device libraries as part
of compilation, assuming a --rocm-path is set. In this context,
this is accomplished by using the following Comgr action instead:
AMD_COMGR_COMPILE_SOURCE_WITH_DEVICE_LIBS_TO_BC (compile and
link in device libs with clang driver)
Change-Id: I661465865365afecc44aa15d4df91bfab361af8d
[ROCm/clr commit: a4c5c44008]
* Return the result by value and not through a pointer passed by
parameter
Change-Id: I8f872c95c4a330bebe299d486fb73f36660b469c
[ROCm/clr commit: 37849a0726]
This reverts commit 8da846db0a.
Reason for revert: Test failures with Luxmark, blender, and Indigobench. Need to investigate before re-applying
Change-Id: I6b08273a8f9c8bcaa4e7a06cd42d15048e52ca2a
[ROCm/clr commit: 5168485d23]
The Comgr ADD_DEVICE_LIBRARIES action has been deprecated. In place
of the previous two-action approach:
AMD_COMGR_COMPILE_SOURCE_TO_BC
AMD_COMGR_ADD_DEVICE_LIBRARIES
We can now use a single combined action:
AMD_COMGR_COMPILE_SOURCE_WITH_DEVICE_LIBS_TO_BC
This new action more closely alings with how device library
management is done by the clang driver.
Change-Id: Id844e9031a1896dedeacec453440b9babc4b111a
[ROCm/clr commit: 0969056f66]
Weirdly, the `requiredDump` argument to linkLLVMBitcode was used to enable/disable
the keeping temporary bytecode files (those generated by -save-temps=all) after linking.
This patch removes this argument as there is no obvious benefit from keepeing it
(the user would only rely on -save-temps=all to control this).
Change-Id: I0c00486f95eb1d4e296b5247c488407c47f0b2d9
[ROCm/clr commit: 8ab3fd58cf]
The current code generates a _optimized.bc regardlessly, so put back the original logic.
Change-Id: I3f84d10934b3e983f5f828af8d0943449a6e1d94
[ROCm/clr commit: 6647882773]
Disable devlib linking when runtime links multiple objects from
the app. Otherwise devlibs will be linked twice and may cause
undefined behavior with COv5.
Change-Id: I3b8640c64ff898893225fe3af5b4b4a32d42bf40
[ROCm/clr commit: c275d9b4b3]
Currently COMGR doesn't provide global variable size and runtime
parses ELF binary directly. Avoid parsing for HIP. That can save
5% in hipModuleLoad() time.
Change-Id: I47540d1e957bdb0c2406b6b848222de2920b2504
[ROCm/clr commit: 2664d8cf9e]
Metadata in Codeobject version 5 is the extension of CO3 and CO4.
Add the detection of the new fields and program them in
the setup of the kernel arguments.
Change-Id: I27e58df77320ad00f4f16d35912668db803826af
[ROCm/clr commit: be6a06384e]
This patch allows to substitute binary for the opencl program. It supposed to be used as:
1. Run the opencl program with -save-temps.
2. Open the cl temp and find the following text in the program header:
Hash to override:
Source: 0xd66bcfa20e69e605
Source + clang options: 0x656a9dd8aedcbfb6
3. Create config file (ascii text) with a pair(s):
<hash> <path_to_binary_to_substitute>
where hash is the hex value from step 2 (without leading 0x), you can use either hash
depending on what you're going to match:
only the source text of the program or along with it's clang options.
4. Set the env variable AMD_OCL_SUBST_OBJFILE to the path of your config file.
5. Rerun the opencl program.
Change-Id: I977c80fe529ea14458194918c6ddfbe2de6a8857
[ROCm/clr commit: 51cc9c2f8c]
This change makes HSAIL usage similar to that of Comgr. By default, the
runtime will statically link against it, however if HSAIL_DYN_DLL is
defined, then the runtime will try to dynamically load HSAIL.
Currently stick to statically linking to HSAIL. In a feature patch the
dynamic loading behaviour will be enabled.
Change-Id: I6a78a4375975cf847f236b200404c8cf941d012b
[ROCm/clr commit: c7b50bb890]
In adition to removing the HSAIL logic from the ROCm backend, guard all
of the HSAIL includes in the common layer behind the WITH_COMPILER_LIB
define. This is to avoid including HSAIL headers when building with
no support for it.
In common logic replace the use of the aclType enum with the new
Program::file_type_t enum. This is essentially a local copy of the HSAIL
enum to avoid including any HSAIL headers.
Change-Id: Ica0651d1b29dfccc255cc584eb82a5cb35e1b520
[ROCm/clr commit: cbeb372e46]
- Correct GSL path to report targets using the TargetID syntax.
- Correct GSL path to check compatibility of code objects when
loading.
- Add concept of an device isa and create a registery used by ROCm,
PAL and GSL.
- Support XNACK and SRAMECC target features consistently for PAL and ROCm.
- Correct logic for NullDevices and asserts to avoid memory coruption.
- Allow all NullDevices to be created for HIP.
- Numerous other code improvements.
Change-Id: I40abf3d2b22249c1492d1af5919665f8184f4e0e
[ROCm/clr commit: c7e8d91e14]
[PAL to KFD/ROCr][ROCr_Runtime][Vega10] OCLSeparateCompile subtest of
oclcompiler from ocltst test package is encountering clLinkProgram()
failed (chksum 0x00000001) error
If runtime does not provide a file name as dump file to ELF library,
ELF library use a temp file in current folder.
The current folder can be not writable for several reasons:
1. The application current folder might be system folder, the user
does not have write permission.
2. The current folder is under a readonly file system. This happens for
embedded customers.
Tested in VEGA10. Issue was fixed.
Change-Id: Ic0e9f040b7c7583914301673cce237ab28b0c0cb
[ROCm/clr commit: 6327dbc4cc]