Clarify some VGPRs terms description.
Fix some wrong query logics of availableVGPRs_ and
availableRegistersPerCU_ in device info.
Add hipDeviceAttributeMaxAvailableVgprsPerThread
attribute query.
Remove hardcoding of following
info_.vgprAllocGranularity_
info_.vgprsPerSimd_
[ROCm/clr commit: 397f303d97]
The compiler currently serializes the workgroup_processor_mode COMGR metadata boolean field as "0"/"1" instead of "false"/"true". Consider "1" a truthy value during parsing.
[ROCm/clr commit: d020598a0f]
- Sometimes we want to mask out kernel names, use right level for kernel
logging
Change-Id: Ideae9647c57b86ae390ff2f4131f6d8c6df5c086
[ROCm/clr commit: f1adecd186]
This reverts commit 1b05247a03.
Reason for revert: Waiting for staging results before finally merging it.
Change-Id: Iaabb510325f50147f368108e98531291217627c0
[ROCm/clr commit: 77be355fd9]
Relates to https://reviews.llvm.org/D150427,
Each printf call populates buffer with following data
1. Control DWord - contains info regarding stream, format string constness and size of data frame
(see http://gerrit-git.amd.com/c/lightning/ec/device-libs/+/857722 for more info)
2. Hash of the format string (if constant) else the format string itself
3. Printf arguments (each aligned to 8 byte boundary)
Change-Id: I7e320deb343921b4b4cfaf08a2be2883e0bc1f65
[ROCm/clr commit: 7b6a8f1702]
This reverts commit dfa7790030.
Reason for revert: Deferred to a future release.
Change-Id: Ia66c37f0ab9734dee73c930d10d7469d5fd57254
[ROCm/clr commit: 5dc104b3ea]
Fix missing issue of kernel attributes including vec_type_hint,
work_group_size_hint and reqd_work_group_size.
Make WorkGroupInfo's meta attributes initialized before other parameters
are initialized.
This way workGroupInfo_'s compileSizeHint_, compileSize_ and
compileVecTypeHint_ will be valid when they are used to create kernel
signature in Kernel::createSignature().
Fix a typo of ".workgorup_size_hint".
Change-Id: I4a1ede2210a25596ad7a935cd4debb896e0147f8
[ROCm/clr commit: cb30ce4e06]
Metadata in Codeobject version 5 is the extension of CO3 and CO4.
Add the detection of the new fields and program them in
the setup of the kernel arguments.
Change-Id: I27e58df77320ad00f4f16d35912668db803826af
[ROCm/clr commit: be6a06384e]
This change makes HSAIL usage similar to that of Comgr. By default, the
runtime will statically link against it, however if HSAIL_DYN_DLL is
defined, then the runtime will try to dynamically load HSAIL.
Currently stick to statically linking to HSAIL. In a feature patch the
dynamic loading behaviour will be enabled.
Change-Id: I6a78a4375975cf847f236b200404c8cf941d012b
[ROCm/clr commit: c7b50bb890]
In adition to removing the HSAIL logic from the ROCm backend, guard all
of the HSAIL includes in the common layer behind the WITH_COMPILER_LIB
define. This is to avoid including HSAIL headers when building with
no support for it.
In common logic replace the use of the aclType enum with the new
Program::file_type_t enum. This is essentially a local copy of the HSAIL
enum to avoid including any HSAIL headers.
Change-Id: Ica0651d1b29dfccc255cc584eb82a5cb35e1b520
[ROCm/clr commit: cbeb372e46]
- Correct GSL path to report targets using the TargetID syntax.
- Correct GSL path to check compatibility of code objects when
loading.
- Add concept of an device isa and create a registery used by ROCm,
PAL and GSL.
- Support XNACK and SRAMECC target features consistently for PAL and ROCm.
- Correct logic for NullDevices and asserts to avoid memory coruption.
- Allow all NullDevices to be created for HIP.
- Numerous other code improvements.
Change-Id: I40abf3d2b22249c1492d1af5919665f8184f4e0e
[ROCm/clr commit: c7e8d91e14]
Rename functions that access devices to reflect the derived device
they return. This includes the base device::Device and the derived
gpu/pal/roc device classes in both NullDevice and Device forms. Change
to use the least derived versions to clarify what operations will be
available.
Change-Id: I1abb6bfed7efa24852bc8d0d49acaea357d8b5d0
[ROCm/clr commit: 001fd66cac]
When HIP_ENABLE_DEFERRED_LOADING=0, many global variables will be
referenced but they are not initialized in that early time. The patch
will use constexpr to initialze global constant varables in compile
time.
Change-Id: I9d538b7abc6a0ce700ec3332b97fc144db5fc1ef
[ROCm/clr commit: fdef6f722f]
This seems to not actually have any function. The OpenCL API test
passes without it, and the way it's produced is problematic.
Change-Id: I384bfa01dee7023484348b184ddd1b2d44a91f7d
[ROCm/clr commit: cfed3f310d]