SWDEV-213031 - Check the functions_ map else interpret as a hip::Function for now. Function may not be a device function and may have been obtaiend via hipModuleGetFunction and thus not in the functions_ map
ReviewBoardURL = http://ocltc.amd.com/reviews/r/18388/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#57 edit
SWDEV-213526 - [hip] OOM issue
Delay any access to device layers till HIP API is called by app.
This allows the app to fork the process first and then call HIP which is legal.
Doing hip calls then fork isn't legal nor supported by ROCm.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_context.cpp#25 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_internal.hpp#49 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#55 edit
SWDEV-145570 - Adding back the lazy kernel changes because the OOM issue is because of KFD/RocR.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#53 edit
SWDEV-145570 - [HIP] Fix occupancy API prototype.
- They need to be C API, i.e. extern "C".
- Follow the current API and use `uint32_t` instead of `int`.
+ TODO: We need to revert them back once that APIs are changed to be
compatible with CUDA.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#46 edit
SWDEV-207449 - [HIP CQE][HIPonPAL][LNX][QR] 6 hiptests failed on all ASICs
hipTestHalf fails to build on Windows due to linker error "unresolved external symbol __gnu_h2f_ieee"
1. Expose __gnu_h2f_ieee() and __gnu_f2h_ieee() for Windows builds.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/18127/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_hcc.def.in#31 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#45 edit
SWDEV-144570 - Adding extern var support for dynamically loaded modules for Texture reference.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_internal.hpp#43 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_module.cpp#42 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#43 edit
SWDEV-198194 - Making some code common between static and dynamically created module handling.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_internal.hpp#37 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_module.cpp#34 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#38 edit
SWDEV-180872 - Runtime support changes for Cooperative Group Features
- Taking into account of SGPRs usage to determine the block size
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#35 edit
SWDEV-145570 - Fix device name mismatch.
Not only gfx906 can have device name with +xnack etc.
Other devices e.g. gfx900 could have that too.
Make the previous fix more generic.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#33 edit
SWDEV-180872 - Runtime support changes for Cooperative Group Features
- Initial implementation of the core functionality. Disabled by default. Use GPU_ENABLE_COOP_GROUPS=1 to enable the feature.
- Runtime uses device queue for cooperative executions with a synchronization on the launched queue.
- The current implementation is pure runtime change and it can work if only one app uses this feature. No ROCr/KFD support was added or tested
- Only inline assembler was tested
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_device.cpp#20 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_device_runtime.cpp#15 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_hcc.def.in#15 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_hcc.map.in#17 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_module.cpp#28 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#32 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#338 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#606 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.hpp#171 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palblit.cpp#31 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palblit.hpp#9 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#142 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.hpp#39 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palschedcl.cpp#6 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palvirtual.cpp#135 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palvirtual.hpp#61 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocblit.cpp#32 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocblit.hpp#12 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#127 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.hpp#37 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocschedcl.cpp#3 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.cpp#75 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.hpp#23 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/command.cpp#94 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/command.hpp#92 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#311 edit
SWDEV-145570 - Support loading fat binary generated through --genco by hipModuleLoad.
hip-clang --genco generates fat binary instead of code object. To support that
we need to extract code object from fat binary in hipModuleLoadData. This is
needed for hipRTC since multiple GPU archs may be passed.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_module.cpp#27 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#31 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#42 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#308 edit
SWDEV-145570 - Fix device name mismatch for gfx906.
For now hip-clang can only emits gfx906 ISA with conservative configurations, i.e. with ecc on and xnack on, therefore it is always gfx906. It is still under discussion how to encode the target id for xnack off or ecc off.
Therefore, the reasonable solution for now is just allow code object marked as gfx906 to be loaded on any device name that starts with gfx906. We will have more detailed control once hip-clang is able to emit code object for gfx906 with ecc off or xnack off.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#30 edit
SWDEV-144570 - Fix build failure after switching to gcc-7
- Hex representation of float needs gnu++11. We'd better not relying on
that. Change the float in hex format into alternative representation.
RBT: http://ocltc.amd.com/reviews/r/17300/
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#29 edit
SWDEV-145570 - Workaround for mismatch of device name and bundle id for gfx906.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#25 edit
SWDEV-145570 - [HIP] - Multithreading issues
Add a lock per function so kernel parameters don't get overwritten
Make execStack_ thread local and remove global lock use for it:
The compiler uses the same thread to set it up and launch the function
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_internal.hpp#16 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_module.cpp#18 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#19 edit
SWDEV-145570 - [HIP] Refactored some g_* stuff
Refactored g_functions into a platform state.
Added a _vars for registered variables.
Added an execution stack similar to Hcc-clang.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#14 edit
SWDEV-145570 - [HIP] Change fat binary magic number and clang-offload-bundler target name to match clang
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#11 edit