SWDEV-197836 - Drop the use of llvm header files in opencl runtime
- Fix compilation error with configurations where COMGR is disabled.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#76 edit
SWDEV-197836 - Drop the use of llvm header files in opencl runtime
- Fix compilation error with configurations where COMGR disabled.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#75 edit
SWDEV-198863 - Options for hip-clang-vdi path to provide the chicken bits, or functional equivalents to HCC_DB (phase 1)
1. The log macros is turned off for release build. So log functions has zero impact to release build.
2. The log macros have level, mask, condition control. So we can have more control to avoid log flooding.
I also adjusted some existing log to use new log functions.
1. To excercise and test the new log functions.
2. To improve performance slightly.
3. The change is mainly for HIP-ROCM, we can move more in next phases for PAL or ORCA.
4. I make these log feature unavailable for release build. We can revert to old log functions for release build in a case by case method.
Tests:
1. http://ocltc.amd.com:8111/viewModification.html?modId=128289&personal=true&tab=vcsModificationBuildshttp://ocltc.amd.com:8111/viewModification.html?modId=128358&personal=true&tab=vcsModificationBuilds
2. release build, run hip program, there is no log
3. fastdebug build, run hip program,
export LOG_LEVEL=3
export GPU_LOG_MASK=4294967295
There was a lot of logs.
4. fastdebug build, run hip program,
export LOG_LEVEL=2
export GPU_LOG_MASK=4294967295
There was no logs.
5. fastdebug build, run hip program,
export LOG_LEVEL=3
export GPU_LOG_MASK=4294967294
There was much less logs.
6. fastdebug build, run hip program,
export LOG_LEVEL=3
export GPU_LOG_MASK=47102
There was even much less logs. The logs was expected according to the mask.
7. Tested step 2 to 6 similarily in Windows and Linux
ReviewBoard: http://ocltc.amd.com/reviews/r/18215
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_internal.hpp#46 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#82 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_stream.cpp#26 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hiprtc_internal.hpp#2 edit
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/cl_svm.cpp#29 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/comgrctx.cpp#6 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.cpp#29 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#68 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#137 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.cpp#91 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/command.cpp#100 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/commandqueue.cpp#32 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/runtime.cpp#40 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/debug.hpp#10 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#323 edit
SWDEV-208424 - ROCr language runtime should not free code object until executable destroy
- Reshuffle the code to make sure HSA runtime can keep the pointer to the code object
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#67 edit
SWDEV-79445 - OCL generic changes and code clean-up
- Fix memory leaks in COMGR path. Don't create binaryData, since it will be overwritten with action_data_get_data() call.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#65 edit
SWDEV-204999 - [hipclang-vdi-rocm] TF unit test tracking.util_xla_test_gpu fails to run
- Change the HSACO detection logic to use e_machine
- Allow to load a binary without any kernel.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/loaders/elf/elf.hpp#27 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#63 edit
SWDEV-188177 - Using the context from owner() in device::Program functions.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#61 edit
SWDEV-145570 - Rename OCL_DUMP_CODE_OBJECT to GPU_DUMP_CODE_OBJECT.
Since this is used by both OCL and HIP. Rename to avoid confusion.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#59 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#315 edit
SWDEV-144570 - Use Comgr feature flag around COMGR related API usage.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#52 edit
SWDEV-161424 - Fix broken option handling in Comgr path
Introduces another potential bug if any options in Options::llvmOptions contain spaces, but this existed before the switch to Comgr.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#50 edit
SWDEV-187169 - Hotel Lobby scene takes long time to compile
Patch authored by Valery Pykhtin.
Remove " -mllvm -amdgpu-early-inline-all", from the options passed
to the compiler; option interferes with function call support.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#44 edit
SWDEV-162389 - OpenCL Support for COMgr
- direct the COMgr log to buildLog_ buffer
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#43 edit
SWDEV-145570 - Support loading fat binary generated through --genco by hipModuleLoad.
hip-clang --genco generates fat binary instead of code object. To support that
we need to extract code object from fat binary in hipModuleLoadData. This is
needed for hipRTC since multiple GPU archs may be passed.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_module.cpp#27 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#31 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#42 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#308 edit
SWDEV-165259 - Update OpenCL runtime to support MsgPack metadata
- Fixed the missing support of Printf for CO v3
- Added back the flag to disable CO v3 for the non-COMGR environment
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.cpp#20 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#40 edit
SWDEV-79445 - OCL generic changes and code clean-up
- More changes to make sure runtime and LC could be built separately
Affected files ...
... //depot/stg/opencl/drivers/opencl/Makefile#71 edit
... //depot/stg/opencl/drivers/opencl/compiler/Makefile#73 edit
... //depot/stg/opencl/drivers/opencl/library/build/Makefile.library#78 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#38 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palprogram.cpp#88 edit
SWDEV-132899 - [OCL][GFX10] add "wavefrontsize64" to the linkOptions if they had previously been added to the compile options
ReviewRequestURL = http://ocltc.amd.com/reviews/r/16966/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#35 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#71 edit
SWDEV-132899 - [OCL][GFX10] passing "force-wgp-mode" option to Finalizer to enable WGP mode by default on gfx10+
and allow GPU_ENABLE_WGP_MODE to control the WGP/CU mode for HSAIL/SC path as well.
- also for Ariel (Navi10Lite) the wave32 should be disabled in LC but allow GPU_ENABLE_WAVE32_MODE control it for testing if needed.
ReviewrequestURL = http://ocltc.amd.com/reviews/r/16926/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#34 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#70 edit
SWDEV-176282 - FP16_MatrixTranspose is failing on NAVI10/VEGA10 PAL/LC path:wq
- add COMGR logging support to show the build log
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#28 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.hpp#16 edit
SWDEV-132899 - [OCL][GFX10] report number of WGP by default on gfx10 ASICs
Both HSAIL/SC and LC compilers use WGP mode by default on gfx10 ASICs (i.e., COMPUTE_PGM_RSRC1.WGP_MODE is set to 1 by both compilers) therefore runtime should report number of WGP (i.e., CU/2) on gfx10 ASICs by default.
The new environment variable (GPU_ENABLE_WGP_MODE = 0) can be used to force CU mode on LC (i.e., -mcumode option) if its needed (HSAIL/SC doesn't have any compiler option for forcing the CU mode)
Also, using the new environment variable (GPU_ENABLE_WAVE32_MODE) to control the wave32 mode on gfx10+.
ReviewRequestURL = http://ocltc.amd.com/reviews/r/16435/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#329 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#27 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#121 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#65 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#301 edit