SWDEV-2 - Change OpenCL version number from 3033 to 3034.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2781 edit
SWDEV-198859 - Options for hip-clang-vdi path to provide the chicken bits, or functional equivalents to HCC_DB
There are regression caused by this change in ocltst test.
Back out changelist 2026859
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#176 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palvirtual.cpp#156 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/debug.hpp#13 edit
SWDEV-198863 - Options for hip-clang-vdi path to provide the chicken bits, or functional equivalents to HCC_DB (phase 3)
Use ClPrint to implement other log functions.
Move some funtion to use new log functions.
This is the final change of the JIRA.
Tests:
1. Linux HIP ROCM platform. VEGA10. Driver is release build.
1.1 export LOG_LEVEL=3
./hipModule
There are many logs.
1.2 export GPU_LOG_MASK=0
./hipModule
There is no log
2. Windows HIP PAL platform. VEGA10, Driver is release build.
2.1 set LOG_LEVEL=3
run test hipPrintfKernel
There are many logs
2.2 set GPU_LOG_MASK=0
run test hipPrintfKernel
There is no log
3. http://ocltc.amd.com:8111/viewModification.html?modId=128490&personal=true&tab=vcsModificationBuilds
ReviewBoard: http://ocltc.amd.com/reviews/r/18247/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#175 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palvirtual.cpp#155 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/debug.hpp#12 edit
SWDEV-2 - Change OpenCL version number from 3032 to 3033.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2780 edit
SWDEV-79445 - OCL generic changes and code clean-up
- Fix the detection of all devices for P2P. The previous logic worked only if GPU_ENABLE_PAL was forced to 1.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#174 edit
SWDEV-198863 - Options for hip-clang-vdi path to provide the chicken bits, or functional equivalents to HCC_DB (phase 2)
Enable the log functions for release build.
Tests:
1. Linux HIP ROCM platform. VEGA10. Driver is release build.
1.1 export LOG_LEVEL=3
./hipModule
There are many logs.
1.2 export GPU_LOG_MASK=0
./hipModule
There is no log
2. Windows HIP PAL platform. VEGA10, Driver is release or fastdbg build.
2.1 set LOG_LEVEL=3
run test hipPrintfKernel
There are many logs
2.2 set GPU_LOG_MASK=0
run test hipPrintfKernel
There is no log
3. http://ocltc.amd.com:8111/viewModification.html?modId=128481&personal=true&tab=vcsModificationBuilds
ReviewBoard: http://ocltc.amd.com/reviews/r/18240/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/debug.hpp#11 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#324 edit
SWDEV-2 - Change OpenCL version number from 3031 to 3032.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2779 edit
SWDEV-209969 - SQTT missing event instrumenting token in OpenCL trace
- SQTT trace reports memory clears if program load occurs during the capture. Avoid memory clears with GPU if CPU backing store is available
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palprogram.cpp#100 edit
SWDEV-198863 - Options for hip-clang-vdi path to provide the chicken bits, or functional equivalents to HCC_DB (phase 1)
1. The log macros is turned off for release build. So log functions has zero impact to release build.
2. The log macros have level, mask, condition control. So we can have more control to avoid log flooding.
I also adjusted some existing log to use new log functions.
1. To excercise and test the new log functions.
2. To improve performance slightly.
3. The change is mainly for HIP-ROCM, we can move more in next phases for PAL or ORCA.
4. I make these log feature unavailable for release build. We can revert to old log functions for release build in a case by case method.
Tests:
1. http://ocltc.amd.com:8111/viewModification.html?modId=128289&personal=true&tab=vcsModificationBuildshttp://ocltc.amd.com:8111/viewModification.html?modId=128358&personal=true&tab=vcsModificationBuilds
2. release build, run hip program, there is no log
3. fastdebug build, run hip program,
export LOG_LEVEL=3
export GPU_LOG_MASK=4294967295
There was a lot of logs.
4. fastdebug build, run hip program,
export LOG_LEVEL=2
export GPU_LOG_MASK=4294967295
There was no logs.
5. fastdebug build, run hip program,
export LOG_LEVEL=3
export GPU_LOG_MASK=4294967294
There was much less logs.
6. fastdebug build, run hip program,
export LOG_LEVEL=3
export GPU_LOG_MASK=47102
There was even much less logs. The logs was expected according to the mask.
7. Tested step 2 to 6 similarily in Windows and Linux
ReviewBoard: http://ocltc.amd.com/reviews/r/18215
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_internal.hpp#46 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#82 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_stream.cpp#26 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hiprtc_internal.hpp#2 edit
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/cl_svm.cpp#29 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/comgrctx.cpp#6 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.cpp#29 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#68 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#137 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.cpp#91 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/command.cpp#100 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/commandqueue.cpp#32 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/runtime.cpp#40 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/debug.hpp#10 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#323 edit
SWDEV-2 - Change OpenCL version number from 3030 to 3031.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2778 edit
SWDEV-2 - Change OpenCL version number from 3029 to 3030.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2777 edit
SWDEV-200688 - Correct some misalignment between DeviceInfo and CALtarget, so that VEGAM can be correctly recognized as gfx804 instead of gfx902
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/caltarget.h#10 edit
SWDEV-2 - Change OpenCL version number from 3028 to 3029.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2776 edit
SWDEV-208424 - ROCr language runtime should not free code object until executable destroy
- Reshuffle the code to make sure HSA runtime can keep the pointer to the code object
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#67 edit
SWDEV-2 - Change OpenCL version number from 3027 to 3028.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2775 edit
SWDEV-2 - Change OpenCL version number from 3026 to 3027.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2774 edit
SWDEV-79445 - OCL generic changes and code clean-up
- Fix memory leaks in COMGR path. Don't create binaryData, since it will be overwritten with action_data_get_data() call.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#65 edit
SWDEV-204995 - Linux Pro: Houdini18 Application hang is seen with pyro sample on OpenCL selection.
The new Houdini application has around 286.6K byte TLS. In Linux, the TLS resides in thread stack. TLS is allocated and initialized during pthread_create.
If command queue thread stack size is only 256k byte, pthread_create function failed with return value EINVAL.
The above information is verified by this test:
I printed out the address of a __thread variable. Then I printed out an address of a local variable. I confirmed both variables are in the same memory segment according to /proc/id/maps. This memory segment is same size of CQ_THREAD_STACK_SIZE and changed with this environment variable.
The __thread variable is 286.6K byte away from the bottom of the stack but still inside the stack.
I have added printf to verify function guessTlsSize can guess tlsSize correctly. And pthread_create succeeded in first invocation with tls size adjustment.
Tests:
1. Test houdini - PASS
2. http://ocltc.amd.com:8111/viewModification.html?modId=128021&personal=true&tab=vcsModificationBuilds
ReviewBoard: http://ocltc.amd.com/reviews/r/18175
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/os/os_posix.cpp#47 edit
SWDEV-2 - Change OpenCL version number from 3025 to 3026.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2773 edit
SWDEV-208424 - ROCr language runtime should not free code object until executable destroy
- Keep the code object reader alive until the program destruction. Update HSAIL path only, since LC path already handles it correctly.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocprogram.cpp#107 edit
SWDEV-207662 - [EURI][OPENCL][Forum 244452]: Multiple printf statements inside kernel producing strange output on Vega on Windows
- Correct the printf arguments parsing logic. Don't use local PrintfInfo info, because it can contain some stale data after the first iteration
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.cpp#27 edit
SWDEV-2 - Change OpenCL version number from 3024 to 3025.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2772 edit
SWDEV-2 - Change OpenCL version number from 3023 to 3024.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2771 edit
SWDEV-2 - Change OpenCL version number from 3022 to 3023.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2770 edit
SWDEV-204511 - [NV14 XTM] OpenCL Conformance Test Fails
- Handle different ABI versions for LC and HSAIL if single context with multiple devices was used. LC changed the locaiton of hidden arguments and HSAIL path requires patching
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.cpp#26 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palkernel.cpp#83 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palkernel.hpp#30 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/kernel.hpp#26 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/program.cpp#105 edit
SWDEV-2 - Change OpenCL version number from 3021 to 3022.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2769 edit
SWDEV-2 - Change OpenCL version number from 3020 to 3021.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2768 edit
SWDEV-184710 - Support hipLaunchCooperativeKernelMultiDevice()
- Add support for multi grid launch in hip
- Detect the new hidden argument and pass the required information for the kernel launch
- Memory for synchronization is allocated as a single object and then the offset for each GPU is found
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_module.cpp#44 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#343 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.cpp#25 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.hpp#17 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palkernel.cpp#82 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#136 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.hpp#42 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.cpp#90 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.hpp#30 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/command.cpp#99 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/command.hpp#97 edit
SWDEV-2 - Change OpenCL version number from 3019 to 3020.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2767 edit
SWDEV-2 - Change OpenCL version number from 3018 to 3019.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2766 edit
SWDEV-2 - Change OpenCL version number from 3017 to 3018.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2765 edit
SWDEV-79445 - OCL generic changes and code clean-up
- Restore xnack support for Navi1x HW(requires COMGR support). Only Navi2x should have a fix in HW
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldefs.hpp#64 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#173 edit
SWDEV-2 - Change OpenCL version number from 3016 to 3017.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2764 edit
SWDEV-198862 - Options for hip-clang-vdi path to provide the chicken bits, or functional equivalents to HCC_OPT_FLUSH
Add HCC_OPT_FLUSH flag to use fence scope agent when possible for HIP VDI. The flag is defaulted to turn on, similiar to HIP HCC.
Add AMD_OCL_OPT_FLUSH to use fence scope agent when possible for OpenCL. This was tested in Windows and PAL. Default is off.
This flag can be used for future OpenCL test.
Tests:
1. http://ocltc.amd.com:8111/viewModification.html?modId=127189&personal=true&tab=vcsModificationBuilds
The teamcity test includes HIP - VDI - Rocm tests.
2. VEGA10 , Windows, HIP, 110 hiptests PASS.
3. VEGA10 , Linux AMDGPU PRO, HIP - PAL, 110 hiptests PASS.
Newer:
http://ocltc.amd.com:8111/viewModification.html?modId=127193&personal=true&tab=vcsModificationBuilds
Reviewboard: http://ocltc.amd.com/reviews/r/18092/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/device.cpp#247 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#342 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.cpp#89 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.hpp#29 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#321 edit
SWDEV-205994 - [CQE OCL][NAVI10][DTB-BLOCKER] ~ 10% -50% performance drop observed while running IndigoBench Benchmark | Faulty CL#2007647
- PAL changed the value reported in numAvailableVgprs on Navi10. Runtime has to switch to vgprsPerSimd for scratch buffer size calculation.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#172 edit
SWDEV-2 - Change OpenCL version number from 3015 to 3016.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2763 edit
SWDEV-193973 - Update perfcounter info to accomodate PAL interface changes
Gfx103 added perf counters for three new blocks - GeDist, GeSe and Df
1. Update the blockIdToIndexSelect array to reflect these changes.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/18063/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palcounters.cpp#25 edit
SWDEV-204782 - store extra information per HSA queue
The new struct QueueInfo is used to store metadata about each HSA
queue. For hostcall, this structure will eventually contain a pointer to
the hostcall buffer allocated to each HSA queue.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#135 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.hpp#41 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.cpp#88 edit
SWDEV-2 - Change OpenCL version number from 3014 to 3015.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2762 edit
SWDEV-2 - Change OpenCL version number from 3013 to 3014.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2761 edit
SWDEV-189650 - [HIP-CLANG][HIP/VDI/PAL] Hangs on test hip_threadfence_system
1. In HIP + VDI + ROCm, allow SVM atomic in VEGA10 and later ASIC. GFX8 (Tonga) was enabled before.
2. In HIP + VDI + PAL Linux driver, allow SVM atomic in VEGA10 and later ASIC.
Tests:
1. In HIP + VDI + ROCm, hip_threadfence_system test passed.
2. In HIP + VDI + PAL + Linux , hip_threadfence_system test passed.
3. OpenCL + PAL, clinfo and ocltest runtime test pass.
4. OpenCL + ROCM, clinfo and ocltest runtime test pass.
5. Windows 10, VEGA 10, clinfo and and ocltest runtime test pass. hip_threadfence_system test passed by skipping the test.
Teamcity presubmission test:
http://ocltc.amd.com:8111/viewModification.html?modId=127083&personal=true&tab=vcsModificationBuildshttp://ocltc.amd.com:8111/viewModification.html?modId=127076&personal=true&tab=vcsModificationBuilds
ReviewBoard: http://ocltc.amd.com/reviews/r/18077/
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#73 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#171 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palresource.cpp#80 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palresource.hpp#31 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#134 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocmemory.cpp#44 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#320 edit