SWDEV-79445 - OCL generic changes and code clean-up
- Fix memory leaks in COMGR path. Don't create binaryData, since it will be overwritten with action_data_get_data() call.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#65 edit
[ROCm/clr commit: e22fe683e7]
SWDEV-204995 - Linux Pro: Houdini18 Application hang is seen with pyro sample on OpenCL selection.
The new Houdini application has around 286.6K byte TLS. In Linux, the TLS resides in thread stack. TLS is allocated and initialized during pthread_create.
If command queue thread stack size is only 256k byte, pthread_create function failed with return value EINVAL.
The above information is verified by this test:
I printed out the address of a __thread variable. Then I printed out an address of a local variable. I confirmed both variables are in the same memory segment according to /proc/id/maps. This memory segment is same size of CQ_THREAD_STACK_SIZE and changed with this environment variable.
The __thread variable is 286.6K byte away from the bottom of the stack but still inside the stack.
I have added printf to verify function guessTlsSize can guess tlsSize correctly. And pthread_create succeeded in first invocation with tls size adjustment.
Tests:
1. Test houdini - PASS
2. http://ocltc.amd.com:8111/viewModification.html?modId=128021&personal=true&tab=vcsModificationBuilds
ReviewBoard: http://ocltc.amd.com/reviews/r/18175
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/os/os_posix.cpp#47 edit
[ROCm/clr commit: be8023429a]
SWDEV-2 - Change OpenCL version number from 3025 to 3026.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2773 edit
[ROCm/clr commit: 81c59e130f]
SWDEV-208424 - ROCr language runtime should not free code object until executable destroy
- Keep the code object reader alive until the program destruction. Update HSAIL path only, since LC path already handles it correctly.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocprogram.cpp#107 edit
[ROCm/clr commit: a1c86d10b5]
SWDEV-207662 - [EURI][OPENCL][Forum 244452]: Multiple printf statements inside kernel producing strange output on Vega on Windows
- Correct the printf arguments parsing logic. Don't use local PrintfInfo info, because it can contain some stale data after the first iteration
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.cpp#27 edit
[ROCm/clr commit: 7ad6787328]
SWDEV-2 - Change OpenCL version number from 3024 to 3025.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2772 edit
[ROCm/clr commit: 3991264b0f]
SWDEV-2 - Change OpenCL version number from 3023 to 3024.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2771 edit
[ROCm/clr commit: 707020264f]
SWDEV-2 - Change OpenCL version number from 3022 to 3023.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2770 edit
[ROCm/clr commit: 59d9f16f77]
SWDEV-204511 - [NV14 XTM] OpenCL Conformance Test Fails
- Handle different ABI versions for LC and HSAIL if single context with multiple devices was used. LC changed the locaiton of hidden arguments and HSAIL path requires patching
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.cpp#26 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palkernel.cpp#83 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palkernel.hpp#30 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/kernel.hpp#26 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/program.cpp#105 edit
[ROCm/clr commit: 50ad4d1a7f]
SWDEV-2 - Change OpenCL version number from 3021 to 3022.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2769 edit
[ROCm/clr commit: f351cfb96b]
SWDEV-2 - Change OpenCL version number from 3020 to 3021.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2768 edit
[ROCm/clr commit: 09382fcc6a]
SWDEV-184710 - Support hipLaunchCooperativeKernelMultiDevice()
- Add support for multi grid launch in hip
- Detect the new hidden argument and pass the required information for the kernel launch
- Memory for synchronization is allocated as a single object and then the offset for each GPU is found
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_module.cpp#44 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#343 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.cpp#25 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.hpp#17 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palkernel.cpp#82 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#136 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.hpp#42 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.cpp#90 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.hpp#30 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/command.cpp#99 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/command.hpp#97 edit
[ROCm/clr commit: 6e7e97987f]
SWDEV-2 - Change OpenCL version number from 3019 to 3020.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2767 edit
[ROCm/clr commit: 4bb7f81c62]
SWDEV-2 - Change OpenCL version number from 3018 to 3019.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2766 edit
[ROCm/clr commit: 1b6971999d]
SWDEV-2 - Change OpenCL version number from 3017 to 3018.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2765 edit
[ROCm/clr commit: f4cc3fe53d]
SWDEV-79445 - OCL generic changes and code clean-up
- Restore xnack support for Navi1x HW(requires COMGR support). Only Navi2x should have a fix in HW
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldefs.hpp#64 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#173 edit
[ROCm/clr commit: cac8628fe2]
SWDEV-2 - Change OpenCL version number from 3016 to 3017.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2764 edit
[ROCm/clr commit: 8dfb60f2c7]
SWDEV-198862 - Options for hip-clang-vdi path to provide the chicken bits, or functional equivalents to HCC_OPT_FLUSH
Add HCC_OPT_FLUSH flag to use fence scope agent when possible for HIP VDI. The flag is defaulted to turn on, similiar to HIP HCC.
Add AMD_OCL_OPT_FLUSH to use fence scope agent when possible for OpenCL. This was tested in Windows and PAL. Default is off.
This flag can be used for future OpenCL test.
Tests:
1. http://ocltc.amd.com:8111/viewModification.html?modId=127189&personal=true&tab=vcsModificationBuilds
The teamcity test includes HIP - VDI - Rocm tests.
2. VEGA10 , Windows, HIP, 110 hiptests PASS.
3. VEGA10 , Linux AMDGPU PRO, HIP - PAL, 110 hiptests PASS.
Newer:
http://ocltc.amd.com:8111/viewModification.html?modId=127193&personal=true&tab=vcsModificationBuilds
Reviewboard: http://ocltc.amd.com/reviews/r/18092/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/device.cpp#247 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#342 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.cpp#89 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.hpp#29 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#321 edit
[ROCm/clr commit: d43f2b6372]
SWDEV-205994 - [CQE OCL][NAVI10][DTB-BLOCKER] ~ 10% -50% performance drop observed while running IndigoBench Benchmark | Faulty CL#2007647
- PAL changed the value reported in numAvailableVgprs on Navi10. Runtime has to switch to vgprsPerSimd for scratch buffer size calculation.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#172 edit
[ROCm/clr commit: a65bdb6d6d]
SWDEV-2 - Change OpenCL version number from 3015 to 3016.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2763 edit
[ROCm/clr commit: dcaa953e8d]
SWDEV-193973 - Update perfcounter info to accomodate PAL interface changes
Gfx103 added perf counters for three new blocks - GeDist, GeSe and Df
1. Update the blockIdToIndexSelect array to reflect these changes.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/18063/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palcounters.cpp#25 edit
[ROCm/clr commit: 5fc5006853]
SWDEV-204782 - store extra information per HSA queue
The new struct QueueInfo is used to store metadata about each HSA
queue. For hostcall, this structure will eventually contain a pointer to
the hostcall buffer allocated to each HSA queue.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#135 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.hpp#41 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.cpp#88 edit
[ROCm/clr commit: 1820fe21cf]
SWDEV-2 - Change OpenCL version number from 3014 to 3015.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2762 edit
[ROCm/clr commit: 35a4ac7c6b]
SWDEV-2 - Change OpenCL version number from 3013 to 3014.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2761 edit
[ROCm/clr commit: dd44634dba]
SWDEV-189650 - [HIP-CLANG][HIP/VDI/PAL] Hangs on test hip_threadfence_system
1. In HIP + VDI + ROCm, allow SVM atomic in VEGA10 and later ASIC. GFX8 (Tonga) was enabled before.
2. In HIP + VDI + PAL Linux driver, allow SVM atomic in VEGA10 and later ASIC.
Tests:
1. In HIP + VDI + ROCm, hip_threadfence_system test passed.
2. In HIP + VDI + PAL + Linux , hip_threadfence_system test passed.
3. OpenCL + PAL, clinfo and ocltest runtime test pass.
4. OpenCL + ROCM, clinfo and ocltest runtime test pass.
5. Windows 10, VEGA 10, clinfo and and ocltest runtime test pass. hip_threadfence_system test passed by skipping the test.
Teamcity presubmission test:
http://ocltc.amd.com:8111/viewModification.html?modId=127083&personal=true&tab=vcsModificationBuildshttp://ocltc.amd.com:8111/viewModification.html?modId=127076&personal=true&tab=vcsModificationBuilds
ReviewBoard: http://ocltc.amd.com/reviews/r/18077/
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#73 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#171 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palresource.cpp#80 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palresource.hpp#31 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#134 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocmemory.cpp#44 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#320 edit
[ROCm/clr commit: d3b6a9731c]
SWDEV-2 - Change OpenCL version number from 3012 to 3013.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2760 edit
[ROCm/clr commit: 3bfe682c3f]
SWDEV-204999 - [hipclang-vdi-rocm] TF unit test tracking.util_xla_test_gpu fails to run
- Fix a regression with 32bit binaries in HSAIL mode
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/loaders/elf/elf.hpp#28 edit
[ROCm/clr commit: 371a872e7c]
SWDEV-204999 - [hipclang-vdi-rocm] TF unit test tracking.util_xla_test_gpu fails to run
- Change the HSACO detection logic to use e_machine
- Allow to load a binary without any kernel.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/loaders/elf/elf.hpp#27 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#63 edit
[ROCm/clr commit: 02fbea29d6]
SWDEV-2 - Change OpenCL version number from 3011 to 3012.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2759 edit
[ROCm/clr commit: fefeaab2b2]
SWDEV-86035 - Integrate PAL from //depot/stg/pal_prm/...
- Adjust Gfx9PlusSubDeviceInfo for the new defines
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldefs.hpp#62 edit
[ROCm/clr commit: fa12de9ce1]
SWDEV-2 - Change OpenCL version number from 3010 to 3011.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2758 edit
[ROCm/clr commit: 370a6a881a]
SWDEV-2 - Change OpenCL version number from 3009 to 3010.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2757 edit
[ROCm/clr commit: 33342675de]
SWDEV-192384 - [HIP CQE][HIPonPAL][19.40] hipBindTexRef1DFetch, hipTextureRef2D are failed on all ASICs for both Win/Lnx
1. Don't ignore the PAL_ALWAYS_RESIDENT flag for HIP.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/18061/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#101 edit
[ROCm/clr commit: d6be390e0a]
SWDEV-192384 - [HIP CQE][HIPonPAL][19.40] hipBindTexRef1DFetch, hipTextureRef2D are failed on all ASICs for both Win/Lnx
Add undefined memory object in PAL process memory objects.
http://ocltc.amd.com/reviews/r/18055/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.hpp#33 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palvirtual.cpp#154 edit
[ROCm/clr commit: 73b57f8bae]
SWDEV-2 - Change OpenCL version number from 3008 to 3009.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2756 edit
[ROCm/clr commit: 411f8194e5]
SWDEV-192384 - [HIP CQE][HIPonPAL][19.40] hipBindTexRef1DFetch, hipTextureRef2D are failed on all ASICs for both Win/Lnx
The runtime cannot trivially determine all the resources that will be used by a kernel, thus it can fail to make all of them resident.
1. Add new runtime flag PAL_ALWAYS_RESIDENT. Enabling this setting will cause resources to become resident at allocation time.
2. Set the default value of the above flag to true for HIP and false for OCL.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/18054/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palresource.cpp#79 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palresource.hpp#30 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#100 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.hpp#27 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palvirtual.cpp#153 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#319 edit
[ROCm/clr commit: 2404ade2ef]
SWDEV-192384 - [HIP CQE][HIPonPAL][19.40] hipBindTexRef1DFetch, hipTextureRef2D are failed on all ASICs for both Win/Lnx
1. Correctly set the image type for textures created from arrays.
2. Allow creating any kind of image from a buffer.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/18051/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_texture.cpp#19 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#166 edit
[ROCm/clr commit: bf668d922c]
SWDEV-2 - Change OpenCL version number from 3007 to 3008.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2755 edit
[ROCm/clr commit: ab126af6c7]