SWDEV-216705 - [hipclang-vdi-rocm][FBA-80]Test crash when all GPUs are hidden by ROCR_VISIBLE_DEVICES
Return an error instead of dereferencing a null pointer. This should address the issue described
in the ticket, but more places need fixing in the runtime to avoid crashes for corner cases.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_context.cpp#26 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_device_runtime.cpp#23 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#92 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#58 edit
[ROCm/hip commit: f6d38a725c]
SWDEV-213031 - Check the functions_ map else interpret as a hip::Function for now. Function may not be a device function and may have been obtaiend via hipModuleGetFunction and thus not in the functions_ map
ReviewBoardURL = http://ocltc.amd.com/reviews/r/18388/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#57 edit
[ROCm/hip commit: 455c3a91ef]
SWDEV-145570 - [hip] special case const char* for logs in case it's a null pointer.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/trace_helper.h#3 edit
[ROCm/hip commit: de46a0e205]
SWDEV-213526 - [hip] OOM issue
Delay any access to device layers till HIP API is called by app.
This allows the app to fork the process first and then call HIP which is legal.
Doing hip calls then fork isn't legal nor supported by ROCm.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_context.cpp#25 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_internal.hpp#49 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#55 edit
[ROCm/hip commit: 921fa13d81]
SWDEV-214490 - Update HIP RT for texture3D in HIP/PAL on Windows
-Update hipTexRefSetArray
http://ocltc.amd.com/reviews/r/18356/
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_texture.cpp#29 edit
[ROCm/hip commit: f4df28905e]
SWDEV-214490 - Update HIP RT for texture3D in HIP/PAL on Windows
-Update hipMemcpy3D function
http://ocltc.amd.com/reviews/r/18346/
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#90 edit
[ROCm/hip commit: 188a357527]
SWDEV-145570 - Adding back the lazy kernel changes because the OOM issue is because of KFD/RocR.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#53 edit
[ROCm/hip commit: a077140ae2]
SWDEV-214490 - Update HIP RT for texture3D in HIP/PAL on Windows
- Update function hipMemcpy3D for Texture Array
- Add hipArrayCubemap support in hipMalloc3DArray
http://ocltc.amd.com/reviews/r/18328/
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#88 edit
[ROCm/hip commit: 5349bd8036]
SWDEV-213526 - pytorch tests fail with hipErrorOutofMemory
There's a bug in ROCr when loading a lot of kernels and not syncing
So for now, if an allocation fails, sync devices and retry before
returning hipErrorOutOfMemory error.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#86 edit
[ROCm/hip commit: 5c5588bf20]
SWDEV-212440 - [HIP] Memory access fault observed on Pytorch while running performance tests with Microbenchmarking script
We need to loop through all the default stream to sync them in case
the app call hipFree on a different current stream and another current stream
is using the memory.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#85 edit
[ROCm/hip commit: 9529795fab]
SWDEV-206239 - [HIP] Return hipErrorMemoryAllocation for fine grained VRAM for now
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#83 edit
[ROCm/hip commit: 571b8d625d]
SWDEV-198863 - Options for hip-clang-vdi path to provide the chicken bits, or functional equivalents to HCC_DB (phase 1)
1. The log macros is turned off for release build. So log functions has zero impact to release build.
2. The log macros have level, mask, condition control. So we can have more control to avoid log flooding.
I also adjusted some existing log to use new log functions.
1. To excercise and test the new log functions.
2. To improve performance slightly.
3. The change is mainly for HIP-ROCM, we can move more in next phases for PAL or ORCA.
4. I make these log feature unavailable for release build. We can revert to old log functions for release build in a case by case method.
Tests:
1. http://ocltc.amd.com:8111/viewModification.html?modId=128289&personal=true&tab=vcsModificationBuildshttp://ocltc.amd.com:8111/viewModification.html?modId=128358&personal=true&tab=vcsModificationBuilds
2. release build, run hip program, there is no log
3. fastdebug build, run hip program,
export LOG_LEVEL=3
export GPU_LOG_MASK=4294967295
There was a lot of logs.
4. fastdebug build, run hip program,
export LOG_LEVEL=2
export GPU_LOG_MASK=4294967295
There was no logs.
5. fastdebug build, run hip program,
export LOG_LEVEL=3
export GPU_LOG_MASK=4294967294
There was much less logs.
6. fastdebug build, run hip program,
export LOG_LEVEL=3
export GPU_LOG_MASK=47102
There was even much less logs. The logs was expected according to the mask.
7. Tested step 2 to 6 similarily in Windows and Linux
ReviewBoard: http://ocltc.amd.com/reviews/r/18215
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_internal.hpp#46 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#82 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_stream.cpp#26 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hiprtc_internal.hpp#2 edit
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/cl_svm.cpp#29 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/comgrctx.cpp#6 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.cpp#29 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#68 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#137 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.cpp#91 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/command.cpp#100 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/commandqueue.cpp#32 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/runtime.cpp#40 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/debug.hpp#10 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#323 edit
[ROCm/hip commit: 007687bf53]
SWDEV-145570 - [HIP] Fix occupancy API prototype.
- They need to be C API, i.e. extern "C".
- Follow the current API and use `uint32_t` instead of `int`.
+ TODO: We need to revert them back once that APIs are changed to be
compatible with CUDA.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#46 edit
[ROCm/hip commit: a0f8995e3a]
SWDEV-197289 - VDI tracing API integration in rocTracer
- Change the names of the functions according to the new interface
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_activity.cpp#2 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_hcc.def.in#34 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_hcc.map.in#32 edit
[ROCm/hip commit: 63a26884aa]
SWDEV-207366 - [HIP] 'hipErrorInvalidValue' (1011) with hipMemcpy3D
We need to divide by sizeByte and not multiply the WidthInBytes to get pixel width
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#79 edit
[ROCm/hip commit: 4ec9d181e0]
SWDEV-207100 - [HIP CQE][HIPonPAL][WIN][QR] 5 hiptests failed in 19H1 Windows on all ASICs
1. Reshuffle locations of the hipMemset functions to make them all next to each other.
2. Update the declarations of hipMemsetD8, hipMemsetD8Async, hipMemsetD16, hipMemsetD16Async. These functions are type aware and take in as their third argument the number of elements in the buffer, not the buffer size. Change the name of this argument from sizeBytes to count to align with the above description. Changes for the header are tracked here https://github.com/ROCm-Developer-Tools/HIP/pull/1544
3. Add the actual implementation of hipMemsetD8, hipMemsetD8Async, hipMemsetD16, hipMemsetD16Async.
4. Remove ihipMemset2D() as it is essentially a copy of ihipMemset(). Change hipMemset2D()/hipMemset2DAsync() to use ihipMemset().
5. Implement hipMemset3DAsync().
6. Update the test script to pick up the updated command line options for hipMemset and hipMemset3D.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_hcc.def.in#32 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_hcc.map.in#30 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#78 edit
... //depot/stg/opencl/drivers/opencl/make/hip.git/tests/scripts/hip_runtimeapi_tests.txt#13 edit
[ROCm/hip commit: 238a71c4ca]
SWDEV-207449 - [HIP CQE][HIPonPAL][LNX][QR] 6 hiptests failed on all ASICs
hipTestHalf fails to build on Windows due to linker error "unresolved external symbol __gnu_h2f_ieee"
1. Expose __gnu_h2f_ieee() and __gnu_f2h_ieee() for Windows builds.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/18127/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_hcc.def.in#31 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_platform.cpp#45 edit
[ROCm/hip commit: d43f0eedcc]
SWDEV-205925 - Update HIP texture APIs for issue in hipTexRefSetAddress in HIP/PAL on Windows
- Remove the nullptr possibility
http://ocltc.amd.com/reviews/r/18121/
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_texture.cpp#23 edit
[ROCm/hip commit: ef14b8b361]
SWDEV-184710 - Support hipLaunchCooperativeKernelMultiDevice()
- Add support for multi grid launch in hip
- Detect the new hidden argument and pass the required information for the kernel launch
- Memory for synchronization is allocated as a single object and then the offset for each GPU is found
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_module.cpp#44 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#343 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.cpp#25 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.hpp#17 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palkernel.cpp#82 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#136 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.hpp#42 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.cpp#90 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.hpp#30 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/command.cpp#99 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/command.hpp#97 edit
[ROCm/hip commit: 70a52b9cd7]
SWDEV-189650 - [HIP-CLANG][HIP/VDI/PAL] Hangs on test hip_threadfence_system
1. In HIP + VDI + ROCm, allow SVM atomic in VEGA10 and later ASIC. GFX8 (Tonga) was enabled before.
2. In HIP + VDI + PAL Linux driver, allow SVM atomic in VEGA10 and later ASIC.
Tests:
1. In HIP + VDI + ROCm, hip_threadfence_system test passed.
2. In HIP + VDI + PAL + Linux , hip_threadfence_system test passed.
3. OpenCL + PAL, clinfo and ocltest runtime test pass.
4. OpenCL + ROCM, clinfo and ocltest runtime test pass.
5. Windows 10, VEGA 10, clinfo and and ocltest runtime test pass. hip_threadfence_system test passed by skipping the test.
Teamcity presubmission test:
http://ocltc.amd.com:8111/viewModification.html?modId=127083&personal=true&tab=vcsModificationBuildshttp://ocltc.amd.com:8111/viewModification.html?modId=127076&personal=true&tab=vcsModificationBuilds
ReviewBoard: http://ocltc.amd.com/reviews/r/18077/
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#73 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#171 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palresource.cpp#80 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palresource.hpp#31 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#134 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocmemory.cpp#44 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#320 edit
[ROCm/hip commit: 5db4c83423]