Belton-Schure, Aidan
3cd3b3ffc5
SWDEV-527851 SWDEV-527890 SWDEV-529456 - Make HIP vector simple struct ( #356 )
...
* Make HIP vector simple struct
Change-Id: I8442c2cc9af26b2a3c7d6719e3348df1593e83b3
* Update make_vector_type
Change-Id: Ic5060994a08baa4262c2a4b09fcbe6bc74276720
2025-05-21 21:30:55 +05:30
Dittakavi, Satyanvesh
e9dbd7c99d
SWDEV-418904 - Remove hiprtc symbols from hip library ( #370 )
2025-05-21 21:22:47 +05:30
Assiouras, Ioannis
0b64eec921
SWDEV-508962 - [6.4 Preview] Update hipPointerGetAttributes to match CUDA > 11.0 behavior ( #120 ) ( #363 )
2025-05-21 21:18:05 +05:30
Assiouras, Ioannis
968f5599a8
SWDEV-508965 - [6.4 Preview] Remove HIP_MEMSET_NODE_PARAMS struct ( #121 ) ( #364 )
2025-05-21 21:16:42 +05:30
Andryeyev, German
58df22546b
SWDEV-532868 - Disable implicit wait in hipFree for async allocations ( #323 )
...
* SWDEV-532868 - Disable implicit wait in hipFree for async allocations
* Fix compilation error
2025-05-21 21:14:38 +05:30
Brzak, Branislav
7a357800dd
SWDEV-508979 - Match hipModuleLoad negative return with Cuda ( #326 )
2025-05-21 21:09:07 +05:30
Brzak, Branislav
a46ed60e2e
SWDEV-508970 - Match hipBindTextureToArray negative return with Cuda ( #325 )
2025-05-21 20:57:09 +05:30
Sang, Tao
78f3caba0b
SWDEV-519346 - Fix __clock64() compiling issue in SPIRV ( #207 )
2025-05-20 16:57:17 -04:00
Jayaprakash, Karthik
12131de4a9
SWDEV-529929 - hipMemGetHandleForAddressRange implementation. ( #245 )
2025-05-20 15:56:04 -04:00
Jayaprakash, Karthik
bed454caa1
SWDEV-457749 - Use size of handle for range instead of actual size for physmem. ( #342 )
2025-05-20 15:24:49 -04:00
Kudchadker, Saleel
1b0ea080e4
SWDEV-523279 - Use preferred engine mask for SDMA ( #317 )
...
- ROCr now reports preferred engine for copy status. We can leverage
this for max bandwidth for inter-GPU copies
- Cleanup logging
2025-05-19 16:04:51 -07:00
Kudchadker, Saleel
5712944c7c
SWDEV-531518 - Fix offset accumulation ( #333 )
...
srcAddress/dstAddress accumulation was cumulative, which shouldnt be
done if we increment offset.
2025-05-19 18:03:06 +05:30
Jayaprakash, Karthik
bb7750a946
Revert "SWDEV-522707 - Set phys_mem_handle type to sizeof(size_t) to avoid blocking address range. ( #105 )" ( #348 )
...
This reverts commit 6811fd90b8 .
2025-05-19 15:19:16 +05:30
Jayaprakash, Karthik
f5b8db33f1
SWDEV-531711 - Report correct error code based on device failure. ( #286 )
2025-05-17 06:33:13 -04:00
Brzak, Branislav
7698d799ce
SWDEV-508742 - Make clCreatePipe spec compliant ( #80 )
2025-05-16 15:18:35 +05:30
Belton-Schure, Aidan
c50610b44d
Add __syncwarp operation ( #160 )
...
Change-Id: I6a3783beafdbb9f11a3b37333f4ff3f5be27ea54
2025-05-15 14:20:13 +05:30
Patel, Jaydeepkumar
32eb6a5d89
SWDEV-530803 - User current device id while cloning graph node. ( #313 )
2025-05-15 09:06:15 +05:30
Andryeyev, German
bddb8f14d1
SWDEV-345024 - Retain the program on Fini kernel execution ( #307 )
...
Fini kernel is executed during the invocation of amd::Program destructor,
but the dispatch logic can retain/release the reference counter and
cause double free. Avoid double free with an extra retain() call
2025-05-14 21:21:26 +05:30
Xie, Pengda
0457b634f8
SWDEV-527781 - Remove Stream Validation in HIP APIs
2025-05-13 13:45:27 -07:00
Assiouras, Ioannis
f7482ef0a6
SWDEV-529449 - Bug fix when retrieving a memobj from the IPC mem handle
2025-05-13 19:18:22 +01:00
Hernandez, Gerardo
5606debd8e
SWDEV-491314 - Re-enable cross-lane sync builtins ( #94 )
...
* Enables warp sync builtins by default
* Removes HIP_ENABLE_WARP_SYNC_BUILTINS; that macro will no longer have an effect. Instead, we will now be able to disable the builtins with the macro: HIP_DISABLE_WARP_SYNC_BUILTINS
2025-05-13 16:35:58 +01:00
Hila, Nino
29df3ae6e9
Update palamida.yml ( #266 )
...
* Add palamida.yml - removing url
2025-05-12 21:39:21 -07:00
Jayaprakash, Karthik
876de49b11
SWDEV-506467 - Fixing compilation issue seen on clang compilation for ASAN. ( #253 )
2025-05-12 17:16:56 -04:00
Brzak, Branislav
f9199ac205
SWDEV-528683 - Hardcode valid wavefront compile time options ( #306 )
2025-05-12 19:29:39 +02:00
Andryeyev, German
da198ac5b2
SWDEV-531678 - Remove split path from the dispatch ( #283 )
...
The split path for blit kernels are no longer necessary, since the new blit kernels
don't use the copy size as the global workload
2025-05-12 12:50:32 -04:00
Jayaprakash, Karthik
acb1f7e8d5
SWDEV-526855 - Modify the SIMDPerCU calculation for gfx1250/1. ( #275 )
2025-05-12 11:09:03 -04:00
Arandjelovic, Marko
c5ced8c3a2
SWDEV-512344 - Unmap all subbuffers ( #214 )
2025-05-12 16:56:10 +02:00
Arandjelovic, Marko
a7492c516d
SWDEV-511204 - Mapped virtual memory should use device instead of host context ( #213 )
...
Since the sub-buffer(virtual memory that is mapped to device memory) is associated with device memory, it should utilize the device context instead of the host context. The original implementation caused hipMemcpyPeer to not take the P2P path, as the memory object was treated as host memory.
2025-05-12 16:55:25 +02:00
Patel, Jaydeepkumar
6858b0fca1
SWDEV-521135 - Make common way to set/parse UUID bytes from PAL props. ( #63 )
2025-05-12 17:00:30 +05:30
Six, Lancelot
c35e9643ec
SWDEV-517078: Fix gfx11 trap handler ( #212 )
...
Fix incorrect edits done when porting the 2nd level trap handler from
the hsa-runtime.
Change-Id: I7bc5160be47b8f669efe05c4d194bc3c47fc0661
2025-05-11 01:12:28 +01:00
Xie, AlexBin
faac50c77a
SWDEV-528860 - reserve some memory in visible frame buffer ( #251 )
2025-05-09 20:08:23 -04:00
Huang, AnZhong
b434fbe2bd
SWDEV-527299 - Support HIP_POINTER_ATTRIBUTE_CONTEXT ( #180 )
...
* SWDEV-527299 - Support HIP_POINTER_ATTRIBUTE_CONTEXT
As HIP enables UVA by default, it seems we can simply expose the context to support this feature.
2025-05-09 17:34:16 +08:00
Chaudhary, Jatin Jaikishan
2f73e1385b
SWDEV-525933 - add constexpr operators for fp16/bf16 ( #199 )
2025-05-09 09:53:58 +01:00
Xie, Jiabao(Jimbo)
a320a3f214
SWDEV-528913 - support gfx950 in rocsetting ( #217 )
...
* SWDEV-528913 - support gfx950 in rocsetting
---------
Co-authored-by: Jimbo Xie <jiabaxie@amd.com >
2025-05-07 15:44:49 -04:00
Lambert, Jacob
6b12154583
SWDEV-518221 - Don't link against libamd_comgr.so at runtime
...
Convention is to always link against .so.* at runtime.
Having it link against .so will break on systems that package
the .so files in their dev/devel package.
This issue was found when building ROCm 6.4 for Fedora.
Commiting on behalf of GitHub user Mystro256
2025-05-07 11:56:41 -07:00
Zhang, Victor
f960433dcd
SWDEV-528142 - add error check for KernelParameters::capture ( #276 )
...
* SWDEV-528142 - add error check for KernelParameters::capture
* Update kernel.cpp
---------
Co-authored-by: victzhan <victzhan@amd.com >
2025-05-07 09:52:09 -04:00
Jayaprakash, Karthik
fa55557f46
SWDEV-493805 - Cleaning up launch parameters arguments. ( #241 )
2025-05-06 15:06:13 -04:00
Dittakavi, Satyanvesh
607f8f26fd
SWDEV-529831 - Return error if the program is empty ( #257 )
2025-05-06 15:12:12 +05:30
Chaudhary, Jatin Jaikishan
a71c6eb1a0
SWDEV-529854 - __hmax/__hmin should handle nan's ( #246 )
2025-05-06 09:42:15 +01:00
Chaudhary, Jatin Jaikishan
b1ebf33850
SWDEV-529927 - add missing operations for fp16/bf16 ( #238 )
2025-05-06 09:41:21 +01:00
Andryeyev, German
65a0181a7c
SWDEV-528808 - Release all HW queues even if only one is idle ( #240 )
...
Pytorch may not explicitly idle each queue. Thus, some queues can be considered as busy,
but have idle state in reality
2025-05-05 19:09:01 -04:00
Guan, Zichuan
3775298655
Disable HIP_PLATFORM auto-detect if already defined ( #254 )
...
Co-authored-by: Stella Laurenzo <stellaraccident@gmail.com >
2025-05-05 15:37:53 -04:00
Arsenault, Matthew
1db9a7d48b
SWDEV-1 - Stop using ocml rounding functions ( #228 )
...
Directly use the builtins. Use the elementwise versions since there's
no implied errno, regardless of -f[no]-math-errno.
I didn't change the cases unnecessarily casting. The bfloat and vector
cases should work directly.
2025-05-05 19:35:12 +02:00
Andryeyev, German
9b018165ce
SWDEV-528808 - Disable dynamic queue by default ( #256 )
...
Dynamic queue management will be disabled by default and
the original sort logic is restored
2025-05-05 10:56:35 -04:00
Searles, Mark
cd9bc61559
Fix typos in warning msgs ( #231 )
2025-05-02 14:31:42 -07:00
Chaudhary, Jatin Jaikishan
12febe6782
SWDEV-514560 - add fp6 header implementation ( #54 )
...
Co-authored-by: rahul manocha <rmanocha_amdeng>
2025-05-01 15:17:38 +01:00
Assiouras, Ioannis
9d6a0d1a4d
SWDEV-521011 - Fix alignment in PalResource::CreateSvm
2025-05-01 02:22:49 +01:00
Andryeyev, German
84a4f293f4
SWDEV-526836 - add PipelineStageBlt flag ( #229 )
...
CP sync requires PipelineStageBlt flag.
2025-04-30 14:27:41 -04:00
Assiouras, Ioannis
d3fb8eda8b
SWDEV-525593, SWDEV-527293 - Acquire active queue after xferQueue is created ( #165 )
...
For xferQueue VirtualGPU::create is called after ProfilingBegin
so the active queue needs to be acquired.
2025-04-30 09:21:11 +01:00
Godavarthy Surya, Anusha
2538d7f02b
SWDEV-522841 - Graph nodes must be created/launched on device where they are captured/created ( #108 )
2025-04-29 22:20:39 +05:30