Граф коммитов

13557 Коммитов

Автор SHA1 Сообщение Дата
Belton-Schure, Aidan 3cd3b3ffc5 SWDEV-527851 SWDEV-527890 SWDEV-529456 - Make HIP vector simple struct (#356)
* Make HIP vector simple struct

Change-Id: I8442c2cc9af26b2a3c7d6719e3348df1593e83b3

* Update make_vector_type

Change-Id: Ic5060994a08baa4262c2a4b09fcbe6bc74276720
2025-05-21 21:30:55 +05:30
Dittakavi, Satyanvesh e9dbd7c99d SWDEV-418904 - Remove hiprtc symbols from hip library (#370) 2025-05-21 21:22:47 +05:30
Assiouras, Ioannis 0b64eec921 SWDEV-508962 - [6.4 Preview] Update hipPointerGetAttributes to match CUDA > 11.0 behavior (#120) (#363) 2025-05-21 21:18:05 +05:30
Assiouras, Ioannis 968f5599a8 SWDEV-508965 - [6.4 Preview] Remove HIP_MEMSET_NODE_PARAMS struct (#121) (#364) 2025-05-21 21:16:42 +05:30
Andryeyev, German 58df22546b SWDEV-532868 - Disable implicit wait in hipFree for async allocations (#323)
* SWDEV-532868 - Disable implicit wait in hipFree for async allocations

* Fix compilation error
2025-05-21 21:14:38 +05:30
Brzak, Branislav 7a357800dd SWDEV-508979 - Match hipModuleLoad negative return with Cuda (#326) 2025-05-21 21:09:07 +05:30
Brzak, Branislav a46ed60e2e SWDEV-508970 - Match hipBindTextureToArray negative return with Cuda (#325) 2025-05-21 20:57:09 +05:30
Sang, Tao 78f3caba0b SWDEV-519346 - Fix __clock64() compiling issue in SPIRV (#207) 2025-05-20 16:57:17 -04:00
Jayaprakash, Karthik 12131de4a9 SWDEV-529929 - hipMemGetHandleForAddressRange implementation. (#245) 2025-05-20 15:56:04 -04:00
Jayaprakash, Karthik bed454caa1 SWDEV-457749 - Use size of handle for range instead of actual size for physmem. (#342) 2025-05-20 15:24:49 -04:00
Kudchadker, Saleel 1b0ea080e4 SWDEV-523279 - Use preferred engine mask for SDMA (#317)
- ROCr now reports preferred engine for copy status. We can leverage
this for max bandwidth for inter-GPU copies
- Cleanup logging
2025-05-19 16:04:51 -07:00
Kudchadker, Saleel 5712944c7c SWDEV-531518 - Fix offset accumulation (#333)
srcAddress/dstAddress accumulation was cumulative, which shouldnt be
done if we increment offset.
2025-05-19 18:03:06 +05:30
Jayaprakash, Karthik bb7750a946 Revert "SWDEV-522707 - Set phys_mem_handle type to sizeof(size_t) to avoid blocking address range. (#105)" (#348)
This reverts commit 6811fd90b8.
2025-05-19 15:19:16 +05:30
Jayaprakash, Karthik f5b8db33f1 SWDEV-531711 - Report correct error code based on device failure. (#286) 2025-05-17 06:33:13 -04:00
Brzak, Branislav 7698d799ce SWDEV-508742 - Make clCreatePipe spec compliant (#80) 2025-05-16 15:18:35 +05:30
Belton-Schure, Aidan c50610b44d Add __syncwarp operation (#160)
Change-Id: I6a3783beafdbb9f11a3b37333f4ff3f5be27ea54
2025-05-15 14:20:13 +05:30
Patel, Jaydeepkumar 32eb6a5d89 SWDEV-530803 - User current device id while cloning graph node. (#313) 2025-05-15 09:06:15 +05:30
Andryeyev, German bddb8f14d1 SWDEV-345024 - Retain the program on Fini kernel execution (#307)
Fini kernel is executed during the invocation of amd::Program destructor,
but the dispatch logic can retain/release the reference counter and
cause double free. Avoid double free with an extra retain() call
2025-05-14 21:21:26 +05:30
Xie, Pengda 0457b634f8 SWDEV-527781 - Remove Stream Validation in HIP APIs 2025-05-13 13:45:27 -07:00
Assiouras, Ioannis f7482ef0a6 SWDEV-529449 - Bug fix when retrieving a memobj from the IPC mem handle 2025-05-13 19:18:22 +01:00
Hernandez, Gerardo 5606debd8e SWDEV-491314 - Re-enable cross-lane sync builtins (#94)
* Enables warp sync builtins by default

* Removes HIP_ENABLE_WARP_SYNC_BUILTINS; that macro will no longer have an effect. Instead, we will now be able to disable the builtins with the macro: HIP_DISABLE_WARP_SYNC_BUILTINS
2025-05-13 16:35:58 +01:00
Hila, Nino 29df3ae6e9 Update palamida.yml (#266)
* Add palamida.yml - removing url
2025-05-12 21:39:21 -07:00
Jayaprakash, Karthik 876de49b11 SWDEV-506467 - Fixing compilation issue seen on clang compilation for ASAN. (#253) 2025-05-12 17:16:56 -04:00
Brzak, Branislav f9199ac205 SWDEV-528683 - Hardcode valid wavefront compile time options (#306) 2025-05-12 19:29:39 +02:00
Andryeyev, German da198ac5b2 SWDEV-531678 - Remove split path from the dispatch (#283)
The split path for blit kernels are no longer necessary, since the new blit kernels
don't use the copy size as the global workload
2025-05-12 12:50:32 -04:00
Jayaprakash, Karthik acb1f7e8d5 SWDEV-526855 - Modify the SIMDPerCU calculation for gfx1250/1. (#275) 2025-05-12 11:09:03 -04:00
Arandjelovic, Marko c5ced8c3a2 SWDEV-512344 - Unmap all subbuffers (#214) 2025-05-12 16:56:10 +02:00
Arandjelovic, Marko a7492c516d SWDEV-511204 - Mapped virtual memory should use device instead of host context (#213)
Since the sub-buffer(virtual memory that is mapped to device memory) is associated with device memory, it should utilize the device context instead of the host context. The original implementation caused hipMemcpyPeer to not take the P2P path, as the memory object was treated as host memory.
2025-05-12 16:55:25 +02:00
Patel, Jaydeepkumar 6858b0fca1 SWDEV-521135 - Make common way to set/parse UUID bytes from PAL props. (#63) 2025-05-12 17:00:30 +05:30
Six, Lancelot c35e9643ec SWDEV-517078: Fix gfx11 trap handler (#212)
Fix incorrect edits done when porting the 2nd level trap handler from
the hsa-runtime.

Change-Id: I7bc5160be47b8f669efe05c4d194bc3c47fc0661
2025-05-11 01:12:28 +01:00
Xie, AlexBin faac50c77a SWDEV-528860 - reserve some memory in visible frame buffer (#251) 2025-05-09 20:08:23 -04:00
Huang, AnZhong b434fbe2bd SWDEV-527299 - Support HIP_POINTER_ATTRIBUTE_CONTEXT (#180)
* SWDEV-527299 - Support HIP_POINTER_ATTRIBUTE_CONTEXT

As HIP enables UVA by default, it seems we can simply expose the context to support this feature.
2025-05-09 17:34:16 +08:00
Chaudhary, Jatin Jaikishan 2f73e1385b SWDEV-525933 - add constexpr operators for fp16/bf16 (#199) 2025-05-09 09:53:58 +01:00
Xie, Jiabao(Jimbo) a320a3f214 SWDEV-528913 - support gfx950 in rocsetting (#217)
* SWDEV-528913 - support gfx950 in rocsetting

---------

Co-authored-by: Jimbo Xie <jiabaxie@amd.com>
2025-05-07 15:44:49 -04:00
Lambert, Jacob 6b12154583 SWDEV-518221 - Don't link against libamd_comgr.so at runtime
Convention is to always link against .so.* at runtime.
Having it link against .so will break on systems that package
the .so files in their dev/devel package.

This issue was found when building ROCm 6.4 for Fedora.

Commiting on behalf of GitHub user Mystro256
2025-05-07 11:56:41 -07:00
Zhang, Victor f960433dcd SWDEV-528142 - add error check for KernelParameters::capture (#276)
* SWDEV-528142 - add error check for KernelParameters::capture

* Update kernel.cpp

---------

Co-authored-by: victzhan <victzhan@amd.com>
2025-05-07 09:52:09 -04:00
Jayaprakash, Karthik fa55557f46 SWDEV-493805 - Cleaning up launch parameters arguments. (#241) 2025-05-06 15:06:13 -04:00
Dittakavi, Satyanvesh 607f8f26fd SWDEV-529831 - Return error if the program is empty (#257) 2025-05-06 15:12:12 +05:30
Chaudhary, Jatin Jaikishan a71c6eb1a0 SWDEV-529854 - __hmax/__hmin should handle nan's (#246) 2025-05-06 09:42:15 +01:00
Chaudhary, Jatin Jaikishan b1ebf33850 SWDEV-529927 - add missing operations for fp16/bf16 (#238) 2025-05-06 09:41:21 +01:00
Andryeyev, German 65a0181a7c SWDEV-528808 - Release all HW queues even if only one is idle (#240)
Pytorch may not explicitly idle each queue. Thus, some queues can be considered as busy,
but have idle state in reality
2025-05-05 19:09:01 -04:00
Guan, Zichuan 3775298655 Disable HIP_PLATFORM auto-detect if already defined (#254)
Co-authored-by: Stella Laurenzo <stellaraccident@gmail.com>
2025-05-05 15:37:53 -04:00
Arsenault, Matthew 1db9a7d48b SWDEV-1 - Stop using ocml rounding functions (#228)
Directly use the builtins. Use the elementwise versions since there's
no implied errno, regardless of -f[no]-math-errno.

I didn't change the cases unnecessarily casting. The bfloat and vector
cases should work directly.
2025-05-05 19:35:12 +02:00
Andryeyev, German 9b018165ce SWDEV-528808 - Disable dynamic queue by default (#256)
Dynamic queue management will be disabled by default and
the original sort logic is restored
2025-05-05 10:56:35 -04:00
Searles, Mark cd9bc61559 Fix typos in warning msgs (#231) 2025-05-02 14:31:42 -07:00
Chaudhary, Jatin Jaikishan 12febe6782 SWDEV-514560 - add fp6 header implementation (#54)
Co-authored-by: rahul manocha <rmanocha_amdeng>
2025-05-01 15:17:38 +01:00
Assiouras, Ioannis 9d6a0d1a4d SWDEV-521011 - Fix alignment in PalResource::CreateSvm 2025-05-01 02:22:49 +01:00
Andryeyev, German 84a4f293f4 SWDEV-526836 - add PipelineStageBlt flag (#229)
CP sync requires PipelineStageBlt flag.
2025-04-30 14:27:41 -04:00
Assiouras, Ioannis d3fb8eda8b SWDEV-525593, SWDEV-527293 - Acquire active queue after xferQueue is created (#165)
For xferQueue VirtualGPU::create is called after ProfilingBegin
so the active queue needs to be acquired.
2025-04-30 09:21:11 +01:00
Godavarthy Surya, Anusha 2538d7f02b SWDEV-522841 - Graph nodes must be created/launched on device where they are captured/created (#108) 2025-04-29 22:20:39 +05:30