Wykres commitów

12571 Commity

Autor SHA1 Wiadomość Data
Anusha GodavarthySurya 4288640f69 SWDEV-469422 - Derive GraphExec from Graph and ChildGraphNode from GraphExec
Change-Id: I54d67a1665355579bc249d8ff4f9806e9ee14588


[ROCm/clr commit: 13e2e797c0]
2024-12-16 00:43:57 -05:00
Istvan Kiss 349dacc1d9 SWDEV-502543 - Update doxygen to surface funtions and Coop Group API
Change-Id: Id4df63b8ae64a1113f85d89aa250ac9f7cc8b9bb


[ROCm/clr commit: 3c863dad91]
2024-12-14 14:11:37 -05:00
Todd tiantuo Li 7081be99ae SWDEV-496037 - add Strix and Strix Halo to ocltst runtime test
Change-Id: Ia21afddf5223ecd132a06f37bb430961fb7a9341


[ROCm/clr commit: 8ffb1430dd]
2024-12-13 20:19:48 -05:00
Sourabh Betigeri 750313dfed SWDEV-421020 - Adds hipGraphAddBatchMemOp, SetGetParams and execSetParams APIs
Change-Id: Ieccecfe6173cc68fd3c01f86c99f7cc09fe194a3


[ROCm/clr commit: f1c05e9026]
2024-12-13 06:23:39 +00:00
Saleel Kudchadker 5255fd1fa3 SWDEV-301667 - Clear dispatch indicator signal flag
Change-Id: I9028df0bb73289791d169e7f064a1d0f615236a5


[ROCm/clr commit: 93f1e8ff60]
2024-12-12 21:20:05 -05:00
Sourabh Betigeri 7261404002 SWDEV-440866 - [hip-roclr] Adds support to batch memory operations APIs
Change-Id: I5ac63a6626af8c2b4ac382c52dfe1aaf0b3716b8


[ROCm/clr commit: 03dbcd8ca7]
2024-12-12 19:29:24 -05:00
Tao Sang fb76b9620c SWDEV-496667 - Support gfx9-4-generic target
Support gfx9-4-generic target to cover mi3XX.
Support features sramecc and xnack in generic target.
Improve some code formats.
Add more log on compiler.

Change-Id: I6b3c6af55c60cffd43ce6f17b75998f751b75713


[ROCm/clr commit: 3ad8f1b811]
2024-12-12 14:43:39 -05:00
Saleel Kudchadker 52a3751389 SWDEV-301667 - Improve logging for hip_memory
Change-Id: Id624b2c91e6b701bc0ee561a0c193f2c66654890


[ROCm/clr commit: 537f2fffc9]
2024-12-12 14:42:00 -05:00
Michael Xie 945ae82918 SWDEV-499997 - Unify ManagedBuffer and KernelArg buffer implementation
Change-Id: I95421c87904dd62d7ee214539a57c7bda1097ff4


[ROCm/clr commit: cfcc743824]
2024-12-12 12:56:23 -05:00
Anusha GodavarthySurya acfcd1098f SWDEV-469422 - Refacor childgraph node
Remove static functions in graph

Change-Id: I4df94915f81f250acaea60398aea32ef0ed658e2


[ROCm/clr commit: 28cbf2bc4f]
2024-12-12 12:38:24 -05:00
Jaydeep Patel 704be8ab01 SWDEV-496544 - Sync with stream if it is different than srcMemory stream.
Change-Id: I9c0f94a9531555278a51202ec7203961e1344c2e


[ROCm/clr commit: c8afd7109d]
2024-12-12 06:17:39 -05:00
Saleel Kudchadker 954c43d798 SWDEV-503761 - Reintroduce save-temps path for OpenCL
Change-Id: I5e111047242aed7d982b7a25c11ab52293af639d


[ROCm/clr commit: 23c21d5181]
2024-12-11 13:01:29 -05:00
Maneesh Gupta e3bf8acd82 Revert "SWDEV-491314 - enable _sync() functions with 64-bit mask argument"
This reverts commit d05c0310d8.

Reason for revert: Introduces regression SWDEV-503319

Change-Id: I888c5b95d904146e4782e8c57d736878fcdde678


[ROCm/clr commit: 072f94c204]
2024-12-11 11:16:48 -05:00
German Andryeyev 185e95ae81 SWDEV-501757 - Clean-up signal creation
Use hsa_amd_signal_create() if settings.system_scope_signal_ is true.

Change-Id: I6d440155dfbcd5bf03658583a93827cb1c56537c


[ROCm/clr commit: 14f58fc74d]
2024-12-11 09:57:50 -05:00
Jatin Chaudhary 78ec2a66af SWDEV-503299 - Do not use operator to check for nan
Some libs use __HIP_NO_HALF_OPERATORS__ and __HIP_NO_HALF_CONVERSIONS__
which results in operators being hidden and can cause errors.

Change-Id: I83c194d7d727cba30b46d7c296f7d396549f5fca


[ROCm/clr commit: 98b33886cd]
2024-12-11 00:33:44 -05:00
Todd tiantuo Li 3fc8b99a79 SWDEV-489099 - Deprecate AMDGPU_TARGETS in lieu of supporting GPU_TARGETS
Change-Id: I184cd18e47b1618bcad0fadc8984de54d2a00a9b


[ROCm/clr commit: def369a010]
2024-12-10 12:38:37 -05:00
Gerardo Hernandez d96c631de0 SWDEV-496392 - Remove references to ASIC Polaris22 in pal::Settings and pal::Device (it is being retired)
Change-Id: I6318abf3e46ed250d087a3d2266d2ae3d4c8c000


[ROCm/clr commit: 03e4057fce]
2024-12-10 06:01:32 -05:00
Ranjith Ramakrishnan 87b59eade6 SWDEV-498728 - Add backward compatibility for deprecated package rocm-opencl-icd-loader
Change-Id: Ic659639e3bb55bd90bd50acf28d8079ff7b084bc


[ROCm/clr commit: 7c9c7a6332]
2024-12-09 16:27:43 -05:00
German Andryeyev 6604accdb3 SWDEV-501757 - Use signals without interrupts
In active wait mode use signals without interrupts by default and switch
to the interrupts only if a callback is required.

Change-Id: Ibcde8f7d44c70f8fb8fa5e0a7fdd8b08a2982a8e


[ROCm/clr commit: f4b9d3b7bd]
2024-12-09 15:16:15 -05:00
Ajay 7346f3bd29 SWDEV-498474 - APU: when total allocated is greater than 75% of invisible
Only when memory type is Local and the invisible memory is +ve
Should also fix SWDEV-490991

Change-Id: I78a4925a234ba90c63909bde5b7dc217568b4de3


[ROCm/clr commit: 7d763fb803]
2024-12-09 12:06:57 -05:00
Branislav Brzak 92b1136755 SWDEV-490860 - Do signal_is_required detection post graph schedule
Change-Id: Iaf1067a811aeac3d16c08de954036e219b545e07


[ROCm/clr commit: 89dfdc4dbe]
2024-12-09 03:57:44 -05:00
Jaydeep Patel a7406bbd5d SWDEV-502532 - Exit graph launch in case of empty graph.
Change-Id: Ifb6ab14ca6810cbc1c9e38c59d1d9e7d367358d9


[ROCm/clr commit: 0d4823ff88]
2024-12-07 12:27:53 -05:00
Ioannis Assiouras dc3ca8aab7 SWDEV-497759 - Fix memObj offset computation for hipHostRegister on Windows mgpu
On Windows, hipHostRegister may add a single object in the MemObjMap
that maps to memory that is allocated on different devices.
This change ensures that the offset that is returned from
getMemoryObject() is computed relative to the memory that is allocated
on the current device.

Change-Id: I5fd3af200bf6f4926fdeaea12dcb9d0154d3a843


[ROCm/clr commit: e80442fdbf]
2024-12-05 16:18:10 -05:00
Marko Arandjelovic 12dc02b4f8 SWDEV-495609 - Change include path for rocclr/utils
- Header files inside rocclr/utils when included from hipamd or opencl should be included as #include "rocclr/utils/xxx.h" instead of "utils/xxx.h"

Change-Id: Ic0760c33b9d091f5620dec67e5482c9698d22093


[ROCm/clr commit: 78f62d3230]
2024-12-05 11:44:20 -05:00
Jatin Chaudhary 163f810941 SWDEV-501779 - add correct function qualifier for fp16 functions
Some functions were __device__ only, but should be __host__ and
__device__, changed them to __HOST_DEVICE__.
Some functions were __HOST_DEVICE__ but were using ockl functions,
changed them to __device__ only.

Change-Id: Ife9e7abe60415bda68f5f9a101e6e7c39ad51064


[ROCm/clr commit: 5122b8c999]
2024-12-05 10:17:18 -05:00
Branislav Brzak cf25aaddcf SWDEV-490860 - Include signal_is_required in dot file dumps
Change-Id: Iec4b433b11fbecb71a4ce68beb7d6f681d25b8e6


[ROCm/clr commit: f2dba978f5]
2024-12-05 04:51:59 -05:00
Shane Xiao dabe311bd8 SWDEV-492049 - Remove the handle of Phy Mem from Memobj
The hipGraph will use VMM by default when allocating memory.
However, the handle of Phy mem has been added to Memobj by default.
Since the Memobj will track the whole address range from handle to
handle + size, this needs the system to reserve the whole address
range. If the system range have not reserved by the system, then it
will have the potential issue that clr finds the Memobj incorrectly.

This patch removes the handle from the Memobj to fix this potential
issue.

Change-Id: I2da38e6b2d11d0d48e1afe66c46899500c290624


[ROCm/clr commit: 231b2410a0]
2024-12-04 19:39:52 -05:00
Saleel Kudchadker 7d7aa8b69c SWDEV-497145 - Use rocr copyOnEngine API for staged copies
- Refactor blit code and clean ASAN instrumentation
- Use unified function for rocr copy
- Enable shader copy path for unpinned writeBuffer/readBuffer paths
- Set GPU_FORCE_BLIT_COPY_SIZE=16 which means we will use BLIT copy for
  pinned copies or unpinned H2D/D2H copies < 16KB

Change-Id: I42045cca79234b340dbf53dafb93044199736ae4


[ROCm/clr commit: 7863eb92dc]
2024-12-04 13:38:13 -05:00
German Andryeyev 6933aa7c29 SWDEV-501403 - Switch to std::shared_mutex for streamSetLock
Shared mutex allows to have access to the list of streams  from
multiple threads at the same time.

Change-Id: Ibee64b846cde03321d5b17dbee2829c0bab7e7d6


[ROCm/clr commit: efd3ea4b30]
2024-12-04 12:06:51 -05:00
Jatin Chaudhary 5bc1cfa2d9 SWDEV-485945 - use union to convert values
this shows up in some compilers as warnings.

Change-Id: I862cd6baf2edb8161757adc54abb787530489481


[ROCm/clr commit: 063f7ef32a]
2024-12-04 11:15:03 -05:00
Jimbo Xie 91a8ee1a96 SWDEV-485672 - LOG_INFO corrected to LOG_ERROR for errors
Change-Id: I8ab5f2117dfd7725bd4ed8b178e370096aa31018


[ROCm/clr commit: 6c755a4116]
2024-12-04 01:18:01 -05:00
Anusha GodavarthySurya b383c1e443 SWDEV-469422 - Always schedule graph nodes
Change-Id: Icc636527fa19e7bf3eb111bc4b1bb9a5f9acff73


[ROCm/clr commit: b89977d518]
2024-12-03 23:44:23 -05:00
Saleel Kudchadker 957b460014 SWDEV-494149 - Improve hipGet/Set Device
Change-Id: If8975687a3ba9caadafc48a0066f19a4ebaab9e2


[ROCm/clr commit: 6611cc015d]
2024-12-03 13:36:38 -05:00
Sourabh Betigeri 1712acdd2e Revert "SWDEV-440866 - [hip-roclr] Adds support to batch memory operations APIs"
This reverts commit ab0ff9163d.

Reason for revert: hipInfo fails on windows. Updating llvm amd-mainline-closed

Change-Id: I57e1fa1945188b0bc0a799c4f3d540f2b7713003


[ROCm/clr commit: 2ca644cf22]
2024-12-02 16:46:12 -05:00
Marko Arandjelovic 36ba236426 SWDEV-499794 - Update AQL packet after updating GraphNode
Change-Id: I332d70bdf42a276894a548a02d636e370c2ca08c


[ROCm/clr commit: 08aee16573]
2024-12-02 12:29:35 -05:00
Aidan Belton-Schure 838cfe1d29 SWDEV-485827 release hostcall listener memory regardless of thread status
The early return if the thread is not alive causes memory leaks.
Neither doorbell_ or urilocator are released if the thread is not alive.

This change alters the logic so regardless of the thread status the
HostcallListener releases its memory.

Change-Id: Ie912360ec0e2ee257de9937b1a8d7375e6aebd83


[ROCm/clr commit: f0063ba8da]
2024-12-02 04:42:56 -05:00
Sameer Sahasrabuddhe d05c0310d8 SWDEV-491314 - enable _sync() functions with 64-bit mask argument
Change-Id: Ieb13a9e1b2fc49ff225a05a51056d1212d95ae57


[ROCm/clr commit: 4e2fd192eb]
2024-12-01 10:16:59 -05:00
Sourabh Betigeri ab0ff9163d SWDEV-440866 - [hip-roclr] Adds support to batch memory operations APIs
Change-Id: I449ffca44bbb04d13348d112e896d603c70fd485


[ROCm/clr commit: bd5d8e9baf]
2024-11-30 17:54:32 -05:00
Anusha GodavarthySurya a3c4a5a19c SWDEV-469422 - Cleanup graph code remove parallellists and nodewaitlists
Change-Id: I00c7b2894333bd13d47b913d3fcdd6e1ffcb741f


[ROCm/clr commit: c47f9dda58]
2024-11-30 04:40:51 -05:00
taosang2 f3e3d8178b SWDEV-447973 - Support generic targets
Change-Id: I32db83843e45e0f013591493aafd7a532c881e16


[ROCm/clr commit: f1f4f40c5b]
2024-11-29 10:12:10 -05:00
Vladana Stojiljkovic 3e8d5599d4 SWDEV-494612- Add capture support for hipLaunchCooperativeKernel
Change-Id: I6b3c6af55c60cffd43ce6f47b75998f750b75703


[ROCm/clr commit: b75b0d9a53]
2024-11-29 08:17:41 -05:00
Anusha GodavarthySurya ac927dd94e SWDEV-489084 - Update max streams for graph
Change-Id: I6d0992b2e80ebf3184911593a4f3574327b2e9c3


[ROCm/clr commit: fb7ad8361c]
2024-11-29 08:16:16 -05:00
Anusha GodavarthySurya c34f55babb SWDEV-489084 - Avoid using queue colliding with the graph launch stream
Change-Id: I3ecaf8836c8e0883441275139041c702aba0937e


[ROCm/clr commit: 06e6561eb5]
2024-11-29 08:15:58 -05:00
Sebastian Luzynski f421f02546 SWDEV-465085 - replace asserts inside API calls
This change replaces some asserts, that were only available in debug
mode, with standard error handling.

Signed-off-by: Sebastian Luzynski <Sebastian.Luzynski@amd.com>
Change-Id: I112f9e56f921abd72daf0d11e4ecdcb7b1a9f9e6


[ROCm/clr commit: 019abdc3bd]
2024-11-29 04:11:39 -05:00
Marko Arandjelovic ae5bebeb5f SWDEV-489617 - Make any host to any host memcpy synchronous
Change-Id: I2a29d1a433508f9b4b67b48c47bb4a4eebac0cb3


[ROCm/clr commit: e94d9b1763]
2024-11-29 03:48:28 -05:00
Aidan Belton-Schure c59a9b3253 SWDEV-485827 release initial_heap_buffer_
This PR adds the initialization and release of initial_heap_buffer_
to prevent memory leaks.

Change-Id: I4ab8721b439a1a3a6f6e53d63d870e572f7c984a


[ROCm/clr commit: f42a87dc2f]
2024-11-28 10:31:26 -05:00
Satyanvesh Dittakavi 5a16db0cd5 SWDEV-477584 - Match hipGetLastError behavior with CUDA using env var
Change-Id: I4c5acff180ae904028f7c5fdf4e109ffd1f0c4ef


[ROCm/clr commit: e3b8754448]
2024-11-28 01:33:52 -05:00
Anusha GodavarthySurya bfc89974e0 SWDEV-472840 SWDEV-461980 - Fix null stream sync performance
=> If null stream is not created during sync skip nullstrm creation
=> Do cpu wait on blocking & null stream if it exists

Change-Id: I90d6ced6a2dd1782ba58f3fed4e3608fc0efa55a


[ROCm/clr commit: 17e7b7c2ef]
2024-11-27 10:29:15 -05:00
Aidan Belton-Schure 0cb25faf88 SWDEV-436099 Use new amdgcn_ballot builtin
Change-Id: I024fabc6c5b3f39c66885eb7615953f4d0432e9a


[ROCm/clr commit: 9652d69575]
2024-11-27 04:34:50 -05:00
Satyanvesh Dittakavi 04af42368b SWDEV-494808 - Do not allow hipMallocAsync/hipFreeAsync when another stream is capturing
hipMallocAsync/hipFreeAsync APIs should return error stating
operation is not supported, if a stream is actively capturing
and is different from the passed stream

Change-Id: I2a1b8260c5eb22d99a936ac529d6788a83f81a17


[ROCm/clr commit: 70b20857e9]
2024-11-26 12:12:56 -05:00