Commit-Graf

12372 Incheckningar

Upphovsman SHA1 Meddelande Datum
ksankisa 3bcd901f06 [SWDEV-469495] Compile blit kernels with -fsanitize=address when asan is enabled.
Change-Id: I96e1abef43317cd58329c4a159f807878bc48cf4


[ROCm/clr commit: e76bf653fb]
2024-08-27 01:27:31 -04:00
Sameer Sahasrabuddhe aacf75f480 SWDEV-480725: missing __ockl_wfall __ockl_wfany in amd_hip_bf16.h
Change-Id: Iff4aeec411bfeaf4cc187c515e2da3d5898f89cb


[ROCm/clr commit: 6df2da65cd]
2024-08-25 22:49:14 -04:00
kjayapra-amd f370cced08 SWDEV-479620 - Change argument type to size_t from uint64_t in nonTemporalMemcpy function.
Change-Id: I31f8a2b00685789b027d78be40a9f82c235f51b9


[ROCm/clr commit: 00eb038eec]
2024-08-24 07:42:37 -04:00
Ajay c774a3470d SWDEV-478881 - Fix log AMD_LOG file corruption
hiprtc and hip APIs use the same file.
Append to file instead of start of file

Change-Id: I2703f9bb67f0c51b557a058daab129679a0b5dd9


[ROCm/clr commit: e07172ff57]
2024-08-23 11:19:48 -04:00
kjayapra-amd 2af78c954b SWDEV-478097 - Check for parents size in case of VA Mem object.
Change-Id: Icfdeabeb178c0dcc8c3a4bc48eec40067985794e


[ROCm/clr commit: d7b097c994]
2024-08-22 14:18:51 -04:00
Julia Jiang eac092348a SWDEV-479940 - Update the changelog for 6.3
Change-Id: I2b465d297466b9c4884e30649bd2ea12a4c4229c


[ROCm/clr commit: 6576be5602]
2024-08-22 11:28:46 -04:00
Shane Xiao 06912065d9 [SWDEV-479204] Fix the hipGraph AQL package fill issue
This patch fixes this potential issue that filling AQL header before
filling the AQL body. The hsa spec specifies "Packet processors may
process AQL packets after the packet format field is updated, but
before the doorbell is signaled."
However, the hipGraph AQL package with valid header will be filled
before fill the body, which may have the potential issue that CP
receive invalid AQL body.

Change-Id: I84af798c19ee2b8805ba19732b0eabdea2958a96


[ROCm/clr commit: 3959b5be1e]
2024-08-21 21:49:11 -04:00
Rahul Manocha 1b14058283 SWDEV-474617 SWDEV-464679 - Fix segfault in palvirtual due to peer memory access
Change-Id: Ib8b641712d78acf8bc073ca5705dea97af6f944a


[ROCm/clr commit: 432bdd7bf2]
2024-08-21 11:34:15 -04:00
Sourabh Betigeri d1d6c448c9 SWDEV-462192 SWDEV-459056 - Fixes corruption
SPT is destroyed with hipDeviceReset(). If a
stream is created right after reset, the same
object id could be reused. Later SPT destructor
incorrectly verifies that the stream is valid
referring to the reused object id causing the
corruption.

Change-Id: I3b1f7ffdf8bab874dca7b8fde22318162997b8f6


[ROCm/clr commit: f6a68b3c2e]
2024-08-21 11:33:44 -04:00
Ioannis Assiouras b5acdd6fdc SWDEV-470612 - Added fixes in optimized multistream path for graph execution
This change adds fixes in optimized multistream path for childGraph uses cases.

1) For childgraph nodes, rely on runNodes() only to process
   the childgraph and skip calls to createCommand and enqueueCommands.
   This ensures that the start/end markers are enqueued correctly
   with respect to the childGraph commands.
   In addition, the runNodes() for the childgraph should be called after
   the dependency walkthrough to make sure that the subgraph is executed once.

2) Nodes with no outgoing edges should be marked
   as a leafs regardless of which stream they are assigned to.
   This is to ensure that marker dependencies from nodes
   that run on non-zero stream to subgraph leafs that run on zero stream
   are still set up correctly.

Change-Id: I4a5f4f3b0e0d01e515cdcb045b46c2798f291255


[ROCm/clr commit: 464b99373b]
2024-08-21 10:11:24 -04:00
Anusha GodavarthySurya c2a4062392 SWDEV-470612 - Add stream id to DOT print when DEBUG_HIP_GRAPH_DOT_PRINT is enabled
Change-Id: Iec3630ba6fb2206925653ea939770bb9820d7c52


[ROCm/clr commit: 19bf971134]
2024-08-21 00:37:41 -04:00
taosang2 785d6e7d01 SWDEV-475144 - Fix random language string
Fix random language string that leads to compiling failure
of trap handler and TDR of hipMemset() on VM in release
mode of hip-rt

Change-Id: Ie1d874742b804f62ceda68064fa54f5d39c092b8


[ROCm/clr commit: 857d0d60b9]
2024-08-20 17:42:31 -04:00
kjayapra-amd 457e46551d SWDEV-439234 - set access for vmm memory on graph/mempool path.
Change-Id: Idfb740dcfe6c7fe0f18231de3074a81d06e6886e


[ROCm/clr commit: e72d5a4443]
2024-08-19 13:16:30 -04:00
kjayapra-amd 30609c2e65 SWDEV-465509 - Save the handle type during import function.
Change-Id: If069abb6cd474a7b071617757041402b53575414


[ROCm/clr commit: 8ddb023512]
2024-08-16 10:55:36 -04:00
Ioannis Assiouras 1110a2f345 SWDEV-470372 - Un-deprecate hipHostAlloc, comply with cuda and introduce hipHostAlloc flags
Change-Id: I8165342825dfe07b6e9edc492d0166d0a03be62d


[ROCm/clr commit: 1e4c60f286]
2024-08-15 18:25:22 -04:00
Ioannis Assiouras 3cc948278e SWDEV-449052 - Fix hipMemcpyParam2D when source or destination pitch is set to zero
When source or destination pitch is set to zero in hip_Memcpy2D struct
it should default to WidthInBytes + [src/dst]XInBytes

Change-Id: Id57b53cab40ba72ced231258da9356554c4868c3


[ROCm/clr commit: 7a1e818c82]
2024-08-14 04:46:41 -04:00
amd-jmacaran 5dac731f29 SWDEV-458516 - External CI: Align with branch naming convention.
Change-Id: Ie1d874742b804f02ceda68064fa54f5d59c092b7


[ROCm/clr commit: cd4ed0916b]
2024-08-13 11:47:11 -04:00
Satyanvesh Dittakavi 6907974f90 SWDEV-473942 - SWDEV-431367 - Correct atomicMax(_system) and atomicMin(_system)
- Fixes -0.0 and +0.0 comparison. For atomicMax if the value on
address is -0.0 and on val is +0.0, gfx90a's unsafe atomics will swap
them. This behavior should be consistent with cas loop as well.

- _system variants of atomicMax and atomicMin are resulting in
incorrect output. Updated these to use the similar implementation as
atomicMax and atomicMin.

Change-Id: I20df36ee29ae0434a6b564f2ba71193fe41cfa59


[ROCm/clr commit: d69cc35750]
2024-08-13 10:38:50 -04:00
Satyanvesh Dittakavi 6bd51db0b1 SWDEV-475185 - Handle device id for hipStreamLegacy
Change-Id: Ib56e6edb77a923f3f9738df64cb9d9ef0b4ba564


[ROCm/clr commit: aa6d07518f]
2024-08-12 09:59:17 -04:00
Ajay 3d22e51806 SWDEV-471863 - APU: device allocation greater than invisible memory
Change-Id: I37f1769873ac7dcbb3cfa51fd815ee1e2123aeae


[ROCm/clr commit: ec0971dd08]
2024-08-09 14:29:18 -04:00
Rahul Manocha 436271e407 SWDEV-468039 - FP8 host only conversion support on mi200
Change-Id: I0891f42d1b7c0d94d099fe26df5db3eff64ba564


[ROCm/clr commit: 39bbc0341d]
2024-08-07 20:51:00 -04:00
Jaydeep Patel c1f83df84c SWDEV-474937 - Fix race condition between main and work thread on windows.
Change-Id: I4d6b9de41d0e5a39094eb3babe47dffde72e0587


[ROCm/clr commit: 912de7ab44]
2024-08-07 14:29:14 -04:00
Alex Xie a381538161 SWDEV-444098 remove "rocm-ocl-icd" package
This is the first step to remove rocm-ocl-icd.
We don't build amd icd after this commit.
We still need to remove header files usage in future steps.

Change-Id: Ic4ac5476180f9ef2ce87b62891c08b28d6c9bfd2


[ROCm/clr commit: 5f775b8b7f]
2024-08-07 11:29:41 -04:00
Jaydeep Patel 12eea11370 SWDEV-457316 - Release graph exec before stream gets deleted.
Releasing graph exec after wait completes and before delete hip::stream obj
during stream destroy.

Change-Id: I1d68aa8d844f7d3af330c6d09c44af07f8553551


[ROCm/clr commit: 8e80429b87]
2024-08-06 00:39:37 -04:00
Jaydeep Patel 82474ca1db SWDEV-465220 - Validate stream on which Kernel is planned to be launched.
Change-Id: I34c679bd888c275584c11ad3e8346d4d542976f9


[ROCm/clr commit: b0047d690a]
2024-08-06 00:31:22 -04:00
Jaydeep Patel 2aafd5a30c SWDEV-457316 - Multiple graph exec can be for given stream.
Change-Id: I0f1b184eb63e0432119d62f094637d375a3d4e55


[ROCm/clr commit: d954eb64db]
2024-08-06 00:31:04 -04:00
Jaydeep Patel c51153f759 SWDEV-470886 - Add maybe_undef attribute for shfl device function due to not all lanes of wave define var and compiler needs to know about this.
Change-Id: I3a683887e033305ac55362f356838b491a6d50f2


[ROCm/clr commit: 6344ddb2f3]
2024-08-05 00:53:13 -04:00
German Andryeyev 35c7a87014 SWDEV-470612 - Add the optimized multistream path
- Added the optimized multi stream path in graph execution. It uses a fixed number of async streams in the execution
- Optimize the launch latency, where commands
creation and execution is done at the same time
- Optimize the scheduling to use less barriers and waiting signals if
the same queue  can be detected
- The new path is controlled by  DEBUG_HIP_FORCE_GRAPH_QUEUES
environment variable, where 0 will use the original path and any other
value will force the number of asynchronous queues for execution
- DEBUG_HIP_FORCE_ASYNC_QUEUE can force single queue async
execution in graphs(applicable for Navi families only)

Change-Id: I7eb40bc15c45f508d6911868a6f6d4c3598d380e


[ROCm/clr commit: 9db52f9a46]
2024-08-02 14:19:44 -04:00
Anusha GodavarthySurya 31927fefd6 SWDEV-468424 - Add support to capture multiple AQL Packets
=> Added support to capture multiple AQL Packets.
=> Added Interface to callback to hip runtime from rocclr to allocate
kernel args from the graph kernel arg pool.
=> Enabled Support to capture memset node.

Change-Id: I7e1c2ba06927459e024653058af142bd82192c43


[ROCm/clr commit: bd3a35bde1]
2024-08-01 23:55:51 -04:00
Julia Jiang 6fa0563ed6 SWDEV-436608 - Un-deprecate hipHostAlloc()
Change-Id: I71393e6f536ba84b4e172acf54ba4f72350e2ae8


[ROCm/clr commit: e988e5e448]
2024-08-01 11:45:20 -04:00
Marko Arandjelovic 7cd2515908 SWDEV-465204 - Fix hipModuleLaunchKernel data validation
Change-Id: I129f265a5eb79d0a13da4f12e78e06ba307b17ee


[ROCm/clr commit: be5f097e8e]
2024-08-01 05:09:23 -04:00
Vladana Stojiljkovic 91800f18ea SWDEV-465142 - Copy memAllocNodePtrs_ when cloning graph
Change-Id: I5a0907e59397e71b44db59c44b551b74a6e59ba0


[ROCm/clr commit: d62c1dea72]
2024-08-01 11:05:06 +02:00
Vladana Stojiljkovic 6fcb2c655f SWDEV-475127 - Check if hipBindTextureToArray parameters are null before dereferencing them
Change-Id: Id0173faff0a385d1665194c9033083ef9b2c48b5


[ROCm/clr commit: d7b07b94a0]
2024-08-01 05:01:55 -04:00
Ioannis Assiouras 42e8d3c894 SWDEV-476460 - Fix for a race condition in SysmemPool::Alloc
Change-Id: Ia94709e68b236c9460589963c0f09ec1f481c306


[ROCm/clr commit: 8e137e8702]
2024-08-01 04:22:26 -04:00
Marko Arandjelovic 92ddb8e242 SWDEV-475114 - Prevent segfaults in hipBindTexture
Change-Id: I050f36a5c74a5d4542155040ccce043fee6b73ad


[ROCm/clr commit: b3153a5f41]
2024-07-31 16:57:57 -04:00
Marko Arandjelovic 662ba4701c SWDEV-461791 make memcpy synchronous for D2D if src&dst ptrs have SYNC_MEMOPS attribute
Change-Id: I603081d21e5eb3c73111845e350d8fa2ba5a7733


[ROCm/clr commit: 7d0ff387e9]
2024-07-30 11:46:55 -04:00
Ioannis Assiouras 19d16561a4 SWDEV-472309 - Ensure static maps are destroyed after __hipUnregisterFatBinary
hipDeviceSynchronize called from __hipUnregisterFatBinary
accesses static maps and monitors. This change ensures these ojects
are not destroyed before __hipUnregisterFatBinary  is called.
Additionally it disables the teardown process for static build.

Change-Id: I46b58641d60efcf6637a8e99cdd786ffe9e2c77d


[ROCm/clr commit: 9b33db9b24]
2024-07-30 10:26:59 -04:00
Sourabh Betigeri f518648183 SWDEV-365151 - Fix the fns32 to do 32 bit computation and adds a wrapper to ease porting from CUDA
Change-Id: I0b5a9ca11c98f8c1c40cfba7f4e057bfda2d756e


[ROCm/clr commit: 7298b80112]
2024-07-26 11:20:14 -04:00
Saleel Kudchadker abfe135e4f SWDEV-475341 - Fix stream resolution for graphs launches
This issue was happening because of incorrect usage of getStream call,
if we get the null stream first and then typecast it, and call on
getStream again, we lose the advantage of simply passing "nullptr" to
indicate NULL stream. Thus we enter the waitActiveStream call and add
barriers to sync across streams.

Change-Id: I94dc4e3ec927295b9e1ab6dee4b37d7d3e00b0cc


[ROCm/clr commit: cda4b7db1c]
2024-07-25 19:38:23 -04:00
Sourabh Betigeri 5b0cc86295 SWDEV-475394 - Fix for the return type to be in-line with CUDA
Change-Id: I7c833571d47b4e86a86e4a0095b61947d16ecab6


[ROCm/clr commit: 4fbd7abbb2]
2024-07-25 16:33:33 -04:00
Saleel Kudchadker 16920809d7 SWDEV-301667 - Refactor Blit force env var
Change-Id: I5344ac2e6442cd8f526118e688f1b1412cc5b45a


[ROCm/clr commit: d379f4efd0]
2024-07-25 15:15:10 -04:00
Rahul Manocha de67a2a1dc SWDEV-468039 - FP8 OCP headers
Change-Id: Iecd32c5a0357781da07395d32f894415954b7b22


[ROCm/clr commit: 353f15afa6]
2024-07-25 12:42:23 -04:00
German Andryeyev fffd8d8190 SWDEV-470612 - Avoid copying a vector on the stack
Change-Id: Ia5fc7d1f77d2519dedeedb2c82c26efebb03d1d3


[ROCm/clr commit: c6b8d69158]
2024-07-25 10:09:19 -04:00
German Andryeyev 9d1d3a6493 SWDEV-470612 - Avoid processing internal signals
If only external signals were provided, then just process it
without adding internal signals

Change-Id: Iaefd65d0f8b0a64b9f6a864a9bd73de20a29dfa4


[ROCm/clr commit: 18187cd8fe]
2024-07-25 10:08:16 -04:00
taosang2 47dcfbae6b SWDEV-458943 - make new AMD_MONITOR on
make DEBUG_CLR_USE_STDMUTEX_IN_AMD_MONITOR be true

Change-Id: I1d21378ff462478d3238d71e4e2a1a7d6b9167ac


[ROCm/clr commit: f8598dabb0]
2024-07-24 14:29:27 -04:00
Maneesh Gupta c17cffba63 SWDEV-459583 - Fix codeowners file
Change-Id: Ib03328a7fb13375fa44626a40202b1eeb177b8b5


[ROCm/clr commit: 66943288a5]
2024-07-24 08:20:37 +00:00
German Andryeyev cd3b700351 SWDEV-469602 - Focre unaligned memory mode with RDP
Change-Id: I770f3dc8dde49d8e4ecdf5c38819e44df3960bce


[ROCm/clr commit: 1bac09ea20]
2024-07-23 18:31:52 -04:00
cadolphe a82e0fe333 SWDEV-462404 - Fix num_mip_levels for 1D Buffer
Updating field num_mip_levels to better align with OpenCL specification that mip-mapped images can not be created for CL_MEM_OBJECT_IMAGE1D_BUFFER images. Added check for miplevels value used for ClCreateImage call.

Change-Id: I82a25b83ef0637a877409572b7976d9e4413dfac


[ROCm/clr commit: 21a1c9075a]
2024-07-23 11:16:38 -04:00
Jatin Chaudhary 0193d66679 SWDEV-466747 - add shfl functions in bfloat16
Change-Id: Ide7d7e1d449783cced8867abf43ff45f5bce113a


[ROCm/clr commit: e43176bde9]
2024-07-23 10:51:02 -04:00
taosang2 f33637b1c6 SWDEV-474091 – Fix sporadic crash in streamcallback test
Also in the scope of SWDEV-467540.
Fix sporadic crash in Unit_hipStreamAddCallback_MultipleThreads by
deferring release() of block_command.
The test will invoke 1000 threads on the same stream thus there
is a chance to free block_command too early in original code.
By deferring release() of block_command we can make sure block_command
is always valid during calling block_command->notifyCmdQueue().

Change-Id: I31555ee18e6958e34b89f04181867fa4e932a38c


[ROCm/clr commit: e3ef19e22a]
2024-07-23 10:24:10 -04:00