Γράφημα Υποβολών

12351 Υποβολές

Συγγραφέας SHA1 Μήνυμα Ημερομηνία
Jaydeep Patel c1f83df84c SWDEV-474937 - Fix race condition between main and work thread on windows.
Change-Id: I4d6b9de41d0e5a39094eb3babe47dffde72e0587


[ROCm/clr commit: 912de7ab44]
2024-08-07 14:29:14 -04:00
Alex Xie a381538161 SWDEV-444098 remove "rocm-ocl-icd" package
This is the first step to remove rocm-ocl-icd.
We don't build amd icd after this commit.
We still need to remove header files usage in future steps.

Change-Id: Ic4ac5476180f9ef2ce87b62891c08b28d6c9bfd2


[ROCm/clr commit: 5f775b8b7f]
2024-08-07 11:29:41 -04:00
Jaydeep Patel 12eea11370 SWDEV-457316 - Release graph exec before stream gets deleted.
Releasing graph exec after wait completes and before delete hip::stream obj
during stream destroy.

Change-Id: I1d68aa8d844f7d3af330c6d09c44af07f8553551


[ROCm/clr commit: 8e80429b87]
2024-08-06 00:39:37 -04:00
Jaydeep Patel 82474ca1db SWDEV-465220 - Validate stream on which Kernel is planned to be launched.
Change-Id: I34c679bd888c275584c11ad3e8346d4d542976f9


[ROCm/clr commit: b0047d690a]
2024-08-06 00:31:22 -04:00
Jaydeep Patel 2aafd5a30c SWDEV-457316 - Multiple graph exec can be for given stream.
Change-Id: I0f1b184eb63e0432119d62f094637d375a3d4e55


[ROCm/clr commit: d954eb64db]
2024-08-06 00:31:04 -04:00
Jaydeep Patel c51153f759 SWDEV-470886 - Add maybe_undef attribute for shfl device function due to not all lanes of wave define var and compiler needs to know about this.
Change-Id: I3a683887e033305ac55362f356838b491a6d50f2


[ROCm/clr commit: 6344ddb2f3]
2024-08-05 00:53:13 -04:00
German Andryeyev 35c7a87014 SWDEV-470612 - Add the optimized multistream path
- Added the optimized multi stream path in graph execution. It uses a fixed number of async streams in the execution
- Optimize the launch latency, where commands
creation and execution is done at the same time
- Optimize the scheduling to use less barriers and waiting signals if
the same queue  can be detected
- The new path is controlled by  DEBUG_HIP_FORCE_GRAPH_QUEUES
environment variable, where 0 will use the original path and any other
value will force the number of asynchronous queues for execution
- DEBUG_HIP_FORCE_ASYNC_QUEUE can force single queue async
execution in graphs(applicable for Navi families only)

Change-Id: I7eb40bc15c45f508d6911868a6f6d4c3598d380e


[ROCm/clr commit: 9db52f9a46]
2024-08-02 14:19:44 -04:00
Anusha GodavarthySurya 31927fefd6 SWDEV-468424 - Add support to capture multiple AQL Packets
=> Added support to capture multiple AQL Packets.
=> Added Interface to callback to hip runtime from rocclr to allocate
kernel args from the graph kernel arg pool.
=> Enabled Support to capture memset node.

Change-Id: I7e1c2ba06927459e024653058af142bd82192c43


[ROCm/clr commit: bd3a35bde1]
2024-08-01 23:55:51 -04:00
Julia Jiang 6fa0563ed6 SWDEV-436608 - Un-deprecate hipHostAlloc()
Change-Id: I71393e6f536ba84b4e172acf54ba4f72350e2ae8


[ROCm/clr commit: e988e5e448]
2024-08-01 11:45:20 -04:00
Marko Arandjelovic 7cd2515908 SWDEV-465204 - Fix hipModuleLaunchKernel data validation
Change-Id: I129f265a5eb79d0a13da4f12e78e06ba307b17ee


[ROCm/clr commit: be5f097e8e]
2024-08-01 05:09:23 -04:00
Vladana Stojiljkovic 91800f18ea SWDEV-465142 - Copy memAllocNodePtrs_ when cloning graph
Change-Id: I5a0907e59397e71b44db59c44b551b74a6e59ba0


[ROCm/clr commit: d62c1dea72]
2024-08-01 11:05:06 +02:00
Vladana Stojiljkovic 6fcb2c655f SWDEV-475127 - Check if hipBindTextureToArray parameters are null before dereferencing them
Change-Id: Id0173faff0a385d1665194c9033083ef9b2c48b5


[ROCm/clr commit: d7b07b94a0]
2024-08-01 05:01:55 -04:00
Ioannis Assiouras 42e8d3c894 SWDEV-476460 - Fix for a race condition in SysmemPool::Alloc
Change-Id: Ia94709e68b236c9460589963c0f09ec1f481c306


[ROCm/clr commit: 8e137e8702]
2024-08-01 04:22:26 -04:00
Marko Arandjelovic 92ddb8e242 SWDEV-475114 - Prevent segfaults in hipBindTexture
Change-Id: I050f36a5c74a5d4542155040ccce043fee6b73ad


[ROCm/clr commit: b3153a5f41]
2024-07-31 16:57:57 -04:00
Marko Arandjelovic 662ba4701c SWDEV-461791 make memcpy synchronous for D2D if src&dst ptrs have SYNC_MEMOPS attribute
Change-Id: I603081d21e5eb3c73111845e350d8fa2ba5a7733


[ROCm/clr commit: 7d0ff387e9]
2024-07-30 11:46:55 -04:00
Ioannis Assiouras 19d16561a4 SWDEV-472309 - Ensure static maps are destroyed after __hipUnregisterFatBinary
hipDeviceSynchronize called from __hipUnregisterFatBinary
accesses static maps and monitors. This change ensures these ojects
are not destroyed before __hipUnregisterFatBinary  is called.
Additionally it disables the teardown process for static build.

Change-Id: I46b58641d60efcf6637a8e99cdd786ffe9e2c77d


[ROCm/clr commit: 9b33db9b24]
2024-07-30 10:26:59 -04:00
Sourabh Betigeri f518648183 SWDEV-365151 - Fix the fns32 to do 32 bit computation and adds a wrapper to ease porting from CUDA
Change-Id: I0b5a9ca11c98f8c1c40cfba7f4e057bfda2d756e


[ROCm/clr commit: 7298b80112]
2024-07-26 11:20:14 -04:00
Saleel Kudchadker abfe135e4f SWDEV-475341 - Fix stream resolution for graphs launches
This issue was happening because of incorrect usage of getStream call,
if we get the null stream first and then typecast it, and call on
getStream again, we lose the advantage of simply passing "nullptr" to
indicate NULL stream. Thus we enter the waitActiveStream call and add
barriers to sync across streams.

Change-Id: I94dc4e3ec927295b9e1ab6dee4b37d7d3e00b0cc


[ROCm/clr commit: cda4b7db1c]
2024-07-25 19:38:23 -04:00
Sourabh Betigeri 5b0cc86295 SWDEV-475394 - Fix for the return type to be in-line with CUDA
Change-Id: I7c833571d47b4e86a86e4a0095b61947d16ecab6


[ROCm/clr commit: 4fbd7abbb2]
2024-07-25 16:33:33 -04:00
Saleel Kudchadker 16920809d7 SWDEV-301667 - Refactor Blit force env var
Change-Id: I5344ac2e6442cd8f526118e688f1b1412cc5b45a


[ROCm/clr commit: d379f4efd0]
2024-07-25 15:15:10 -04:00
Rahul Manocha de67a2a1dc SWDEV-468039 - FP8 OCP headers
Change-Id: Iecd32c5a0357781da07395d32f894415954b7b22


[ROCm/clr commit: 353f15afa6]
2024-07-25 12:42:23 -04:00
German Andryeyev fffd8d8190 SWDEV-470612 - Avoid copying a vector on the stack
Change-Id: Ia5fc7d1f77d2519dedeedb2c82c26efebb03d1d3


[ROCm/clr commit: c6b8d69158]
2024-07-25 10:09:19 -04:00
German Andryeyev 9d1d3a6493 SWDEV-470612 - Avoid processing internal signals
If only external signals were provided, then just process it
without adding internal signals

Change-Id: Iaefd65d0f8b0a64b9f6a864a9bd73de20a29dfa4


[ROCm/clr commit: 18187cd8fe]
2024-07-25 10:08:16 -04:00
taosang2 47dcfbae6b SWDEV-458943 - make new AMD_MONITOR on
make DEBUG_CLR_USE_STDMUTEX_IN_AMD_MONITOR be true

Change-Id: I1d21378ff462478d3238d71e4e2a1a7d6b9167ac


[ROCm/clr commit: f8598dabb0]
2024-07-24 14:29:27 -04:00
Maneesh Gupta c17cffba63 SWDEV-459583 - Fix codeowners file
Change-Id: Ib03328a7fb13375fa44626a40202b1eeb177b8b5


[ROCm/clr commit: 66943288a5]
2024-07-24 08:20:37 +00:00
German Andryeyev cd3b700351 SWDEV-469602 - Focre unaligned memory mode with RDP
Change-Id: I770f3dc8dde49d8e4ecdf5c38819e44df3960bce


[ROCm/clr commit: 1bac09ea20]
2024-07-23 18:31:52 -04:00
cadolphe a82e0fe333 SWDEV-462404 - Fix num_mip_levels for 1D Buffer
Updating field num_mip_levels to better align with OpenCL specification that mip-mapped images can not be created for CL_MEM_OBJECT_IMAGE1D_BUFFER images. Added check for miplevels value used for ClCreateImage call.

Change-Id: I82a25b83ef0637a877409572b7976d9e4413dfac


[ROCm/clr commit: 21a1c9075a]
2024-07-23 11:16:38 -04:00
Jatin Chaudhary 0193d66679 SWDEV-466747 - add shfl functions in bfloat16
Change-Id: Ide7d7e1d449783cced8867abf43ff45f5bce113a


[ROCm/clr commit: e43176bde9]
2024-07-23 10:51:02 -04:00
taosang2 f33637b1c6 SWDEV-474091 – Fix sporadic crash in streamcallback test
Also in the scope of SWDEV-467540.
Fix sporadic crash in Unit_hipStreamAddCallback_MultipleThreads by
deferring release() of block_command.
The test will invoke 1000 threads on the same stream thus there
is a chance to free block_command too early in original code.
By deferring release() of block_command we can make sure block_command
is always valid during calling block_command->notifyCmdQueue().

Change-Id: I31555ee18e6958e34b89f04181867fa4e932a38c


[ROCm/clr commit: e3ef19e22a]
2024-07-23 10:24:10 -04:00
kjayapra-amd 30a4b9e316 SWDEV-460948 - Remove dflock, since kernel arguments are part of command now.
Change-Id: I6b5a229307b41bd24ffa0bc172c64ad1154df474


[ROCm/clr commit: 9c03f85f46]
2024-07-22 16:02:01 -04:00
Anusha GodavarthySurya da132a2e28 SWDEV-468424 - Fix kernelArg mgr release and clear commands after capture
Creation of ReferenceCountedObject will increase reference count by 1.
Clear the commands from Node after capture so that they wont be reference later.

Change-Id: I1cc4085939cf65218ec2aa2e25ab6d737f7cacd3


[ROCm/clr commit: 6ae5d6896c]
2024-07-22 05:16:12 +00:00
Anusha GodavarthySurya 7985a72073 SWDEV-468424 - hipgraph capture memset node
Capture AQL packets during GraphInstantiation and enqueue AQL packets during graph launch.

Added support to capture single graph memset node.
Capture support for memset node is currently disabled.
Memset capture will be enabled when capture for multiple packets are supported..

Change-Id: I14dfbc41731025cc3a548a730558915def3fa384


[ROCm/clr commit: 346da4bb40]
2024-07-19 23:52:50 -04:00
German Andryeyev 7363b984c1 SWDEV-470585 - Disable double copy in HIP
- HIP path doesn't support resource tracking. Thus, double copy can't be enabled,
because it requires resource tracking.

Change-Id: I0f9c4e185b5b2d2b1abde041fca21bb099db9ccd


[ROCm/clr commit: 4c763e45a1]
2024-07-19 18:32:34 -04:00
kjayapra-amd cf28e2b27a SWDEV-439234 - Implement Set/Get Access APIs in PAL/Windows.
Change-Id: I997c330880da70c5128b187e1ef4d9c449218880


[ROCm/clr commit: 11817b4405]
2024-07-19 10:42:41 -04:00
Jatin Chaudhary 6240b203dc SWDEV-466747 - optimize conversions for bfloat16 operations
Since we made the members public, we can optimize some operations which
do not require redundant conversions to half_raw types.

Change-Id: I31555ef18e695d8e24b89f0418187fa4e932a38a


[ROCm/clr commit: 6a655a77e7]
2024-07-17 18:37:25 -04:00
Maneesh Gupta d007896179 SWDEV-472433 - Update year in license
Change-Id: I61a8cf5f361504989a754ed44247c6c02e857a89


[ROCm/clr commit: 375089876a]
2024-07-17 05:14:20 -04:00
Jatin Chaudhary 2418a0aa68 SWDEV-467414 - add sharedMemPerBlockOptin = sharedMemPerBlock
On some platforms user can ask for extended shared memory for a
particular kernel in some cases. This feature does not exist on HIP at
the moment. So we are setting it to sharedMemPerBlock which is the
maximum user can expect for their kernels.

Change-Id: I81005cf0d1c9fb941e77d34fb8385241ffe5bdd0


[ROCm/clr commit: 4b95e7bc87]
2024-07-16 11:00:29 -04:00
kjayapra-amd 573dfa21e1 SWDEV-460113 - Remove the ufd print.
Change-Id: If0d64ea4b6662493784c040aa1ceffafc8efa1c3


[ROCm/clr commit: a5664fc93f]
2024-07-16 10:39:16 -04:00
kjayapra-amd a064de92b8 SWDEV-464828 - Initial implementation of VMM IPC on PAL/Windows.
Change-Id: I3d5e148fad9105704db6724b00df06bef4fc9d2f


[ROCm/clr commit: e7a7feb273]
2024-07-16 10:38:35 -04:00
Satyanvesh Dittakavi da5bff9464 SWDEV-471935 - Destroy hsa queues with cumask set
Fixes the memory leak with hipExtStreamCreateWithCUMask API.
hsa queues with cumask set are not being reused and created
everytime the API is called, But these queues were not being
destroyed during hipStreamDestroy causing memory leak.

Change-Id: Ibfbe019bbd73604e98eca80461efe53fa64bb701


[ROCm/clr commit: 191869b252]
2024-07-16 10:02:42 -04:00
Anusha GodavarthySurya b6d82323e9 SWDEV-468424 - Refactor kernel arg
For refactoring of childGraph to have its own graphExec,
kernelArgs needs to be separated from the graphExec object.
All the childNodes part of graph should share same kernelArg pool.
Otherwise we endup creating multiple device kernel arg memory chucks
for single graphExec.

Change-Id: I4029a46ebc1fa112d87df64ab1fecbf288fabe5e


[ROCm/clr commit: 35079e834e]
2024-07-16 08:38:44 -04:00
Marko Arandjelovic 2e7581a69a SWDEV-441296 - Allign hipTexObjectCreate error handling to CUDA
Change-Id: I9ff01c22f14344e0e82e473104d6930e9fa5ff77


[ROCm/clr commit: 7d3c0c5e10]
2024-07-15 15:51:41 -04:00
Julia Jiang 3c7ae28776 SWDEV-472710 - Adding gitattributes and remove trailing spaces
Change-Id: Ic8ad2071745f0ffe6a2e120bfebb6d90bf270f87


[ROCm/clr commit: dd30e0e893]
2024-07-15 12:39:56 -04:00
Julia Jiang 3623e54842 SWDEV-472908 - Fix oclConfWimpyfull test failure
Change-Id: I44fddb88353e86a2f37e3ac870ba84cf6cace197


[ROCm/clr commit: 1e0565cc01]
2024-07-12 13:40:48 -04:00
Ioannis Assiouras f3a77127b4 SWDEV-472309 - Check if vmm support exists before enabling vm in mempool
Change-Id: I6ae2fb18a306595e0f3a56e144658a4a720e7a37


[ROCm/clr commit: 0053584aac]
2024-07-12 10:11:03 -04:00
Marko Arandjelovic 6159b0eba0 SWDEV-472345 - Fix coalesced group size
In case when the tile size is greater than the number of active threads,
the coalesced group size should be equal to the number of active threads.

Change-Id: I1d41322f2428a07862a590cb5d34b01243383b7c


[ROCm/clr commit: 152f343124]
2024-07-12 04:29:53 -04:00
Jaydeep Patel ca6d126f81 SWDEV-471298 - Use same context during child creation as parent's context.
Change-Id: I41e534b6194cef9aa8e96b28b8e811906cb362f0


[ROCm/clr commit: fb2b87db56]
2024-07-11 23:15:41 -04:00
pghafari e50ce19519 SWDEV-444447 - log print pid/tid only in verbose mode
Change-Id: I2bbe9085d607e9d8d5acda1ed43e3245335d239f


[ROCm/clr commit: 9e6e77b7dd]
2024-07-11 15:39:13 -04:00
Satyanvesh Dittakavi 64c8d338a0 SWDEV-472010 - Add error message reporting unknown kernel arg metadata
Change-Id: I18e45592e58e5766b4c00f758966771f06205ba8


[ROCm/clr commit: dc8259e71e]
2024-07-11 13:56:58 -04:00
Jatin Chaudhary 7425b0e1a4 SWDEV-470698 - add common .clang-format inside main folder
Remove the redundant copies inside sub folders. This was useful when
these projects were independent but now since they are merged they
should have one single .clang-format file.

Change-Id: I60510d7b78b129c761e84f13403492bd0c5d941a


[ROCm/clr commit: b5b1f639c0]
2024-07-11 11:39:16 -04:00