Commit gráf

13360 Commit-ok

Szerző SHA1 Üzenet Dátum
kjayapra-amd 712987ed08 SWDEV-509280 - Combine multiple definitions of callbackQueue into a single function.
Change-Id: Ibbb56136bec2beed71c202d75e8aec9e82640a4e


[ROCm/clr commit: 0324014710]
2025-01-30 15:58:11 -05:00
Jatin Chaudhary f8421ce480 SWDEV-508617 - There is no NaN for E4M3 and FNUZ
Change-Id: I330b041019990231c098073f94d9d40a3c13ba76


[ROCm/clr commit: 1fdbf35d14]
2025-01-30 11:48:34 -05:00
Saleel Kudchadker d0656c944b SWDEV-504494 - Resolve signal dependencies
- Resolve signal dependencies for barrier value packet if there are > 1
  depenent signals. Barrier Value packet accounts for only 1 dep signal
- Better log

Change-Id: Ia506ad5d80b91d598f92e7b539f41756e9b4b64b


[ROCm/clr commit: 2d450e8b06]
2025-01-29 19:49:02 +00:00
Jatin Chaudhary 992b5fd009 SWDEV-507817 - fix the return type of one of the atomicMin variants
Change-Id: I9915eb174d5677e21adbabae5819c9e306338ab3


[ROCm/clr commit: e6fb89190a]
2025-01-29 11:52:19 -05:00
Jimbo Xie 0a30936c67 SWDEV-510869 - add gfx1153 id
Change-Id: I36d39a1db2392990ad9b01d70676c3c986435707


[ROCm/clr commit: 4abedf2a0e]
2025-01-28 18:15:46 -05:00
Saleel Kudchadker 21ae9ef25e SWDEV-508225 - Improve fat binary handling
Change-Id: I78a9951f2f4c4c743c1205b1e40aac215054e27d


[ROCm/clr commit: 08af3eb484]
2025-01-28 14:38:21 -05:00
German Andryeyev ae379965dd SWDEV-459826 - Add a crash dump for a failed queue
The logic can analyze the AQL queue state and
find a failed AQL packet with the kernel's name

Change-Id: I1a478fa2c25462cd07a194784958bdf22454b897


[ROCm/clr commit: ea0b092af8]
2025-01-28 14:27:46 -05:00
Tao Sang 7803594aea SWDEV-458943 - Add fast path in wait()
wait() is redesigned with two pathes:
fast path: Use spinlock to wait for notify signal. If the
 signal hasn't been received for some loops, go to slow path.
slow path: Use condition_variable's wait().

Improve monitor wrapper for better performance.

Fix some bugs left from name removing patch.

Change-Id: I893a8353121a25d11e37c8e631caf31cc1fc1f24


[ROCm/clr commit: f2ff56af9c]
2025-01-28 12:19:55 -05:00
Saleel Kudchadker c6eef97e3e SWDEV-504494 - Set active engine for SDMA
Change-Id: I4cec84e71903c5813a7063e8b9ff1ea4473f4720


[ROCm/clr commit: d208e8052f]
2025-01-27 17:54:36 -05:00
Gerardo Hernandez 6ea8c53990 SWDEV-510589 - Use libgcc1 package (on Debian 10 only)
Change-Id: Ibe945e366468a84fd717e0e425cfaf7dab5a99c4


[ROCm/clr commit: b073063612]
2025-01-27 11:02:30 -05:00
Marko Arandjelovic 8b956db8db SWDEV-489619 - Added checks for memcpy capture path
Change-Id: I0e156099282f0b6393bcbcee2e9b96c31034a851


[ROCm/clr commit: 269ec54252]
2025-01-27 03:51:34 -05:00
Jacob Lambert d7371306b0 SWDEV-360440 - Prepare CLR CMake for Comgr V3 transition
Change-Id: Ia279928fd3549a45bae561d0d2d8fcf110d8c245


[ROCm/clr commit: 1fc7c6bb9a]
2025-01-27 01:09:23 -05:00
Ioannis Assiouras 8169855390 SWDEV-510319 - Fixed random segfaults in graph tests
This change fixes random segfaults in graph tests that
are seen after the change make internal callbacks non-blocking.
The callback thread that decreases the GraphExec ref count
may now run after the runtime shutdown. This can cause a segfault
because the hip::device that is accessed in GraphExec destructor
is already destroyed during runtime shutdown. This patch ensures
that the hip::device object  stays alive until after the
callback thread completes.

Change-Id: I75a6ac01f27a0b2250bbd10ed389ebfb322927af


[ROCm/clr commit: 21c223f8df]
2025-01-25 09:54:15 -05:00
Sourabh Betigeri e3c4a81b69 SWDEV-502219 - Adds validity checks for negative parameters passed
Change-Id: Ib8a531533306a27143d74b81c074de81051eb896


[ROCm/clr commit: c460b0541b]
2025-01-24 16:32:29 -05:00
Saleel Kudchadker ae7e2ecb85 SWDEV-510186 - Improve logging of kernel names
- Demangle kernel names in logs

Change-Id: I9aa58e8c109becb45ef7fc747d991bd657c4190a


[ROCm/clr commit: 9b7e0ad48a]
2025-01-24 11:43:02 -05:00
zichguan-amd 4e84f2182e SWDEV-509518 - Allow LLVM_ROOT and Clang_ROOT to be used with find_program
Fixes #123. find_program doesn't follow CMP0074 and thus ignores LLVM_ROOT and Clang_ROOT. This change adds LLVM_ROOT and Clang_ROOT to the search path of find_program for llvm-mc and clang in hiprtc to mimics previous add_package behaviour.
Caveat: cmake-specific variables like CMAKE_PREFIX_PATH will take precedence over paths specified with HINTS for find_program, there's no way to change the ordering unless we skip cmake-specific variables all together using NO_CMAKE_PATH and NO_CMAKE_ENVIRONMENT_PATH.

Change-Id: I1fedb60cda09744416e19b3c6e3e0c5c9045f8e7


[ROCm/clr commit: 272ef9a7bf]
2025-01-23 11:50:36 -05:00
taosang2 590465543d SWDEV-507969 - Fix wrong VGPRs for some devices
Change-Id: Ia8fc19564272e2c7171d991376bf896a99085a97


[ROCm/clr commit: 799e54aa0d]
2025-01-22 10:11:47 -05:00
Jaydeep Patel f674ba58f0 SWDEV-508982 - [6.4 Preview] - Handle hipMemPoolCreate, hipMemPoolDestory & hipDeviceSetMemPool during stream capture.
Change-Id: Ia195442041803896df814798c3d2053c0ba7770c


[ROCm/clr commit: 57df1b348f]
2025-01-22 05:28:47 -05:00
Jatin Chaudhary 1fb66c3e1e SWDEV-491248 - Fix build_mask
thread_rank() gives thread index in a block. Limit the range to the
current warp size.

Change-Id: Ib5c9831236096485cf99ba7ab0b911a3b10de31c


[ROCm/clr commit: bd7d40a4d8]
2025-01-22 04:46:01 -05:00
Jaydeep Patel 1d7b7cde76 SWDEV-457316 - Use phy memory obj stored in user data instead of querying from memObjs.
Change-Id: Id837eb00195d88b50904441f01cf8153fa752ecd


[ROCm/clr commit: b4df9fb6ec]
2025-01-21 22:05:14 -05:00
Pengda Xie 86921cd750 SWDEV-508590 - Fix segfault issue with hipModuleLoad
- Ensuring devProgram pointer isn't nullptr

Change-Id: Ia5786d0a2441f3a512d79b4998eb314beb98b35e


[ROCm/clr commit: f76733e5b8]
2025-01-21 11:38:37 -05:00
Sourabh Betigeri ac32b2e77e SWDEV-507104 - Removes alignment requirement for Semaphore class to resolve runtime misaligned memory issues
Change-Id: I1be3eb6e9fdcf12e995c8fe8ee30592c94f7f97a


[ROCm/clr commit: e4ba0b6262]
2025-01-20 11:27:47 -05:00
Konstantin Zhuravlyov 20b9b5a08c SWDEV-341212 - HIP header changes for supporting SPIR-V
This removes almost all uses of the deprecated
__AMDGCN_WAVEFRONT_SIZE macro, which is unavailable
when targeting SPIR-V, and adds a SPIR-V compatible
formulation of warpSize (which should end up as the
sole definition of warpSize once we remove support
for treating it as a compile time constant). It
is incomplete in that the cooperative_groups
implementation will need additional surgery.

Squashed commit of the following:

commit 6840826c3fec8516857dc4f2092d84358550f588
Author: Alex Voicu <alexandru.voicu@amd.com>
Date:   Fri Dec 6 23:36:32 2024 +0000

    Add deprecation warning for constexpr uses of `warpSize`.

commit a72307a7353034c2de53fd164e016967945fd0d1
Author: Alex Voicu <alexandru.voicu@amd.com>
Date:   Fri Dec 6 23:12:14 2024 +0000

    Prepare HIP RT for SPIR-V.

commit 5e40dd746ac4f8c93b521ef048ff9d494905ba95
Author: Alex Voicu <alexandru.voicu@amd.com>
Date:   Fri Dec 6 22:46:05 2024 +0000

    Revert stale change.

commit 231fe91c53dba4cabd832fc84eaa6ddb402271a0
Merge: a48905ec9 12dc02b4f
Author: Alex Voicu <alexandru.voicu@amd.com>
Date:   Fri Dec 6 22:37:24 2024 +0000

    Merge branch 'amd-staging' of https://github.com/ROCm/clr into amd-staging

commit a48905ec9cfe0e017cc64943195be82b530117d7
Author: Alex Voicu <alexandru.voicu@amd.com>
Date:   Tue Sep 17 03:14:56 2024 +0100

    Add scaffolding for SPIR-V support.

Change-Id: I2e84bbe90df58a5f9a8709b619905f04fa5b96dc


[ROCm/clr commit: dd4378611a]
2025-01-20 08:42:24 -05:00
Jatin Chaudhary 0795f00a14 SWDEV-341217 - Initial work to use SPIRV in HIP
Change-Id: If5c09b5e86b498e7ac5eb05adf28cb7a1fac8101


[ROCm/clr commit: 6a5d19059d]
2025-01-20 03:54:23 -05:00
Jaydeep Patel 5536076dc6 SWDEV-509664 - Specify type explicitly.
Change-Id: Ia0c0478682fa15eae7a31a2360310f08151716d4


[ROCm/clr commit: 1aa8383b09]
2025-01-17 13:09:48 -05:00
Julia Jiang e729df2f23 SWDEV-509295 - Update changelog with newly added HIP APIs for 6.4
Change-Id: I6052e3bf4f17d1fec23e6cc835aa2526ee0bc48c


[ROCm/clr commit: fe5c68d8a3]
2025-01-17 10:13:11 -05:00
Julia Jiang a07131b3d9 SWDEV-509295 - Merging changelog from 6.3.2 into amd-staging for 6.4
Change-Id: I0cd56b44402e82499616e961212abd2b3569c164


[ROCm/clr commit: f497b79111]
2025-01-17 10:12:44 -05:00
Branislav Brzak 05057b2a88 SWDEV-508743 - [6.4 Preview] Add ROCm 7.0 breaking change fields
Change-Id: I07bff42731e74a4c409505cf8981342e22ce26be


[ROCm/clr commit: 3fd46a3783]
2025-01-17 06:25:27 -05:00
Vladana Stojiljkovic 4b786b5207 SWDEV-498061 - Add capture support for hipModuleLaunchCooperativeKernel
Change-Id: I5ed188e046c680c2785b3952391f59ed1d0c21b8


[ROCm/clr commit: 30cb2d0e67]
2025-01-16 10:54:30 -05:00
Marko Arandjelovic cf8eeabfe2 SWDEV-489619 - Fix memcpy tests with capture stream enabled
- Added missing validation as graph node should not be created
 if parameters are invalid
 - Fix conversion of input params to graphNode params

Change-Id: I37ab04942b5fb2eb07386850cb7dbbf26f9ca967


[ROCm/clr commit: db8527f655]
2025-01-16 10:31:04 -05:00
Marko Arandjelovic 8647bb483b SWDEV-504084 - Make hipModuleGetFunction use the device the module is loaded on
If a module is loaded on one device, hipModuleGetFunction and other similar APIs should be able to run successfully from another device.

Change-Id: I96084cbd6c6dcf2a81019779a6ab1842ef2f35d1


[ROCm/clr commit: c46f843b99]
2025-01-16 10:16:42 -05:00
Ioannis Assiouras 67c93c3bad SWDEV-505503 - Use internal device synchronize function in __hipUnregisterFatBinary
This is to avoid calling the HIP_INIT macro during the shutdown process.

Change-Id: I2e65f6e10491918a17445ee1e8ddd08286070358


[ROCm/clr commit: 5e3a29078d]
2025-01-15 18:57:34 -05:00
Sourabh Betigeri c95bb38b2b SWDEV-507960 - Return with error code if stream of type hipStreamLegacy is being attempted to destroy
Change-Id: Iee7ada6a5a905b44360a7e4049fc8b1a45c80db0


[ROCm/clr commit: 9d8d35ae40]
2025-01-13 18:17:37 -05:00
Ioannis Assiouras 161d590405 SWDEV-503760 - Only consider allocations that are less than X% larger in a mempool request
Change-Id: I94acbca606fd4c575e2e1a9e34959ce650571867


[ROCm/clr commit: 44b6b6813d]
2025-01-13 16:57:26 -05:00
Rahul Manocha 0a9315f5e0 SWDEV-497288 - Add validation checks to hipGraphExecNodeSetParams
Change-Id: If8cde47bc8e62414333768e01064298d8a3d80ee


[ROCm/clr commit: 2b32d9aada]
2025-01-13 16:49:45 -05:00
Rahul Manocha 17ad1834d4 SWDEV-492165 - Add support for 16 bit atomicAdd and atomicCAS
Change-Id: I7f1e4876fe2960dff1775d27cb6443a89e146c86


[ROCm/clr commit: 93cff75928]
2025-01-13 13:35:40 -05:00
Jacob Lambert aa918756e8 SWDEV-477039 - Fix spelling typos
Change-Id: Ia46e07ec0a001b7e19bec999597028acd4ee7077


[ROCm/clr commit: 805462a77b]
2025-01-13 12:17:46 -05:00
Aidan Belton-Schure dc2fa93f37 SWDEV-482851 - Do not release last suballocator chunk
Change-Id: Ib28dc9df68e454ee0c0c699c1ff17588fd55f802


[ROCm/clr commit: 451b0ce768]
2025-01-13 10:14:40 -05:00
Daniel Livingston 50387f6eb5 SWDEV-489003 - [Ubertrace] OCL/HIP profiles are missing event instrumentation
Adds UberTrace support for pre-dispatch markers and barrier begin/end markers.

Moves shared definitions out of palgpuopen.hpp into shared header
palcapturemgr.hpp.

Change-Id: I9f464c689e7ff12c54eca043fc1ad65e1836a64f


[ROCm/clr commit: 541c449ce2]
2025-01-10 11:28:52 -05:00
Julia Jiang 60f9ab6fcd SWDEV-507699 - Update CLR license date
Change-Id: I51b641c58b1e9b8c84637af2d22f905bcdab8f56


[ROCm/clr commit: c6e25b2be7]
2025-01-10 11:17:19 -05:00
Julia Jiang 8c070f11f9 SWDEV-497634 - Update change log for hipMalloc allocation fix on Windows
Change-Id: If4351cb4f75141661538e1d26c96e600df3d0b39


[ROCm/clr commit: bdf48bdbf9]
2025-01-10 10:12:39 -05:00
Anusha GodavarthySurya 08c92f4793 SWDEV-480209 - Make internal callbacks non-blocking
Change-Id: Ic918d08f341abfd9a7c167d09f9c723cdc43157f


[ROCm/clr commit: 683a942364]
2025-01-10 02:16:11 -05:00
Saleel Kudchadker 16f14e4b00 SWDEV-504494 - Use system scope for D2H
- When using shader copy, make sure to use release scope for the AQL
  packet. This is a potential bug but is hidden as hipMemcpyAsync always
needs synchronization(which inserts a barrier with release scope). For
hipMemcpy we use a barrier packet to make sure its blocking. Eitherways
a barrier gets always used and hides in some ways a potential bug.

Change-Id: I57fb7f769c3179e76d712471c0905104c801d7ba


[ROCm/clr commit: c9dd95bf6c]
2025-01-10 00:34:08 -05:00
Saleel Kudchadker 6a416008aa SWDEV-508004 - Improve hipEventRecord
- Resolve stream once for event record. We should avoid calling
  getStream again in addMarker

Change-Id: I78448c4f151ae10a5c8e8c248b2f4078b84191cb


[ROCm/clr commit: a22c45d635]
2025-01-09 16:47:46 -05:00
German Andryeyev 5a76761960 SWDEV-507019 - Change the function lock to the module lock
Multiple functions can be located in the same module.

Change-Id: Ia4ca3db64fe5b0822584059d3770c91103665c63


[ROCm/clr commit: 45a12208b6]
2025-01-09 12:35:29 -05:00
Evgenii Averin ef2a812d0a SWDEV-505769 - Fix typo
Change-Id: I2d3f65ed68157718c4439a9da7d2dcdfcbb9f93d


[ROCm/clr commit: b62995ce1a]
2025-01-08 21:31:17 -05:00
Sourabh Betigeri 36f3d7647c SWDEV-505971 - Fix size mismatch of count type to uint32_t
Change-Id: Ie526f828f816e6681ef1735d5edb2db895dace57


[ROCm/clr commit: f5b2516f5d]
2025-01-08 12:47:36 -05:00
Saleel Kudchadker d4594531ef SWDEV-506251 - Disable blit copy thresold for OpenCL
Change-Id: Id0ca43b13d5792791a42da263f6aa4496382cea6


[ROCm/clr commit: 39801b5750]
2025-01-08 02:46:01 +00:00
Rahul Manocha b1ef5972d6 SWDEV-504215 - fix rocalution perf drop by disabling cpu wait
Change-Id: I878f3420073b05cc6241f524ac428e47c0ce823d


[ROCm/clr commit: 05baf9ff22]
2025-01-07 17:02:24 -05:00
Saleel Kudchadker a18f2c549c SWDEV-504494 - Flush to systemscope when copying non-coherent mem
- When we use blit(compute) copies, two subsequent copies may read for
  the same source buffer, the buffer may get modified by the host in
between and if the src buffer was allocated with non-coherent flag, the
device may simply use stale value from previous cacheline fetch. This is
a corner case.

Change-Id: I2ce261c6f6fa4e5bb608f116548e5cc711ae6f3c


[ROCm/clr commit: b63005d550]
2025-01-07 12:49:22 -05:00