提交图

12482 次代码提交

作者 SHA1 备注 提交日期
Tao Sang 5fe3dc5bf9 SWDEV-487356 - Fix AMD LOG issue in Win32
Change-Id: Ia1c19cf4ea24188cdb2d374b01f975f794e02dbf


[ROCm/clr commit: 802cacf3e9]
2024-11-01 08:26:25 -04:00
Satyanvesh Dittakavi b606e3dca7 SWDEV-489570 - Update AQL packet in hipDrvGraphExecMemsetNodeSetParams
After setting the new params in hipDrvGraphExecMemsetNodeSetParams, we
need to update the AQL packet as well, otherwise during the graph launch
it still dispatches the packet which has the original params and not the
updated one.

Change-Id: Ie49a641ba3f66c8085a29f92d88ac6ea6a1c0534


[ROCm/clr commit: ba2ebb3b99]
2024-11-01 07:01:10 -04:00
Jaydeep Patel 0f77eeaace SWDEV-491149 - OCL does not need to update scratch as it can't update stack size using API unlike hipDeviceSetLimit.
for HIP, Update should be only if compiler notifies use of stack size.

Change-Id: Ic781bcac6fcf586da39ec4aafd4809da3652ede3


[ROCm/clr commit: 4aa52155ee]
2024-11-01 01:05:07 -04:00
Vladana Stojiljkovic 30571de816 SWDEV-492768 - Match hipStreamAddCallback capture behavior with nvidia
Change-Id: I7a084d8eeffe8b5095f7eb9969a565a40e76bb4b


[ROCm/clr commit: f6c8bbf4dc]
2024-10-31 12:42:17 -04:00
Vladana Stojiljkovic 116effa83c SWDEV-491452 - Allow hipMemAdvise capturing only inrelaxed mode
Change-Id: I1ca5e050ff869b486e3a0a41d7f06390a88e1110


[ROCm/clr commit: 02bbe11e56]
2024-10-31 12:41:47 -04:00
Vladana Stojiljkovic 7ff9aa117d SWDEV-493526 - Create kernel node when hipLaunchByPtr is captured
Change-Id: Id3493485dfdb468436ab33e6d7cb19b6b0066fd4


[ROCm/clr commit: e08df57502]
2024-10-31 12:41:31 -04:00
Vladana Stojiljkovic d655fa8c66 SWDEV-489571 - Fix ihipGraphAddMemsetNode to allow memset of 3d portions of an array
* When hipMemset3dAsync is captured, a 3d extent can set be as a parameter (depth > 1). That worked on nvidia, but on amd wrong portion of array was filled because when creating Memset3D command, extent dimensions were used to create pitchedPtr, instead of original array width and height.
* Also, when capturing hipMemset3dAsync, nvidia allows any of the extent dimension to be 0, and in that case, no work should be done.

Change-Id: I46a605bf9ae801cd3348e98d528c21263a8eefce


[ROCm/clr commit: ec60bb1aed]
2024-10-31 10:29:54 -04:00
Alex Xie e751f00ca2 SWDEV-489468 - make resource cache bigger for APU
Change-Id: I065c712acd06c273a0b194fe792ec4f876fa9c46


[ROCm/clr commit: f8c56f6bac]
2024-10-31 09:55:01 -04:00
Tao Sang 2cce18fb38 SWDEV-492563 - Fix Ocl issues
1. Fix LDSSize type to be uint32_t.
2. Prevent clWaitForEvents running on complete events whose
   HostQueue have been destructed.

Change-Id: I829e915f56b37db2ba76bb876c9656166534f154


[ROCm/clr commit: 82dff9a67d]
2024-10-30 19:15:59 -04:00
Saleel Kudchadker a2b25be61c SWDEV-491375 - Improve MemObjMap perf
- Create bins each with its own map and lock. This would help cases
where the hash of a VA is differnet than ther one which falls in
different bin, and there is no lock contention
- Use STL shared mutexes, that way we can unique_lock for map updates
vs simple reads which can use shared_lock

Change-Id: I118818be65c6373700f5e511045babb6a398938a


[ROCm/clr commit: e23ff0520b]
2024-10-30 05:37:33 +00:00
German Andryeyev 3191f8e942 SWDEV-486602 - Add tracking of HSA handlers
Add an atomic counter to track the outstanding HSA handlers.
Wait on CPU for the callbacks if the number exceeds the value
in DEBUG_HIP_BLOCK_SYNC env variable.

Change-Id: I95dc8c4bf0258c7e59411b7504220709ed6898c5


[ROCm/clr commit: 403f624bf8]
2024-10-25 15:20:50 -04:00
Sameer Sahasrabuddhe ccb09057d8 SWDEV-490198: _sync() will be enabled by default in 6.4
Change-Id: Id029424a9c0f6b144a7aa0e96fe8acc4a138ec51


[ROCm/clr commit: 556390f9c7]
2024-10-25 09:54:40 -04:00
Julia Jiang ec7ae95977 SWDEV-488396,489257 - Fixed the regression in CTS pipes sub-test failure
Change-Id: Id4004f0d6da5754b12c9a21038de50472cb1fee5


[ROCm/clr commit: 9f2f6a8aa7]
2024-10-25 05:58:46 -04:00
German Andryeyev 584c9c1ee1 SWDEV-440746 - Fix a typo with GPU_PINNED_XFER_SIZE
Change-Id: I8fdbfb4e1c6b1274206c28a529eee9ebeaaa26fb


[ROCm/clr commit: dceb320ba7]
2024-10-24 18:33:14 -04:00
Sourabh Betigeri 78d068f153 SWDEV-450052 - Return if numDevices is more than device count on the platform
Change-Id: I538106d1b02084df9cd06b41427629207312e76f


[ROCm/clr commit: 64e1b15551]
2024-10-24 17:07:11 -04:00
Julia Jiang 5e2b090e68 SWDEV-479940 - Updating the format of changelog
Change-Id: I8aedb47b0de3ed656993bbcf9d7bc0fe3720f391


[ROCm/clr commit: 6f30ae102c]
2024-10-23 11:32:40 -04:00
Anusha GodavarthySurya 1a472b9703 SWDEV-480209 - Handle GraphExec object release
=> GraphExec instance is destroyed before async launch completes,
destroy after all pending graph launches
=> Remove GraphExec destroy during next sync point(hipStreamSync,
hipDeviceSync etc..)

Change-Id: I4df682aae5787fd6e5240a7be936ce50361345d0


[ROCm/clr commit: f9f995c6d0]
2024-10-22 12:30:46 -04:00
David c4afce2abc Changes needed for hipcc/hipconfig rename and cleanup
- HIPCC, on Linux, will be removing high-level perl scripts (hipcc/hipconfig) in ROCm 6.3
  - removes renaming hipcc.bin/hipconfig.bin logic

SWDEV-467478 - HIPCC Clean up Perl

Change-Id: I829e915d56b37cb2ba76bb876c6656166534f15c


[ROCm/clr commit: 05d6f75830]
2024-10-22 04:46:33 -04:00
Anusha GodavarthySurya 2dccb30f6f SWDEV-485904 - propagate hsa_amd_vmem_address_free error to hip API
Unit_hipMemSetAccess_GrowVMM test fails with
HSA_STATUS_ERROR_RESOURCE_FREE silently

Change-Id: I7a78410e432de4a2e877062782abf8761645f392


[ROCm/clr commit: b498103f9b]
2024-10-21 10:12:32 -04:00
Jaydeep Patel d65afe707c SWDEV-482751 - Use ocl-icd-devel package for SLES.
Change-Id: I30e6243d697dc984a42051c20e336551d50d8e94


[ROCm/clr commit: 1f55a707b4]
2024-10-20 23:55:02 -04:00
German Andryeyev 4a2687a450 SWDEV-486602 - Fix Windows 32 bit build
Windows alings fields to 8 bytes even with 32bit builds.
Add BUG_CLR_SYSMEM_POOL to cotnrol sysmempool.

Change-Id: I8622aabc9f7391ed7dd8583b252ce9eb41d62293


[ROCm/clr commit: 6bb7d1afdc]
2024-10-18 11:35:54 -04:00
Vladana Stojiljkovic 830c72e286 SWDEV-490474 - Allow hipMallocManaged capturing only in relaxed mode
Change-Id: I02dccc6c45e39082ef925509a28bbe3c2a0fb7c6


[ROCm/clr commit: 6deecf1bfe]
2024-10-18 04:52:01 -04:00
Saleel Kudchadker 5c03a593e7 SWDEV-491375 - Optimize multithreaded dispatches
- Fix typo

Change-Id: If4c68455dcfa03fee18cb4720e8b5b438642703c


[ROCm/clr commit: 0f2342bc13]
2024-10-17 17:02:23 -04:00
German Andryeyev 640ba99b75 SWDEV-486602 - Change SysmemPool implementation
- Remove the list of all chunks and use embedded chunk
information in each allocation. That simplifies Free() logic,
avoiding expensive loop if for some reason the number of
outstanding allocations significantly grew.

Change-Id: I9ea84d314320ce356ed24dd3180f262e2116c59b


[ROCm/clr commit: ad18146d8f]
2024-10-17 12:39:39 -04:00
Rahul Manocha b8a5d8396d SWDEV-468039,SWDEV-482579 - Enable FP8 SW Conversions on pre gfx940 archs
1) SW Conversions for ocp and fnuz are enabled on pre mi300 archs
2) for mi300 only fnuz is enabled
3) for gfx1200 only ocp is enabled

Change-Id: I90373752a2d15eff20d5deec874ed396ba4e1788


[ROCm/clr commit: e729f08704]
2024-10-17 11:49:22 -04:00
German Andryeyev 0a03665a3f SWDEV-491375 - Limit the SW batch size
Applications may submit commands withoout waits
for GPU. That causes a growth of SW unreleased commands.
Make sure runtime flushes SW queue, if it grows over some
threshold, controlled by DEBUG_CLR_MAX_BATCH_SIZE.

Change-Id: Ia4d85c24210ef91c394f638ab6b53b14323a0396


[ROCm/clr commit: 8657a77029]
2024-10-17 10:53:57 -04:00
Alex Xie 2a6792ec25 SWDEV-482751 - Depends on distro opencl icd loader
Since we don't distribute icd loader, we need to install distro icd loader.

Change-Id: I1ea86bcf7c642a034c53f71130b15de1fa27e31e


[ROCm/clr commit: df9ae754a4]
2024-10-16 16:21:58 -04:00
Ajay b747d0986f SWDEV-482751 - add distro path to find package AMD_ICD
Change-Id: I0d21f6ba6ade3ed932b134da503f639fd5d0d552


[ROCm/clr commit: ff306ce9d8]
2024-10-14 15:27:34 -07:00
German Andryeyev faea40cbb3 SWDEV-486602 - Optimize HSA callback performance
- Don't generate callbacks for HIP events
- Don't process profiling info in the callback for HIP events
- Wait for CPU status update of the submitted commands
every 50 calls. That will allow to drain the commands and
destroy HSA signals.

Change-Id: Ib601a350e7e7c2b6c6209a172385389baccf73a9


[ROCm/clr commit: 364dfb0ed1]
2024-10-11 14:50:25 -04:00
Ioannis Assiouras 043271a3e6 SWDEV-490323 - Fix validateMemAccess in hipMemset
Changed the validation to occur on the sub-object rather than the parent.

Change-Id: I87bf5ef3526d0db9304099ef9ac1a5494e9a01a9


[ROCm/clr commit: 5da72f9d52]
2024-10-10 18:08:28 -04:00
Todd tiantuo Li 170e45b879 SWDEV-472357 - support Rect copy with staging buffer for 2D & 3D memcpy in PAL
Change-Id: Ie32f3e5a6fa077f6b2db20fc1ab1e2e0da8344cb


[ROCm/clr commit: 41dc4545fc]
2024-10-10 18:00:19 -04:00
kjayapra-amd 55945b16c0 SWDEV-486510 - Delete hip::Function object, in case compiler passes duplicate hostFunction ptr.
Change-Id: Ic8714eb9022a0f2150b2ea5dc008cecd7a9fae27


[ROCm/clr commit: e7c0e06b5e]
2024-10-10 12:45:58 -04:00
Vladana Stojiljkovic af3e9cb9e2 SWDEV-489823 - Fix hipStreamEndCapture leak when capture is invalidated
Change-Id: If8f5163d70e04d34a75fd0a7ba6c0a15ea59bb8b


[ROCm/clr commit: 6f2bad3998]
2024-10-10 04:38:06 -04:00
Jaydeep Patel 7983801b0c SWDEV-485866 - Return OOM if stream creation fails due to insufficient memory.
Change-Id: I4e57ecc81921bde274bb6a4e0890f0fc6a17955a


[ROCm/clr commit: 5ccc140e1b]
2024-10-10 00:44:54 -04:00
Jatin Chaudhary da59165313 SWDEV-486137 - match behavior of int variants of hadd/uhadd/rhadd/urhadd
Match cases and handle cases where it can overflow.

Change-Id: I3d6f802686af230a622ef9891a844135ad3d1ae5


[ROCm/clr commit: b977101893]
2024-10-09 13:47:33 -04:00
kjayapra-amd 3a2d835272 SWDEV-486573 - Check the return type of commit memory.
Change-Id: Id158cd7a0dff37b382b858cf7113aa4cf326300a


[ROCm/clr commit: 74ebbe17e9]
2024-10-09 05:10:03 -04:00
Julia Jiang 5d7b767788 SWDEV-479940 - Correct changelog in staging for 6.2.1
Change-Id: I3f35a85b9834841d27fa35abc52b9838d6f1c9e7


[ROCm/clr commit: d6bcabdc2c]
2024-10-08 17:04:43 -04:00
Ioannis Assiouras d452c4ad28 SWDEV-483134 - Deprecate hipHostMalloc and hipHostFree APIs
Change-Id: I230ab2de2e4bdfdd9bfb0a3e59c6130a25b8b0cd


[ROCm/clr commit: 80043d38f4]
2024-10-08 15:58:25 -04:00
Satyanvesh Dittakavi 815b9ffc36 SWDEV-489280 - Add missing hipGraphNodeSetParams API in dispatch table
Change-Id: I41dfd045fa4e29b49e605b8d583ec9f51dd6a6cc


[ROCm/clr commit: 15ecf834a1]
2024-10-08 13:56:02 -04:00
Jaydeep Patel 566984676e SWDEV-487988 - Reserve event flag in hip::Event.
Don't create new hip:Function if it is already registered.

Change-Id: I3ecd5d61146659be6ba434717b0f21d3fc04cfc9


[ROCm/clr commit: a6c5c6a95a]
2024-10-08 05:29:32 -04:00
Jaydeep Patel b31bf885a3 SWDEV-482692, SWDEV-485802, SWDEV-485489 - Handle refcounts owned by graph for user objects.
Change-Id: Ic739ab1ec5d3dc3143e3ae70f9591922bc0e3d9f


[ROCm/clr commit: e74ac6f580]
2024-10-08 03:44:44 -04:00
Jaydeep Patel 3130b4639f SWDEV-487905 - device_ptr_ is being removed and its amd:Memory obj is being deleted during ihipFree in hip::StatCO::removeFatBinary.
Change-Id: I89d9fdeb53dc4ce0699f1f445a28486917a36e72


[ROCm/clr commit: 164cbcc531]
2024-10-08 03:38:15 -04:00
Branislav Brzak 491d3828dd SWDEV-482130 - Fix release of virtual mem obj
Change-Id: I893a8353aa1a25d00e36c8e601caf31cc0fc1f22


[ROCm/clr commit: 43fcac1739]
2024-10-08 01:37:39 -04:00
Satyanvesh Dittakavi 57c5264937 SWDEV-483241 - Add a compile option to avoid including default hiprtc header
Change-Id: Ic23b41395588e6183abac36cb7543da02b0aba29


[ROCm/clr commit: 522ae8ead4]
2024-10-07 07:56:29 -04:00
Saleel Kudchadker b9497ea70e SWDEV-301667 - Enable ROCr logging
- Use AMD_LOG_LEVEL=5 to dump AQL packets in ROCr

Change-Id: I2c044a5304c4eaf3d3af20e62d1f54c98d4fbaa4


[ROCm/clr commit: e36666e536]
2024-10-04 19:22:12 -04:00
Saleel Kudchadker 375ed9d848 SWDEV-478065 - Revert "SWDEV-478065 - Embed host thread in shared_ptr"
This reverts commit 274fd2628f.

Reason for revert: This blocks multithreaded callbacks

Change-Id: I9944417e4fb63c9eea2b286c828c7dfa621c4fe8


[ROCm/clr commit: d3d0ca5fc6]
2024-10-04 19:19:28 -04:00
Branislav Brzak 169423798f SWDEV-476542 - Unable to link to hipGraphExecGetFlags
Change-Id: I572baaeee31c6a73e533f9ef956bf111e9d2e688


[ROCm/clr commit: d29ebea7ac]
2024-10-04 13:39:06 -04:00
Saleel Kudchadker 5296c77138 SWDEV-301667 - Logging upgrades
- Use AMD_LOG_LEVEL_SIZE in MBs to set log file size truncation, by default its 2048 MB

Change-Id: Ia2f87e8c6b94148e30edfb602b279f93630817c3


[ROCm/clr commit: 35e03ea0d0]
2024-10-04 13:26:25 -04:00
Jaydeep Patel 91b343c758 SWDEV-471422 - Free memory being double deducted on APUs due to system_total_alloced var holds local memory.
Change-Id: I3fbbc8f8aaa156881ff95cad6a4f82fd3df651d1


[ROCm/clr commit: 292842ad28]
2024-10-04 04:49:20 -04:00
pghafari 38ad03660a SWDEV-467263 - Allow hipMalloc to use sys memory
PAL supports allocating from system memory once device memory is used up
or allocation is larger than the device memory.

Change-Id: Iccd3377e95a6cc6d23e45d4738a17af8b9ee32d7


[ROCm/clr commit: b07178618c]
2024-10-03 11:14:08 -04:00