Rahul Manocha
e729f08704
SWDEV-468039,SWDEV-482579 - Enable FP8 SW Conversions on pre gfx940 archs
...
1) SW Conversions for ocp and fnuz are enabled on pre mi300 archs
2) for mi300 only fnuz is enabled
3) for gfx1200 only ocp is enabled
Change-Id: I90373752a2d15eff20d5deec874ed396ba4e1788
2024-10-17 11:49:22 -04:00
German Andryeyev
8657a77029
SWDEV-491375 - Limit the SW batch size
...
Applications may submit commands withoout waits
for GPU. That causes a growth of SW unreleased commands.
Make sure runtime flushes SW queue, if it grows over some
threshold, controlled by DEBUG_CLR_MAX_BATCH_SIZE.
Change-Id: Ia4d85c24210ef91c394f638ab6b53b14323a0396
2024-10-17 10:53:57 -04:00
Alex Xie
df9ae754a4
SWDEV-482751 - Depends on distro opencl icd loader
...
Since we don't distribute icd loader, we need to install distro icd loader.
Change-Id: I1ea86bcf7c642a034c53f71130b15de1fa27e31e
2024-10-16 16:21:58 -04:00
Ajay
ff306ce9d8
SWDEV-482751 - add distro path to find package AMD_ICD
...
Change-Id: I0d21f6ba6ade3ed932b134da503f639fd5d0d552
2024-10-14 15:27:34 -07:00
German Andryeyev
364dfb0ed1
SWDEV-486602 - Optimize HSA callback performance
...
- Don't generate callbacks for HIP events
- Don't process profiling info in the callback for HIP events
- Wait for CPU status update of the submitted commands
every 50 calls. That will allow to drain the commands and
destroy HSA signals.
Change-Id: Ib601a350e7e7c2b6c6209a172385389baccf73a9
2024-10-11 14:50:25 -04:00
Ioannis Assiouras
5da72f9d52
SWDEV-490323 - Fix validateMemAccess in hipMemset
...
Changed the validation to occur on the sub-object rather than the parent.
Change-Id: I87bf5ef3526d0db9304099ef9ac1a5494e9a01a9
2024-10-10 18:08:28 -04:00
Todd tiantuo Li
41dc4545fc
SWDEV-472357 - support Rect copy with staging buffer for 2D & 3D memcpy in PAL
...
Change-Id: Ie32f3e5a6fa077f6b2db20fc1ab1e2e0da8344cb
2024-10-10 18:00:19 -04:00
kjayapra-amd
e7c0e06b5e
SWDEV-486510 - Delete hip::Function object, in case compiler passes duplicate hostFunction ptr.
...
Change-Id: Ic8714eb9022a0f2150b2ea5dc008cecd7a9fae27
2024-10-10 12:45:58 -04:00
Vladana Stojiljkovic
6f2bad3998
SWDEV-489823 - Fix hipStreamEndCapture leak when capture is invalidated
...
Change-Id: If8f5163d70e04d34a75fd0a7ba6c0a15ea59bb8b
2024-10-10 04:38:06 -04:00
Jaydeep Patel
5ccc140e1b
SWDEV-485866 - Return OOM if stream creation fails due to insufficient memory.
...
Change-Id: I4e57ecc81921bde274bb6a4e0890f0fc6a17955a
2024-10-10 00:44:54 -04:00
Jatin Chaudhary
b977101893
SWDEV-486137 - match behavior of int variants of hadd/uhadd/rhadd/urhadd
...
Match cases and handle cases where it can overflow.
Change-Id: I3d6f802686af230a622ef9891a844135ad3d1ae5
2024-10-09 13:47:33 -04:00
kjayapra-amd
74ebbe17e9
SWDEV-486573 - Check the return type of commit memory.
...
Change-Id: Id158cd7a0dff37b382b858cf7113aa4cf326300a
2024-10-09 05:10:03 -04:00
Julia Jiang
d6bcabdc2c
SWDEV-479940 - Correct changelog in staging for 6.2.1
...
Change-Id: I3f35a85b9834841d27fa35abc52b9838d6f1c9e7
2024-10-08 17:04:43 -04:00
Ioannis Assiouras
80043d38f4
SWDEV-483134 - Deprecate hipHostMalloc and hipHostFree APIs
...
Change-Id: I230ab2de2e4bdfdd9bfb0a3e59c6130a25b8b0cd
2024-10-08 15:58:25 -04:00
Satyanvesh Dittakavi
15ecf834a1
SWDEV-489280 - Add missing hipGraphNodeSetParams API in dispatch table
...
Change-Id: I41dfd045fa4e29b49e605b8d583ec9f51dd6a6cc
2024-10-08 13:56:02 -04:00
Jaydeep Patel
a6c5c6a95a
SWDEV-487988 - Reserve event flag in hip::Event.
...
Don't create new hip:Function if it is already registered.
Change-Id: I3ecd5d61146659be6ba434717b0f21d3fc04cfc9
2024-10-08 05:29:32 -04:00
Jaydeep Patel
e74ac6f580
SWDEV-482692, SWDEV-485802, SWDEV-485489 - Handle refcounts owned by graph for user objects.
...
Change-Id: Ic739ab1ec5d3dc3143e3ae70f9591922bc0e3d9f
2024-10-08 03:44:44 -04:00
Jaydeep Patel
164cbcc531
SWDEV-487905 - device_ptr_ is being removed and its amd:Memory obj is being deleted during ihipFree in hip::StatCO::removeFatBinary.
...
Change-Id: I89d9fdeb53dc4ce0699f1f445a28486917a36e72
2024-10-08 03:38:15 -04:00
Branislav Brzak
43fcac1739
SWDEV-482130 - Fix release of virtual mem obj
...
Change-Id: I893a8353aa1a25d00e36c8e601caf31cc0fc1f22
2024-10-08 01:37:39 -04:00
Satyanvesh Dittakavi
522ae8ead4
SWDEV-483241 - Add a compile option to avoid including default hiprtc header
...
Change-Id: Ic23b41395588e6183abac36cb7543da02b0aba29
2024-10-07 07:56:29 -04:00
Saleel Kudchadker
e36666e536
SWDEV-301667 - Enable ROCr logging
...
- Use AMD_LOG_LEVEL=5 to dump AQL packets in ROCr
Change-Id: I2c044a5304c4eaf3d3af20e62d1f54c98d4fbaa4
2024-10-04 19:22:12 -04:00
Saleel Kudchadker
d3d0ca5fc6
SWDEV-478065 - Revert "SWDEV-478065 - Embed host thread in shared_ptr"
...
This reverts commit 4b03017e8a .
Reason for revert: This blocks multithreaded callbacks
Change-Id: I9944417e4fb63c9eea2b286c828c7dfa621c4fe8
2024-10-04 19:19:28 -04:00
Branislav Brzak
d29ebea7ac
SWDEV-476542 - Unable to link to hipGraphExecGetFlags
...
Change-Id: I572baaeee31c6a73e533f9ef956bf111e9d2e688
2024-10-04 13:39:06 -04:00
Saleel Kudchadker
35e03ea0d0
SWDEV-301667 - Logging upgrades
...
- Use AMD_LOG_LEVEL_SIZE in MBs to set log file size truncation, by default its 2048 MB
Change-Id: Ia2f87e8c6b94148e30edfb602b279f93630817c3
2024-10-04 13:26:25 -04:00
Jaydeep Patel
292842ad28
SWDEV-471422 - Free memory being double deducted on APUs due to system_total_alloced var holds local memory.
...
Change-Id: I3fbbc8f8aaa156881ff95cad6a4f82fd3df651d1
2024-10-04 04:49:20 -04:00
pghafari
b07178618c
SWDEV-467263 - Allow hipMalloc to use sys memory
...
PAL supports allocating from system memory once device memory is used up
or allocation is larger than the device memory.
Change-Id: Iccd3377e95a6cc6d23e45d4738a17af8b9ee32d7
2024-10-03 11:14:08 -04:00
Ioannis Assiouras
07bcc283f9
SWDEV-488851 - Correctly remove the queue from the active set on windows
...
Change-Id: I4d21743ecf7a44636121f85566f898e62ff61e97
2024-10-02 12:06:59 +01:00
Satyanvesh Dittakavi
ade1954015
SWDEV-478708 - Remove forced wait of 10us in hipEventQuery
...
Change-Id: I868aae14311c3cdfc09aa03252ac324c4b79b864
2024-10-01 06:27:42 -04:00
Rahul Manocha
9da90fe848
SWDEV-487903 - Fix for Empty Kernel Segfault in PAL
...
Change-Id: Ia1c19cf4ea24188cdb2d374b07f975f794e02dba
2024-09-30 13:00:15 -04:00
Jaydeep Patel
614b00c20b
SWDEV-487905 - Managed vars are registered in __hipRegisterManagedVar however not freed.
...
Change-Id: Ic5a72ac4d64a9f7f5a3a7a88e1ed813e6dcc1f57
2024-09-30 11:54:31 -04:00
Julia Jiang
17c8b9f855
SWDEV-412099 - Fix CTS clFillImage sub-tests failures
...
Change-Id: I082476837c539e6ccf93cba6b1e97aae2509e65c
2024-09-30 11:13:52 -04:00
Branislav Brzak
939c788779
SWDEV-478034 - Unable to link to hipGraphExecNodeSetParams
...
Change-Id: I0b6b8d1a4281ecda3c1789d8829ade9771aed741
2024-09-30 02:13:43 -04:00
Anusha GodavarthySurya
742b0210d3
SWDEV-477324 - Capture Memcpy1D pinned H2D D2H
...
Change-Id: I1f4744f20a9caeed005ec68da44e5fde737e09f7
2024-09-30 01:01:30 -04:00
Vladana Stojiljkovic
da5f1a6146
SWDEV-482086 - Fix hipGraphInstantiate leak
...
* In a scenario where kernel is launched with hipExtLaunchKernelGGL and stop event is used, hipGraphInstantiate leaks. Since stop event is used, profiling is enabled and Timestamp (ReferencedCountedObject) is created, but it doesn't get released.
* The idea behind this solution is that profiling should be disabled when command is captured, hence the timestamp should not be created. Because information about capturing isn't available when kernel command is created, packet capturing state is used to determine whether to create a timestamp or not.
Change-Id: Ia23adac4592ded4fb5e236acf99e12e729f63692
2024-09-29 11:36:53 -04:00
Jaydeep Patel
d6193a2f23
SWDEV-483436 - User spt stream as def with -fgpu-default-stream=per-thread for hipMemsetAsync.
...
Change-Id: Ia85c2b4c40fc9250754d3b64fb9fd1c615362572
2024-09-29 01:42:33 -04:00
Rahul Manocha
0d20383ef9
[SWDEV-467733] - Add Param checking for SetCacheConfig APIs
...
Change-Id: I9e777fa0fae6791ebab539e49346e6956a6ff196
2024-09-27 11:32:58 -04:00
Jonathan R. Madsen
07c9c7fe56
Fix HIP API trace versioning
...
Change-Id: I33f2be4668c96e2225d4ca9a253e61ec2dc65102
2024-09-25 10:32:14 -04:00
Ajay
7a288ea8bf
SWDEV-486816 - RenderOpDispatch usage in pal client
...
Change-Id: I11cae3e625b287b998c9500c547efdacf1034a2b
2024-09-24 14:28:16 -04:00
pghafari
0a918c8f96
SWDEV-479260,SWDEV-483599 - Check griddim Y,Z <= 65536
...
Gfx12 has 16 bits for grid dim Y/Z. Detect gfxIp and return error if dim y/z > 16 bits
Change-Id: I43dd14affc9e4073d0b1232e7523967f0180fa31
2024-09-23 11:36:13 -04:00
Jatin Chaudhary
f8beeede22
SWDEV-466747 - call device sync once while unregistering
...
Basically embed hipDeviceSync in std::call_once.
Change-Id: I29ca926d61ed80e21acba5c388a8256d913487e4
2024-09-23 08:00:10 -04:00
German Andryeyev
29cc678d8d
SWDEV-483586 - Unblock staging H2D transfers
...
Although unpinned copies require synchronizations
in HIP, runtime can avoid syncs for H2D copies with
a staging buffer
Change-Id: If2203c6bc0cbd89742823688dc8e89e9acd873b2
2024-09-21 10:25:27 -04:00
Maneesh Gupta
2d1c6ee23e
SWDEV-485179 - Revert "SWDEV-459254 - Overwrite cacheline size to 256 for gfx12, as it is used for kernarg alignment."
...
This reverts commit 1f63650bf96e01e48f879aa58b80e2130dd4a567.
Reason for revert: <INSERT REASONING HERE>
Change-Id: I6d7ed87c09d9b77116548dce1f30ac4711c2c09d
2024-09-20 11:33:34 -04:00
Anusha GodavarthySurya
870842201d
SWDEV-485904 - Fix virtual,physical mem obj leaks
...
Change-Id: Ie0456b5dcfec206ae54a6aabfc2a15a620cac693
2024-09-19 23:04:20 -04:00
Saleel Kudchadker
8c84a20b01
SWDEV-301667 - Improve logging
...
Change-Id: I3fa06791b7ac73d84b8a9586e6b3435fa8858d25
2024-09-19 15:09:03 -04:00
kjayapra-amd
12a39fbf22
SWDEV-480772 - Remove name variable from amd::Monitor class.
...
Change-Id: Ie2a4fa44f485786227230f8a892e090e718aa30e
2024-09-19 11:55:01 -04:00
Marko Arandjelovic
cfdc9dfc36
Revert "SWDEV-441296 - Allign hipTexObjectCreate error handling to CUDA"
...
This reverts commit 7d3c0c5e10 .
Changing the error code is considered as a breaking change,
so it should be done in major releases only.
The other reason for reverting the commit is that this change itself
is incorrect. Cuda behaves in the same way as hip when
pResDesc or pTexDesc are nullptr.
Change-Id: I3abee6b79279b81ab01c7f8466c7f8e3776c4109
2024-09-18 16:38:16 -04:00
Rahul Manocha
4d1ded9eaf
SWDEV-479575 - Add marker to parent graph dependencies in childgraph node
...
1) Child Graph nodes need to have parent graph dependencies in waitlist.
2) Marker is placed on base stream with parent graph waitlist
Change-Id: Iec65a0171ea387be05b0733abcc708fb630e4be4
2024-09-18 15:12:50 -04:00
pghafari
365ffd4805
SWDEV-444447 - Fix regression for verbose printing for AMD_LOG_LEVEL=4
...
Change-Id: Id245caef711b7ccdf4e999e934993beb43d7c3d5
2024-09-18 13:08:10 -04:00
Rahul Manocha
07261002b1
SWDEV-439234 - Fix for Segfault in ValidateMemAccess
...
Change-Id: I251d277eb5af16ba5c0de85ffd142a5f64fa469d
2024-09-18 10:52:32 -04:00
Daniel Livingston
e550032d25
SWDEV-77148 - Add UberTrace support to PAL device
...
This PR adds UberTrace-based tracing support to ROCclr's PAL device class.
Legacy RGP-based tracing is still available and is the default.
If UberTrace support is enabled tool-side, this new code path will activate.
Change-Id: I268b2dcef70e850a50e2caef8355f38bf51d4641
2024-09-17 16:06:37 -04:00