Commit Graph

6768 Commitit

Tekijä SHA1 Viesti Päivämäärä
Vladana Stojiljkovic f6c8bbf4dc SWDEV-492768 - Match hipStreamAddCallback capture behavior with nvidia
Change-Id: I7a084d8eeffe8b5095f7eb9969a565a40e76bb4b
2024-10-31 12:42:17 -04:00
Vladana Stojiljkovic 02bbe11e56 SWDEV-491452 - Allow hipMemAdvise capturing only inrelaxed mode
Change-Id: I1ca5e050ff869b486e3a0a41d7f06390a88e1110
2024-10-31 12:41:47 -04:00
Vladana Stojiljkovic e08df57502 SWDEV-493526 - Create kernel node when hipLaunchByPtr is captured
Change-Id: Id3493485dfdb468436ab33e6d7cb19b6b0066fd4
2024-10-31 12:41:31 -04:00
Vladana Stojiljkovic ec60bb1aed SWDEV-489571 - Fix ihipGraphAddMemsetNode to allow memset of 3d portions of an array
* When hipMemset3dAsync is captured, a 3d extent can set be as a parameter (depth > 1). That worked on nvidia, but on amd wrong portion of array was filled because when creating Memset3D command, extent dimensions were used to create pitchedPtr, instead of original array width and height.
* Also, when capturing hipMemset3dAsync, nvidia allows any of the extent dimension to be 0, and in that case, no work should be done.

Change-Id: I46a605bf9ae801cd3348e98d528c21263a8eefce
2024-10-31 10:29:54 -04:00
German Andryeyev 403f624bf8 SWDEV-486602 - Add tracking of HSA handlers
Add an atomic counter to track the outstanding HSA handlers.
Wait on CPU for the callbacks if the number exceeds the value
in DEBUG_HIP_BLOCK_SYNC env variable.

Change-Id: I95dc8c4bf0258c7e59411b7504220709ed6898c5
2024-10-25 15:20:50 -04:00
Sameer Sahasrabuddhe 556390f9c7 SWDEV-490198: _sync() will be enabled by default in 6.4
Change-Id: Id029424a9c0f6b144a7aa0e96fe8acc4a138ec51
2024-10-25 09:54:40 -04:00
Sourabh Betigeri 64e1b15551 SWDEV-450052 - Return if numDevices is more than device count on the platform
Change-Id: I538106d1b02084df9cd06b41427629207312e76f
2024-10-24 17:07:11 -04:00
Anusha GodavarthySurya f9f995c6d0 SWDEV-480209 - Handle GraphExec object release
=> GraphExec instance is destroyed before async launch completes,
destroy after all pending graph launches
=> Remove GraphExec destroy during next sync point(hipStreamSync,
hipDeviceSync etc..)

Change-Id: I4df682aae5787fd6e5240a7be936ce50361345d0
2024-10-22 12:30:46 -04:00
David 05d6f75830 Changes needed for hipcc/hipconfig rename and cleanup
- HIPCC, on Linux, will be removing high-level perl scripts (hipcc/hipconfig) in ROCm 6.3
  - removes renaming hipcc.bin/hipconfig.bin logic

SWDEV-467478 - HIPCC Clean up Perl

Change-Id: I829e915d56b37cb2ba76bb876c6656166534f15c
2024-10-22 04:46:33 -04:00
Anusha GodavarthySurya b498103f9b SWDEV-485904 - propagate hsa_amd_vmem_address_free error to hip API
Unit_hipMemSetAccess_GrowVMM test fails with
HSA_STATUS_ERROR_RESOURCE_FREE silently

Change-Id: I7a78410e432de4a2e877062782abf8761645f392
2024-10-21 10:12:32 -04:00
Vladana Stojiljkovic 6deecf1bfe SWDEV-490474 - Allow hipMallocManaged capturing only in relaxed mode
Change-Id: I02dccc6c45e39082ef925509a28bbe3c2a0fb7c6
2024-10-18 04:52:01 -04:00
Saleel Kudchadker 0f2342bc13 SWDEV-491375 - Optimize multithreaded dispatches
- Fix typo

Change-Id: If4c68455dcfa03fee18cb4720e8b5b438642703c
2024-10-17 17:02:23 -04:00
Rahul Manocha e729f08704 SWDEV-468039,SWDEV-482579 - Enable FP8 SW Conversions on pre gfx940 archs
1) SW Conversions for ocp and fnuz are enabled on pre mi300 archs
2) for mi300 only fnuz is enabled
3) for gfx1200 only ocp is enabled

Change-Id: I90373752a2d15eff20d5deec874ed396ba4e1788
2024-10-17 11:49:22 -04:00
German Andryeyev 364dfb0ed1 SWDEV-486602 - Optimize HSA callback performance
- Don't generate callbacks for HIP events
- Don't process profiling info in the callback for HIP events
- Wait for CPU status update of the submitted commands
every 50 calls. That will allow to drain the commands and
destroy HSA signals.

Change-Id: Ib601a350e7e7c2b6c6209a172385389baccf73a9
2024-10-11 14:50:25 -04:00
Ioannis Assiouras 5da72f9d52 SWDEV-490323 - Fix validateMemAccess in hipMemset
Changed the validation to occur on the sub-object rather than the parent.

Change-Id: I87bf5ef3526d0db9304099ef9ac1a5494e9a01a9
2024-10-10 18:08:28 -04:00
Todd tiantuo Li 41dc4545fc SWDEV-472357 - support Rect copy with staging buffer for 2D & 3D memcpy in PAL
Change-Id: Ie32f3e5a6fa077f6b2db20fc1ab1e2e0da8344cb
2024-10-10 18:00:19 -04:00
kjayapra-amd e7c0e06b5e SWDEV-486510 - Delete hip::Function object, in case compiler passes duplicate hostFunction ptr.
Change-Id: Ic8714eb9022a0f2150b2ea5dc008cecd7a9fae27
2024-10-10 12:45:58 -04:00
Vladana Stojiljkovic 6f2bad3998 SWDEV-489823 - Fix hipStreamEndCapture leak when capture is invalidated
Change-Id: If8f5163d70e04d34a75fd0a7ba6c0a15ea59bb8b
2024-10-10 04:38:06 -04:00
Jaydeep Patel 5ccc140e1b SWDEV-485866 - Return OOM if stream creation fails due to insufficient memory.
Change-Id: I4e57ecc81921bde274bb6a4e0890f0fc6a17955a
2024-10-10 00:44:54 -04:00
Jatin Chaudhary b977101893 SWDEV-486137 - match behavior of int variants of hadd/uhadd/rhadd/urhadd
Match cases and handle cases where it can overflow.

Change-Id: I3d6f802686af230a622ef9891a844135ad3d1ae5
2024-10-09 13:47:33 -04:00
Satyanvesh Dittakavi 15ecf834a1 SWDEV-489280 - Add missing hipGraphNodeSetParams API in dispatch table
Change-Id: I41dfd045fa4e29b49e605b8d583ec9f51dd6a6cc
2024-10-08 13:56:02 -04:00
Jaydeep Patel a6c5c6a95a SWDEV-487988 - Reserve event flag in hip::Event.
Don't create new hip:Function if it is already registered.

Change-Id: I3ecd5d61146659be6ba434717b0f21d3fc04cfc9
2024-10-08 05:29:32 -04:00
Jaydeep Patel e74ac6f580 SWDEV-482692, SWDEV-485802, SWDEV-485489 - Handle refcounts owned by graph for user objects.
Change-Id: Ic739ab1ec5d3dc3143e3ae70f9591922bc0e3d9f
2024-10-08 03:44:44 -04:00
Jaydeep Patel 164cbcc531 SWDEV-487905 - device_ptr_ is being removed and its amd:Memory obj is being deleted during ihipFree in hip::StatCO::removeFatBinary.
Change-Id: I89d9fdeb53dc4ce0699f1f445a28486917a36e72
2024-10-08 03:38:15 -04:00
Branislav Brzak 43fcac1739 SWDEV-482130 - Fix release of virtual mem obj
Change-Id: I893a8353aa1a25d00e36c8e601caf31cc0fc1f22
2024-10-08 01:37:39 -04:00
Satyanvesh Dittakavi 522ae8ead4 SWDEV-483241 - Add a compile option to avoid including default hiprtc header
Change-Id: Ic23b41395588e6183abac36cb7543da02b0aba29
2024-10-07 07:56:29 -04:00
Branislav Brzak d29ebea7ac SWDEV-476542 - Unable to link to hipGraphExecGetFlags
Change-Id: I572baaeee31c6a73e533f9ef956bf111e9d2e688
2024-10-04 13:39:06 -04:00
Saleel Kudchadker 35e03ea0d0 SWDEV-301667 - Logging upgrades
- Use AMD_LOG_LEVEL_SIZE in MBs to set log file size truncation, by default its 2048 MB

Change-Id: Ia2f87e8c6b94148e30edfb602b279f93630817c3
2024-10-04 13:26:25 -04:00
pghafari b07178618c SWDEV-467263 - Allow hipMalloc to use sys memory
PAL supports allocating from system memory once device memory is used up
or allocation is larger than the device memory.

Change-Id: Iccd3377e95a6cc6d23e45d4738a17af8b9ee32d7
2024-10-03 11:14:08 -04:00
Satyanvesh Dittakavi ade1954015 SWDEV-478708 - Remove forced wait of 10us in hipEventQuery
Change-Id: I868aae14311c3cdfc09aa03252ac324c4b79b864
2024-10-01 06:27:42 -04:00
Jaydeep Patel 614b00c20b SWDEV-487905 - Managed vars are registered in __hipRegisterManagedVar however not freed.
Change-Id: Ic5a72ac4d64a9f7f5a3a7a88e1ed813e6dcc1f57
2024-09-30 11:54:31 -04:00
Branislav Brzak 939c788779 SWDEV-478034 - Unable to link to hipGraphExecNodeSetParams
Change-Id: I0b6b8d1a4281ecda3c1789d8829ade9771aed741
2024-09-30 02:13:43 -04:00
Anusha GodavarthySurya 742b0210d3 SWDEV-477324 - Capture Memcpy1D pinned H2D D2H
Change-Id: I1f4744f20a9caeed005ec68da44e5fde737e09f7
2024-09-30 01:01:30 -04:00
Vladana Stojiljkovic da5f1a6146 SWDEV-482086 - Fix hipGraphInstantiate leak
* In a scenario where kernel is launched with hipExtLaunchKernelGGL and stop event is used, hipGraphInstantiate leaks. Since stop event is used, profiling is enabled and Timestamp (ReferencedCountedObject) is created, but it doesn't get released.
* The idea behind this solution is that profiling should be disabled when command is captured, hence the timestamp should not be created. Because information about capturing isn't available when kernel command is created, packet capturing state is used to determine whether to create a timestamp or not.

Change-Id: Ia23adac4592ded4fb5e236acf99e12e729f63692
2024-09-29 11:36:53 -04:00
Jaydeep Patel d6193a2f23 SWDEV-483436 - User spt stream as def with -fgpu-default-stream=per-thread for hipMemsetAsync.
Change-Id: Ia85c2b4c40fc9250754d3b64fb9fd1c615362572
2024-09-29 01:42:33 -04:00
Rahul Manocha 0d20383ef9 [SWDEV-467733] - Add Param checking for SetCacheConfig APIs
Change-Id: I9e777fa0fae6791ebab539e49346e6956a6ff196
2024-09-27 11:32:58 -04:00
Jonathan R. Madsen 07c9c7fe56 Fix HIP API trace versioning
Change-Id: I33f2be4668c96e2225d4ca9a253e61ec2dc65102
2024-09-25 10:32:14 -04:00
pghafari 0a918c8f96 SWDEV-479260,SWDEV-483599 - Check griddim Y,Z <= 65536
Gfx12 has 16 bits for grid dim Y/Z. Detect gfxIp and return error if dim y/z > 16 bits

Change-Id: I43dd14affc9e4073d0b1232e7523967f0180fa31
2024-09-23 11:36:13 -04:00
Jatin Chaudhary f8beeede22 SWDEV-466747 - call device sync once while unregistering
Basically embed hipDeviceSync in std::call_once.

Change-Id: I29ca926d61ed80e21acba5c388a8256d913487e4
2024-09-23 08:00:10 -04:00
German Andryeyev 29cc678d8d SWDEV-483586 - Unblock staging H2D transfers
Although unpinned copies require synchronizations
in HIP, runtime can avoid syncs for H2D copies with
a staging buffer

Change-Id: If2203c6bc0cbd89742823688dc8e89e9acd873b2
2024-09-21 10:25:27 -04:00
Anusha GodavarthySurya 870842201d SWDEV-485904 - Fix virtual,physical mem obj leaks
Change-Id: Ie0456b5dcfec206ae54a6aabfc2a15a620cac693
2024-09-19 23:04:20 -04:00
Saleel Kudchadker 8c84a20b01 SWDEV-301667 - Improve logging
Change-Id: I3fa06791b7ac73d84b8a9586e6b3435fa8858d25
2024-09-19 15:09:03 -04:00
kjayapra-amd 12a39fbf22 SWDEV-480772 - Remove name variable from amd::Monitor class.
Change-Id: Ie2a4fa44f485786227230f8a892e090e718aa30e
2024-09-19 11:55:01 -04:00
Marko Arandjelovic cfdc9dfc36 Revert "SWDEV-441296 - Allign hipTexObjectCreate error handling to CUDA"
This reverts commit 7d3c0c5e10.

Changing the error code is considered as a breaking change,
so it should be done in major releases only.

The other reason for reverting the commit is that this change itself
is incorrect. Cuda behaves in the same way as hip when
pResDesc or pTexDesc are nullptr.

Change-Id: I3abee6b79279b81ab01c7f8466c7f8e3776c4109
2024-09-18 16:38:16 -04:00
Rahul Manocha 4d1ded9eaf SWDEV-479575 - Add marker to parent graph dependencies in childgraph node
1) Child Graph nodes need to have parent graph dependencies in waitlist.
2) Marker is placed on base stream with parent graph waitlist

Change-Id: Iec65a0171ea387be05b0733abcc708fb630e4be4
2024-09-18 15:12:50 -04:00
Satyanvesh Dittakavi 8ee065c5bd SWDEV-478776 - Fix segfault with streamsync using hipStreamLegacy
Change-Id: Ifb412d0bcfa33bc1130b47b757ee276ca9bc1c3a
2024-09-17 15:02:18 -04:00
Jaydeep Patel 2494992695 SWDEV-478049 - Clear packets list as it is being added back later during submitKernelInternal while setting params for graph node.
Change-Id: I7451ffda93d94eeda5e1be05bb87558ae86d2a19
2024-09-16 23:10:32 -04:00
Ioannis Assiouras bcc545e6b8 SWDEV-476929 - Introduce an activeQueues set
The new set tracks only the queues that have a command
submitted to them. This allows for fast iteration
in waitActiveStreams.

Change-Id: I2c832eefa01280d9a87a5f57874d36d2e9441de7
2024-09-16 15:53:49 -04:00
Ranjith Ramakrishnan 4d0b815d06 SWDEV-437189 - Provide option to enable/disable CPACK_SET_DESTDIR
The variable is already set as cache, so that user can override.
But the hard coded setting is preventing override. Removed the same

Change-Id: I2aecc18ce4f1d1b523ba267ef1c8ef4ea1168d9c
2024-09-13 15:49:53 -04:00
Rahul Manocha eb1089593e SWDEV-480536 - Disable cpu wait in device synchronize
1) currently cpu wait is set to true, which makes the host wait for last
command in queue to finish even if the kernel execution has already
finished causing delay in device sync call.
2) device sync only needs to await completion when hw event
is not ready.


Change-Id: I91e3e89d39a1193ae06abac822cea8ae651493a5
2024-09-13 15:31:32 -04:00