Граф коммитов

292 Коммитов

Автор SHA1 Сообщение Дата
Kudchadker, Saleel 072fb0804e SWDEV-521647 - Fix tracking of hw_event (#206)
- When a command may possibly have two packets(like device heap
  initializer), and if there is no signal on the main kernel packet the
tracking was broken as it marked HW event of the command as the first
packet signal.
- Make sure if no completion signal is attached to the second packet
  then clear the HW event for the command.
2025-04-25 08:46:44 -07:00
Sang, Tao 1113eff3f9 SWDEV-493275 - Support scratch limit (#20)
Support programmatic query and change of scratch limit on
AMD devices.

Change-Id: Id5da355a77366f97868e462847f3916e87fd2af6
2025-04-24 17:15:25 -04:00
Yao, Longlong 0de73eeaf8 SWDEV-518966 - Avoid creating Arena Memobj for VMM pointer (#39)
Change-Id: I69c6c0a1464d01e674ac929de34ab10047012f1a

Signed-off-by: Longlong Yao <Longlong.Yao@amd.com>
2025-04-11 16:55:53 +05:30
Patel, Jaydeepkumar b5c9cbc236 SWDEV-508632 - Align address to 2 MBs for hidden heap allocation. (#29) 2025-04-02 16:33:29 +05:30
Andryeyev, German 28967982b2 SWDEV-517481 - Add dynamic queue management (#37)
Enabled by defaulty. DEBUG_HIP_DYNAMIC_QUEUES controls the feature
2025-03-19 11:22:50 -04:00
Branislav Brzak c2d1776ebd SWDEV-516564 - SWDEV-512817 - Remove mentions of gfx940 and gfx941
Change-Id: Ia069fcb9c6948c3fc9a00961593c9dcc59609375
2025-03-05 04:26:07 -05:00
Saleel Kudchadker 37d606d193 SWDEV-513197 - Unify getBuffer implementation
- Use getBuffer/releaseBuffer in BlitManager
- Cleanup XferBuffer as we use ManagedBuffer for both reads/writes

Change-Id: I2661b85dd012763b17a38a743fec1b1d79125f67
2025-02-28 12:47:51 -05:00
German Andryeyev ba8e740be4 SWDEV-515356 - Make the round-robin queue selection
- Add custom compare to the map of queues, which will help with
 the round-robin selection

Change-Id: Ie67a820bfb1a5b484a1b3edced967eed94228bb8
2025-02-20 11:09:54 -05:00
Vladana Stojiljkovic 061c5d877f SWDEV-510059 - Format CU mask properly
Change-Id: I80e94b4f3ea25f6988fc06d83aeb398e81ccddd1
2025-02-13 11:02:56 -05:00
kjayapra-amd cc62a82347 SWDEV-488290 - Remove Stream to Engine logic and rely on engine query status HSA API.
Change-Id: I469ab6679360c8ee8d4ee515678a8aa8d4578ebf
2025-02-04 13:00:16 -05:00
Jimbo Xie 4872b420c9 SWDEV-504383 - Cleaned up kForcedTimeout10us and removed IsHwEventReadyForcedWait
Also removed active_wait_timeout

Change-Id: I7a429f003c09a4df267b5c0983050704260094c6
2025-01-31 14:40:18 -05:00
kjayapra-amd 0324014710 SWDEV-509280 - Combine multiple definitions of callbackQueue into a single function.
Change-Id: Ibbb56136bec2beed71c202d75e8aec9e82640a4e
2025-01-30 15:58:11 -05:00
German Andryeyev ea0b092af8 SWDEV-459826 - Add a crash dump for a failed queue
The logic can analyze the AQL queue state and
find a failed AQL packet with the kernel's name

Change-Id: I1a478fa2c25462cd07a194784958bdf22454b897
2025-01-28 14:27:46 -05:00
taosang2 799e54aa0d SWDEV-507969 - Fix wrong VGPRs for some devices
Change-Id: Ia8fc19564272e2c7171d991376bf896a99085a97
2025-01-22 10:11:47 -05:00
Evgenii Averin b62995ce1a SWDEV-505769 - Fix typo
Change-Id: I2d3f65ed68157718c4439a9da7d2dcdfcbb9f93d
2025-01-08 21:31:17 -05:00
taosang2 d82d6a78cf SWDEV-479958 - Support different address mode
Support different address modes in X, Y, Z directions

Change-Id: If1db5a8af33c92dd14b48968c3e8eceb97daea6c
2024-12-23 16:39:54 -05:00
Saleel Kudchadker 7863eb92dc SWDEV-497145 - Use rocr copyOnEngine API for staged copies
- Refactor blit code and clean ASAN instrumentation
- Use unified function for rocr copy
- Enable shader copy path for unpinned writeBuffer/readBuffer paths
- Set GPU_FORCE_BLIT_COPY_SIZE=16 which means we will use BLIT copy for
  pinned copies or unpinned H2D/D2H copies < 16KB

Change-Id: I42045cca79234b340dbf53dafb93044199736ae4
2024-12-04 13:38:13 -05:00
Satyanvesh Dittakavi a26dc29eb9 SWDEV-491967 - Add the right VGPRs per SIMD and VGPR Granularity for gfx12
- Default values are being assigned causing occupancy calculation to go
wrong without the right values defined for gfx12 ASICs
- Also added the these values for gfx1105

Change-Id: I611cc3a8ed8c57f2def637310ce1c3a48c16a574
2024-11-01 12:47:23 -04:00
Konstantin Zhuravlyov 3387f48b56 SWDEV-428601 - Don't enforce 1 isa per device in rocm backend
- Device can have multiple isas as per HSA spec
  - First isa is most specific one, so this change is sort of a NOP

Change-Id: Ib332af21745f2e6a7c25db8986bf7717501059bc
2024-11-01 11:01:02 -04:00
Anusha GodavarthySurya b498103f9b SWDEV-485904 - propagate hsa_amd_vmem_address_free error to hip API
Unit_hipMemSetAccess_GrowVMM test fails with
HSA_STATUS_ERROR_RESOURCE_FREE silently

Change-Id: I7a78410e432de4a2e877062782abf8761645f392
2024-10-21 10:12:32 -04:00
Saleel Kudchadker e36666e536 SWDEV-301667 - Enable ROCr logging
- Use AMD_LOG_LEVEL=5 to dump AQL packets in ROCr

Change-Id: I2c044a5304c4eaf3d3af20e62d1f54c98d4fbaa4
2024-10-04 19:22:12 -04:00
German Andryeyev 29cc678d8d SWDEV-483586 - Unblock staging H2D transfers
Although unpinned copies require synchronizations
in HIP, runtime can avoid syncs for H2D copies with
a staging buffer

Change-Id: If2203c6bc0cbd89742823688dc8e89e9acd873b2
2024-09-21 10:25:27 -04:00
Maneesh Gupta 2d1c6ee23e SWDEV-485179 - Revert "SWDEV-459254 - Overwrite cacheline size to 256 for gfx12, as it is used for kernarg alignment."
This reverts commit 1f63650bf96e01e48f879aa58b80e2130dd4a567.

Reason for revert: <INSERT REASONING HERE>

Change-Id: I6d7ed87c09d9b77116548dce1f30ac4711c2c09d
2024-09-20 11:33:34 -04:00
kjayapra-amd 12a39fbf22 SWDEV-480772 - Remove name variable from amd::Monitor class.
Change-Id: Ie2a4fa44f485786227230f8a892e090e718aa30e
2024-09-19 11:55:01 -04:00
kjayapra-amd 6211037f63 SWDEV-439234 - Access check before memcpy and kernel operations.
Change-Id: I7057125c03460db205409e19980145298c190fe2
2024-09-06 14:30:00 -04:00
Rahul Manocha ddbd7039b0 SWDEV-478921 - Destroy Queue created by Coop Launch
Change-Id: I7f31ce05421479ff1de138cae26aafa071e956e2
2024-09-02 02:35:08 -04:00
kjayapra-amd e7a7feb273 SWDEV-464828 - Initial implementation of VMM IPC on PAL/Windows.
Change-Id: I3d5e148fad9105704db6724b00df06bef4fc9d2f
2024-07-16 10:38:35 -04:00
Satyanvesh Dittakavi 191869b252 SWDEV-471935 - Destroy hsa queues with cumask set
Fixes the memory leak with hipExtStreamCreateWithCUMask API.
hsa queues with cumask set are not being reused and created
everytime the API is called, But these queues were not being
destroyed during hipStreamDestroy causing memory leak.

Change-Id: Ibfbe019bbd73604e98eca80461efe53fa64bb701
2024-07-16 10:02:42 -04:00
Julia Jiang dd30e0e893 SWDEV-472710 - Adding gitattributes and remove trailing spaces
Change-Id: Ic8ad2071745f0ffe6a2e120bfebb6d90bf270f87
2024-07-15 12:39:56 -04:00
Ioannis Assiouras 0053584aac SWDEV-472309 - Check if vmm support exists before enabling vm in mempool
Change-Id: I6ae2fb18a306595e0f3a56e144658a4a720e7a37
2024-07-12 10:11:03 -04:00
taosang2 544c45364f SWDEV-467540 - Fix reference of freed locks
1.Move global amd::monitor listenerLock before global
class runtime_tear_down as it will be referenced in
~RuntimeTearDown() after main(). It should be freed
later than runtime_tear_down.

2.Update  Device::~Device() to SVM free coopHostcallBuffer_
before context_ is released and freed.

Change-Id: I1d21378ff463477d3238d71e5e2a1a7d6b9147ad
2024-06-18 13:58:36 -04:00
Anusha GodavarthySurya 57156c524d SWDEV-467102 - Hidden heap init for graph capture
If the graph has kernels that does device side allocation,  during packet capture, heap is
allocated because heap pointer has to be added to the AQL packet, and initialized during
graph launch.

Handle race with wait when 2 kernels with device heap are enqueued on multiple streams.

Change-Id: I45933b77fcaf7bc8fdf1bc906462e32b5d8d3688
2024-06-17 02:07:25 -04:00
Satyanvesh Dittakavi 1815fc808d SWDEV-464927 - Update the Get by PCI BusId logic and Hop count
- Update the intra socket weight for partitions within single socket as
it is changed to 13 by the driver.
- Use the PCIe function to distinguish the partitions of the same device
such as TPX mode in gfx942.

Change-Id: I8e64023d44e37c2dbb105cbb343441a48021ba7b
2024-06-10 04:46:50 -04:00
Ioannis Assiouras 8f42ad6aa3 SWDEV-464648 - code and comment cleanups
Change-Id: I5ba3f1bff500b3cd5903c2f441017735e688f83f
2024-06-07 22:38:09 +01:00
Ioannis Assiouras 775dc204aa SWDEV-463865 - changed device,roc and pal namespaces to be nested under amd
Change-Id: Icad342843c039c634e249a13a7aa31400730b1dd
2024-06-07 12:23:06 -04:00
kjayapra-amd 1590b39f9e SWDEV-464455 - Init Segment flags and check for valid segment before passing to hsa APIs for allocation.
Change-Id: Ibe640093acdb7856115b6a4109bcf010adf20353
2024-06-07 10:40:57 -04:00
Ioannis Assiouras b8c2ac4de4 SWDEV-463865 - symbol renamings to prevent conflicts in static build
Change-Id: Id7fbb638c1088c23df52fee877cd790d637b1ffb
2024-06-06 04:05:55 -04:00
Lang Yu a0127c9eea SWDEV-461525 - Add vgprAllocGranularity_ and vgprsPerSimd_ for gfx1150/1
These are missed for gfx1150/1.

Change-Id: I03d997e451d15a01a961e6597f805f634e5c3ae7
Signed-off-by: Lang Yu <lang.yu@amd.com>
2024-05-31 21:53:25 -04:00
Alex Xie 80011685b2 SWDEV-462635 - 256 byte image memory alignment
Change-Id: I1d21368ff460477d3238d71e4e2a0a7d6b9167ac
2024-05-29 10:37:27 -04:00
Ajay 6ec5074d74 SWDEV-439581 - hip event flags clean up
Change-Id: I2197762d912da41a8b53b32b3446f0a958c988a6
2024-05-28 06:31:10 +00:00
Ajay a5a4b78606 SWDEV-439581 - hipEventBlockingSync flag for hip events
Change-Id: I0d7785a568f8007f82f999776a7ad23d0acc81b7
2024-05-28 06:31:10 +00:00
Vladana Stojiljkovic fdaa7141af SWDEV-452364 - Check if no GPUs are available when hsa_init fails
* When no GPUs are available, hsa_init fails with HSA_STATUS_ERROR_OUT_OF_RESOURCES, and device and runtime initialization fails. In order for NoGpu tests to pass, true needs to be returned which will cause HIP_INIT_API to return proper error hipErrorNoDevice instead of hipErrorInvalidDevice.

Change-Id: I982d4416c92ed1b36893354d8b10d73df34f2478
2024-05-28 06:31:10 +00:00
kjayapra-amd dd1dd86fd7 SWDEV-459254 - Overwrite cacheline size to 256 for gfx12, as it is used for kernarg alignment.
Change-Id: Ia6acf312ee84f6dde1c830fc21f10d3a8a9de5ee
2024-05-28 06:28:17 +00:00
Jaydeep Patel 1d48f2a1ab SWDEV-456279 - Adding new hip flag to access contiguous memory and pass the flag to HSA API.
Change-Id: I1bafeaa3096395c729723af958d609bc41e7845c
2024-04-30 05:25:38 -04:00
Ioannis Assiouras bf74ef4025 SWDEV-451594 - Implement Readback and Avoid HDP Flush workaround for device kernel args
Change-Id: I6d41a089a17f55306e7ff402588a1e831b20a7a7
2024-04-19 09:29:20 -04:00
kjayapra-amd 56ebf5157a SWDEV-413997 - VMM IPC implementation for Linux.
Change-Id: Icfeb83ca51e96be35abb67a94d6e3e1a1ca5a934
2024-04-18 11:28:13 -04:00
German Andryeyev c95a75a2bf SWDEV-444670 - Enable teardown class
Force implicit runtime teardown with a global destructor.

Change-Id: Iabe63dedf5b94fefc98668585c45a61607120669
2024-04-16 12:00:06 -04:00
Rakesh Roy 52db98edd9 SWDEV-453180 - Add UUID support for HIP_VISIBLE_DEVICES on Linux
- UUID is Ascii string with a maximum of 21 chars which uniquely identifies a GPU
- Convert set UUID in HIP_VISIBLE_DEVICES to device index internally
- Then use existing device index logic for HIP_VISIBLE_DEVICES

Change-Id: I8cab4fe42459f8209b97f909300789e6e687b9ac
2024-04-13 22:07:19 -04:00
kjayapra-amd d52d16c8e6 SWDEV-413997 - Fixing multiple device cases.
Change-Id: I10ad3fbfca887e92cd81f68392fa1acf753cbd2b
2024-04-13 06:14:03 -04:00
kjayapra-amd 2b8634bada SWDEV-446298 - Adding error code to the logs on p2p hsa api failure.
Change-Id: Ic41b1ad1b64cca0e31986337a83a5146d52a7328
2024-04-10 06:00:00 -04:00