Граф коммитов

273 Коммитов

Автор SHA1 Сообщение Дата
Anusha GodavarthySurya b498103f9b SWDEV-485904 - propagate hsa_amd_vmem_address_free error to hip API
Unit_hipMemSetAccess_GrowVMM test fails with
HSA_STATUS_ERROR_RESOURCE_FREE silently

Change-Id: I7a78410e432de4a2e877062782abf8761645f392
2024-10-21 10:12:32 -04:00
Saleel Kudchadker e36666e536 SWDEV-301667 - Enable ROCr logging
- Use AMD_LOG_LEVEL=5 to dump AQL packets in ROCr

Change-Id: I2c044a5304c4eaf3d3af20e62d1f54c98d4fbaa4
2024-10-04 19:22:12 -04:00
German Andryeyev 29cc678d8d SWDEV-483586 - Unblock staging H2D transfers
Although unpinned copies require synchronizations
in HIP, runtime can avoid syncs for H2D copies with
a staging buffer

Change-Id: If2203c6bc0cbd89742823688dc8e89e9acd873b2
2024-09-21 10:25:27 -04:00
Maneesh Gupta 2d1c6ee23e SWDEV-485179 - Revert "SWDEV-459254 - Overwrite cacheline size to 256 for gfx12, as it is used for kernarg alignment."
This reverts commit 1f63650bf96e01e48f879aa58b80e2130dd4a567.

Reason for revert: <INSERT REASONING HERE>

Change-Id: I6d7ed87c09d9b77116548dce1f30ac4711c2c09d
2024-09-20 11:33:34 -04:00
kjayapra-amd 12a39fbf22 SWDEV-480772 - Remove name variable from amd::Monitor class.
Change-Id: Ie2a4fa44f485786227230f8a892e090e718aa30e
2024-09-19 11:55:01 -04:00
kjayapra-amd 6211037f63 SWDEV-439234 - Access check before memcpy and kernel operations.
Change-Id: I7057125c03460db205409e19980145298c190fe2
2024-09-06 14:30:00 -04:00
Rahul Manocha ddbd7039b0 SWDEV-478921 - Destroy Queue created by Coop Launch
Change-Id: I7f31ce05421479ff1de138cae26aafa071e956e2
2024-09-02 02:35:08 -04:00
kjayapra-amd e7a7feb273 SWDEV-464828 - Initial implementation of VMM IPC on PAL/Windows.
Change-Id: I3d5e148fad9105704db6724b00df06bef4fc9d2f
2024-07-16 10:38:35 -04:00
Satyanvesh Dittakavi 191869b252 SWDEV-471935 - Destroy hsa queues with cumask set
Fixes the memory leak with hipExtStreamCreateWithCUMask API.
hsa queues with cumask set are not being reused and created
everytime the API is called, But these queues were not being
destroyed during hipStreamDestroy causing memory leak.

Change-Id: Ibfbe019bbd73604e98eca80461efe53fa64bb701
2024-07-16 10:02:42 -04:00
Julia Jiang dd30e0e893 SWDEV-472710 - Adding gitattributes and remove trailing spaces
Change-Id: Ic8ad2071745f0ffe6a2e120bfebb6d90bf270f87
2024-07-15 12:39:56 -04:00
Ioannis Assiouras 0053584aac SWDEV-472309 - Check if vmm support exists before enabling vm in mempool
Change-Id: I6ae2fb18a306595e0f3a56e144658a4a720e7a37
2024-07-12 10:11:03 -04:00
taosang2 544c45364f SWDEV-467540 - Fix reference of freed locks
1.Move global amd::monitor listenerLock before global
class runtime_tear_down as it will be referenced in
~RuntimeTearDown() after main(). It should be freed
later than runtime_tear_down.

2.Update  Device::~Device() to SVM free coopHostcallBuffer_
before context_ is released and freed.

Change-Id: I1d21378ff463477d3238d71e5e2a1a7d6b9147ad
2024-06-18 13:58:36 -04:00
Anusha GodavarthySurya 57156c524d SWDEV-467102 - Hidden heap init for graph capture
If the graph has kernels that does device side allocation,  during packet capture, heap is
allocated because heap pointer has to be added to the AQL packet, and initialized during
graph launch.

Handle race with wait when 2 kernels with device heap are enqueued on multiple streams.

Change-Id: I45933b77fcaf7bc8fdf1bc906462e32b5d8d3688
2024-06-17 02:07:25 -04:00
Satyanvesh Dittakavi 1815fc808d SWDEV-464927 - Update the Get by PCI BusId logic and Hop count
- Update the intra socket weight for partitions within single socket as
it is changed to 13 by the driver.
- Use the PCIe function to distinguish the partitions of the same device
such as TPX mode in gfx942.

Change-Id: I8e64023d44e37c2dbb105cbb343441a48021ba7b
2024-06-10 04:46:50 -04:00
Ioannis Assiouras 8f42ad6aa3 SWDEV-464648 - code and comment cleanups
Change-Id: I5ba3f1bff500b3cd5903c2f441017735e688f83f
2024-06-07 22:38:09 +01:00
Ioannis Assiouras 775dc204aa SWDEV-463865 - changed device,roc and pal namespaces to be nested under amd
Change-Id: Icad342843c039c634e249a13a7aa31400730b1dd
2024-06-07 12:23:06 -04:00
kjayapra-amd 1590b39f9e SWDEV-464455 - Init Segment flags and check for valid segment before passing to hsa APIs for allocation.
Change-Id: Ibe640093acdb7856115b6a4109bcf010adf20353
2024-06-07 10:40:57 -04:00
Ioannis Assiouras b8c2ac4de4 SWDEV-463865 - symbol renamings to prevent conflicts in static build
Change-Id: Id7fbb638c1088c23df52fee877cd790d637b1ffb
2024-06-06 04:05:55 -04:00
Lang Yu a0127c9eea SWDEV-461525 - Add vgprAllocGranularity_ and vgprsPerSimd_ for gfx1150/1
These are missed for gfx1150/1.

Change-Id: I03d997e451d15a01a961e6597f805f634e5c3ae7
Signed-off-by: Lang Yu <lang.yu@amd.com>
2024-05-31 21:53:25 -04:00
Alex Xie 80011685b2 SWDEV-462635 - 256 byte image memory alignment
Change-Id: I1d21368ff460477d3238d71e4e2a0a7d6b9167ac
2024-05-29 10:37:27 -04:00
Ajay 6ec5074d74 SWDEV-439581 - hip event flags clean up
Change-Id: I2197762d912da41a8b53b32b3446f0a958c988a6
2024-05-28 06:31:10 +00:00
Ajay a5a4b78606 SWDEV-439581 - hipEventBlockingSync flag for hip events
Change-Id: I0d7785a568f8007f82f999776a7ad23d0acc81b7
2024-05-28 06:31:10 +00:00
Vladana Stojiljkovic fdaa7141af SWDEV-452364 - Check if no GPUs are available when hsa_init fails
* When no GPUs are available, hsa_init fails with HSA_STATUS_ERROR_OUT_OF_RESOURCES, and device and runtime initialization fails. In order for NoGpu tests to pass, true needs to be returned which will cause HIP_INIT_API to return proper error hipErrorNoDevice instead of hipErrorInvalidDevice.

Change-Id: I982d4416c92ed1b36893354d8b10d73df34f2478
2024-05-28 06:31:10 +00:00
kjayapra-amd dd1dd86fd7 SWDEV-459254 - Overwrite cacheline size to 256 for gfx12, as it is used for kernarg alignment.
Change-Id: Ia6acf312ee84f6dde1c830fc21f10d3a8a9de5ee
2024-05-28 06:28:17 +00:00
Jaydeep Patel 1d48f2a1ab SWDEV-456279 - Adding new hip flag to access contiguous memory and pass the flag to HSA API.
Change-Id: I1bafeaa3096395c729723af958d609bc41e7845c
2024-04-30 05:25:38 -04:00
Ioannis Assiouras bf74ef4025 SWDEV-451594 - Implement Readback and Avoid HDP Flush workaround for device kernel args
Change-Id: I6d41a089a17f55306e7ff402588a1e831b20a7a7
2024-04-19 09:29:20 -04:00
kjayapra-amd 56ebf5157a SWDEV-413997 - VMM IPC implementation for Linux.
Change-Id: Icfeb83ca51e96be35abb67a94d6e3e1a1ca5a934
2024-04-18 11:28:13 -04:00
German Andryeyev c95a75a2bf SWDEV-444670 - Enable teardown class
Force implicit runtime teardown with a global destructor.

Change-Id: Iabe63dedf5b94fefc98668585c45a61607120669
2024-04-16 12:00:06 -04:00
Rakesh Roy 52db98edd9 SWDEV-453180 - Add UUID support for HIP_VISIBLE_DEVICES on Linux
- UUID is Ascii string with a maximum of 21 chars which uniquely identifies a GPU
- Convert set UUID in HIP_VISIBLE_DEVICES to device index internally
- Then use existing device index logic for HIP_VISIBLE_DEVICES

Change-Id: I8cab4fe42459f8209b97f909300789e6e687b9ac
2024-04-13 22:07:19 -04:00
kjayapra-amd d52d16c8e6 SWDEV-413997 - Fixing multiple device cases.
Change-Id: I10ad3fbfca887e92cd81f68392fa1acf753cbd2b
2024-04-13 06:14:03 -04:00
kjayapra-amd 2b8634bada SWDEV-446298 - Adding error code to the logs on p2p hsa api failure.
Change-Id: Ic41b1ad1b64cca0e31986337a83a5146d52a7328
2024-04-10 06:00:00 -04:00
Saleel Kudchadker 3f0bcf7834 SWDEV-301667 - Fix SDMA mask reuse
If we are using the mask returned by getLastUsedSdmaEngine() then we
need to apply the SDMA Read/Write mask to it before using with HSA
copy_on_engine API.

Change-Id: I6e5dc6c187eeb3c61ee159e9d2a0fa7b4737c06e
2024-04-08 15:42:52 -04:00
Sourabh Betigeri dbac2976e4 SWDEV-451964 - Limit gpu single allocation percentage for gfx940 only
Change-Id: Iadcdadd734e7aeeb23742e426353defa972d3ad5
2024-04-05 09:43:42 -04:00
kjayapra-amd 5cbd74b554 SWDEV-413997 - Save hsa_handle as ptr in hipMemCreate path.
Change-Id: Ica32017ef7b00326dfb6d1f604e126d40ad5b786
2024-03-26 10:24:29 -04:00
Ioannis Assiouras 96f5c44851 SWDEV-451166 - Disable kernel args for non-XGMI if HDP flush register is invalid
Change-Id: I227e046e2b9cb25476a50240f5d070adbd558f21
2024-03-15 05:27:52 -04:00
kjayapra-amd f5ca620baa SWDEV-423835 - Fixing kernel launch issues on Virtual Memory Management path.
Change-Id: I9f5e8a3d83af3809b2c50b21a10697e26113dd23
2024-03-12 17:22:07 -04:00
Saleel Kudchadker 984c86f407 SWDEV-301667 - Better log
- Print SWq for AQL packets, this helps correlating a stream to the HWq
mapped

Change-Id: I610430c0872a1abc6636027c00163ec46983cd65
2024-03-01 16:43:06 -05:00
Ioannis Assiouras 1f6d416684 SWDEV-446399 - Fixed segfault in hipMemSetAccess
Change-Id: Ia1200d9bee03e8abade211287505f081e635ceec
2024-02-20 18:51:05 -05:00
kjayapra-amd 7d5b4a8f7a SWDEV-437832 - Changes to update host unified memory and iommuv2 flags.
Change-Id: I88998cf57c21fc446fa28e250f826c607923670b
2024-02-07 06:27:47 -05:00
Saleel Kudchadker 0567c3b720 SWDEV-301667 - Better log
Display queue base pointer in the log. This can be co-related with AQL
packets

Change-Id: I544f9b6db6ae01c85e57e4b3f0b3fffefcd7c2ed
2024-02-05 05:08:11 +00:00
kjayapra-amd b366a7c992 SWDEV-437832 - Adding device property to check if the device is accelerator.
Change-Id: I8349e99c03422c268bbb60a8c143bd492d9cec09
2024-02-05 05:08:11 +00:00
Satyanvesh Dittakavi 755eb2962c SWDEV-434846 - Limit the gpu single allocation percentage for all MI300 versions
Change-Id: I33dea3eaab249ce3f9a624d38267489f99cd530c
2024-01-03 23:47:44 -05:00
German Andryeyev fb3dfcf889 SWDEV-436859 - Enable pitch for COPY_HOST_PTR
Original logic didn't use pitch because, abstraction layer had
a sysmem copy without pitch. Since extra sysmem copy was
disabled, the code has to accept pitch values from the app.

Change-Id: Ia9fba7b33ddff4e9109b4e63d0d6afa52f501c8f
2023-12-13 16:50:16 -05:00
Satyanvesh Dittakavi b2102fe939 SWDEV-434846 - Correct the vgprs per simd for MI300
Change-Id: Id4862da7611f64392bfc1538fb644801ec0a9e7f
2023-12-13 03:06:21 -05:00
German Andryeyev f1dc81f427 SWDEV-432174 - Change the fillBuffer kernel
- Add the new fillBuffer kernel, which allows to launch a limited
number of workgroups for memory fill operation
- Switch fill memory to 16 bytes write by default
- Allow to limit the workgroups with DEBUG_CLR_LIMIT_BLIT_WG

Change-Id: Ibad1822f2d42b2fc71bcfc1917c31409c0623e8e
2023-11-16 14:25:55 -04:00
Jatin Chaudhary ce27581465 SWDEV-431399 - fix first set of memory leaks in clr, found in rtc tests
change constexpr variable names to match the C++ style we follow.

Change-Id: Ibc59a65d8ff2ca765da7bf5e653c0650fb3714c4
2023-11-14 20:39:45 -05:00
Saleel Kudchadker f06368fd04 SWDEV-301667 - Add error logging
Change-Id: I814399dc0e7083bb7fb0ed8bf46dd96bdf664965
2023-11-10 11:55:54 -05:00
Alex Xie 4fb9f03f9e SWDEV-430062 - Support GPU_MAX_HEAP_SIZE flag in ROCm
Change-Id: Ibfe82b3524e09c61879b988f23512f394d725024
2023-11-07 10:07:24 -05:00
German a49d633883 SWDEV-429529 - Allocate glb_ctx_ even for one device
Move context allocation into Device::init() method to simplify the logic and handle
HIP_VISIBLE_DEVICES properly

Change-Id: I0fc6f37c7ae39bedbdad0290295d6794c66d6c54
2023-10-27 15:00:15 -04:00
Saleel Kudchadker 5662d4037c SWDEV-408180 - Address possible cornercases
- Address corner cases that can arise with the new
hipMemcpyDeviceToDeviceNoCU enum
- Better log

Change-Id: I6035b901f8d616741054b7a5ff4f67956329ac57
2023-10-23 16:54:08 -04:00