Commit Graph

56 Commitit

Tekijä SHA1 Viesti Päivämäärä
German Andryeyev dceb320ba7 SWDEV-440746 - Fix a typo with GPU_PINNED_XFER_SIZE
Change-Id: I8fdbfb4e1c6b1274206c28a529eee9ebeaaa26fb
2024-10-24 18:33:14 -04:00
Saleel Kudchadker 9de6d4d46c SWDEV-478624 - Use readback workaround to ensure kernel arg coherence
Use env var DEBUG_CLR_KERNARG_HDP_FLUSH_WA=1 to fall back to HDP flush
workaround. The default is 0

Change-Id: I7bdb9be61da60c30d15ac9991b7cd27351e1831c
2024-09-11 14:53:15 -04:00
Ioannis Assiouras 8f42ad6aa3 SWDEV-464648 - code and comment cleanups
Change-Id: I5ba3f1bff500b3cd5903c2f441017735e688f83f
2024-06-07 22:38:09 +01:00
Ioannis Assiouras 775dc204aa SWDEV-463865 - changed device,roc and pal namespaces to be nested under amd
Change-Id: Icad342843c039c634e249a13a7aa31400730b1dd
2024-06-07 12:23:06 -04:00
German Andryeyev a4dbc97bd7 SWDEV-451594 - Correct preMI100 detection
Change-Id: I4f1570a64cebf1ff73b4d189c17b7d7db095009c
2024-05-28 06:31:10 +00:00
kjayapra-amd 931431fc38 SWDEV-417091 - Disable GWS Init for PAL/Windows side.
Change-Id: Ib6295f063daa835c1f33f21f50c083241a9026ff
2024-05-28 06:31:10 +00:00
Ioannis Assiouras 2b746de6de SWDEV-451594 - Fallback to host kernel args on older devices
On gfx8, gfx9 devices before MI100 and gfx10.0 or gfx10.1
none of the memory ordering workarounds for device kernel arguments
can be applied. Use host kernel arguments on these devices.

Change-Id: I9be6fbfe4b3986eb7d9f83998334df5f03fd4124
2024-05-28 06:28:17 +00:00
Ioannis Assiouras 6cb7b6ec6b SWDEV-451594 - Change device kernel args to use HDP flush by default
The Readback and Avoid HDP Flush memory ordering workaround is
used as a fallback solution only when HDP flush register is invalid

Change-Id: Ic284eba1f95ed22b0270d3abeb904fb902015b1a
2024-05-02 19:35:13 +00:00
Ioannis Assiouras bf74ef4025 SWDEV-451594 - Implement Readback and Avoid HDP Flush workaround for device kernel args
Change-Id: I6d41a089a17f55306e7ff402588a1e831b20a7a7
2024-04-19 09:29:20 -04:00
German Andryeyev f0c7ecf617 SWDEV-455254 - Add kernel arg optimization
Add kernel arguments optimization into blit path.
Enabled by default on MI300.

Change-Id: I2694a81b90d48ad07d86dfe4c0c64fe187bada8e
2024-04-10 18:08:37 -04:00
Ioannis Assiouras 96f5c44851 SWDEV-451166 - Disable kernel args for non-XGMI if HDP flush register is invalid
Change-Id: I227e046e2b9cb25476a50240f5d070adbd558f21
2024-03-15 05:27:52 -04:00
Saleel Kudchadker 68f40f78dd SWDEV-443760 - Enable device kernel args for MI300
- Enable Device kernel args for MI300* for now.
- Fix a perf issue which impacts graph instantiate when dev kernel args
are enabled.

Change-Id: I962e58fd9d8dd1a8db95e601cb03a8e9c7bac97f
2024-02-28 19:10:04 -05:00
Saleel Kudchadker f138e0d113 SWDEV-443760 - Enable device kern args
- Implement workaround to ensure HDP writes are done by writing and
reading the HDP MMIO register.
- Implement the same workaround for graphs, we no longer need sentinel
write/readback

Change-Id: I0d3027b46a1f61131ec62e3c8c669ff5184fa6b2
2024-02-20 02:03:14 -05:00
German 31101c6219 SWDEV-440746 - Limit WG for compute P2P
Use only 16 workgroups for compute P2P copies.
That should be enough to utilize XGMI bandwidth.

Change-Id: I60dfe019279bb95f93c8874244c1738aad1896d8
2024-01-12 14:56:29 -05:00
German Andryeyev f1dc81f427 SWDEV-432174 - Change the fillBuffer kernel
- Add the new fillBuffer kernel, which allows to launch a limited
number of workgroups for memory fill operation
- Switch fill memory to 16 bytes write by default
- Allow to limit the workgroups with DEBUG_CLR_LIMIT_BLIT_WG

Change-Id: Ibad1822f2d42b2fc71bcfc1917c31409c0623e8e
2023-11-16 14:25:55 -04:00
Saleel Kudchadker 5c591b5877 SWDEV-301667 - Support device kernel args for PCIE
Change-Id: I5e51602bea5a68734227fd62e11ab68eb1ad81c1
2023-11-15 14:37:41 -05:00
kjayapra-amd 6a8bc3c718 SWDEV-419688 - Do not run GWS init kernel for targets > gfx12 and MI300.
Change-Id: I8e7441268978be71ab8a5a33e7f8bcf69660e500
(cherry picked from commit 36d37ef614909c0f215512aac0c133408d787080)
2023-10-05 14:57:56 -04:00
Sourabh Betigeri 7dc78d234d SWDEV-418855 - Limits the 'no GWS' approach to gfx940, gfx11and gfx12
Change-Id: Iab2d34d3142902517124cec7ef3461cf7aa4b98c
2023-08-30 23:48:02 -04:00
German 077311153a SWDEV-407533 - [ABI Break]Purge unused env vars
Change-Id: I627950e8ebb6299affc602754a20d442dbe42b14
2023-08-24 14:11:40 -04:00
Maneesh Gupta 5dc104b3ea SWDEV-368235 - Revert "Remove obsolete env variables"
This reverts commit 7b50c935f8.

Reason for revert: Deferred to a future release.

Change-Id: Ia66c37f0ab9734dee73c930d10d7469d5fd57254
2023-02-15 07:25:00 +00:00
German 7b50c935f8 SWDEV-368235 - Remove obsolete env variables
Change-Id: I7e14d53297e79e2f68b3a6cc40251ad7db9eb5ab
2023-02-03 13:44:24 -05:00
Saleel Kudchadker 4f64d89026 SWDEV-371123 - Use barrier value packet for event records
Change-Id: I5e5e5e89e0d96a2430b4682d168b76848fa5b94e
2022-12-07 17:57:36 -05:00
Sourabh Betigeri 5d7f3f9f3c SWDEV-305894 - Cooperative groups grid and multi grid sync support for gfx940+
Change-Id: I35d72f1cb50c3a96eee56a612b72d641852b145f
2022-12-05 16:30:30 -05:00
German 79d12df147 SWDEV-363074 - Adjust staging copy limits in Windows
Pinned copy can cause big performance drops, because slow pinning under Windows.
Use up to 128MB for staging transfers. Change staging buffer size to 4MB.
Linux path should still have the old defaults.

Change-Id: I954edceb3ec89e8e670be116aa2d0a9564c8b11c
2022-11-17 14:48:16 -05:00
German Andryeyev d8e4a289b3 SWDEV-344280 - Use coarse grain sysmem for kernel arg on MI200
Change-Id: I9596f0e8b88699538ec271b3a4345e5f75b968e3
2022-06-29 13:04:46 -04:00
German Andryeyev 525a1bbf1a SWDEV-286150 - Remove GSL backend
Change-Id: Iba9a997ee7d5ff6ac00d5888ff189a4514958fe9
2022-02-09 17:16:39 -05:00
Satyanvesh Dittakavi e20dd61932 SWDEV-306939 - Fix vdi errors/warnings by CppCheck
Change-Id: I56d910f8363787f1050d5d7e8064ed553c5827fd
2022-01-12 00:22:16 -05:00
Saleel Kudchadker 3f82b99f5d SWDEV-308843 - Increase MaxPinnedXferSize to 128
This allows experimenting with env var GPU_PINNED_XFER_SIZE which is
still at a default of 32MB

Change-Id: I85ade700ed58d498eba29d1737601dc74d4c26a4
2021-12-01 20:37:56 -05:00
Saleel Kudchadker e29b9c00ee SWDEV-301667 - Kern arg placement
Add a env var ROC_USE_FGS_KERNARG to toggle kernel arg placement
By default its in Fine Grain Kernel arg segment for supported asics.

Change-Id: I3d57ed69a1a4db2b392b0438ead499f3ddca4716
2021-09-02 12:36:49 -04:00
Jason Tang f165737096 SWDEV-296911 - Enable clgl interop for both MesaGL and OrcaGL
Change-Id: Ie3ad85a8335b1fc751812c09bb0cd30aad38dcae
2021-08-22 23:56:08 -07:00
agunashe d96481fb36 SWDEV-293742 - Update copyright end year VDI repo
Change-Id: I69d2fea4a7a43adf96ccea794270e4af991c5261
2021-08-22 23:56:07 -07:00
German Andryeyev ea3dba0832 SWDEV-240804 - Enable HMM build by default
Change-Id: Ia6175dff8eda8c18b7a7bb4ca87a90c1f3e4e6fb
2021-04-26 17:36:53 -04:00
Saleel Kudchadker aa38af8c96 SWDEV-276120 - Remove support for barrier sync
ROC_BARRIER_SYNC will not work with direct dispatch.
Remove and cleanup.

Change-Id: I81368b2e65039477bd0343bb92708dab48867db6
2021-04-07 17:08:39 -04:00
German Andryeyev fbde61de7f SWDEV-274199 - Enable SVM tracking
ROCr/KFD doesn't validate memory pointers. Enable validation inside
ROCclr, using SVM tracking mechanism.

Change-Id: I581e32ff37187f9ed8d9a302e8fd9f6ca935bdd7
2021-03-03 13:18:56 -05:00
Jason Tang 54a7170e40 SWDEV-198364 - Only enable clgl sharing in ROCm path when building LinuxPro
Change-Id: Ie4d87e252519d090a62b930f7ebb315d3477b690
2021-02-23 14:15:04 -05:00
German Andryeyev dbc7abaecf SWDEV-257787 - Add engine tracking per signal
- The logic will trace compute, sdma read/write operations and
apply signals when necessary
- ROC_CPU_WAIT_FOR_SIGNAL, ROC_SYSTEM_SCOPE_SIGNAL
and ROC_SKIP_COPY_SYNC were added to control the tracking

Change-Id: I9e8e6174c63bf7784f7ab00964e2918c8667d364
2021-01-25 12:34:45 -05:00
Tony Tye c7e8d91e14 Update code object handling for GSL, PAL and ROCm
- Correct GSL path to report targets using the TargetID syntax.

- Correct GSL path to check compatibility of code objects when
  loading.

- Add concept of an device isa and create a registery used by ROCm,
  PAL and GSL.

- Support XNACK and SRAMECC target features consistently for PAL and ROCm.

- Correct logic for NullDevices and asserts to avoid memory coruption.

- Allow all NullDevices to be created for HIP.

- Numerous other code improvements.

Change-Id: I40abf3d2b22249c1492d1af5919665f8184f4e0e
2021-01-14 11:11:51 -05:00
German Andryeyev 4af8b53846 Enable GPU memory in HMM by default
Change-Id: Ifec4733dc7a932163d921ebe1ae9fbd594ea1ef2
2020-11-30 12:39:18 -05:00
Jason Tang 25cc965c76 Change file mode 755 back to 644
Change-Id: I4ba5d66997ffd3331c56674d4bf805160dcdf049
2020-10-19 15:09:32 -04:00
German Andryeyev d9397590de Add option to skip AQL barrier
The change reuses HSA signals for dispatches as a wait signal.
Skipping the barrier requires to  disable L2 cache for sysmem
allocations and extra tracking for HDP access with the large bar.
ROC_BARRIER_SYNC=0 activates the new logic. Barrier sync is
still used by default.
ROC_ACTIVE_WAIT=1 enables unconditional active wait in ROCr.
The change also consolidated ROCr wait logic under single function.

Change-Id: I6bd1be30aa88258da1b1f9de319ef5a45852afd8
2020-10-06 08:37:12 -04:00
kjayapra-amd 18352d189b SWDEV-253063 - Code changes to make Image Buffer Workaround only for targets gfx 10.1
Change-Id: I17044a1c0775f427b9ba712eb3fd5ab21ed88b0e
2020-09-23 11:07:15 -04:00
Jason Tang db5a2d4c2d SWDEV-239502 - Fix image test regression
Change-Id: Iea35fb0f1964d09a35131b4a20ac8f6f82850a8e
2020-08-13 11:58:20 -04:00
German Andryeyev 6e69258b69 Enable prefetch async functionality
Fix a typo with the name define, when compilation wasn't enabled.
Force CPU prefetch if system was forced in runtime

Change-Id: Id4b578f9fa44a45426fdb5d8ecb1da803aa42313
2020-08-13 11:09:10 -04:00
German Andryeyev 059832b526 Return always true for P2P validation under ROCr
Change-Id: Id32a5a94a642e708d1d042c5247af38501bec153
2020-07-04 11:38:04 -04:00
German Andryeyev c5afd5d412 Initial HMM support
- Expose ROCclr interfaces for HIP usage
- ROCr interfaces aren't available in staging, thus control the
build with AMD_HMM_SUPPORT define

Change-Id: Iadc2bcc230e78d3b0dc22b235189c8cc80843446
2020-06-12 09:06:07 -04:00
German Andryeyev fb401bfe6d Revert "Revert "Reenable cooperative groups""
This reverts commit abc115bda8.

Reason for revert: <INSERT REASONING HERE>

Change-Id: I93c45fae27e0a08b199542d44fb0d65fc74ea13c
2020-05-25 14:11:58 -04:00
Aakash Sudhanwa abc115bda8 Revert "Reenable cooperative groups"
This reverts commit 82dc1a6343.

Reason for revert: <INSERT REASONING HERE>

Change-Id: I8954b37c354382804a139d80e2551c381fd9b2ed
2020-05-19 18:21:48 -04:00
German Andryeyev 82dc1a6343 Reenable cooperative groups
Change-Id: Ia43049ef550bffa6d21704dbd306ddb9c1d56af0
2020-05-15 12:41:12 -04:00
German Andryeyev d2b9a57c4f Disable cooperative groups support
Change-Id: I1b526f2228d083ecad7907a6eaf37c1dd4428277
2020-05-12 14:31:10 -04:00
Jason Tang b4f1239f34 device/rocm: split gfxVersion to major/minor/stepping
Change-Id: I1e437eaee30794147713d9516229211670f01d90
2020-05-12 12:17:13 -04:00