rocm-systems

Autor	SHA1	Nachricht	Datum
SaleelK	340f3aa887	clr: Implement dynamic stream to HWq logic (#1958 ) * clr: Implement dynamic stream to HW queue assignment This change implements dynamic stream to hardware queue (HWq) mapping with the following features: * Queue depth heuristics with weights for optimal HWq assignment * Make last used queue sticky for better locality * Use pipe HWq to pipe mapping - gfx9 follows a round-robin queue to pipe mapping based on creation order (single process per device only, as pipe ID is statically assigned by runtime) * More aggressive heuristic usage for better queue distribution * Extend dynamic queues support for all stream priorities Environment variables: * DEBUG_HIP_DYNAMIC_QUEUE: 0 - disabled, 1 - Depth heuristics 2 - Depth+Pipe heuristics * DEBUG_HIP_IGNORE_STREAM_PRIORITY=1: ignore priority stream creation * clr: Clean up last_used_queue_	2026-01-23 10:40:54 -08:00
Tao Sang	163e44d0a8	SWDEV-555889 - Support mipmap on rocr (#2082 ) * SWDEV-555889 - Support mipmap on rocr Support mipmap in hip-rt on rocr backend. Enable all mipmap tests in Windows. Some other minor improvement. Add some SRD logs that will be removed finally. * Add sampler.mipFilter to fix sampler issues on mipmap in rocr. Fix format issues of view of leveled image and mipmap image in blit kernel in rocr. Enabled disabled mipmap tests. * Rewrite view logic * Set word4.f.PITCH = 0 for mipmap SRD on navi31 to fix unstable test issues. Reset last error in nagative tests. * Remove SRD dump log from hip-rt Let Rocr mipmap log be in condition. * minor format chang * Exclude mipmap tests for mi200+ which don't support mipmap.	2026-01-21 09:10:29 -08:00
Jin Jung	deaf8ab38a	SWDEV-567119 - Windows GL Interop Support (#1892 )	2025-12-08 11:03:59 -05:00
Pengda Xie	a4bbd73dc6	SWDEV-556684 - Remove HSAIL support (#1183 )	2025-10-23 11:21:49 -07:00
Ajay GunaShekar	f2ad8d6d5e	SWDEV-553099 - remove WITHOUT_HSA_BACKEND usage (#831 )	2025-09-03 08:40:25 -07:00
Danylo Lytovchenko	2ff2316227	Adjust clang format to the new versions, revert broken macro layout (#714 )	2025-08-22 17:23:22 +02:00
Danylo Lytovchenko	f7338717ae	SWDEV-470698 - fix formatting, add format check workflow (#657 )	2025-08-20 19:58:06 +05:30
Andryeyev, German	6df9a49437	SWDEV-465041 - Add support for user events with DD (#321 ) * SWDEV-465041 - Add support for user events with DD User events can be replaced with HSA signals. Add the interface to allocate HSA signal for user events and update the status on CL_COMPLETE. Force pinned path with DD to avoid blocking calls. Pinned memory can be released only when the command is complete. Simplify device enqueue path to use generic kernel arg buffer and signals * Fix notifyCmdQueue() logic for OCL * Avoid blocking calls in OCL with DD * Add event destruciton in a case of the failure. [ROCm/clr commit: `2305f8ae56`]	2025-08-12 19:04:36 -04:00
Xie, Jiabao(Jimbo)	e1d2194b75	SWDEV-528913 - support gfx950 in rocsetting (#217 ) * SWDEV-528913 - support gfx950 in rocsetting --------- Co-authored-by: Jimbo Xie <jiabaxie@amd.com> [ROCm/clr commit: `a320a3f214`]	2025-05-07 15:44:49 -04:00
Andryeyev, German	5c7c86f66d	SWDEV-517481 - Add dynamic queue management (#37 ) Enabled by defaulty. DEBUG_HIP_DYNAMIC_QUEUES controls the feature [ROCm/clr commit: `28967982b2`]	2025-03-19 11:22:50 -04:00
Saleel Kudchadker	d0a7ae02cf	SWDEV-513197 - Unify getBuffer implementation - Use getBuffer/releaseBuffer in BlitManager - Cleanup XferBuffer as we use ManagedBuffer for both reads/writes Change-Id: I2661b85dd012763b17a38a743fec1b1d79125f67 [ROCm/clr commit: `37d606d193`]	2025-02-28 12:47:51 -05:00
Rahul Manocha	90337103ac	SWDEV-510849 - Restore pinned memory copy path 1) Create getBuffer method to return pinned host memory or staging buffer 2) for D2H path use managed buffer instead of static buffer 3) use staging buffer copy for 16KB < size < 1MB 4) use pinned memory copy for size > 1MB Change-Id: I13d4d6ab60691bc6c7724239db1e11e23f0f3dc2 [ROCm/clr commit: `4bf634dfca`]	2025-02-26 11:25:02 -05:00
taosang2	40df900647	SWDEV-501963 - Add missing codes for gfx950 Cherry-pick https://gerrit-git.amd.com/c/compute/ec/clr/+/1162997 Change-Id: I6b3c6bf55c61cffd43cd6f17b75998f751b75723 [ROCm/clr commit: `32daa8f384`]	2025-01-31 14:34:49 -05:00
German Andryeyev	584c9c1ee1	SWDEV-440746 - Fix a typo with GPU_PINNED_XFER_SIZE Change-Id: I8fdbfb4e1c6b1274206c28a529eee9ebeaaa26fb [ROCm/clr commit: `dceb320ba7`]	2024-10-24 18:33:14 -04:00
Saleel Kudchadker	343bdf3187	SWDEV-478624 - Use readback workaround to ensure kernel arg coherence Use env var DEBUG_CLR_KERNARG_HDP_FLUSH_WA=1 to fall back to HDP flush workaround. The default is 0 Change-Id: I7bdb9be61da60c30d15ac9991b7cd27351e1831c [ROCm/clr commit: `9de6d4d46c`]	2024-09-11 14:53:15 -04:00
Ioannis Assiouras	75104df3b2	SWDEV-464648 - code and comment cleanups Change-Id: I5ba3f1bff500b3cd5903c2f441017735e688f83f [ROCm/clr commit: `8f42ad6aa3`]	2024-06-07 22:38:09 +01:00
Ioannis Assiouras	407d1346f2	SWDEV-463865 - changed device,roc and pal namespaces to be nested under amd Change-Id: Icad342843c039c634e249a13a7aa31400730b1dd [ROCm/clr commit: `775dc204aa`]	2024-06-07 12:23:06 -04:00
German Andryeyev	ad24101e5e	SWDEV-451594 - Correct preMI100 detection Change-Id: I4f1570a64cebf1ff73b4d189c17b7d7db095009c [ROCm/clr commit: `a4dbc97bd7`]	2024-05-28 06:31:10 +00:00
kjayapra-amd	27bc1632f1	SWDEV-417091 - Disable GWS Init for PAL/Windows side. Change-Id: Ib6295f063daa835c1f33f21f50c083241a9026ff [ROCm/clr commit: `931431fc38`]	2024-05-28 06:31:10 +00:00
Ioannis Assiouras	6a0f554fa6	SWDEV-451594 - Fallback to host kernel args on older devices On gfx8, gfx9 devices before MI100 and gfx10.0 or gfx10.1 none of the memory ordering workarounds for device kernel arguments can be applied. Use host kernel arguments on these devices. Change-Id: I9be6fbfe4b3986eb7d9f83998334df5f03fd4124 [ROCm/clr commit: `2b746de6de`]	2024-05-28 06:28:17 +00:00
Ioannis Assiouras	a21913a0bd	SWDEV-451594 - Change device kernel args to use HDP flush by default The Readback and Avoid HDP Flush memory ordering workaround is used as a fallback solution only when HDP flush register is invalid Change-Id: Ic284eba1f95ed22b0270d3abeb904fb902015b1a [ROCm/clr commit: `6cb7b6ec6b`]	2024-05-02 19:35:13 +00:00
Ioannis Assiouras	2f430138c5	SWDEV-451594 - Implement Readback and Avoid HDP Flush workaround for device kernel args Change-Id: I6d41a089a17f55306e7ff402588a1e831b20a7a7 [ROCm/clr commit: `bf74ef4025`]	2024-04-19 09:29:20 -04:00
German Andryeyev	f29d608ca3	SWDEV-455254 - Add kernel arg optimization Add kernel arguments optimization into blit path. Enabled by default on MI300. Change-Id: I2694a81b90d48ad07d86dfe4c0c64fe187bada8e [ROCm/clr commit: `f0c7ecf617`]	2024-04-10 18:08:37 -04:00
Ioannis Assiouras	b46d3c0f8d	SWDEV-451166 - Disable kernel args for non-XGMI if HDP flush register is invalid Change-Id: I227e046e2b9cb25476a50240f5d070adbd558f21 [ROCm/clr commit: `96f5c44851`]	2024-03-15 05:27:52 -04:00
Saleel Kudchadker	ce7b62d15c	SWDEV-443760 - Enable device kernel args for MI300 - Enable Device kernel args for MI300* for now. - Fix a perf issue which impacts graph instantiate when dev kernel args are enabled. Change-Id: I962e58fd9d8dd1a8db95e601cb03a8e9c7bac97f [ROCm/clr commit: `68f40f78dd`]	2024-02-28 19:10:04 -05:00
Saleel Kudchadker	ec59b1bc3e	SWDEV-443760 - Enable device kern args - Implement workaround to ensure HDP writes are done by writing and reading the HDP MMIO register. - Implement the same workaround for graphs, we no longer need sentinel write/readback Change-Id: I0d3027b46a1f61131ec62e3c8c669ff5184fa6b2 [ROCm/clr commit: `f138e0d113`]	2024-02-20 02:03:14 -05:00
German	339523c475	SWDEV-440746 - Limit WG for compute P2P Use only 16 workgroups for compute P2P copies. That should be enough to utilize XGMI bandwidth. Change-Id: I60dfe019279bb95f93c8874244c1738aad1896d8 [ROCm/clr commit: `31101c6219`]	2024-01-12 14:56:29 -05:00
German Andryeyev	e390ec044f	SWDEV-432174 - Change the fillBuffer kernel - Add the new fillBuffer kernel, which allows to launch a limited number of workgroups for memory fill operation - Switch fill memory to 16 bytes write by default - Allow to limit the workgroups with DEBUG_CLR_LIMIT_BLIT_WG Change-Id: Ibad1822f2d42b2fc71bcfc1917c31409c0623e8e [ROCm/clr commit: `f1dc81f427`]	2023-11-16 14:25:55 -04:00
Saleel Kudchadker	153bb15f46	SWDEV-301667 - Support device kernel args for PCIE Change-Id: I5e51602bea5a68734227fd62e11ab68eb1ad81c1 [ROCm/clr commit: `5c591b5877`]	2023-11-15 14:37:41 -05:00
kjayapra-amd	96580585c3	SWDEV-419688 - Do not run GWS init kernel for targets > gfx12 and MI300. Change-Id: I8e7441268978be71ab8a5a33e7f8bcf69660e500 (cherry picked from commit 36d37ef614909c0f215512aac0c133408d787080) [ROCm/clr commit: `6a8bc3c718`]	2023-10-05 14:57:56 -04:00
Sourabh Betigeri	22f367a172	SWDEV-418855 - Limits the 'no GWS' approach to gfx940, gfx11and gfx12 Change-Id: Iab2d34d3142902517124cec7ef3461cf7aa4b98c [ROCm/clr commit: `7dc78d234d`]	2023-08-30 23:48:02 -04:00
German	3f4bbcfdba	SWDEV-407533 - [ABI Break]Purge unused env vars Change-Id: I627950e8ebb6299affc602754a20d442dbe42b14 [ROCm/clr commit: `077311153a`]	2023-08-24 14:11:40 -04:00
Maneesh Gupta	d7fdd9fcb8	SWDEV-368235 - Revert "Remove obsolete env variables" This reverts commit `dfa7790030`. Reason for revert: Deferred to a future release. Change-Id: Ia66c37f0ab9734dee73c930d10d7469d5fd57254 [ROCm/clr commit: `5dc104b3ea`]	2023-02-15 07:25:00 +00:00
German	dfa7790030	SWDEV-368235 - Remove obsolete env variables Change-Id: I7e14d53297e79e2f68b3a6cc40251ad7db9eb5ab [ROCm/clr commit: `7b50c935f8`]	2023-02-03 13:44:24 -05:00
Saleel Kudchadker	7ba49616e9	SWDEV-371123 - Use barrier value packet for event records Change-Id: I5e5e5e89e0d96a2430b4682d168b76848fa5b94e [ROCm/clr commit: `4f64d89026`]	2022-12-07 17:57:36 -05:00
Sourabh Betigeri	7aa958a8f7	SWDEV-305894 - Cooperative groups grid and multi grid sync support for gfx940+ Change-Id: I35d72f1cb50c3a96eee56a612b72d641852b145f [ROCm/clr commit: `5d7f3f9f3c`]	2022-12-05 16:30:30 -05:00
German	4b6a6ba8e8	SWDEV-363074 - Adjust staging copy limits in Windows Pinned copy can cause big performance drops, because slow pinning under Windows. Use up to 128MB for staging transfers. Change staging buffer size to 4MB. Linux path should still have the old defaults. Change-Id: I954edceb3ec89e8e670be116aa2d0a9564c8b11c [ROCm/clr commit: `79d12df147`]	2022-11-17 14:48:16 -05:00
German Andryeyev	34ed734a66	SWDEV-344280 - Use coarse grain sysmem for kernel arg on MI200 Change-Id: I9596f0e8b88699538ec271b3a4345e5f75b968e3 [ROCm/clr commit: `d8e4a289b3`]	2022-06-29 13:04:46 -04:00
German Andryeyev	3c4f97f66c	SWDEV-286150 - Remove GSL backend Change-Id: Iba9a997ee7d5ff6ac00d5888ff189a4514958fe9 [ROCm/clr commit: `525a1bbf1a`]	2022-02-09 17:16:39 -05:00
Satyanvesh Dittakavi	85c2cac111	SWDEV-306939 - Fix vdi errors/warnings by CppCheck Change-Id: I56d910f8363787f1050d5d7e8064ed553c5827fd [ROCm/clr commit: `e20dd61932`]	2022-01-12 00:22:16 -05:00
Saleel Kudchadker	97456a157b	SWDEV-308843 - Increase MaxPinnedXferSize to 128 This allows experimenting with env var GPU_PINNED_XFER_SIZE which is still at a default of 32MB Change-Id: I85ade700ed58d498eba29d1737601dc74d4c26a4 [ROCm/clr commit: `3f82b99f5d`]	2021-12-01 20:37:56 -05:00
Saleel Kudchadker	1bf9b39cf8	SWDEV-301667 - Kern arg placement Add a env var ROC_USE_FGS_KERNARG to toggle kernel arg placement By default its in Fine Grain Kernel arg segment for supported asics. Change-Id: I3d57ed69a1a4db2b392b0438ead499f3ddca4716 [ROCm/clr commit: `e29b9c00ee`]	2021-09-02 12:36:49 -04:00
Jason Tang	8235cb4462	SWDEV-296911 - Enable clgl interop for both MesaGL and OrcaGL Change-Id: Ie3ad85a8335b1fc751812c09bb0cd30aad38dcae [ROCm/clr commit: `f165737096`]	2021-08-22 23:56:08 -07:00
agunashe	49f0546637	SWDEV-293742 - Update copyright end year VDI repo Change-Id: I69d2fea4a7a43adf96ccea794270e4af991c5261 [ROCm/clr commit: `d96481fb36`]	2021-08-22 23:56:07 -07:00
German Andryeyev	5e70450a24	SWDEV-240804 - Enable HMM build by default Change-Id: Ia6175dff8eda8c18b7a7bb4ca87a90c1f3e4e6fb [ROCm/clr commit: `ea3dba0832`]	2021-04-26 17:36:53 -04:00
Saleel Kudchadker	6c304e4027	SWDEV-276120 - Remove support for barrier sync ROC_BARRIER_SYNC will not work with direct dispatch. Remove and cleanup. Change-Id: I81368b2e65039477bd0343bb92708dab48867db6 [ROCm/clr commit: `aa38af8c96`]	2021-04-07 17:08:39 -04:00
German Andryeyev	e8b1e484f5	SWDEV-274199 - Enable SVM tracking ROCr/KFD doesn't validate memory pointers. Enable validation inside ROCclr, using SVM tracking mechanism. Change-Id: I581e32ff37187f9ed8d9a302e8fd9f6ca935bdd7 [ROCm/clr commit: `fbde61de7f`]	2021-03-03 13:18:56 -05:00
Jason Tang	09259cd49f	SWDEV-198364 - Only enable clgl sharing in ROCm path when building LinuxPro Change-Id: Ie4d87e252519d090a62b930f7ebb315d3477b690 [ROCm/clr commit: `54a7170e40`]	2021-02-23 14:15:04 -05:00
German Andryeyev	f96e973378	SWDEV-257787 - Add engine tracking per signal - The logic will trace compute, sdma read/write operations and apply signals when necessary - ROC_CPU_WAIT_FOR_SIGNAL, ROC_SYSTEM_SCOPE_SIGNAL and ROC_SKIP_COPY_SYNC were added to control the tracking Change-Id: I9e8e6174c63bf7784f7ab00964e2918c8667d364 [ROCm/clr commit: `dbc7abaecf`]	2021-01-25 12:34:45 -05:00
Tony Tye	902cf1a239	Update code object handling for GSL, PAL and ROCm - Correct GSL path to report targets using the TargetID syntax. - Correct GSL path to check compatibility of code objects when loading. - Add concept of an device isa and create a registery used by ROCm, PAL and GSL. - Support XNACK and SRAMECC target features consistently for PAL and ROCm. - Correct logic for NullDevices and asserts to avoid memory coruption. - Allow all NullDevices to be created for HIP. - Numerous other code improvements. Change-Id: I40abf3d2b22249c1492d1af5919665f8184f4e0e [ROCm/clr commit: `c7e8d91e14`]	2021-01-14 11:11:51 -05:00

1 2

69 Commits