rocm-systems

Автор	SHA1	Сообщение	Дата
Xie, AlexBin	e22c9b457e	SWDEV-576718 - provide option to limit memory cache usage (#2810 ) * SWDEV-576718 - provide option to limit memory cache usage * SWDEV-576718 - Use MiB instead of MB in description	2026-01-26 11:35:01 -05:00
SaleelK	340f3aa887	clr: Implement dynamic stream to HWq logic (#1958 ) * clr: Implement dynamic stream to HW queue assignment This change implements dynamic stream to hardware queue (HWq) mapping with the following features: * Queue depth heuristics with weights for optimal HWq assignment * Make last used queue sticky for better locality * Use pipe HWq to pipe mapping - gfx9 follows a round-robin queue to pipe mapping based on creation order (single process per device only, as pipe ID is statically assigned by runtime) * More aggressive heuristic usage for better queue distribution * Extend dynamic queues support for all stream priorities Environment variables: * DEBUG_HIP_DYNAMIC_QUEUE: 0 - disabled, 1 - Depth heuristics 2 - Depth+Pipe heuristics * DEBUG_HIP_IGNORE_STREAM_PRIORITY=1: ignore priority stream creation * clr: Clean up last_used_queue_	2026-01-23 10:40:54 -08:00
Fábio Mestre	61325db1c8	Fix AMD_LOG_LEVEL_SIZE env variable (#2463 ) AMD_LOG_LEVEL_SIZE is being used in a global variable. This always uses the default value of 2048 because the HIP runtime doesn't have the opportunity to load environment variables at the point where global variables are initialized. The solution is to use AMD_LOG_LEVEL_SIZE inside truncate_log_file() function.	2026-01-13 09:57:49 +00:00
Godavarthy Surya, Anusha	1ef6a86ee3	SWDEV-549711 - Improve graph DEBUG dot print for segments (#2205 ) Co-authored-by: Anusha GodavarthySurya<agodavar@amd.com>	2026-01-07 14:07:49 +05:30
SaleelK	c105dcd05b	clr: Use graph segment scheduling to process HIP Graphs (#1372 ) * clr: Use graph segment scheduling to process HIP Graphs * Add a broader path to use capture packet capture for all topologies * Refactor code * Use DEBUG_HIP_GRAPH_SEGMENT_SCHEDULING to toggle new vs classic path, Enabled by default * clr: Few fixes and improvements * clr: Detect complex graphs to take classic path * Use DEBUG_HIP_GRAPH_SEGMENT_SCHEDULING=2 to force segment scheduling path * clr: Fix a cornercase stack corruption * clr: Track commands of segments instead of snapshots * clr: Fix Batch dispatch logic * Track fence_dirty_ flag for command of other streams * Dependency resolution markers can now accomodate dirty fence on cross streams --------- Co-authored-by: Ioannis Assiouras <Ioannis.Assiouras@amd.com> Co-authored-by: Godavarthy Surya, Anusha <agodavar@amd.com>	2025-12-01 12:49:26 -08:00
Todd tiantuo Li	ee48f6221d	SWDEV-562708 - change default maximum SVM size to 256GB (#1731 )	2025-11-25 23:59:39 -08:00
Karthik Jayaprakash	740a06d567	SWDEV-559267 - Use CLPrint to DevLogPrintf with Log Level - detail debug. (#1160 )	2025-11-25 19:25:32 -05:00
German Andryeyev	2c5754844f	SWDEV-465041 - Enable direct dispatch under Linux by default. (#1934 )	2025-11-25 11:30:32 -05:00
SaleelK	738bb19835	clr: Increase kernelArg/managedBuffer size (#1586 ) * Increase the buffer to 4MB. That can help kernel launches limited by a deep kernel pipeline Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>	2025-11-08 18:32:43 -08:00
SaleelK	f301053740	clr: Improve logging (#1457 )	2025-10-25 15:55:27 -07:00
Pengda Xie	a4bbd73dc6	SWDEV-556684 - Remove HSAIL support (#1183 )	2025-10-23 11:21:49 -07:00
SaleelK	149dc17c90	clr: Optimize doorbell ring (#1030 ) Lay foundation to batch packets efficiently for graphs Dynamically copy packets with max threshold set with DEBUG_HIP_GRAPH_BATCH_SIZE, if not stagger packet copy with pow2 Default threshold for DEBUG_HIP_GRAPH_BATCH_SIZE is 256 If TS are not collected for a signal for reuse, create a new signal. This can potentially increase signal footprint if the handler doesn't run fast enough.	2025-09-18 15:02:10 -07:00
SaleelK	c4537e8050	SWDEV-553126 - Improve logging (#835 ) * Ability to mask COPY api usage in logs * Show total graph nodes in logs * Add another log level for detailed debug	2025-09-04 10:08:41 -07:00
Danylo Lytovchenko	2ff2316227	Adjust clang format to the new versions, revert broken macro layout (#714 )	2025-08-22 17:23:22 +02:00
Danylo Lytovchenko	f7338717ae	SWDEV-470698 - fix formatting, add format check workflow (#657 )	2025-08-20 19:58:06 +05:30
GunaShekar, Ajay	5c412edcd1	SWDEV-532576 - clr_logs_<pid>.txt default AMD_LOG_LEVEL_FILE (#480 ) avoids app crash and uses default AMD_LOG_LEVEL_FILE if invalid name is passed [ROCm/clr commit: `76637d7ebe`]	2025-08-13 20:27:42 -07:00
Kudchadker, Saleel	3a849c6962	SWDEV-538195 - Introduce threshold for handler submission (#723 ) - When doing device/stream sync, we can submit a handler which may introduce some host side delays. Use DEBUG_CLR_BATCH_CPU_SYNC_SIZE to batch commands for host wait. Default for HIP is 8 commands. - Investigation is underway in ROCr but need to address this for now in HIP runtime. [ROCm/clr commit: `9b045922a8`]	2025-08-06 20:34:42 -07:00
Xie, Pengda	b7d8cb56d1	SWDEV-505833 - Remove DEBUG_CLR_SKIP_RELEASE_SCOPE flag (#735 ) Cleanup debug flag DEBUG_CLR_SKIP_RELEASE_SCOPE [ROCm/clr commit: `4121a860bf`]	2025-08-05 08:31:55 -07:00
Belton-Schure, Aidan	88c1717658	SWDEV-515426 - Remove HIP_USE_RUNTIME_UNBUNDLER (#205 ) * remove HIP_USE_RUNTIME_UNBUNDLER * clang-format * Generic to use comgr * Remove HIP_ALWAYS_USE_NEW_COMGR_UNBUNDLING_ACTION flag * Removes runtime unbundling unused and debug Code * Removes stale functions [ROCm/clr commit: `81238db679`]	2025-07-08 21:45:31 +05:30
Lin, Qun	3b44884a57	SWDEV-508869 - Fix Linux build error for HIP on PAL (#176 ) [ROCm/clr commit: `9699cc3864`]	2025-06-27 07:51:22 +08:00
Harrymanoharan, Jessey	1868e4e595	SWDEV-531711 - Enable skipping host side abort when GPU crashes. (#380 ) Co-authored-by: kjayapra-amd <karthik.jayaprakash@amd.com> [ROCm/clr commit: `3930ae2524`]	2025-05-26 17:52:02 +05:30
Dittakavi, Satyanvesh	1cc35da9be	SWDEV-438790 - Remove DEBUG_HIP_7_PREVIEW env var keeping the hipGetLastError changes by default (#337 ) [ROCm/clr commit: `664bf232dd`]	2025-05-21 22:12:45 +05:30
Andryeyev, German	c512258e45	SWDEV-528808 - Disable dynamic queue by default (#256 ) Dynamic queue management will be disabled by default and the original sort logic is restored [ROCm/clr commit: `9b018165ce`]	2025-05-05 10:56:35 -04:00
Sang, Tao	68deb3d10a	SWDEV-520352 - Remove HostThread and legacy monitor (#230 ) * SWDEV-520352 - Remove HostThread and legacy monitor Remove HostThread, semaphore and legacy monitor. Make original logics of thread and command queue stricker. Add more comments to make logics clearer. Some other minor improvement. Also part of SWDEV-458943. [ROCm/clr commit: `96cadbc9e9`]	2025-04-29 09:55:24 -04:00
Jayaprakash, Karthik	49a527c826	SWDEV-506467 - Skip Abort in case of crash from the device. (#60 ) Change-Id: I964b2f2647d068202e9c38fcddb1337da754df8d [ROCm/clr commit: `b2388dfb88`]	2025-04-29 11:19:02 +05:30
Kudchadker, Saleel	1b1d6b841e	SWDEV-510186 - Improve logging (#220 ) - Print all arguments for logs, this is useful for debug [ROCm/clr commit: `ce24936970`]	2025-04-25 08:40:31 -07:00
Andryeyev, German	f8344154a0	SWDEV-497841 - Enable memory manager by default (#149 ) [ROCm/clr commit: `a5c860f3b0`]	2025-04-22 21:20:37 +05:30
Chaudhary, Jatin Jaikishan	e9e207d7b0	SWDEV-517941 - use device bitcode before spirv (#95 ) Also add flag: HIP_FORCE_SPIRV_CODEOBJECT to allow override to force use SPIRV. * use cache for already compiled code objects * address review comments and use the two spirv isa names [ROCm/clr commit: `07e57a1f0d`]	2025-04-14 23:40:52 +01:00
Andryeyev, German	5c7c86f66d	SWDEV-517481 - Add dynamic queue management (#37 ) Enabled by defaulty. DEBUG_HIP_DYNAMIC_QUEUES controls the feature [ROCm/clr commit: `28967982b2`]	2025-03-19 11:22:50 -04:00
German Andryeyev	77840f1cb9	SWDEV-518474 - Add comgr debug mask Move prints from CO processing under COMGR debug mask. Change-Id: I2a417e42a1f4e2922a34eb104c69e4db10b5f1c6 [ROCm/clr commit: `cece301fd4`]	2025-03-04 14:37:08 -05:00
German Andryeyev	f9d9b2c441	SWDEV-497841 - Add virtual memory heap Add initial implementation of virtual memory heap with dynamic virtual memory mapping support for memory pools. DEBUG_HIP_MEM_POOL_VMHEAP controls the new method. Change-Id: I8dc5be2e0f34ab472f1800f43bb6243639a5e500 [ROCm/clr commit: `296dce5570`]	2025-02-20 10:55:49 -05:00
Tao Sang	7803594aea	SWDEV-458943 - Add fast path in wait() wait() is redesigned with two pathes: fast path: Use spinlock to wait for notify signal. If the signal hasn't been received for some loops, go to slow path. slow path: Use condition_variable's wait(). Improve monitor wrapper for better performance. Fix some bugs left from name removing patch. Change-Id: I893a8353121a25d11e37c8e631caf31cc1fc1f24 [ROCm/clr commit: `f2ff56af9c`]	2025-01-28 12:19:55 -05:00
Branislav Brzak	05057b2a88	SWDEV-508743 - [6.4 Preview] Add ROCm 7.0 breaking change fields Change-Id: I07bff42731e74a4c409505cf8981342e22ce26be [ROCm/clr commit: `3fd46a3783`]	2025-01-17 06:25:27 -05:00
Saleel Kudchadker	d4594531ef	SWDEV-506251 - Disable blit copy thresold for OpenCL Change-Id: Id0ca43b13d5792791a42da263f6aa4496382cea6 [ROCm/clr commit: `39801b5750`]	2025-01-08 02:46:01 +00:00
Pengda Xie	612ae28524	SWDEV-505833 - Provide functionality to avoid L2 flush for CPX mode for dispatch packets - Added DEBUG_CLR_SKIP_RELEASE_SCOPE flag to force release scope to SCOPE_NONE in AQL packet header Change-Id: Ife02cddb9d5cd4749103ce585d3d5fe9024c6868 [ROCm/clr commit: `8155943c5f`]	2025-01-03 17:28:21 -05:00
Ioannis Assiouras	2c8805e536	SWDEV-483134 - Remove hipExtHostAlloc API Change-Id: I60777ef5c56b60dd8100d0d794ca10fb3b96a555 [ROCm/clr commit: `e8b2fdab96`]	2024-12-16 17:13:49 -05:00
Saleel Kudchadker	7d7aa8b69c	SWDEV-497145 - Use rocr copyOnEngine API for staged copies - Refactor blit code and clean ASAN instrumentation - Use unified function for rocr copy - Enable shader copy path for unpinned writeBuffer/readBuffer paths - Set GPU_FORCE_BLIT_COPY_SIZE=16 which means we will use BLIT copy for pinned copies or unpinned H2D/D2H copies < 16KB Change-Id: I42045cca79234b340dbf53dafb93044199736ae4 [ROCm/clr commit: `7863eb92dc`]	2024-12-04 13:38:13 -05:00
Satyanvesh Dittakavi	5a16db0cd5	SWDEV-477584 - Match hipGetLastError behavior with CUDA using env var Change-Id: I4c5acff180ae904028f7c5fdf4e109ffd1f0c4ef [ROCm/clr commit: `e3b8754448`]	2024-11-28 01:33:52 -05:00
German Andryeyev	a9daa4c8f4	SWDEV-486602 - Disable sysmem pool Currently amd::Monitor can work in FILO mode for the active waits and cause a delay in wakeup of some threads. That may have a problem with the current sysmem pool design. Change-Id: I145081478d1e0b282d8838855c5718f09cf54b69 [ROCm/clr commit: `9473f143c2`]	2024-11-20 11:35:28 -05:00
taosang2	7169a92488	SWDEV-487356 - Fix AMD LOG compiling warining Change-Id: I757185f9c7c12f736e266219b67daf5836d2a125 [ROCm/clr commit: `cc25c5d646`]	2024-11-09 12:57:22 -05:00
Saleel Kudchadker	672e4fa835	SWDEV-446123 - Revert "Match hipGetLastError behavior with CUDA using env var" This reverts commit `941cfd5b36`. Reason for revert: <INSERT REASONING HERE> Change-Id: I11a456655393bcf4b82d749ce7259bc1b78d1424 [ROCm/clr commit: `582dc7dd6d`]	2024-11-08 20:35:13 -05:00
Satyanvesh Dittakavi	941cfd5b36	SWDEV-446123 - Match hipGetLastError behavior with CUDA using env var Change-Id: Iaec697c1304d746376ecf2bfe2ad683b15ee189f [ROCm/clr commit: `5f477900a3`]	2024-11-07 12:02:34 -05:00
Tao Sang	5fe3dc5bf9	SWDEV-487356 - Fix AMD LOG issue in Win32 Change-Id: Ia1c19cf4ea24188cdb2d374b01f975f794e02dbf [ROCm/clr commit: `802cacf3e9`]	2024-11-01 08:26:25 -04:00
German Andryeyev	4a2687a450	SWDEV-486602 - Fix Windows 32 bit build Windows alings fields to 8 bytes even with 32bit builds. Add BUG_CLR_SYSMEM_POOL to cotnrol sysmempool. Change-Id: I8622aabc9f7391ed7dd8583b252ce9eb41d62293 [ROCm/clr commit: `6bb7d1afdc`]	2024-10-18 11:35:54 -04:00
German Andryeyev	0a03665a3f	SWDEV-491375 - Limit the SW batch size Applications may submit commands withoout waits for GPU. That causes a growth of SW unreleased commands. Make sure runtime flushes SW queue, if it grows over some threshold, controlled by DEBUG_CLR_MAX_BATCH_SIZE. Change-Id: Ia4d85c24210ef91c394f638ab6b53b14323a0396 [ROCm/clr commit: `8657a77029`]	2024-10-17 10:53:57 -04:00
German Andryeyev	faea40cbb3	SWDEV-486602 - Optimize HSA callback performance - Don't generate callbacks for HIP events - Don't process profiling info in the callback for HIP events - Wait for CPU status update of the submitted commands every 50 calls. That will allow to drain the commands and destroy HSA signals. Change-Id: Ib601a350e7e7c2b6c6209a172385389baccf73a9 [ROCm/clr commit: `364dfb0ed1`]	2024-10-11 14:50:25 -04:00
Saleel Kudchadker	b9497ea70e	SWDEV-301667 - Enable ROCr logging - Use AMD_LOG_LEVEL=5 to dump AQL packets in ROCr Change-Id: I2c044a5304c4eaf3d3af20e62d1f54c98d4fbaa4 [ROCm/clr commit: `e36666e536`]	2024-10-04 19:22:12 -04:00
Saleel Kudchadker	5296c77138	SWDEV-301667 - Logging upgrades - Use AMD_LOG_LEVEL_SIZE in MBs to set log file size truncation, by default its 2048 MB Change-Id: Ia2f87e8c6b94148e30edfb602b279f93630817c3 [ROCm/clr commit: `35e03ea0d0`]	2024-10-04 13:26:25 -04:00
pghafari	3fc58e93b3	SWDEV-444447 - Fix regression for verbose printing for AMD_LOG_LEVEL=4 Change-Id: Id245caef711b7ccdf4e999e934993beb43d7c3d5 [ROCm/clr commit: `365ffd4805`]	2024-09-18 13:08:10 -04:00
Saleel Kudchadker	343bdf3187	SWDEV-478624 - Use readback workaround to ensure kernel arg coherence Use env var DEBUG_CLR_KERNARG_HDP_FLUSH_WA=1 to fall back to HDP flush workaround. The default is 0 Change-Id: I7bdb9be61da60c30d15ac9991b7cd27351e1831c [ROCm/clr commit: `9de6d4d46c`]	2024-09-11 14:53:15 -04:00

1 2 3 4 5 ...

668 Коммитов