rocm-systems

Автор	SHA1	Повідомлення	Дата
Andryeyev, German	9b018165ce	SWDEV-528808 - Disable dynamic queue by default (#256 ) Dynamic queue management will be disabled by default and the original sort logic is restored	2025-05-05 10:56:35 -04:00
Sang, Tao	96cadbc9e9	SWDEV-520352 - Remove HostThread and legacy monitor (#230 ) * SWDEV-520352 - Remove HostThread and legacy monitor Remove HostThread, semaphore and legacy monitor. Make original logics of thread and command queue stricker. Add more comments to make logics clearer. Some other minor improvement. Also part of SWDEV-458943.	2025-04-29 09:55:24 -04:00
Jayaprakash, Karthik	b2388dfb88	SWDEV-506467 - Skip Abort in case of crash from the device. (#60 ) Change-Id: I964b2f2647d068202e9c38fcddb1337da754df8d	2025-04-29 11:19:02 +05:30
Kudchadker, Saleel	ce24936970	SWDEV-510186 - Improve logging (#220 ) - Print all arguments for logs, this is useful for debug	2025-04-25 08:40:31 -07:00
Andryeyev, German	a5c860f3b0	SWDEV-497841 - Enable memory manager by default (#149 )	2025-04-22 21:20:37 +05:30
Chaudhary, Jatin Jaikishan	07e57a1f0d	SWDEV-517941 - use device bitcode before spirv (#95 ) Also add flag: HIP_FORCE_SPIRV_CODEOBJECT to allow override to force use SPIRV. * use cache for already compiled code objects * address review comments and use the two spirv isa names	2025-04-14 23:40:52 +01:00
Andryeyev, German	28967982b2	SWDEV-517481 - Add dynamic queue management (#37 ) Enabled by defaulty. DEBUG_HIP_DYNAMIC_QUEUES controls the feature	2025-03-19 11:22:50 -04:00
German Andryeyev	cece301fd4	SWDEV-518474 - Add comgr debug mask Move prints from CO processing under COMGR debug mask. Change-Id: I2a417e42a1f4e2922a34eb104c69e4db10b5f1c6	2025-03-04 14:37:08 -05:00
German Andryeyev	296dce5570	SWDEV-497841 - Add virtual memory heap Add initial implementation of virtual memory heap with dynamic virtual memory mapping support for memory pools. DEBUG_HIP_MEM_POOL_VMHEAP controls the new method. Change-Id: I8dc5be2e0f34ab472f1800f43bb6243639a5e500	2025-02-20 10:55:49 -05:00
Tao Sang	f2ff56af9c	SWDEV-458943 - Add fast path in wait() wait() is redesigned with two pathes: fast path: Use spinlock to wait for notify signal. If the signal hasn't been received for some loops, go to slow path. slow path: Use condition_variable's wait(). Improve monitor wrapper for better performance. Fix some bugs left from name removing patch. Change-Id: I893a8353121a25d11e37c8e631caf31cc1fc1f24	2025-01-28 12:19:55 -05:00
Branislav Brzak	3fd46a3783	SWDEV-508743 - [6.4 Preview] Add ROCm 7.0 breaking change fields Change-Id: I07bff42731e74a4c409505cf8981342e22ce26be	2025-01-17 06:25:27 -05:00
Saleel Kudchadker	39801b5750	SWDEV-506251 - Disable blit copy thresold for OpenCL Change-Id: Id0ca43b13d5792791a42da263f6aa4496382cea6	2025-01-08 02:46:01 +00:00
Pengda Xie	8155943c5f	SWDEV-505833 - Provide functionality to avoid L2 flush for CPX mode for dispatch packets - Added DEBUG_CLR_SKIP_RELEASE_SCOPE flag to force release scope to SCOPE_NONE in AQL packet header Change-Id: Ife02cddb9d5cd4749103ce585d3d5fe9024c6868	2025-01-03 17:28:21 -05:00
Ioannis Assiouras	e8b2fdab96	SWDEV-483134 - Remove hipExtHostAlloc API Change-Id: I60777ef5c56b60dd8100d0d794ca10fb3b96a555	2024-12-16 17:13:49 -05:00
Saleel Kudchadker	7863eb92dc	SWDEV-497145 - Use rocr copyOnEngine API for staged copies - Refactor blit code and clean ASAN instrumentation - Use unified function for rocr copy - Enable shader copy path for unpinned writeBuffer/readBuffer paths - Set GPU_FORCE_BLIT_COPY_SIZE=16 which means we will use BLIT copy for pinned copies or unpinned H2D/D2H copies < 16KB Change-Id: I42045cca79234b340dbf53dafb93044199736ae4	2024-12-04 13:38:13 -05:00
Satyanvesh Dittakavi	e3b8754448	SWDEV-477584 - Match hipGetLastError behavior with CUDA using env var Change-Id: I4c5acff180ae904028f7c5fdf4e109ffd1f0c4ef	2024-11-28 01:33:52 -05:00
German Andryeyev	9473f143c2	SWDEV-486602 - Disable sysmem pool Currently amd::Monitor can work in FILO mode for the active waits and cause a delay in wakeup of some threads. That may have a problem with the current sysmem pool design. Change-Id: I145081478d1e0b282d8838855c5718f09cf54b69	2024-11-20 11:35:28 -05:00
taosang2	cc25c5d646	SWDEV-487356 - Fix AMD LOG compiling warining Change-Id: I757185f9c7c12f736e266219b67daf5836d2a125	2024-11-09 12:57:22 -05:00
Saleel Kudchadker	582dc7dd6d	SWDEV-446123 - Revert "Match hipGetLastError behavior with CUDA using env var" This reverts commit `5f477900a3`. Reason for revert: <INSERT REASONING HERE> Change-Id: I11a456655393bcf4b82d749ce7259bc1b78d1424	2024-11-08 20:35:13 -05:00
Satyanvesh Dittakavi	5f477900a3	SWDEV-446123 - Match hipGetLastError behavior with CUDA using env var Change-Id: Iaec697c1304d746376ecf2bfe2ad683b15ee189f	2024-11-07 12:02:34 -05:00
Tao Sang	802cacf3e9	SWDEV-487356 - Fix AMD LOG issue in Win32 Change-Id: Ia1c19cf4ea24188cdb2d374b01f975f794e02dbf	2024-11-01 08:26:25 -04:00
German Andryeyev	6bb7d1afdc	SWDEV-486602 - Fix Windows 32 bit build Windows alings fields to 8 bytes even with 32bit builds. Add BUG_CLR_SYSMEM_POOL to cotnrol sysmempool. Change-Id: I8622aabc9f7391ed7dd8583b252ce9eb41d62293	2024-10-18 11:35:54 -04:00
German Andryeyev	8657a77029	SWDEV-491375 - Limit the SW batch size Applications may submit commands withoout waits for GPU. That causes a growth of SW unreleased commands. Make sure runtime flushes SW queue, if it grows over some threshold, controlled by DEBUG_CLR_MAX_BATCH_SIZE. Change-Id: Ia4d85c24210ef91c394f638ab6b53b14323a0396	2024-10-17 10:53:57 -04:00
German Andryeyev	364dfb0ed1	SWDEV-486602 - Optimize HSA callback performance - Don't generate callbacks for HIP events - Don't process profiling info in the callback for HIP events - Wait for CPU status update of the submitted commands every 50 calls. That will allow to drain the commands and destroy HSA signals. Change-Id: Ib601a350e7e7c2b6c6209a172385389baccf73a9	2024-10-11 14:50:25 -04:00
Saleel Kudchadker	e36666e536	SWDEV-301667 - Enable ROCr logging - Use AMD_LOG_LEVEL=5 to dump AQL packets in ROCr Change-Id: I2c044a5304c4eaf3d3af20e62d1f54c98d4fbaa4	2024-10-04 19:22:12 -04:00
Saleel Kudchadker	35e03ea0d0	SWDEV-301667 - Logging upgrades - Use AMD_LOG_LEVEL_SIZE in MBs to set log file size truncation, by default its 2048 MB Change-Id: Ia2f87e8c6b94148e30edfb602b279f93630817c3	2024-10-04 13:26:25 -04:00
pghafari	365ffd4805	SWDEV-444447 - Fix regression for verbose printing for AMD_LOG_LEVEL=4 Change-Id: Id245caef711b7ccdf4e999e934993beb43d7c3d5	2024-09-18 13:08:10 -04:00
Saleel Kudchadker	9de6d4d46c	SWDEV-478624 - Use readback workaround to ensure kernel arg coherence Use env var DEBUG_CLR_KERNARG_HDP_FLUSH_WA=1 to fall back to HDP flush workaround. The default is 0 Change-Id: I7bdb9be61da60c30d15ac9991b7cd27351e1831c	2024-09-11 14:53:15 -04:00
victzhan	7a01db98e9	Revert "SWDEV-458943 - make new AMD_MONITOR on" This reverts commit `f8598dabb0`. Change-Id: I2a7ddb2d4340224f43749a2ea91a894a8a95b83b	2024-09-05 10:10:50 -04:00
Ioannis Assiouras	2c84211b58	SWDEV-470372 - Added hipExtHostAlloc API This change adds a new HIP API `hipExtHostAlloc` which preserves the functionality of `hipHostMalloc`. Change-Id: I13504c6fc13465ddd7aed329795bb4f2fef1baff	2024-08-27 08:26:03 -04:00
Ajay	e07172ff57	SWDEV-478881 - Fix log AMD_LOG file corruption hiprtc and hip APIs use the same file. Append to file instead of start of file Change-Id: I2703f9bb67f0c51b557a058daab129679a0b5dd9	2024-08-23 11:19:48 -04:00
German Andryeyev	9db52f9a46	SWDEV-470612 - Add the optimized multistream path - Added the optimized multi stream path in graph execution. It uses a fixed number of async streams in the execution - Optimize the launch latency, where commands creation and execution is done at the same time - Optimize the scheduling to use less barriers and waiting signals if the same queue can be detected - The new path is controlled by DEBUG_HIP_FORCE_GRAPH_QUEUES environment variable, where 0 will use the original path and any other value will force the number of asynchronous queues for execution - DEBUG_HIP_FORCE_ASYNC_QUEUE can force single queue async execution in graphs(applicable for Navi families only) Change-Id: I7eb40bc15c45f508d6911868a6f6d4c3598d380e	2024-08-02 14:19:44 -04:00
Saleel Kudchadker	d379f4efd0	SWDEV-301667 - Refactor Blit force env var Change-Id: I5344ac2e6442cd8f526118e688f1b1412cc5b45a	2024-07-25 15:15:10 -04:00
taosang2	f8598dabb0	SWDEV-458943 - make new AMD_MONITOR on make DEBUG_CLR_USE_STDMUTEX_IN_AMD_MONITOR be true Change-Id: I1d21378ff462478d3238d71e4e2a1a7d6b9167ac	2024-07-24 14:29:27 -04:00
pghafari	9e6e77b7dd	SWDEV-444447 - log print pid/tid only in verbose mode Change-Id: I2bbe9085d607e9d8d5acda1ed43e3245335d239f	2024-07-11 15:39:13 -04:00
Tao Sang	73c02041e1	SWDEV-458943 - Implement std::mutex based monitor Implement std::mutex based monitor that has much simpler logics than legacy monitor. Create DEBUG_CLR_USE_STDMUTEX_IN_AMD_MONITOR to toggle them. If DEBUG_CLR_USE_STDMUTEX_IN_AMD_MONITOR = false (by default), use legacy monitor; If DEBUG_CLR_USE_STDMUTEX_IN_AMD_MONITOR = true, use std::mutex based monitor. If no perf drop of stl::mutex based monitor, legacy one will be removed later. Change-Id: I1d21368ff462477d3238d71e4e2a1a7d6b9167ad	2024-07-04 11:50:46 -04:00
Ioannis Assiouras	fa07c33cba	SWDEV-470787 - Fixed undefined symbols for flags in static build Change-Id: I7812c8924396d0df9ab331f9a1844aabbf5a9211	2024-07-04 02:57:22 -04:00
Ioannis Assiouras	3edf1501cc	SWDEV-463865 - namespace changes to prevent symbol conflicts in static builds Change-Id: I09ceb5962b7aa19156909f47167c87d6887c9cd1	2024-06-12 16:22:27 -04:00
kjayapra-amd	892071aeb2	SWDEV-460948 - Changes to alloc, set, capture under single function. Change-Id: I7b2d40e99e812b97c53535c5e63c41ad64a8f543	2024-06-06 16:57:53 -04:00
Ioannis Assiouras	b8c2ac4de4	SWDEV-463865 - symbol renamings to prevent conflicts in static build Change-Id: Id7fbb638c1088c23df52fee877cd790d637b1ffb	2024-06-06 04:05:55 -04:00
Tao Sang	d0050ce309	SWDEV-433371 - Support new comgr unbundling action Support new comgr unbundling action api to extract codebjects in compressed and uncompressed modes. Create HIP_ALWAYS_USE_NEW_COMGR_UNBUNDLING_ACTION ENV to toggle new path and old path. If HIP_ALWAYS_USE_NEW_COMGR_UNBUNDLING_ACTION=false(default), uncompressed codeobject will go old path for better perf, compressed codeobject will go new path. If HIP_ALWAYS_USE_NEW_COMGR_UNBUNDLING_ACTION=true, both uncompressed and compressed codeobjects will go new path. Add comgr wrapper for amd_comgr_action_info_set_bundle_entry_ids() Change-Id: I79952f132fe21249296685ee12cae05a4f9aec32	2024-05-28 06:31:10 +00:00
Tao Sang	a1350fe8c1	Revert "SWDEV-433371 - use comgr to unbundle code objects" This reverts commit `e53df57ffe`. Reason for revert: <INSERT REASONING HERE> New comgr unbundling action leads to perf drop for uncompressed code object. Will create a new patch to use old path for uncompressed , new unbundling api for compressed . Change-Id: I41ef53b71fc9f7aaa8cf231d4d70945f1117db52	2024-05-28 06:31:10 +00:00
Alex Xie	2eb30376ba	SWDEV-451945 - Remove ShouldLoadPlatform function Change-Id: Iabb4071bb77201576bc2c0488a04f4fa188815df	2024-05-06 10:42:59 -04:00
taosang2	e53df57ffe	SWDEV-433371 - use comgr to unbundle code objects 1.Make runtime use comgr to unbundle code objects 2.Support compressed/uncompressed modes 3.Remove HIP_USE_RUNTIME_UNBUNDLER and HIPRTC_USE_RUNTIME_UNBUNDLER to simplify logics 4.Add comgr wrapper for amd_comgr_action_info_set_bundle_entry_ids() Change-Id: Ic41b1ad1b64cca1e31986437983a5146d52a7329	2024-05-01 16:09:12 -04:00
Saleel Kudchadker	948ca5a931	SWDEV-301667 - Add LOG_TS mask - Add LOG_TS mask for printing signal times - Read raw ticks from signals Change-Id: Ibdd0bf06c790729f6c65083a4784c97a3c3219e0	2024-04-30 12:24:48 -04:00
German Andryeyev	7a371503b2	SWDEV-311271 - Enable mempools under Linux Change-Id: I7fda94e61121f9d3a30f4ad185b8a97712922f3c	2024-04-29 18:06:34 -04:00
taosang2	35c80dd482	SWDEV-424956 - Fix half vector printf issue Refactor PrintfDbg::outputArgument() to remove potential risk. Fix half vector printf issue on all devices. Fix FEAT-56794 as well. Change-Id: Iae39359d2128588def2e43d77fe58e868b8e71ff	2024-04-12 14:25:44 -04:00
German Andryeyev	f0c7ecf617	SWDEV-455254 - Add kernel arg optimization Add kernel arguments optimization into blit path. Enabled by default on MI300. Change-Id: I2694a81b90d48ad07d86dfe4c0c64fe187bada8e	2024-04-10 18:08:37 -04:00
Saleel Kudchadker	c157bfb202	SWDEV-301667 - Create TS for each node recorded in graph - Create a vector to allow multiple TS to be stored in Command. - This would mean we dont wait for entire batch in Accumulate command to finish when we exhaust signals. - Reduce the number of signals created at init to 64. This min value may still need to be tuned but the KFD allows max of 4094 interrupt signals per device. - Store kernel names whenever they are available and not just when profiling. If we dynamically enable profiling like for Torch, a crash can happen if hipGraphInstantiate wasnt included in Torch profile scope beacuse we previously entered kernel names only when profiler is attached. Change-Id: I34e7881a25bbc763f82fdeb3408a8ea58e1ec006	2024-03-26 14:47:24 -04:00
German Andryeyev	0f3391b93e	SWDEV-311271 - Enable mempool under Windows Change-Id: Ifa4cac4a8d52e031d63f62515439ca09efe7b4cb	2024-03-11 10:45:51 -04:00

1 2 3 4 5 ...

646 Коміти