rocm-systems

Author	SHA1	Message	Date
German Andryeyev	6bb7d1afdc	SWDEV-486602 - Fix Windows 32 bit build Windows alings fields to 8 bytes even with 32bit builds. Add BUG_CLR_SYSMEM_POOL to cotnrol sysmempool. Change-Id: I8622aabc9f7391ed7dd8583b252ce9eb41d62293	2024-10-18 11:35:54 -04:00
German Andryeyev	8657a77029	SWDEV-491375 - Limit the SW batch size Applications may submit commands withoout waits for GPU. That causes a growth of SW unreleased commands. Make sure runtime flushes SW queue, if it grows over some threshold, controlled by DEBUG_CLR_MAX_BATCH_SIZE. Change-Id: Ia4d85c24210ef91c394f638ab6b53b14323a0396	2024-10-17 10:53:57 -04:00
German Andryeyev	364dfb0ed1	SWDEV-486602 - Optimize HSA callback performance - Don't generate callbacks for HIP events - Don't process profiling info in the callback for HIP events - Wait for CPU status update of the submitted commands every 50 calls. That will allow to drain the commands and destroy HSA signals. Change-Id: Ib601a350e7e7c2b6c6209a172385389baccf73a9	2024-10-11 14:50:25 -04:00
Saleel Kudchadker	e36666e536	SWDEV-301667 - Enable ROCr logging - Use AMD_LOG_LEVEL=5 to dump AQL packets in ROCr Change-Id: I2c044a5304c4eaf3d3af20e62d1f54c98d4fbaa4	2024-10-04 19:22:12 -04:00
Saleel Kudchadker	35e03ea0d0	SWDEV-301667 - Logging upgrades - Use AMD_LOG_LEVEL_SIZE in MBs to set log file size truncation, by default its 2048 MB Change-Id: Ia2f87e8c6b94148e30edfb602b279f93630817c3	2024-10-04 13:26:25 -04:00
pghafari	365ffd4805	SWDEV-444447 - Fix regression for verbose printing for AMD_LOG_LEVEL=4 Change-Id: Id245caef711b7ccdf4e999e934993beb43d7c3d5	2024-09-18 13:08:10 -04:00
Saleel Kudchadker	9de6d4d46c	SWDEV-478624 - Use readback workaround to ensure kernel arg coherence Use env var DEBUG_CLR_KERNARG_HDP_FLUSH_WA=1 to fall back to HDP flush workaround. The default is 0 Change-Id: I7bdb9be61da60c30d15ac9991b7cd27351e1831c	2024-09-11 14:53:15 -04:00
victzhan	7a01db98e9	Revert "SWDEV-458943 - make new AMD_MONITOR on" This reverts commit `f8598dabb0`. Change-Id: I2a7ddb2d4340224f43749a2ea91a894a8a95b83b	2024-09-05 10:10:50 -04:00
Ioannis Assiouras	2c84211b58	SWDEV-470372 - Added hipExtHostAlloc API This change adds a new HIP API `hipExtHostAlloc` which preserves the functionality of `hipHostMalloc`. Change-Id: I13504c6fc13465ddd7aed329795bb4f2fef1baff	2024-08-27 08:26:03 -04:00
Ajay	e07172ff57	SWDEV-478881 - Fix log AMD_LOG file corruption hiprtc and hip APIs use the same file. Append to file instead of start of file Change-Id: I2703f9bb67f0c51b557a058daab129679a0b5dd9	2024-08-23 11:19:48 -04:00
German Andryeyev	9db52f9a46	SWDEV-470612 - Add the optimized multistream path - Added the optimized multi stream path in graph execution. It uses a fixed number of async streams in the execution - Optimize the launch latency, where commands creation and execution is done at the same time - Optimize the scheduling to use less barriers and waiting signals if the same queue can be detected - The new path is controlled by DEBUG_HIP_FORCE_GRAPH_QUEUES environment variable, where 0 will use the original path and any other value will force the number of asynchronous queues for execution - DEBUG_HIP_FORCE_ASYNC_QUEUE can force single queue async execution in graphs(applicable for Navi families only) Change-Id: I7eb40bc15c45f508d6911868a6f6d4c3598d380e	2024-08-02 14:19:44 -04:00
Saleel Kudchadker	d379f4efd0	SWDEV-301667 - Refactor Blit force env var Change-Id: I5344ac2e6442cd8f526118e688f1b1412cc5b45a	2024-07-25 15:15:10 -04:00
taosang2	f8598dabb0	SWDEV-458943 - make new AMD_MONITOR on make DEBUG_CLR_USE_STDMUTEX_IN_AMD_MONITOR be true Change-Id: I1d21378ff462478d3238d71e4e2a1a7d6b9167ac	2024-07-24 14:29:27 -04:00
pghafari	9e6e77b7dd	SWDEV-444447 - log print pid/tid only in verbose mode Change-Id: I2bbe9085d607e9d8d5acda1ed43e3245335d239f	2024-07-11 15:39:13 -04:00
Tao Sang	73c02041e1	SWDEV-458943 - Implement std::mutex based monitor Implement std::mutex based monitor that has much simpler logics than legacy monitor. Create DEBUG_CLR_USE_STDMUTEX_IN_AMD_MONITOR to toggle them. If DEBUG_CLR_USE_STDMUTEX_IN_AMD_MONITOR = false (by default), use legacy monitor; If DEBUG_CLR_USE_STDMUTEX_IN_AMD_MONITOR = true, use std::mutex based monitor. If no perf drop of stl::mutex based monitor, legacy one will be removed later. Change-Id: I1d21368ff462477d3238d71e4e2a1a7d6b9167ad	2024-07-04 11:50:46 -04:00
Ioannis Assiouras	fa07c33cba	SWDEV-470787 - Fixed undefined symbols for flags in static build Change-Id: I7812c8924396d0df9ab331f9a1844aabbf5a9211	2024-07-04 02:57:22 -04:00
Ioannis Assiouras	3edf1501cc	SWDEV-463865 - namespace changes to prevent symbol conflicts in static builds Change-Id: I09ceb5962b7aa19156909f47167c87d6887c9cd1	2024-06-12 16:22:27 -04:00
kjayapra-amd	892071aeb2	SWDEV-460948 - Changes to alloc, set, capture under single function. Change-Id: I7b2d40e99e812b97c53535c5e63c41ad64a8f543	2024-06-06 16:57:53 -04:00
Ioannis Assiouras	b8c2ac4de4	SWDEV-463865 - symbol renamings to prevent conflicts in static build Change-Id: Id7fbb638c1088c23df52fee877cd790d637b1ffb	2024-06-06 04:05:55 -04:00
Tao Sang	d0050ce309	SWDEV-433371 - Support new comgr unbundling action Support new comgr unbundling action api to extract codebjects in compressed and uncompressed modes. Create HIP_ALWAYS_USE_NEW_COMGR_UNBUNDLING_ACTION ENV to toggle new path and old path. If HIP_ALWAYS_USE_NEW_COMGR_UNBUNDLING_ACTION=false(default), uncompressed codeobject will go old path for better perf, compressed codeobject will go new path. If HIP_ALWAYS_USE_NEW_COMGR_UNBUNDLING_ACTION=true, both uncompressed and compressed codeobjects will go new path. Add comgr wrapper for amd_comgr_action_info_set_bundle_entry_ids() Change-Id: I79952f132fe21249296685ee12cae05a4f9aec32	2024-05-28 06:31:10 +00:00
Tao Sang	a1350fe8c1	Revert "SWDEV-433371 - use comgr to unbundle code objects" This reverts commit `e53df57ffe`. Reason for revert: <INSERT REASONING HERE> New comgr unbundling action leads to perf drop for uncompressed code object. Will create a new patch to use old path for uncompressed , new unbundling api for compressed . Change-Id: I41ef53b71fc9f7aaa8cf231d4d70945f1117db52	2024-05-28 06:31:10 +00:00
Alex Xie	2eb30376ba	SWDEV-451945 - Remove ShouldLoadPlatform function Change-Id: Iabb4071bb77201576bc2c0488a04f4fa188815df	2024-05-06 10:42:59 -04:00
taosang2	e53df57ffe	SWDEV-433371 - use comgr to unbundle code objects 1.Make runtime use comgr to unbundle code objects 2.Support compressed/uncompressed modes 3.Remove HIP_USE_RUNTIME_UNBUNDLER and HIPRTC_USE_RUNTIME_UNBUNDLER to simplify logics 4.Add comgr wrapper for amd_comgr_action_info_set_bundle_entry_ids() Change-Id: Ic41b1ad1b64cca1e31986437983a5146d52a7329	2024-05-01 16:09:12 -04:00
Saleel Kudchadker	948ca5a931	SWDEV-301667 - Add LOG_TS mask - Add LOG_TS mask for printing signal times - Read raw ticks from signals Change-Id: Ibdd0bf06c790729f6c65083a4784c97a3c3219e0	2024-04-30 12:24:48 -04:00
German Andryeyev	7a371503b2	SWDEV-311271 - Enable mempools under Linux Change-Id: I7fda94e61121f9d3a30f4ad185b8a97712922f3c	2024-04-29 18:06:34 -04:00
taosang2	35c80dd482	SWDEV-424956 - Fix half vector printf issue Refactor PrintfDbg::outputArgument() to remove potential risk. Fix half vector printf issue on all devices. Fix FEAT-56794 as well. Change-Id: Iae39359d2128588def2e43d77fe58e868b8e71ff	2024-04-12 14:25:44 -04:00
German Andryeyev	f0c7ecf617	SWDEV-455254 - Add kernel arg optimization Add kernel arguments optimization into blit path. Enabled by default on MI300. Change-Id: I2694a81b90d48ad07d86dfe4c0c64fe187bada8e	2024-04-10 18:08:37 -04:00
Saleel Kudchadker	c157bfb202	SWDEV-301667 - Create TS for each node recorded in graph - Create a vector to allow multiple TS to be stored in Command. - This would mean we dont wait for entire batch in Accumulate command to finish when we exhaust signals. - Reduce the number of signals created at init to 64. This min value may still need to be tuned but the KFD allows max of 4094 interrupt signals per device. - Store kernel names whenever they are available and not just when profiling. If we dynamically enable profiling like for Torch, a crash can happen if hipGraphInstantiate wasnt included in Torch profile scope beacuse we previously entered kernel names only when profiler is attached. Change-Id: I34e7881a25bbc763f82fdeb3408a8ea58e1ec006	2024-03-26 14:47:24 -04:00
German Andryeyev	0f3391b93e	SWDEV-311271 - Enable mempool under Windows Change-Id: Ifa4cac4a8d52e031d63f62515439ca09efe7b4cb	2024-03-11 10:45:51 -04:00
Vikram	6f390f5af9	SWDEV-424956 - Fix OpenCL printf bug while printing vectors of half type OpenCL printf handling did not process vector of half precision floats properly (mainly because compiler packs 2 halfs into a dword and runtime failed to extract the individual parts). This patch fixes the issue. Change-Id: Ia1f15ccfb5db52b71c43cfd588dd38f551ee5277	2024-03-04 03:53:18 -05:00
Saleel Kudchadker	94c7004df8	SWDEV-301667 - Increase default signal pool to 4096 Change-Id: I4ab23b0f87e295b40ab76ad6e96249d11b8ad04d	2024-02-29 22:52:02 +00:00
Saleel Kudchadker	68f40f78dd	SWDEV-443760 - Enable device kernel args for MI300 - Enable Device kernel args for MI300* for now. - Fix a perf issue which impacts graph instantiate when dev kernel args are enabled. Change-Id: I962e58fd9d8dd1a8db95e601cb03a8e9c7bac97f	2024-02-28 19:10:04 -05:00
Rahul Garg	b954d0d6e0	SWDEV-443760 - Disable HIP_FORCE_DEV_KERNARG by default Change-Id: I8c3d8e65aa954bd28499eebefbc532d1177445dc	2024-02-22 04:37:51 -05:00
Todd tiantuo Li	7bfee3481b	SWDEV-333557 - Enable PAL_HIP_IPC_FLAG by default Change-Id: Ibb2ca0b9521aff4eca190e4817dcc5f8d697b172	2024-02-20 18:45:25 -05:00
Saleel Kudchadker	f138e0d113	SWDEV-443760 - Enable device kern args - Implement workaround to ensure HDP writes are done by writing and reading the HDP MMIO register. - Implement the same workaround for graphs, we no longer need sentinel write/readback Change-Id: I0d3027b46a1f61131ec62e3c8c669ff5184fa6b2	2024-02-20 02:03:14 -05:00
Anusha GodavarthySurya	ae0368d12d	SWDEV-422207 - Enable DEBUG_CLR_GRAPH_PACKET_CAPTURE environiment variable Change-Id: I9bf72b9c1a56980352109bd4d42b54ecb2d1b8f9	2024-02-05 05:08:11 +00:00
Anusha GodavarthySurya	0a055f874b	SWDEV-422207 - Added debug env to dump graph during Instantiation Change-Id: Ibde2ae5b8d240f3986bcd168facc513a319c0f17	2024-02-05 05:08:11 +00:00
pghafari	0cff14c9e1	SWDEV-441258 - remove full path for HIP LOG windows Change-Id: Ibad6e9542c0cede38f5a114dcd352356ddedf019	2024-01-16 15:26:06 -05:00
German	7d661bc7df	SWDEV-404889 - Enable debugger interface in PAL Add GPU_DEBUG_ENABLE to control ttpm behavior. If enabled, then HW will collect more debug info at some perf cost Change-Id: Icee0686b903a7b1bd483710b9d611877cd43c6aa	2024-01-02 11:51:42 -05:00
kjayapra-amd	e05923b139	SWDEV-413997 - Enable Virtual Mem support by default. Change-Id: Ia3db3919701708cf95574692e1d47375ca99d7fd	2023-12-20 12:49:16 -05:00
German Andryeyev	f1dc81f427	SWDEV-432174 - Change the fillBuffer kernel - Add the new fillBuffer kernel, which allows to launch a limited number of workgroups for memory fill operation - Switch fill memory to 16 bytes write by default - Allow to limit the workgroups with DEBUG_CLR_LIMIT_BLIT_WG Change-Id: Ibad1822f2d42b2fc71bcfc1917c31409c0623e8e	2023-11-16 14:25:55 -04:00
Ioannis Assiouras	7868876db7	SWDEV-428244 - Set PARAMETERS_MIN_ALIGNMENT to the native alignment Change-Id: I14d8a0db4e575d6fa816754c52df405de88d9200	2023-10-21 17:26:46 -04:00
kjayapra-amd	3ef829939a	SWDEV-413997 - Initial VMM changes for ROCm path. Change-Id: I4405fd7b53182eb4c4622835c811c0dc08461537	2023-10-16 11:29:16 -04:00
jiabaxie	28f0daa34f	SWDEV-405983 - adding in HIP_LAUNCH_BLOCKING Change-Id: I3f9c8a745099aab05155ebe910e727693961a02f	2023-10-10 21:11:13 -04:00
Anusha GodavarthySurya	e63c280d4d	SWDEV-422207 - Capture AQL Packets for graph Kernel nodes during graph Inst. And enqueue AQL packet during launch Change-Id: I1e5f7f9e2a70bd500d190193cb6ba0867f5a63e7	2023-10-05 00:34:29 -04:00
German	7be3a5e33e	SWDEV-407533 - [ABI Break]Remove Wavelimiter Change-Id: I6a2f6fb5a0c3acea93fa0200a69679783e76f5bd	2023-09-07 09:58:41 -04:00
Ioannis Assiouras	1302d6f119	SWDEV-420328 - Initialize AMD_LOG_MASK with decimals instead of hex Change-Id: Id25510863c51206bca2e50fc93d6e1e1c5cbbfea	2023-09-07 03:04:37 -04:00
kjayapra-amd	6a0f80a03d	SWDEV-381625 - Parse compiler and linker options from environment variable. Change-Id: Id5a012b678e5973c4b64dff84444a909aefae006	2023-08-29 20:24:27 -04:00
German	077311153a	SWDEV-407533 - [ABI Break]Purge unused env vars Change-Id: I627950e8ebb6299affc602754a20d442dbe42b14	2023-08-24 14:11:40 -04:00
Saleel Kudchadker	aa6eb555e2	SWDEV-384557 - Enable SDMA query Change-Id: Ibb0a8d131f799985a4d4adbf753261e58c04157f	2023-08-01 18:41:23 -04:00

1 2 3 4 5 ...

625 Commits