rocm-systems

Author	SHA1	Message	Date
German Andryeyev	6bb7d1afdc	SWDEV-486602 - Fix Windows 32 bit build Windows alings fields to 8 bytes even with 32bit builds. Add BUG_CLR_SYSMEM_POOL to cotnrol sysmempool. Change-Id: I8622aabc9f7391ed7dd8583b252ce9eb41d62293	2024-10-18 11:35:54 -04:00
German Andryeyev	8657a77029	SWDEV-491375 - Limit the SW batch size Applications may submit commands withoout waits for GPU. That causes a growth of SW unreleased commands. Make sure runtime flushes SW queue, if it grows over some threshold, controlled by DEBUG_CLR_MAX_BATCH_SIZE. Change-Id: Ia4d85c24210ef91c394f638ab6b53b14323a0396	2024-10-17 10:53:57 -04:00
German Andryeyev	364dfb0ed1	SWDEV-486602 - Optimize HSA callback performance - Don't generate callbacks for HIP events - Don't process profiling info in the callback for HIP events - Wait for CPU status update of the submitted commands every 50 calls. That will allow to drain the commands and destroy HSA signals. Change-Id: Ib601a350e7e7c2b6c6209a172385389baccf73a9	2024-10-11 14:50:25 -04:00
Saleel Kudchadker	35e03ea0d0	SWDEV-301667 - Logging upgrades - Use AMD_LOG_LEVEL_SIZE in MBs to set log file size truncation, by default its 2048 MB Change-Id: Ia2f87e8c6b94148e30edfb602b279f93630817c3	2024-10-04 13:26:25 -04:00
Saleel Kudchadker	9de6d4d46c	SWDEV-478624 - Use readback workaround to ensure kernel arg coherence Use env var DEBUG_CLR_KERNARG_HDP_FLUSH_WA=1 to fall back to HDP flush workaround. The default is 0 Change-Id: I7bdb9be61da60c30d15ac9991b7cd27351e1831c	2024-09-11 14:53:15 -04:00
victzhan	7a01db98e9	Revert "SWDEV-458943 - make new AMD_MONITOR on" This reverts commit `f8598dabb0`. Change-Id: I2a7ddb2d4340224f43749a2ea91a894a8a95b83b	2024-09-05 10:10:50 -04:00
Ioannis Assiouras	2c84211b58	SWDEV-470372 - Added hipExtHostAlloc API This change adds a new HIP API `hipExtHostAlloc` which preserves the functionality of `hipHostMalloc`. Change-Id: I13504c6fc13465ddd7aed329795bb4f2fef1baff	2024-08-27 08:26:03 -04:00
German Andryeyev	9db52f9a46	SWDEV-470612 - Add the optimized multistream path - Added the optimized multi stream path in graph execution. It uses a fixed number of async streams in the execution - Optimize the launch latency, where commands creation and execution is done at the same time - Optimize the scheduling to use less barriers and waiting signals if the same queue can be detected - The new path is controlled by DEBUG_HIP_FORCE_GRAPH_QUEUES environment variable, where 0 will use the original path and any other value will force the number of asynchronous queues for execution - DEBUG_HIP_FORCE_ASYNC_QUEUE can force single queue async execution in graphs(applicable for Navi families only) Change-Id: I7eb40bc15c45f508d6911868a6f6d4c3598d380e	2024-08-02 14:19:44 -04:00
Saleel Kudchadker	d379f4efd0	SWDEV-301667 - Refactor Blit force env var Change-Id: I5344ac2e6442cd8f526118e688f1b1412cc5b45a	2024-07-25 15:15:10 -04:00
taosang2	f8598dabb0	SWDEV-458943 - make new AMD_MONITOR on make DEBUG_CLR_USE_STDMUTEX_IN_AMD_MONITOR be true Change-Id: I1d21378ff462478d3238d71e4e2a1a7d6b9167ac	2024-07-24 14:29:27 -04:00
Tao Sang	73c02041e1	SWDEV-458943 - Implement std::mutex based monitor Implement std::mutex based monitor that has much simpler logics than legacy monitor. Create DEBUG_CLR_USE_STDMUTEX_IN_AMD_MONITOR to toggle them. If DEBUG_CLR_USE_STDMUTEX_IN_AMD_MONITOR = false (by default), use legacy monitor; If DEBUG_CLR_USE_STDMUTEX_IN_AMD_MONITOR = true, use std::mutex based monitor. If no perf drop of stl::mutex based monitor, legacy one will be removed later. Change-Id: I1d21368ff462477d3238d71e4e2a1a7d6b9167ad	2024-07-04 11:50:46 -04:00
Ioannis Assiouras	fa07c33cba	SWDEV-470787 - Fixed undefined symbols for flags in static build Change-Id: I7812c8924396d0df9ab331f9a1844aabbf5a9211	2024-07-04 02:57:22 -04:00
Ioannis Assiouras	3edf1501cc	SWDEV-463865 - namespace changes to prevent symbol conflicts in static builds Change-Id: I09ceb5962b7aa19156909f47167c87d6887c9cd1	2024-06-12 16:22:27 -04:00
kjayapra-amd	892071aeb2	SWDEV-460948 - Changes to alloc, set, capture under single function. Change-Id: I7b2d40e99e812b97c53535c5e63c41ad64a8f543	2024-06-06 16:57:53 -04:00
Tao Sang	d0050ce309	SWDEV-433371 - Support new comgr unbundling action Support new comgr unbundling action api to extract codebjects in compressed and uncompressed modes. Create HIP_ALWAYS_USE_NEW_COMGR_UNBUNDLING_ACTION ENV to toggle new path and old path. If HIP_ALWAYS_USE_NEW_COMGR_UNBUNDLING_ACTION=false(default), uncompressed codeobject will go old path for better perf, compressed codeobject will go new path. If HIP_ALWAYS_USE_NEW_COMGR_UNBUNDLING_ACTION=true, both uncompressed and compressed codeobjects will go new path. Add comgr wrapper for amd_comgr_action_info_set_bundle_entry_ids() Change-Id: I79952f132fe21249296685ee12cae05a4f9aec32	2024-05-28 06:31:10 +00:00
Tao Sang	a1350fe8c1	Revert "SWDEV-433371 - use comgr to unbundle code objects" This reverts commit `e53df57ffe`. Reason for revert: <INSERT REASONING HERE> New comgr unbundling action leads to perf drop for uncompressed code object. Will create a new patch to use old path for uncompressed , new unbundling api for compressed . Change-Id: I41ef53b71fc9f7aaa8cf231d4d70945f1117db52	2024-05-28 06:31:10 +00:00
taosang2	e53df57ffe	SWDEV-433371 - use comgr to unbundle code objects 1.Make runtime use comgr to unbundle code objects 2.Support compressed/uncompressed modes 3.Remove HIP_USE_RUNTIME_UNBUNDLER and HIPRTC_USE_RUNTIME_UNBUNDLER to simplify logics 4.Add comgr wrapper for amd_comgr_action_info_set_bundle_entry_ids() Change-Id: Ic41b1ad1b64cca1e31986437983a5146d52a7329	2024-05-01 16:09:12 -04:00
German Andryeyev	7a371503b2	SWDEV-311271 - Enable mempools under Linux Change-Id: I7fda94e61121f9d3a30f4ad185b8a97712922f3c	2024-04-29 18:06:34 -04:00
German Andryeyev	f0c7ecf617	SWDEV-455254 - Add kernel arg optimization Add kernel arguments optimization into blit path. Enabled by default on MI300. Change-Id: I2694a81b90d48ad07d86dfe4c0c64fe187bada8e	2024-04-10 18:08:37 -04:00
Saleel Kudchadker	c157bfb202	SWDEV-301667 - Create TS for each node recorded in graph - Create a vector to allow multiple TS to be stored in Command. - This would mean we dont wait for entire batch in Accumulate command to finish when we exhaust signals. - Reduce the number of signals created at init to 64. This min value may still need to be tuned but the KFD allows max of 4094 interrupt signals per device. - Store kernel names whenever they are available and not just when profiling. If we dynamically enable profiling like for Torch, a crash can happen if hipGraphInstantiate wasnt included in Torch profile scope beacuse we previously entered kernel names only when profiler is attached. Change-Id: I34e7881a25bbc763f82fdeb3408a8ea58e1ec006	2024-03-26 14:47:24 -04:00
German Andryeyev	0f3391b93e	SWDEV-311271 - Enable mempool under Windows Change-Id: Ifa4cac4a8d52e031d63f62515439ca09efe7b4cb	2024-03-11 10:45:51 -04:00
Saleel Kudchadker	94c7004df8	SWDEV-301667 - Increase default signal pool to 4096 Change-Id: I4ab23b0f87e295b40ab76ad6e96249d11b8ad04d	2024-02-29 22:52:02 +00:00
Saleel Kudchadker	68f40f78dd	SWDEV-443760 - Enable device kernel args for MI300 - Enable Device kernel args for MI300* for now. - Fix a perf issue which impacts graph instantiate when dev kernel args are enabled. Change-Id: I962e58fd9d8dd1a8db95e601cb03a8e9c7bac97f	2024-02-28 19:10:04 -05:00
Rahul Garg	b954d0d6e0	SWDEV-443760 - Disable HIP_FORCE_DEV_KERNARG by default Change-Id: I8c3d8e65aa954bd28499eebefbc532d1177445dc	2024-02-22 04:37:51 -05:00
Todd tiantuo Li	7bfee3481b	SWDEV-333557 - Enable PAL_HIP_IPC_FLAG by default Change-Id: Ibb2ca0b9521aff4eca190e4817dcc5f8d697b172	2024-02-20 18:45:25 -05:00
Saleel Kudchadker	f138e0d113	SWDEV-443760 - Enable device kern args - Implement workaround to ensure HDP writes are done by writing and reading the HDP MMIO register. - Implement the same workaround for graphs, we no longer need sentinel write/readback Change-Id: I0d3027b46a1f61131ec62e3c8c669ff5184fa6b2	2024-02-20 02:03:14 -05:00
Anusha GodavarthySurya	ae0368d12d	SWDEV-422207 - Enable DEBUG_CLR_GRAPH_PACKET_CAPTURE environiment variable Change-Id: I9bf72b9c1a56980352109bd4d42b54ecb2d1b8f9	2024-02-05 05:08:11 +00:00
Anusha GodavarthySurya	0a055f874b	SWDEV-422207 - Added debug env to dump graph during Instantiation Change-Id: Ibde2ae5b8d240f3986bcd168facc513a319c0f17	2024-02-05 05:08:11 +00:00
German	7d661bc7df	SWDEV-404889 - Enable debugger interface in PAL Add GPU_DEBUG_ENABLE to control ttpm behavior. If enabled, then HW will collect more debug info at some perf cost Change-Id: Icee0686b903a7b1bd483710b9d611877cd43c6aa	2024-01-02 11:51:42 -05:00
kjayapra-amd	e05923b139	SWDEV-413997 - Enable Virtual Mem support by default. Change-Id: Ia3db3919701708cf95574692e1d47375ca99d7fd	2023-12-20 12:49:16 -05:00
German Andryeyev	f1dc81f427	SWDEV-432174 - Change the fillBuffer kernel - Add the new fillBuffer kernel, which allows to launch a limited number of workgroups for memory fill operation - Switch fill memory to 16 bytes write by default - Allow to limit the workgroups with DEBUG_CLR_LIMIT_BLIT_WG Change-Id: Ibad1822f2d42b2fc71bcfc1917c31409c0623e8e	2023-11-16 14:25:55 -04:00
Ioannis Assiouras	7868876db7	SWDEV-428244 - Set PARAMETERS_MIN_ALIGNMENT to the native alignment Change-Id: I14d8a0db4e575d6fa816754c52df405de88d9200	2023-10-21 17:26:46 -04:00
kjayapra-amd	3ef829939a	SWDEV-413997 - Initial VMM changes for ROCm path. Change-Id: I4405fd7b53182eb4c4622835c811c0dc08461537	2023-10-16 11:29:16 -04:00
jiabaxie	28f0daa34f	SWDEV-405983 - adding in HIP_LAUNCH_BLOCKING Change-Id: I3f9c8a745099aab05155ebe910e727693961a02f	2023-10-10 21:11:13 -04:00
Anusha GodavarthySurya	e63c280d4d	SWDEV-422207 - Capture AQL Packets for graph Kernel nodes during graph Inst. And enqueue AQL packet during launch Change-Id: I1e5f7f9e2a70bd500d190193cb6ba0867f5a63e7	2023-10-05 00:34:29 -04:00
German	7be3a5e33e	SWDEV-407533 - [ABI Break]Remove Wavelimiter Change-Id: I6a2f6fb5a0c3acea93fa0200a69679783e76f5bd	2023-09-07 09:58:41 -04:00
kjayapra-amd	6a0f80a03d	SWDEV-381625 - Parse compiler and linker options from environment variable. Change-Id: Id5a012b678e5973c4b64dff84444a909aefae006	2023-08-29 20:24:27 -04:00
German	077311153a	SWDEV-407533 - [ABI Break]Purge unused env vars Change-Id: I627950e8ebb6299affc602754a20d442dbe42b14	2023-08-24 14:11:40 -04:00
Saleel Kudchadker	aa6eb555e2	SWDEV-384557 - Enable SDMA query Change-Id: Ibb0a8d131f799985a4d4adbf753261e58c04157f	2023-08-01 18:41:23 -04:00
Todd tiantuo Li	04b9ab49eb	SWDEV-333557 - add PAL_HIP_IPC_FLAG for PAL HIP device allocations Change-Id: I9017f4e3b03d4817bf233c788e30775fb2297589	2023-07-17 08:10:25 -04:00
Anusha GodavarthySurya	b0e6f99ad7	SWDEV-392732 - Initial commit for graph doorbell optimization(AQL Buffering) Change-Id: I451725006c54c249dc530c55d2af2a31594bf49b	2023-07-16 07:56:00 -04:00
Saleel Kudchadker	770b2a4711	SWDEV-384557 - Rename env var - Rename HIP_USE_SDMA_QUERY to DEBUG_CLR_USE_SDMA_QUERY as this is supposed to be a temporary env var for debug purposes only. Change-Id: If6ebd52ab87624375a3df24ceccdcc05c60a65af	2023-06-29 13:54:55 -04:00
Ioannis Assiouras	4add0e6563	SWDEV-405182 - Revert min alignment for abstract parameters stack to 16 bytes Change-Id: I9e6ace281468e8ef11b011c58f5971ce8907f3c6	2023-06-23 04:39:51 -04:00
Saleel Kudchadker	8d193c32bb	SWDEV-384557 - Use toggle for SDMA query - Use HIP_USE_SDMA_QUERY env var toggle for new API use. Env var is 0 by default Change-Id: If725a0c41e15f78a1a6c3f47942954fe9240b4db	2023-06-15 01:02:24 -04:00
Jacob Lambert	443f912c7f	SWDEV-375055 - Re-enable Comgr unbundler With recent upstream changes (D145770), we can now use the Comgr unbundler without requiring an env field in the supplied targetID. For users, this is consistent with previous legacy unbundler behavior. Change-Id: I5f085b0fa1ad352bbbb282b75367c206b75f279f	2023-05-31 16:14:08 -04:00
Saleel Kudchadker	5436d362b1	SWDEV-301667 - Add a flag for gpuvm kernargs HIP_FORCE_DEV_KERNARG=1 will create a device allocation for kernel arg segment. Flag is 0 by default. Change-Id: Iaaf5a149f3be8596568878d5d272268baf067c60	2023-05-22 11:23:48 -04:00
Alex Voicu	06df9e2efd	SWDEV-301667 - Kernelarg gpuvm Add aligned, nontemporal `memcpy` for kernarg. Change-Id: I5d8ac76904feaf793b45ec2ea5fbd1069be20068	2023-05-22 11:21:14 -04:00
German	04b696abee	SWDEV-353281 - VM support in mempool for graphs The change enables VM support in graphs on Windows. That allows to avoid caching of all allocations at the cost of map/unmap overhead during memory create/destroy. Change-Id: I792be00fba099e5e5d3cd44a963e1dfd6976a86d	2023-05-05 15:31:26 -04:00
Maneesh Gupta	5dc104b3ea	SWDEV-368235 - Revert "Remove obsolete env variables" This reverts commit `7b50c935f8`. Reason for revert: Deferred to a future release. Change-Id: Ia66c37f0ab9734dee73c930d10d7469d5fd57254	2023-02-15 07:25:00 +00:00
German	7b50c935f8	SWDEV-368235 - Remove obsolete env variables Change-Id: I7e14d53297e79e2f68b3a6cc40251ad7db9eb5ab	2023-02-03 13:44:24 -05:00

1 2 3

108 Commits