커밋 그래프

212 커밋

작성자 SHA1 메시지 날짜
German ad33a021cb SWDEV-352197 - Destroy virtual device in thread destructor
Windows kills threads on exit without any notification. However,
runtime can still destroy VirtualGPU object from the host thread with
HostQueue destruction.
This change also forces RGP trace transfer on the last capture without
any delays.

Change-Id: I768e87e99e1d23a021e63c12f36e450817743759
2023-01-31 10:53:48 -05:00
German 53a10c9039 SWDEV-377991 - Remove liquidflash support
Change-Id: Iba6455e5c0210c3223a06fec332404cd9f489154
2023-01-20 09:57:06 -05:00
German 3690ae8464 SWDEV-377991 - Remove liquidflash support
Change-Id: I8ea0feb6067387f1b545a7492b6bcb55e82ec8b0
2023-01-19 11:39:10 -05:00
German 5a42279a65 SWDEV-377991 - Rmove Liquidflash logic
Untie extra dependency on opencl repo

Change-Id: I8069cd8337214043d3c1453e3dfb0a0a47a83251
2023-01-19 09:58:31 -05:00
German c8927cd84e SWDEV-377991 - Remove Liquidflash extension
Initial check-in to untie dependencies with HIP and OCL repos

Change-Id: I363b63954c3f118f40a6ed893545d6a4ac44144c
2023-01-18 13:16:20 -05:00
Jaydeep Patel d97b4e8c74 SWDEV-374360 - Handle free for external memory.
Change-Id: I4a1ede2210a255960d7a935cd4debb806e0147f6
2023-01-16 13:06:36 -05:00
Ajay ecea27eb2d SWDEV-372757 - thread check workaround for windows hang
Change-Id: Ie9f87b88dd0f3078ad1919edc336f297f6b40373
2023-01-13 04:05:35 -05:00
kjayapra-amd e56a611b92 SWDEV-371904 - Adding pseudo fine grain flag to hsa memory allocation for device fine grained memory.
Change-Id: I8cada90f0e3880dfbc5bf5a3fac4554e7a0cb08e
2022-12-11 08:15:17 -05:00
Ioannis Assiouras 72b45e2a1f SWDEV-369581 - Convey copy API metadata to ROCclr
Change-Id: I569462d6d268700d419510255e201bf7d80d6714
2022-12-09 00:27:15 -05:00
Tao Sang f29d3bc3ac SWDEV-370659 - Add lock for HSAIL only
Add lock for HSAIL only in order to fix test failures
in math brute force and integer_ops tests.

Change-Id: I5f14cdcaa4ee9867fdae63fff197a0f21ee5f1d4
2022-12-06 15:50:04 -05:00
Sourabh Betigeri 5d7f3f9f3c SWDEV-305894 - Cooperative groups grid and multi grid sync support for gfx940+
Change-Id: I35d72f1cb50c3a96eee56a612b72d641852b145f
2022-12-05 16:30:30 -05:00
Tao Sang 3b2a8f3c8b SWDEV-306410 - Remove program lock
Remove global program lock in order to fix too
long kernel launch overhead with multi-threads
on MGPUs.
This patch depends on a compiler patch that makes
LC thread safe.
Change-Id: Ic8a7374d19112764d6de5d483ec5d07a56661d1b
2022-11-20 14:42:24 -05:00
Juan Manuel MARTINEZ CAAMAÑO bab23480d3 SWDEV-286150 - [NFC] Avoid copying the entire devicePrograms map
Change-Id: I059f979d9bcdf6604aa3630b40fd47475b75fc30
2022-11-17 03:15:55 -05:00
kjayapra-amd 7f1fb925ff SWDEV-361374 - Adding support for hipPointerSetAttributes
Change-Id: I3ec9627f43b3cbe0aa299c8aa9cd96f8fbd74925
2022-11-11 12:07:26 -05:00
Juan Manuel MARTINEZ CAAMAÑO 40f75306d5 SWDEV-286150 - [NFC] Refactor repeated option parsing code into function
Change-Id: I606dc1cd48d880974142e523d16f5d9ac6f3aff1
2022-11-08 10:29:13 -05:00
German e223b0f678 SWDEV-352487 - Don't add notifications as the last command
Change-Id: Ifed34485839ef2c9491e8e8f6bb3569932160b1c
2022-10-24 09:39:03 -04:00
Laurent Morichetti 9a82118c85 SWDEV-362046 - Report HIP_OPS activities using the ROCr driver_node_id instead of the device's index
The ROCclr assigns zero-based IDs to GPUs in the order they are
discovered. That zero-based ID is what is used to identify the GPU
on which the HIP_OPS activity took place.

When multiple ranks are used, each rank's first logical device always
has GPU ID 0, regardless of which physical device is selected with
CUDA_VISIBLE_DEVICES. Because of this, when merging trace files from
multiple ranks, GPU IDs from different processes may overlap.

The long term solution is to use the KFD's gpu_id which is stable
across APIs and processes. Unfortunately the gpu_id is not yet exposed
by the ROCr, so for now use the driver's node id.

Change-Id: Ib78854527d600d175bb76e2df0747c33f898c615
2022-10-20 12:31:30 -04:00
Sourabh Betigeri 84fbb30b7c SWDEV-357246 - Adds a missing return statement
Change-Id: I2216f71f4d4fb6dd3766023b0c821cb3d35d7849
2022-10-05 16:29:32 -04:00
Laurent Morichetti e00965df50 SWDEV-351980 - Add FillBuffer byte count to the record
Change-Id: I90c791f5810b8a3f6b1d6a9e81c165b1a7515c92
2022-09-30 21:20:14 -07:00
Saleel Kudchadker 9b5cbd37a2 SWDEV-352001 - Store last scopes for dispatch
- Store last fence scopes and use the last value to determine if we need a cache flush again. This helps cases where hipExtLaunchKernel API is
used.
- Purge code for ROC_EVENT_NO_FLUSH

Change-Id: I531cf9c9c60d5e2b3a9e265d0f52f79ed2fa8a8c
2022-09-22 11:34:10 -04:00
Laurent Morichetti 52eb28930a SWDEV-351980 - Consolidate registration tables in the roctracer library
Remove the activity_prof::CallbacksTable. The table was redundant with
the information already stored in the roctracer library. Instead use a
single callback into the roctracer library to query whether the activity
is enabled, and to report it.

Change-Id: I2e05b0881bb4a1953c14361d00ea310d02eb6e0c
2022-09-21 05:54:09 -04:00
Laurent Morichetti e713b5c7d0 SWDEV-351980 - Enable profiling for commands reporting activities
Profiling should be enabled for any command reporting activities as the
activity record captures the profilingInfo's start and end timestamps.

Since IS_PROFILER_ON is only used to determine whether API tracing is
enabled, there is no need to expose it globally, it should be a property
of the activity_prof::CallbacksTable.

Change-Id: I44a0d19ed2862606cfbc9a98c1a07a336ab7e26c
2022-09-21 05:53:59 -04:00
Laurent Morichetti 4fbae91468 SWDEV-351980 - Move activity_ to the ProfilingInfo
The activity_ is only instantiated if profiling is enabled.

Remove the HIP private global record ID. Instead, use the correlation ID
stored in the hip_api_data_t by the profiler while the last HIP function
is in scope.

For NDRange and Copy commands, store the kernel name and byte size
(respectively) in the record.

General cleanups to improve the code's readability.

Change-Id: I01907484b0d9611eb9440c3a7c4865479dc42289
2022-09-21 05:53:47 -04:00
Ajay 373a7d1195 SWDEV-347670 - GPU StreamWait and StreamWrite support in Windows PAL backend
Change-Id: Ic4881305b6332e217f3d3127dce7e9d9d0a7df11
2022-09-15 13:57:40 -04:00
Joseph Greathouse 6b956f7627 SWDEV-330307 - Avoid releasing command before last use
The fix for SWDEV-329789 moved down the last use of the a
command object pointer in order to prevent a race condition.
However, the previous patch did not move down the release of
that command. By releasing the command early, another thread
could get a command with the same pointer. That second thread
could later submit work to the queue using that new command.
The first thread could then perform a comparison against the
queue's last command using its own now-stale pointer. This
could eventually allow the second thread to skip synchornizing
on the queue. This would result in host synchronizations
completing before their device work was actually complete.

Change-Id: I292b7b369743251ceafe453a4c5cae14a6d01046
2022-08-31 16:07:49 -04:00
Jason Tang d92b3a2d90 SWDEV-333471 - Add GPU_FORCE_QUEUE_PROFILING
To support both hip and ocl. HIP_FORCE_QUEUE_PROFILING will be replaced with this later on.

Change-Id: I6d3514b1568ff049584ed9fd74bbdb3e4f4bf0c3
2022-08-19 10:51:41 -04:00
Anusha Godavarthy Surya 7b1c6d06d5 SWDEV-345683 - Fix HIP out of memory
If for every eventRecord handler is not submitted,
memory is not getting released during hipFree and leads to OOM.

Change-Id: I19b61a0c523502e9e1a3564ce8b791f3e2cea02c
2022-07-28 07:36:38 -04:00
Saleel Kudchadker faaa41aab8 SWDEV-335626 - Use ROCr copy for IPC
Detect IPC buffer and use ROCr copy api instead of blit

Change-Id: Ie6bdd6fc45dbd7457611011d81570b53d5fd5276
2022-07-08 13:32:19 -04:00
German Andryeyev 9e74f1c7f8 SWDEV-329789 - Avoid a race condition with the last command
Runtime can reset the last command only if it didn't change
since the query at the beginning of finish()

Change-Id: I629f2d788e9bbaa17ca4e96b1a753f8131e32463
2022-07-07 10:17:07 -04:00
Ajay 236178d0d4 SWDEV-337331 - command queue logs for debugging option
Change-Id: I198aecc5fd12369d87d4acc9910acc9435c1967a
2022-06-22 19:41:38 +00:00
Jaydeep ea0590d1fe SWDEV-332607 - If pitch returned from hipMallocPitch is equal to pitch passed to hipMemset2D then height passed to hipMemset2D must be less than or equal to height passed to hipMallocPitch.
Change-Id: I8d9b0938fb592170008aaec9cedd519bf40c6201
2022-06-17 10:35:22 -04:00
Sarbojit Sarkar 356e22f910 SWDEV-325379 - Fix for remote copy crash
Change-Id: I22152c0b3538cf7cfc80f82505bc255c01d98f7b
2022-06-16 23:59:11 -04:00
Saleel Kudchadker 5df34a2f7a SWDEV-335780 - Indicate if handler is queued
Maintain status of handler callback. For event records we no longer
submit callbacks to reduce the load on the async handler thread. However
without a callback we leak command memory/decrement refcounts. Indicate
status of the handler which we can use to queue a callback when
finish is called.

Change-Id: I89fd02f3d047a0e8162664ee17581a14795f1928
2022-06-14 20:55:06 -04:00
neqochan ebfa343827 SWDEV-1 - Fix illegal atomic initialization
See https://stackoverflow.com/a/21710850 for an extensive discussion.

This is a cherry-pick from a github pull request:
https://github.com/ROCm-Developer-Tools/ROCclr/pull/29

Change-Id: I87a58548d2995ab51a7cd6e684b5442e5b300923
2022-05-31 09:51:44 -04:00
German Andryeyev 07c1b9a998 SWDEV-336024 - Clear device heap to 0
This reverts commit 04bfd93569.

Reason for revert: Fix regressions

Change-Id: I7d883e1c3cbd27bb64b581ec800243ad7dfe24fd
2022-05-19 09:10:08 -04:00
German Andryeyev 04bfd93569 SWDEV-336024 - Clear device heap to 0
The heap must be cleared once per device, but ROCclr doesn't
create a queue per device in HIP. Hence, the clear operation will
be performed during the first queue creation.

Change-Id: I52ceb06d67d11cde6d019c5ab510059f426a9bfb
2022-05-11 11:03:56 -04:00
Rakesh Roy ac2c3b5cad SWDEV-333598 - Add flags field in amd::Memory UserData
Change-Id: Ie4d59fa34486679fde1027dd113573bda3e7c65c
2022-05-05 12:24:53 -04:00
Saleel Kudchadker 02566677cf SWDEV-334152 - Set release as systemscope
Set release scope as system for dispatch AQL when events are passed to
hip*LaunchKernelGGL*

Change-Id: I93b91591e0ab023f1ecc5247f7905eca26147358
2022-04-29 13:19:29 -04:00
Sarbojit Sarkar 2f973fb38b SWDEV-330649 - Fix for QCD app crash
Change-Id: If85eb06083d2f7dbe69cde6fbd5ac54979d25693
2022-04-29 05:37:33 -04:00
Saleel Kudchadker fa76f03654 SWDEV-334150 - Force callback to cycle commands
Enqueue a handler callback for hipEventRecords(aka marker_ts_) for every
64 submits, This recycles the memory if we dont end up calling
synchronize for the longest time.

Change-Id: I3d39fe76d52a5d81387927edd85b5663b563682c
2022-04-28 12:30:23 -04:00
Christophe Paquot 67657d6099 SWDEV-322620 - Virtual Memory Management
Implement map/unmap for PAL backend
Create commands since PAL uses the IQueue to map/unmap

Change-Id: I97e26a7d28ae5e10774c9ca65307153100945621
2022-04-22 18:09:26 -04:00
Saleel Kudchadker ddfd919a62 SWDEV-333237 - Release command before queing a marker
Change-Id: I5343c4b7ade2dc68efa7454a919a6657726c45d3
2022-04-22 12:58:58 -04:00
Alex (Bin) Xie 3d514c85b9 SWDEV-329646 - MicroStation app crash upon closing
Change-Id: Ie3422788c80b233c836e319c355214ca076e5d4f
2022-04-20 14:34:44 -04:00
Saleel Kudchadker b6cbfaf499 SWDEV-301667 - Separate scope from marker_ts_
Change-Id: I19f4d394e898bfb8c9d9a2c2edf9d5bf5def3b08
2022-04-16 19:26:31 -04:00
Christophe Paquot b5f555f9ec SWDEV-322620 - Virtual Memory Management
Adding virtual memory management APIs to rocclr.
The HIP layer will handle virtual allocs on devices.

Change-Id: Ia978f105c2c3fed3959c77580ba228e845105754
2022-04-15 00:10:02 -04:00
Saleel Kudchadker 8eeaa998c0 SWDEV-301667 - Add cache state for a device
- Add a global cache state for a device to indicate scopes of submitted
AQL packets
- Remove scopes for TS marker if hipEventReleaseToDevice is passed. Set
env ROC_EVENT_NO_FLUSH=1 to use NOP AQL for event records.
It would flush caches by default with system scope release.
- Calling finish() should ensure if caches are flushed, if not queue a
marker

Change-Id: Ibbbdbb1cd7ac61cb35649169212142545be159e0
2022-04-12 12:27:31 -04:00
haoyuan2 1fbc01a812 SWDEV-328274 - Move DLLMain from VDI layer to HIP/OCL layers
Change-Id: Idc84eb0db92d21a5ced8769fa1eae064b86c31b0
2022-04-11 16:55:59 -04:00
Maxime Chambonnet d45794e985 SWDEV-1 - ROC CLR typos
This is cherry-picked from this github issue:
https://github.com/ROCm-Developer-Tools/ROCclr/issues/28

Change-Id: I236f4f25a2dabe05883159af0fab0bad06ab0fd0
2022-04-11 14:24:39 -04:00
Christophe Paquot 867346520f SWDEV-322620 - Virtual Memory Management Part 1
Adding opaque data handle to memory. This is used to look back the HIP object associated with it.

Change-Id: I1bbb14a915bed79c6c3593a29a627778c7aaf13a
2022-03-31 21:12:26 -04:00
German Andryeyev 28597ec5b5 SWDEV-328670 - Enable arena for ROCr interops
Add ROCR memory detection and enable arena mem object for possible
access in HIP

Change-Id: Icf86ac789176bfee4ea8d36b0970a817d4c6a2f7
2022-03-30 16:46:36 -04:00