提交線圖

556 次程式碼提交

作者 SHA1 備註 日期
Sean Keely 9e53cab613 Add agent info query for HSA_AMD_AGENT_INFO_SVM_DIRECT_HOST_ACCESS.
Allows determining if the host can directly access HMM memory that
is physically resident in vram.

Change-Id: Ie452eedd0e27fe1b511afd416f5a1cd01b3d84e8
2021-06-17 03:45:26 -04:00
Sean Keely 8adbda1c18 Allocate any size vram request through the fragment allocator.
Enables the fragment allocator to handle >2MB allocations, maintaining
good TLB alignment.  Prior code contained a bug that caused the effective
API granule for vram allocations >2MB to be bumped to 2MB.

Also adjusts the block cache's block retention heuristic to not
count discarded blocks as in use.  This will reduce block retention
when a significant amount of large blocks or IPC is in use.

Change-Id: I30bd85eb87951df822211f799d9cfe579ab109c6
2021-06-10 19:30:54 -05:00
Sean Keely 981c6bd8c3 Remove unused GpuAgent.local_region_ member.
Change-Id: I99526e6b1f64e810f7fed5d922c540d252a46d80
2021-06-07 19:59:58 -04:00
Sean Keely bd59789f0b Add debugging checks for packet type in the scratch handler.
Change-Id: I84a6f18548ac39349595e3a1c8a5a9ff27d4e178
2021-06-07 15:36:18 -04:00
Sean Keely 3323e18f3e Limit reporting of GPU_ONLY signal waits from host.
Such waits must spin but are functionally correct.

Change-Id: I4992852f04da788495c6f566c46a3dffaf38397c
2021-06-03 15:26:40 -05:00
Sean Keely ca8387768e Allow limiting debug warning messages.
Add macro debug_warning_n to stop printing a message after
N instances.

Change-Id: Id5f84b11eb63b3a20bd2bcb2ea8f10a066b457ef
2021-06-03 15:25:55 -05:00
Sean Keely 6b398eb72c Improve async handler performance.
Under high async handler load signal retention and event sorting
become bottlenecks.  This change processes more handlers in a
single pass to amortize wait_any overheads.

Change-Id: I8b276e102db647e3858e120547aa0c6fca85ab4c
2021-06-02 23:52:07 -05:00
Sean Keely f6c2aa1c78 Add Read Mostly attribute support.
Change-Id: Ia7c60edacb892cbf14bdb50350c0a0a627e53964
2021-06-01 23:39:12 -05:00
Sean Keely 7361fc18ee Recognize gfx1034 in image device family id.
Change-Id: I2a529b5e91fae9f3697ddbccaaf0e97c87d59837
2021-05-25 16:43:20 -05:00
Chris Freehill 8cb686fdc5 Add gfx1034 support
Change-Id: I2d4bfcb9012704daf7de10739c966827bd2a09e2
2021-05-25 16:43:16 -05:00
Mike (Tianxin) Li 36c54c63f7 Revert "Get the size of VGPR and SGPR register file"
This reverts commit 344ed757e0.

Change-Id: I9988218ad1d2b6182d92aad09d18a95e77e46c01
2021-05-18 15:01:30 -04:00
Mike Li 344ed757e0 Get the size of VGPR and SGPR register file
Signed-off-by: Mike Li <Tianxinmike.Li@amd.com>
Change-Id: Ifa515ad7e1df1dd27f25f1e919b0053049531063
2021-05-13 11:54:41 -04:00
Sean Keely 5f0e39df63 Update README.md
Remove reference to finalizer and images libs.

Change-Id: Ic673da77bb13dea77b477d7bfe799fc2c028ab2a
2021-05-10 17:53:19 -05:00
Sean Keely 0439dc90cd Correct merge error.
Old memory properties info name used after removing branches.
This caused the CPU coarse grain pool to initialize with random
bits.

Change-Id: I397bc5ecf09fab69bdf1d7fafadcf54d71b64070
2021-05-06 18:40:56 -05:00
Sean Keely c9ce27a640 Add exception forwarding to tools API callbacks.
Prevents poorly written tools which throw in tools interface
callbacks from causing ROCr to catch and return a generic error
code.

Change-Id: I2f5bf7104dc7d4ee688eb48423c7ffdb06bd7702
2021-05-04 02:14:20 -05:00
Sean Keely 0b7d9db964 Correct scratch in use computation.
Old logic did not consider memory held in the scratch cache to be
free when deciding whether or not to reclaim.

Change-Id: I7f7c7549c72d743edbf7c53489fe9a453dc4177a
2021-04-22 20:07:25 -04:00
Sean Keely ee8b1b64ad Report HMM driver support status.
Implements HSA_AMD_SYSTEM_INFO_SVM_SUPPORTED.

Change-Id: If5182edcc1fa067fa514aa2c1bd326c4c42d1b64
2021-04-21 21:44:42 -05:00
Sean Keely 77046a1aaa Revert "Revert SVM and XNACK support."
This reverts commit 5bd153974d.

Conflicts:
	opensrc/hsa-runtime/core/util/flag.h

Change-Id: I16daf41588e6139126d66af54b0693de2e7e39f3
2021-04-21 14:49:43 -05:00
Sean Keely 3127d1ffdc Ensure ROCr created threads have no CPU affinity.
Change-Id: I53828dbaf055b65b61bdd11f0eadfcc806596821
2021-04-19 19:47:06 -05:00
Konstantin Zhuravlyov 1bdc2f6854 Update documentation of hsa_ven_amd_loader_iterate_executables
Clarify behavior of hsa_ven_amd_loader_iterate_executables during
concurrent calls of executable creation and destruction.

Change-Id: Idc3e3981d4fcc0d58d9f1b7a7578deed20aa490b
2021-04-16 20:51:48 -04:00
Konstantin Zhuravlyov 15e54d684d Expose iterator for executables
Change-Id: I0c5d39fc33c15a6eb8ee10ff181c2dcf2e042675
2021-04-16 20:51:48 -04:00
Konstantin Zhuravlyov e826c365ea Remove loaders.c/hpp
Change-Id: Ida507c2dd2de9172f250172f9c45a639953cb412
2021-04-16 20:51:48 -04:00
Mike Li d077606e22 Get GPU cache information from KFD
Signed-off-by: Mike Li <Tianxinmike.Li@amd.com>
Change-Id: I8dc8c97ae81c3747b7cd88cf2cdb7a9e4694a88d
2021-04-13 10:29:34 -04:00
Tony Tye a97c14abea Add support for gfx909 and gfx90c
Change-Id: I88158789cdda44a173e3ca26d2c96b8e0ea0e221
2021-04-08 22:37:30 +00:00
Sean Keely 243e29ba8e Remove emulator SRAMECC override controls.
Change-Id: Iea9e7870dbf517032f34cebec673c90226b96960
2021-04-02 02:11:05 -04:00
Sean Keely 5bd153974d Revert SVM and XNACK support.
KFD is not ready yet.

Change-Id: I61deb292ddb92185d33504c2115169888d56e211
2021-04-02 02:10:59 -04:00
Ramesh Errabolu 25f3dc305f Override Cpu-Gpu link-weight for Alebaran until a proper fix is available
Change-Id: I1fbc38b788f71cc9c9fc62295223286004689bf9
2021-04-02 02:10:54 -04:00
Sean Keely 7333c77e22 Squash merge of cfreehil/amd-temp-gfx90a onto amd-staging.
Includes some workarounds and HMM.
Conflicts:
	opensrc/hsa-runtime/core/runtime/amd_topology.cpp
	opensrc/hsa-runtime/core/util/flag.h

Change-Id: I22976f07964a43dbb228a6231777dbd599112b8d
2021-04-02 02:10:15 -04:00
Sean Keely 4197461b7f Correct hsa_agent_iterate_isas return code for CPUs.
When no isa's are available no callbacks should be invoked.  This
is not an error and should return success.

Change-Id: Ie4048aa8cbe5c3fdf5431f6a865021549ecf8a13
2021-04-01 00:08:22 -04:00
Sean Keely 45fbe5b192 Block ROCm 4.1+ running against 4.0 and prior kfd.
Sramecc is misreported in kfd 4.0 and prior.  To prevent possible
corruption due to d16 instructions, deny use of gfx906 with older
kfds and correct misreport for gfx908.  Denial of gfx906 may be
overridden by setting HSA_IGNORE_SRAMECC_MISREPORT=1.

Change-Id: I7d5c3a716fad01c348f8b88cd508cedbf914c989
2021-04-01 00:03:32 -04:00
Cole Nelson 72fa4a17fa hsa-runtime: add ENABLE_LDCONFIG to support multi-version install
Depends-On: I58fdf1d0b4e864b5a61ffe8e335d430d424811ab
Change-Id: I0cb6f8711ea5033e84b7e45ce20e7e23d84005c3
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
2021-03-26 18:37:04 -04:00
Laurent Morichetti ea6ee0aa81 New trap handler ABI (v5)
Park the wave, if it is stopped, to avoid halting it at an s_endpgm
instruction if the architecture does not support it.

Free ttmp6 by converting the dispatch_ptr into a queue packet index
(25-bit) and storing it in ttmp7[24:0].

Save the exception PC in ttmp11[22:7] ttmp6[31:0].

Change-Id: Iaa3c5baf5b488c0b534044d338f12bffa63ddce2
2021-03-04 21:44:14 -05:00
Laurent Morichetti 7e0f391a08 Correct the trap handler
ttmp11 no longer has an "excp_raised" field.

Change-Id: I8e673ca404c2b802470bbc9f76e7925782076c5a
2021-03-04 21:21:26 -05:00
Sean Keely 191664cd20 Insert scratch memory into scratch cache on full profile systems.
Scratch cache was not updated for IOMMUv2 systems previously.
This both negates the cache and causes segfault during scratch
release.

Change-Id: I71e81d6b642d65ca135868ff7225ea173529d458
2021-03-03 21:30:16 -05:00
Mike Li 93609fd3d4 Support for Custom Pitch for gfx103x
Signed-off-by: Mike Li <Tianxinmike.Li@amd.com>
Change-Id: Ica83dff8bb382637010396781190f585754bd150
2021-02-22 22:05:25 -05:00
Jason Tang ec22afb8a8 Correct GetIsa() typo
Change-Id: Ia6b5a86bd035fb077f0da9d52160ec8d12987b87
2021-02-17 11:57:58 -05:00
Sean Keely 34ac62274a Correct legacy copy path.
Legacy p2p copy path incorrectly transfered in whole pages rather than
the requested size.

Change-Id: I9aa7337754f9e32f587a0cc5305f8ffeb6196f10
2021-02-10 19:53:02 -05:00
Sean Keely 01f42dbe46 Add hsa_amd_signal_value_pointer.
Enables partial signal interop with non-HSA devices.

Change-Id: Ic39bca84ed1709cbd2cc24b1eb0f4fc6cccb39cf
2021-02-10 18:47:54 -05:00
Laurent Morichetti 9ca79d072a New trap handler ABI (v4)
Replace the stop reasons ttmp11.trap_raised and ttmp11.excp_raised
with ttmp11.wave_stopped which indicates that the trap handler has
halted the wave as the result of an event (trap, single-step or
exception).

If the wave is stopped because of a trap, also record the trap_id in
ttmp11.saved_trap_id[7:0].

Save status.halt in ttmp11.saved_status_halt, so that it can be
restored when resuming a wave (changing a wave's state from stopped to
running or single-stepping).

Change-Id: I7322f59b60e8cc1b92bf5f067dba606a3109ef49
2021-02-05 09:56:01 -08:00
Evgeny c5aae30d08 adding gfx1030 blocks
Change-Id: Ide2576939c5321dbe928183a8d9984d5ef87a61b
2021-01-29 08:50:10 -06:00
Huang Rui feeb2f62e2 Add gfx10.3.3 ISA support for Van Gogh
This patch is to let ROCr recognize new gfx10.3.3 ISA.

Change-Id: Ied23eee2752e14c19c8c0a6d7789fded9940e31e
Signed-off-by: Huang Rui <ray.huang@amd.com>
2021-01-22 04:22:15 -05:00
Laurent Morichetti 8aec53969f Don't terminate waves halted at s_endpgm
To support single stepping the instruction preceding an s_endpgm,
unwind the PC by 8 bytes and set ttmp11[9] to notify the debugger
that the wave is halted with a modified PC.

Bump the debug r_version for this new trap handler ABI.

Change-Id: I55e4e0d65576f92da14a336266c31c513baab547
2021-01-21 20:51:38 -08:00
Laurent Morichetti 8808ed3177 Correct gfx10.3+ trap handler.
Change-Id: I77d2b41c8882014a430d741ecd777718a1f61561
2021-01-21 09:24:20 -08:00
Tony Tye 26fe26e415 Correct isa lookup for targets that do not support a target feature
Change-Id: I130070a53162e5d9fcc6a64a4bdda7869179be82
2021-01-18 15:47:19 +00:00
Chris Freehill 09bc75bf0d Correct some target ID strings for gfx908
Change-Id: I7833b561447b9928447cf49472cfe1ca1867e71d
2021-01-15 14:56:38 -06:00
Sean Keely 7bc6aac5d2 Correct computation of scratch slot requirements.
Each SE must be assigned equal numbers of slots and slots
must be assigned in units of whole groups.

Change-Id: I8f3677237fa6f2e2d25e3e78210c5a7a0ad792f3
2021-01-13 15:09:00 -05:00
Sean Keely 9fe8ccc3ee Revert "Revert "Cache scratch allocations.""
This reverts commit 7e2ba23566.

Change-Id: I3f3c257270016559f8b2e70151664f0931db28d2
2021-01-13 15:08:53 -05:00
Tony Tye 6bbf6b1c9c Improve Isa class
- Use consistent naming in Isa class.
- Remove unused Isa methods.
- Simplify Isa methods.

Change-Id: I7c4045d08fbfe0d94b3181db8ebc5e5ed8c8cc82
2021-01-10 18:23:54 +00:00
Tony 853ccc762e Store target ID in isa registry
Store target ID string in isa registry and use for returning agent and
isa name.

Change-Id: I72a20d8ff963c73d86392158aff3853e4c9bfdbd
2021-01-10 18:23:54 +00:00
Tony 12eb2764cd Correct code object V2 support
- Remove gfx800, gfx804 and gfx901 as they do not exist.
- Map the V2 note record of "AMD:AMDGPU:8:0:0" to gfx802 as they are
  the same target just connected to a differnt motherboard.
- Correct typo for supporting gfx902:xnack+.
- Support agent names with a minor or stepping version greater than 9.

Change-Id: Ife933449f60ab4687e2aaab9baf4c9fc5b86339d
2021-01-10 18:23:54 +00:00