Allows determining if the host can directly access HMM memory that
is physically resident in vram.
Change-Id: Ie452eedd0e27fe1b511afd416f5a1cd01b3d84e8
Enables the fragment allocator to handle >2MB allocations, maintaining
good TLB alignment. Prior code contained a bug that caused the effective
API granule for vram allocations >2MB to be bumped to 2MB.
Also adjusts the block cache's block retention heuristic to not
count discarded blocks as in use. This will reduce block retention
when a significant amount of large blocks or IPC is in use.
Change-Id: I30bd85eb87951df822211f799d9cfe579ab109c6
Under high async handler load signal retention and event sorting
become bottlenecks. This change processes more handlers in a
single pass to amortize wait_any overheads.
Change-Id: I8b276e102db647e3858e120547aa0c6fca85ab4c
Old memory properties info name used after removing branches.
This caused the CPU coarse grain pool to initialize with random
bits.
Change-Id: I397bc5ecf09fab69bdf1d7fafadcf54d71b64070
Prevents poorly written tools which throw in tools interface
callbacks from causing ROCr to catch and return a generic error
code.
Change-Id: I2f5bf7104dc7d4ee688eb48423c7ffdb06bd7702
Old logic did not consider memory held in the scratch cache to be
free when deciding whether or not to reclaim.
Change-Id: I7f7c7549c72d743edbf7c53489fe9a453dc4177a
Clarify behavior of hsa_ven_amd_loader_iterate_executables during
concurrent calls of executable creation and destruction.
Change-Id: Idc3e3981d4fcc0d58d9f1b7a7578deed20aa490b
Includes some workarounds and HMM.
Conflicts:
opensrc/hsa-runtime/core/runtime/amd_topology.cpp
opensrc/hsa-runtime/core/util/flag.h
Change-Id: I22976f07964a43dbb228a6231777dbd599112b8d
When no isa's are available no callbacks should be invoked. This
is not an error and should return success.
Change-Id: Ie4048aa8cbe5c3fdf5431f6a865021549ecf8a13
Sramecc is misreported in kfd 4.0 and prior. To prevent possible
corruption due to d16 instructions, deny use of gfx906 with older
kfds and correct misreport for gfx908. Denial of gfx906 may be
overridden by setting HSA_IGNORE_SRAMECC_MISREPORT=1.
Change-Id: I7d5c3a716fad01c348f8b88cd508cedbf914c989
Park the wave, if it is stopped, to avoid halting it at an s_endpgm
instruction if the architecture does not support it.
Free ttmp6 by converting the dispatch_ptr into a queue packet index
(25-bit) and storing it in ttmp7[24:0].
Save the exception PC in ttmp11[22:7] ttmp6[31:0].
Change-Id: Iaa3c5baf5b488c0b534044d338f12bffa63ddce2
Scratch cache was not updated for IOMMUv2 systems previously.
This both negates the cache and causes segfault during scratch
release.
Change-Id: I71e81d6b642d65ca135868ff7225ea173529d458
Replace the stop reasons ttmp11.trap_raised and ttmp11.excp_raised
with ttmp11.wave_stopped which indicates that the trap handler has
halted the wave as the result of an event (trap, single-step or
exception).
If the wave is stopped because of a trap, also record the trap_id in
ttmp11.saved_trap_id[7:0].
Save status.halt in ttmp11.saved_status_halt, so that it can be
restored when resuming a wave (changing a wave's state from stopped to
running or single-stepping).
Change-Id: I7322f59b60e8cc1b92bf5f067dba606a3109ef49
This patch is to let ROCr recognize new gfx10.3.3 ISA.
Change-Id: Ied23eee2752e14c19c8c0a6d7789fded9940e31e
Signed-off-by: Huang Rui <ray.huang@amd.com>
To support single stepping the instruction preceding an s_endpgm,
unwind the PC by 8 bytes and set ttmp11[9] to notify the debugger
that the wave is halted with a modified PC.
Bump the debug r_version for this new trap handler ABI.
Change-Id: I55e4e0d65576f92da14a336266c31c513baab547
Each SE must be assigned equal numbers of slots and slots
must be assigned in units of whole groups.
Change-Id: I8f3677237fa6f2e2d25e3e78210c5a7a0ad792f3
- Remove gfx800, gfx804 and gfx901 as they do not exist.
- Map the V2 note record of "AMD:AMDGPU:8:0:0" to gfx802 as they are
the same target just connected to a differnt motherboard.
- Correct typo for supporting gfx902:xnack+.
- Support agent names with a minor or stepping version greater than 9.
Change-Id: Ife933449f60ab4687e2aaab9baf4c9fc5b86339d