The trap handler should read the PERF_SNAPSHOT_DATA after all of
PERF_SNAPSHOT_DATA, PERF_SNAPSHOT_PC_LO and PERF_SNAPSHOT_PC_HI. This
patch fixes this.
Change-Id: I7f78e16d7a0d8bfebb34906b4dff73c2eaeb5658
[ROCm/ROCR-Runtime commit: 6a4785f650]
Make sure to clear the HOST_TRAP and PERF_SNAPSHOT bits before returning
from the second level trap handler. As those bits are sticky, this
ensures future re-entry to the trap handler (for context save for
example) will not be confused with a sampling trap.
Change-Id: I05e5e58779a650b324ac6e30d574dc6931340f13
Signed-off-by: Lancelot SIX <lancelot.six@amd.com>
[ROCm/ROCR-Runtime commit: eece210a5c]
detect if the loaded driver is upstream or DKMS version and
add a filter for for the tests that fail in upstream driver
Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>
[ROCm/ROCR-Runtime commit: 10530fa2a7]
Adding a general stage for agents to release their resources on
shutdown. This avoids a circular dependency during shutdown because
we have to delete allocated resources before deleting memory pools, but
we also have to delete memory pools before destroying agents.
[ROCm/ROCR-Runtime commit: 947391deac]
The initial call to Refresh() in the constructor is
unnecessary as it's handled in Runtime::Load().
Signed-off-by: lyndonli <Lyndon.Li@amd.com>
[ROCm/ROCR-Runtime commit: c34a2798ce]
The debugger override will set the initial request mask to the
previously set request mask so use a different mask to assert
enablement.
Trap on wave start and end also run back to back, so fix the
previous override mask check as well.
In addition, unlike instruction traps, trap on wave start and end
will not require a rewind of the program counter on wave exit.
[ROCm/ROCR-Runtime commit: c710a06ee0]
In PcSamplingCreateFromId, convert number of bytes into number of
dwords because DmaFill expects a count of 32-bit words, not raw bytes.
This prevents OOB writes on large sampling buffers.
[ROCm/ROCR-Runtime commit: 2ae70735e8]
The over arching goal it so provide an API that pre-silicon models can latch into for software bring up.# Please enter the commit message for your changes. Lines starting
[ROCm/ROCR-Runtime commit: d4b85b6bf5]
Poll the dependent signals twice on all gfx9.0 GPUs except gfx90a.
This is needed as a work-around for a rare issue where SDMA_POLL_REGMEM
may return before the memory is actually cleared.
[ROCm/ROCR-Runtime commit: 6903a41b1d]
Resets event_age when signals move. Prior to this PR, event_age
can become unaligned with hsa_event, causing hangs if the event_age
exceeds the true hsa_event age.
[ROCm/ROCR-Runtime commit: d2a89a467b]
For the case parent goes faster then child, and child hasn't call the second
raise(SIGSTOP), then parent's "waitpid(childPid, &childStatus, 0)" will return,
and the childStatus will be 0x137f, which is SIGSTOP signal id.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
[ROCm/ROCR-Runtime commit: 42f79776cd]
For the case that the child goes to the second raise(SIGSTOP),
and parent sends PTRACE_CONT, than child exits. Parent will assert at
DeviceSnapshot, as in kfd_ioctl, couldn't get the mm from child pid.
Signed-off-by: Emily Deng <Emily.Deng@amd.com>
[ROCm/ROCR-Runtime commit: 91ef44d3ec]
reduce the allocated memory for GFX VRAM as
KFD Evict test faced intermittent page faults,
which can be due to larger GFX CS BO size
[ROCm/ROCR-Runtime commit: 85c4b0020a]
Blacklist KFDNegativeTest.BasicPipeReset from gfx950 until MEC can
support pipe reset on GC 9.5.0.
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
[ROCm/ROCR-Runtime commit: fcf3f91379]
Remove hard assertions for signal validation on hsa_amd_signal_wait_* operations, instead ignore 0/NULL/invalid signals in the dependency condition evaluation to align with HSA specs for barrier-AND and barrier-OR packets.
Signed-off-by: zichguan-amd <zichuan.guan@amd.com>
[ROCm/ROCR-Runtime commit: e4d027191c]
The scratch_backing_memory_byte_size is not used by CP, but it is
currently used by rocgdb. Putting the field back, but we need to find a
solution for alt_scratch_backing_memory_byte_size.
Also, completely disabling alternate scratch as we need some changes to
support debugger.
[ROCm/ROCR-Runtime commit: 02b38d0614]
This is primarily used for debug and negative testing for SDMA queue
reset and shouldn't be used for normal run cases.
[ROCm/ROCR-Runtime commit: d047708317]
We cannot guarrantee system-scope coherency on systems with only PCIe
connections, so do not expose extended fine-grain memory pool on these
systems.
[ROCm/ROCR-Runtime commit: 6dac90c89a]