This patch make get_block_properties() function work on gfx1033 platform
Change-Id: Ie5be7dfb38575eec8b39b91f3ee5b3a31abe8bd1
Signed-off-by: Chen Gong <curry.gong@amd.com>
[ROCm/ROCR-Runtime commit: 4cf50fdeaa]
This patch is to add Van Gogh support on thunk.
Change-Id: I75819329b865e4c38c097e83e3a0cb4e4f566fa2
Signed-off-by: Huang Rui <ray.huang@amd.com>
[ROCm/ROCR-Runtime commit: 9600760ff7]
The memory tests between iommuv2 and dgpu_fallback are different.So it
needs to ditinguish them.
Change-Id: Icc64e9ae0fc1638c3d148795a5f247d9e5e8e503
Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
[ROCm/ROCR-Runtime commit: 39386c03bf]
The default kfdtest timeout is not enough for certain platforms, and
tests are failing.
Change-Id: I2027eadcbeb12a2fbbc9c55f92f31869fa13dbcb
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
[ROCm/ROCR-Runtime commit: 4bbfbe7789]
check GPU peer accessible with p2p_links in system
Signed-off-by: Gang Ba <gaba@amd.com>
Change-Id: I026f16564303b687811d6648f0b7f84be6819979
[ROCm/ROCR-Runtime commit: 8e94dde685]
Add missing target names and make all parts consistent with which
targets are supported.
- Add gfx805 as a supported target.
- Add all ELF targets to genric code.
- Make offline loader match supported targets.
Change-Id: Idab4d69edc71645aecaa83aa55e29c1aeee4c1d6
[ROCm/ROCR-Runtime commit: b443397bcc]
Now that symlinks aren't necessarily guaranteed, use "find" to try to
find the rocm-smi, and clarify the error message if it is not found
Also tie in a fix for parsing the output now that the output has changed
Change-Id: I2081442a71731c186c3ad00585a2ba6e8a8e5a28
[ROCm/ROCR-Runtime commit: 2651ce37d8]
Kernel argument size and alignment queries are not supported on
code object v3.
Change-Id: I1bdd34e2e62132f912ac39d80355efd3456df87c
[ROCm/ROCR-Runtime commit: 6182abf5e9]
Code object V2 had the ability to support the following queries:
- HSA_CODE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_SIZE
- HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_SIZE
- HSA_CODE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_ALIGNMENT
- HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_ALIGNMENT
However code object V3 onwards cannot support these as the kernel
descriptor changed. These queries need to be deprecated.
Until then return more reasonable values:
- For kernarg alignment return 16 which is the minimum alignment
required by the HSA standard.
- For kernarg size return the field from the kernel descriptor which
is a hint. If it is 0 then the compiler is not specifying the kernarg
size, or the kernel has no kernarg.
Change-Id: I19ce6cd0f3658a2bf62277492f39100ea5ab4256
[ROCm/ROCR-Runtime commit: ef755e4c82]
Avoids calling to KFD to map/unmap scratch allocations for
every large scratch using dispatch.
Change-Id: I9fab5705251ec82b03e4f2f2ca6da7cdccabefb9
[ROCm/ROCR-Runtime commit: 27e044ae4d]
Improves HIP event performance in directed benchmarks where
clock sync latency is significant.
Change-Id: I78b724a14a8f5b6a9a2b9f4d85afe9d8b81808a6
[ROCm/ROCR-Runtime commit: 32d0fcafa9]
The modern meaning of the construct if( NOT ON ) was added in CMake 2.8,
but when the cmake_minimum_required not set in user code and no policy
level is set in the CMake config, then CMake 2.8 features cannot be
used. In old CMake (the default), ON is interpreted as a variable, and
because it is not defined, it is considered false. The same is true of
OFF.
This change sets a variable as ON, so that old CMake interpretation is
correct, and the if works as expected regardless of policy version.
Change-Id: I67d7ed4ceaf8248eeb5a1c7f54009d72313f3f5d
[ROCm/ROCR-Runtime commit: 4a35f560f6]
Names test good:
hsa-rocr-dev_1.2.0.30900-crdnnv.415_amd64.deb
hsa-rocr-dev-1.2.0.30900-crdnnv.415.el7.x86_64.rpm
hsa-rocr-dev-1.2.0.30900-crdnnv.sles151.415.x86_64.rpm
http://confluence.amd.com/display/GPUCPT/Package+File+Naming
Note: rpm requires 'devel' instead of 'dev', to be a subsequent
patchset.
Change-Id: Id6a422f3c335448b52c70c77ed39c9041114b80f
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
[ROCm/ROCR-Runtime commit: 90f2dd5b1b]
1. Create P2P links
2. Determine FRAMEBUFFER_PUBLIC/PRIVATE only based
host-accessibility, not peer-accesssibility
Signed-off-by: Gang Ba <gaba@amd.com>
Change-Id: I15fccdc60386b453e2a47849a16df15157324b21
[ROCm/ROCR-Runtime commit: bedecc5957]
RPM needs _REQUIRES at the end, not _DEPENDS, and also requires a space
before the version of the required package.
Change-Id: I9dd70bd92fc2407b7e8b31e4d46df43c52438a65
Signed-off-by: Kent Russell <kent.russell@amd.com>
[ROCm/ROCR-Runtime commit: 089fdeb1fe]
This reverts commit 35b07e1e28.
Reason for revert: This commit caused a regression rocrtst memory
subtest: Maximum Single Allocation in Memory Pools failed.
Change-Id: I15330625603f893200a08cd8b5b097f9bf95361f
[ROCm/ROCR-Runtime commit: e515fd818b]
This fixes a build issue with kfdtest and the amdgpu pro driver build.
This was requested as kfdtest is needed for regular testing due to the
inclusion of the ROCr/KFD stack in the amdgpu pro driver (OSGSUP-199)
Change-Id: I224d2e9ee3f02065596890b4d8226484f4fac04f
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
[ROCm/ROCR-Runtime commit: 8026ba250c]
We want to compare bits, not check if a defined value is true
Change-Id: Ie51ede96d18eae01aff6677d852a056ee12bd9c6
[ROCm/ROCR-Runtime commit: e34dfa8ebd]
There is no default case, and we were missing a few types defined from
hsakmttypes.h. This was found via clang
Change-Id: I26193cb111a9d8220b1eff21c7313fe060288f36
[ROCm/ROCR-Runtime commit: 761d9d84d2]
While the ternary is nice to read, strlen in general is an expensive
call, so call it once and check if the value is greater than our maximum
allowable string length and adjust accordingly
Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: Id744f2ba0eb81bb2b3c52eb69f38a615398a655d
[ROCm/ROCR-Runtime commit: 025036a662]
Don't update the vm_object if GPU mapping failed. Print an error message
to help diagnose underlying problems.
Change-Id: I801ab6fe6c155bd25e6c0358007c106a4a019480
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: c7e6f5a274]
Use MAP_POPULATE when allocating anonymous system memory for later
GPU mapping as a userptr. This can speed up large allocations by
more than factor 2. I suspect populating pages in this way is more
efficient than the CPU page fault code path triggered by
get_user_pages in the kernel.
Change-Id: I188bbc1462ccb650d48cbfb1080dbb8eb7ada8b5
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: 8f26c0c40c]
Temporary workaround while language and compiler teams sort out
handling both modes.
Change-Id: I5d676cd546382dba05ec0b62bb885baa854614f6
[ROCm/ROCR-Runtime commit: a09ba8bcc8]
On gfx9, the maximum number of wavefronts per queue is the minimum of
40 waves per compute units, or 512 waves per shader engine. On gfx10,
there can only be 32 waves per compute units.
Signed-off-by: Laurent Morichetti <laurent.morichetti@amd.com>
Change-Id: I148d1a4fe6c07cdbfaa1f77939eb29311c81c008
[ROCm/ROCR-Runtime commit: 783e346777]
Fix for issue where rocrtst could not be built if out directory
was outside the src (WORK_ROOT) directory due to hard-coded
relative path for OPENCL_INC_DIR.
Change-Id: Icb93de2266d568e9c2437166e34c88ec526fb45c
[ROCm/ROCR-Runtime commit: 8d00f1aa59]
Reserve some space in the context save area for the debugger's
use. There should be 32 bytes per wave for a given queue.
Change-Id: I65ddb6123d0f6afd3149844617ad19023009101d
[ROCm/ROCR-Runtime commit: 2ed2e46b9b]
The queue control stack size cannot exceed 0x7000 on ASICs
gfx1010 through gfx1031. The lower limit is not achievable
with AQL so this should have no practical effect.
Fixes control stack size overflow on large ASICs.
Change-Id: Ib78cf6e4c5f096044bf8de24debe211689891caa
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>
[ROCm/ROCR-Runtime commit: 44f80d170d]
The error checking macro IS_OPEN returns an hsa_signal_t.
This conflicts with the return type of uint32_t.
Add an assert and rely on spurious return rule to return zero
when rocr is not initialized.
Change-Id: Ifc9bb75e22ecdd675273de59b31e5026a69c62e0
[ROCm/ROCR-Runtime commit: a3c4aaf95a]
1. Add KFDEvictTest.* for gfx90c based on CI test results
2. Remove SDMA blacklist based on SDMA issue fixed:
Change-Id: I86910fc98a5141f29959b35248a900f0c098a6e8
[ROCm/ROCR-Runtime commit: 36249ddc0e]