When no isa's are available no callbacks should be invoked. This
is not an error and should return success.
Change-Id: Ie4048aa8cbe5c3fdf5431f6a865021549ecf8a13
[ROCm/ROCR-Runtime commit: 4197461b7f]
Sramecc is misreported in kfd 4.0 and prior. To prevent possible
corruption due to d16 instructions, deny use of gfx906 with older
kfds and correct misreport for gfx908. Denial of gfx906 may be
overridden by setting HSA_IGNORE_SRAMECC_MISREPORT=1.
Change-Id: I7d5c3a716fad01c348f8b88cd508cedbf914c989
[ROCm/ROCR-Runtime commit: 45fbe5b192]
1. As we cannot ganrantee that 100% apu vram are free to be allocated, limit
the allocation size be no more than 3/4 of vram size.
2. Keep the old 1GB allocation limit for dGPU case.
3. Add the alignment check for alloc_size.
Affected tests:
rocrtstStress.Memory_Concurrent_Allocate_Test
rocrtstStress.Memory_Concurrent_Free_Test
Change-Id: Id0023de132024d02f80980ae4237d9d74d9e27d3
Signed-off-by: Mengbing Wang <mengbing.wang@amd.com>
[ROCm/ROCR-Runtime commit: d5855c1658]
The old bit was deprecated, because old buggy user mode depends on it
being always 0. The correct value is now reported in a new bit. New user
mode handles the reported EDC setting correctly, so we can report the
correct value in a new bit.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ib5d5ed2519810e650458c6b69c97670dab435ddb
[ROCm/ROCR-Runtime commit: d287c60246]
This reverts commit 07b0758bee.
SVM is not ready yet. This was merged by accident.
Change-Id: I8901594a72e785ba5d25a6448718a570e76fe117
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: 41cd7aea2f]
This reverts commit 08e65a397a.
SVM is not ready yet. This was merged by accident.
Change-Id: I1bee102823e7e612be8e8f2e0f50580e8692cc80
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: 5edd00136d]
This reverts commit a247255a6a.
SVM is not ready yet. This was merged by accident.
Change-Id: I372f7d293fd38429ec570bc0e0add7e612871594
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: 4ebda913cd]
This reverts commit 2c1c2cfdf8.
SVM is not ready yet. This was merged by accident.
Change-Id: I7c0d835a0d3a448f2ac1094f818601e5d6363045
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: a8f4c43fef]
BLACKLIST_ALL_ASICS has to the first in the list otherwise "-" negative
flag won't be inserted
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: I9ee150d7f793809641b16012929c4e157595d37f
[ROCm/ROCR-Runtime commit: 0f16e6f35f]
Park the wave, if it is stopped, to avoid halting it at an s_endpgm
instruction if the architecture does not support it.
Free ttmp6 by converting the dispatch_ptr into a queue packet index
(25-bit) and storing it in ttmp7[24:0].
Save the exception PC in ttmp11[22:7] ttmp6[31:0].
Change-Id: Iaa3c5baf5b488c0b534044d338f12bffa63ddce2
[ROCm/ROCR-Runtime commit: ea6ee0aa81]
Scratch cache was not updated for IOMMUv2 systems previously.
This both negates the cache and causes segfault during scratch
release.
Change-Id: I71e81d6b642d65ca135868ff7225ea173529d458
[ROCm/ROCR-Runtime commit: 191664cd20]
hsa_amd_agent_iterate_memory_pools return HSA_STATUS_SUCCESS even if
no memory pool is found. Add a memory pool check.
jenkins@jenkins-System-Product-Name:~/rocrtst_tests/gfx902$ ./rocrtst64 --gtest_filter=rocrtstFunc.MemoryAccessTests
Note: Google Test filter = rocrtstFunc.MemoryAccessTests
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from rocrtstFunc
[ RUN ] rocrtstFunc.MemoryAccessTests
#### TEST NAME ####
RocR Memory Access Tests
#### TEST DESCRIPTION ####
This series of tests check memory allocationon GPU and CPU, i.e. GPU access
to system memory and CPU access to GPU memory.
#### TEST SETUP ####
The gpu device name is gfx902
Target HW Profile is HSA_PROFILE_FULL
Test can run on any profile. OK.
#### TEST EXECUTION ####
*** Memory Subtest: CPUAccessToGPUMemoryTest in Memory Pools ***
Segmentation fault (core dumped)
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: Ic335c4c98990b43f5d4842ab6d74855859a9048a
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
[ROCm/ROCR-Runtime commit: 27ae854cda]
Possibly because of moving to gart table for vram access from Kernel.
This test failure shouldn't be a blocker. Temporarily blacklist till a
solution is found.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: I99725f368aced863188e30f619288ad4d033b9a6
[ROCm/ROCR-Runtime commit: e35778ed4d]
Set the KFD_IOC_ALLOC_MEM_FLAGS_COHERENT flag and
KFD_IOC_ALLOC_MEM_FLAGS_UNCACHED flag to allocate
uncached coherent memory when HSA_DISABLE_CACHE
environment variable is set. At KFD driver,
Single KFD_IOC_ALLOC_MEM_FLAGS_UNCACHED flag is
not sufficient to allocate uncached memory. We
have to use both two flags to allocate uncached
memory.
Change-Id: Ie490f37b2e696314e60048f5b1b57442431696e9
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
[ROCm/ROCR-Runtime commit: ae0e74095e]
This will create a deb and an rpm for rocrtst to make installing and
running it easier for non-ROCr devs.
Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: I506baedc1471482e5808139cab5c28ae07ac8fb1
[ROCm/ROCR-Runtime commit: 9311789398]
APU doesn't have non-KERNARG memory pool for cpu agent or
a global memory pool for gpu agent. Current setup check
fails as below. Change to a APU specific check method.
[==========] Running 45 tests from 5 test cases.
[----------] Global test environment set-up.
[----------] 1 test from rocrtst
[ RUN ] rocrtst.Test_Example
#### TEST NAME ####
Test Case Example
#### TEST DESCRIPTION ####
Put a description of the test case here. Line breaks will be taken care of
on output, not here.
#### TEST SETUP ####
The gpu device name is gfx902
Target HW Profile is HSA_PROFILE_FULL
Test can run on any profile. OK.
/home/jenkins/hsa/runtime/rocrtst/common/base_rocr_utils.cc:180: Failure
Value of: rocrtst::ProcessIterateError(err)
Actual: 4096
Expected: HSA_STATUS_SUCCESS
Which is: 0
HSA_STATUS_ERROR: A generic error has occurred.
/home/jenkins/hsa/runtime/rocrtst/suites/test_common/test_case_template.cc:195: Failure
Value of: HSA_STATUS_SUCCESS
Actual: 0
Expected: err
Which is: 4096
rocrtst64: /home/jenkins/hsa/runtime/rocrtst/common/base_rocr_utils.cc:416: hsa_kernel_dispatch_packet_t* rocrtst::WriteAQLToQueue(rocrtst::BaseRocR*, uint64_t*): Assertion `test->main_queue()' failed.
../shunit2: line 977: 1382 Aborted (core dumped) ./rocrtst$ROCRTST_BLD_BITS "$ROCRTST_ARGS" --gtest_output=xml:"$gtest_xml"
failed (failed to run rocrtst)
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: I03691bd4171b6e622231baf3dce4db2211eb47e7
[ROCm/ROCR-Runtime commit: 5977eb554f]
Three kfd subtests are added to verify new XGMI connection with
cache coherence HW link on A+A.
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
Change-Id: I6960ec91cbfb696c4e6acb3b79fd83107003acdd
[ROCm/ROCR-Runtime commit: 9aa521d1ff]
In A+A all system memory is mapped as NC. So add a new function
gfx9_PollNCMemory which will support NC memory.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: I097b95fb156f73d6f480cd4fd262cc6fa5933f69
[ROCm/ROCR-Runtime commit: 085005f07b]
destBuf is mapped as cached, the intruction flat_atomic_add
operates on cache that cause test failed. Adding scc modifier
in the instruction will fix the issue.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Change-Id: I8e138f93ae4f5e23020e3ac1549ef924968a74c5
[ROCm/ROCR-Runtime commit: f7759df6e0]
The three kfdtests have been fixed, so remove them from
filter list.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Change-Id: I101a72476970a9d105e8c0b5c022847757fdd316
[ROCm/ROCR-Runtime commit: 3a378fcf0b]
Before gfx90a, coherent memory is uncached. So it was reasonable
when environment variable HSA_DISABLE_CACHE is set, memory is mapped
as coherent. On gfx90a, coherent memory can be cached, so mapping
memory as coherent can't guarantee memory is uncached. When
HSA_DISABLE_CACHE is set, we have to map memory as uncached.
Change-Id: Ia5ed4cf0ad6aef5644dc8c9e6632b52d606f06f4
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
[ROCm/ROCR-Runtime commit: f132fb2cd0]
Refer to commit "Mark buffers accessed by CP as UC"
A+A buffers are mapped as NC. CP (PM4Writes) need ReleaseMem function to
ensure the write go through to the memory
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: I4ee55a6e40fba078f5950d95c8fee7ee076260bf
[ROCm/ROCR-Runtime commit: 57f46b53ec]
Refer to commit: " Mark buffers accessed by CP as UC"
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: I1816e035dbb3178f28f5e34b050c20ecca282060
[ROCm/ROCR-Runtime commit: 0e8500b886]
This change might be redundant if ROCr takes care of it
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: I7b67143a8ad21baa61b7eda7b8e5fe0ac1e33830
[ROCm/ROCR-Runtime commit: 10674916e4]
This change is for the A+A bring-up branch as it needs to made more
generic to handle all ASICs.
For A+A all the system buffers are mapped as NC (non coherent) unless
explicitly marked as UC (uncached). The coherency is then expected to be
handled by shader by explicitly using acquire/release instructions.
However, CP doesn't have same feature. The buffers used by CP thus have
to UC. For now queue buffer and Signal handler memory is marked as UC.
This change shouldn't affect other ASICs since Uncached flag is not used
in those. However, this change still need to be made more generic.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: I56c37a809913f7f08c94d01b0572d0f4864939aa
[ROCm/ROCR-Runtime commit: 7c05c5240f]
On gfx9, the maximum number of wavefronts per queue is the minimum of
40 waves per compute units, or 512 waves per shader engine. On gfx10,
there can only be 32 waves per compute units.
Signed-off-by: Laurent Morichetti <laurent.morichetti@amd.com>
Change-Id: I148d1a4fe6c07cdbfaa1f77939eb29311c81c008
[ROCm/ROCR-Runtime commit: 4cf11c3a7e]
Reserve some space in the context save area for the debugger's
use. There should be 32 bytes per wave for a given queue.
Change-Id: I65ddb6123d0f6afd3149844617ad19023009101d
[ROCm/ROCR-Runtime commit: a83f9b67ce]
Temporarily blacklist some tests on gfx90a until they are solved.
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Change-Id: I87cc3a996ea7d55ed8f20f5b4eecfd8bb691effd
[ROCm/ROCR-Runtime commit: e342c9c890]
On every new asic with new stepping, we need to manually relax this
checking. This check is not very helpful. Delete it.
Change-Id: I11f813023ca2566d82f6d11121d4be38c296674b
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
[ROCm/ROCR-Runtime commit: 1f05b54dc9]
Due to cache coherence change, the remote vram mapping is changed
to cached, the written value by remote shader will not be read by
local shader. So the test will fail.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Change-Id: I2b64e8a30bed0066e159bad9bb7febae5ebe84aa
[ROCm/ROCR-Runtime commit: ec7ba38b23]
XNACK API for GPUs that support this mode. This API
makes calls to amdgpu driver to configure xnack mode.
It supports set xnack mode and query the current mode used.
Change-Id: If865fd0e3f900f008243dc49504e1a0694e1791a
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
[ROCm/ROCR-Runtime commit: 3f45f602d4]
Add function definitions to support SVM (shared virtual memory)
and xnack set.
Change-Id: Ia97ad9d0c449d8d500d799f702e1a58e87d65a56
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: a352639df5]
Add svm (shared virtual memory) range and xnack mode
APIs.
Change-Id: Ibd8d7fe566dc200730da0c892caa71aad7589ebd
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
[ROCm/ROCR-Runtime commit: 5ae49f2321]
strlen(src) should not be used as the length in strncpy. Use memcpy
since we know the length of the string, and ensure that we
NULL-terminate regardless of length
Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: I21cc6d106510c69464e7ac9d3fc7da3a1e6d1a68
[ROCm/ROCR-Runtime commit: 731a06c704]
The largebar check will exit exceptionally from test
when destination node is not set.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Change-Id: I8bf0fed613250cc71468208e645fc562fb1a8757
[ROCm/ROCR-Runtime commit: 18ead8815c]
Update build script and CMakeLists_sp3.txt file as SP3 directory
structure has changed.
The SP3 source code with gfx90a suport is merged into a
new branch mukjoshi/sp3_gfx90a.
Please make sure to checkout this branch before using the
build script to generate the static library.
Change-Id: I2bf0ade8b2d254cd7648cc8a6d69a83ee51344cd
[ROCm/ROCR-Runtime commit: da3abfb0f8]