Gráfico de commits

710 Commits

Autor SHA1 Mensaje Fecha
Konstantin Zhuravlyov b095fec147 Expose iterator for executables
Change-Id: I0c5d39fc33c15a6eb8ee10ff181c2dcf2e042675


[ROCm/ROCR-Runtime commit: 15e54d684d]
2021-04-16 20:51:48 -04:00
Konstantin Zhuravlyov 1c7abea61a Remove loaders.c/hpp
Change-Id: Ida507c2dd2de9172f250172f9c45a639953cb412


[ROCm/ROCR-Runtime commit: e826c365ea]
2021-04-16 20:51:48 -04:00
Mengbing Wang a69a3946c9 Add allocation size limit of 1/2 vram size in rocrtstPerf.Memory_Async_Copy test.
Add the hard limit of allocation size to be 1/2 available vram
to avoid allocation failure when allocation size equals to vram size.

Add printing block size in each round to report progress for long running
test

Add the block size skip info in result form(if any tests skipped).

Affected test:
rocrtstPerf.Memory_Async_Copy

Data Size             Avg Time(us)         Avg BW(GB/s)          MinTime(us)          Peak BW(GB/s)
  128M             638759.570200              0.195692		637569.991000               0.196057
  256M            1270058.822400              0.196841		1268425.758000               0.197095
Notice: Data Size larger than 512M is skipped due to hard limit of 1/2 vram size

Signed-off-by: Mengbing Wang <mengbing.wang@amd.com>
Change-Id: I4c4cea74a608272cc29d222b9399af26b34d7473


[ROCm/ROCR-Runtime commit: cf10c3bc35]
2021-04-16 02:23:48 -04:00
Mike Li 3258d72d3b Get GPU cache information from KFD
Signed-off-by: Mike Li <Tianxinmike.Li@amd.com>
Change-Id: I8dc8c97ae81c3747b7cd88cf2cdb7a9e4694a88d


[ROCm/ROCR-Runtime commit: d077606e22]
2021-04-13 10:29:34 -04:00
Tony Tye e20cccb6e4 Add support for gfx909 and gfx90c
Change-Id: I88158789cdda44a173e3ca26d2c96b8e0ea0e221


[ROCm/ROCR-Runtime commit: a97c14abea]
2021-04-08 22:37:30 +00:00
Sean Keely 2b25548eb0 Remove emulator SRAMECC override controls.
Change-Id: Iea9e7870dbf517032f34cebec673c90226b96960


[ROCm/ROCR-Runtime commit: 243e29ba8e]
2021-04-02 02:11:05 -04:00
Sean Keely da41352a93 Revert SVM and XNACK support.
KFD is not ready yet.

Change-Id: I61deb292ddb92185d33504c2115169888d56e211


[ROCm/ROCR-Runtime commit: 5bd153974d]
2021-04-02 02:10:59 -04:00
Ramesh Errabolu 29fa097a82 Override Cpu-Gpu link-weight for Alebaran until a proper fix is available
Change-Id: I1fbc38b788f71cc9c9fc62295223286004689bf9


[ROCm/ROCR-Runtime commit: 25f3dc305f]
2021-04-02 02:10:54 -04:00
Sean Keely dd42ca6dbe Squash merge of cfreehil/amd-temp-gfx90a onto amd-staging.
Includes some workarounds and HMM.
Conflicts:
	opensrc/hsa-runtime/core/runtime/amd_topology.cpp
	opensrc/hsa-runtime/core/util/flag.h

Change-Id: I22976f07964a43dbb228a6231777dbd599112b8d


[ROCm/ROCR-Runtime commit: 7333c77e22]
2021-04-02 02:10:15 -04:00
Sean Keely ea1f545fcc Correct hsa_agent_iterate_isas return code for CPUs.
When no isa's are available no callbacks should be invoked.  This
is not an error and should return success.

Change-Id: Ie4048aa8cbe5c3fdf5431f6a865021549ecf8a13


[ROCm/ROCR-Runtime commit: 4197461b7f]
2021-04-01 00:08:22 -04:00
Sean Keely 465ada0234 Block ROCm 4.1+ running against 4.0 and prior kfd.
Sramecc is misreported in kfd 4.0 and prior.  To prevent possible
corruption due to d16 instructions, deny use of gfx906 with older
kfds and correct misreport for gfx908.  Denial of gfx906 may be
overridden by setting HSA_IGNORE_SRAMECC_MISREPORT=1.

Change-Id: I7d5c3a716fad01c348f8b88cd508cedbf914c989


[ROCm/ROCR-Runtime commit: 45fbe5b192]
2021-04-01 00:03:32 -04:00
Cole Nelson 7cd0a8435b hsa-runtime: add ENABLE_LDCONFIG to support multi-version install
Depends-On: I58fdf1d0b4e864b5a61ffe8e335d430d424811ab
Change-Id: I0cb6f8711ea5033e84b7e45ce20e7e23d84005c3
Signed-off-by: Cole Nelson <cole.nelson@amd.com>


[ROCm/ROCR-Runtime commit: 72fa4a17fa]
2021-03-26 18:37:04 -04:00
Mengbing Wang 97918cbd7c limit the memory allocation on vram to 3/4 of vram size.
1. As we cannot ganrantee that 100% apu vram are free to be allocated, limit
the allocation size be no more than 3/4 of vram size.
2. Keep the old 1GB allocation limit for dGPU case.
3. Add the alignment check for alloc_size.

Affected tests:

rocrtstStress.Memory_Concurrent_Allocate_Test
rocrtstStress.Memory_Concurrent_Free_Test

Change-Id: Id0023de132024d02f80980ae4237d9d74d9e27d3
Signed-off-by: Mengbing Wang <mengbing.wang@amd.com>


[ROCm/ROCR-Runtime commit: d5855c1658]
2021-03-23 18:59:42 +08:00
Chris Freehill 72dcc0520e Don't overwrite default CMAKE_CXX_FLAGS in tests & samples
Change-Id: I4a2bb0bcc320fb0645e9fc5447775e6a878b960b


[ROCm/ROCR-Runtime commit: 82b2dbe495]
2021-03-17 21:25:18 -05:00
Laurent Morichetti 023947a6de New trap handler ABI (v5)
Park the wave, if it is stopped, to avoid halting it at an s_endpgm
instruction if the architecture does not support it.

Free ttmp6 by converting the dispatch_ptr into a queue packet index
(25-bit) and storing it in ttmp7[24:0].

Save the exception PC in ttmp11[22:7] ttmp6[31:0].

Change-Id: Iaa3c5baf5b488c0b534044d338f12bffa63ddce2


[ROCm/ROCR-Runtime commit: ea6ee0aa81]
2021-03-04 21:44:14 -05:00
Laurent Morichetti 188570928f Correct the trap handler
ttmp11 no longer has an "excp_raised" field.

Change-Id: I8e673ca404c2b802470bbc9f76e7925782076c5a


[ROCm/ROCR-Runtime commit: 7e0f391a08]
2021-03-04 21:21:26 -05:00
Sean Keely e2aca270b7 Insert scratch memory into scratch cache on full profile systems.
Scratch cache was not updated for IOMMUv2 systems previously.
This both negates the cache and causes segfault during scratch
release.

Change-Id: I71e81d6b642d65ca135868ff7225ea173529d458


[ROCm/ROCR-Runtime commit: 191664cd20]
2021-03-03 21:30:16 -05:00
Yifan Zhang 519b7d4642 rorctst: check gpu_pool value after hsa_amd_agent_iterate_memory_pools.
hsa_amd_agent_iterate_memory_pools return HSA_STATUS_SUCCESS even if
no memory pool is found. Add a memory pool check.

jenkins@jenkins-System-Product-Name:~/rocrtst_tests/gfx902$ ./rocrtst64 --gtest_filter=rocrtstFunc.MemoryAccessTests
Note: Google Test filter = rocrtstFunc.MemoryAccessTests
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from rocrtstFunc
[ RUN      ] rocrtstFunc.MemoryAccessTests

	#### TEST NAME ####
RocR Memory Access Tests

	#### TEST DESCRIPTION ####
This series of tests check memory allocationon GPU and CPU, i.e. GPU access
to system memory and CPU access to GPU memory.

	#### TEST SETUP ####
The gpu device name is gfx902
Target HW Profile is HSA_PROFILE_FULL
Test can run on any profile. OK.

	#### TEST EXECUTION ####
  *** Memory Subtest: CPUAccessToGPUMemoryTest in Memory Pools ***
Segmentation fault (core dumped)

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: Ic335c4c98990b43f5d4842ab6d74855859a9048a
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>


[ROCm/ROCR-Runtime commit: 27ae854cda]
2021-03-03 20:16:26 -05:00
Kent Russell 70b6691f55 rocrtst: Add packaging information to CMakeLists
This will create a deb and an rpm for rocrtst to make installing and
running it easier for non-ROCr devs.

Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: I506baedc1471482e5808139cab5c28ae07ac8fb1


[ROCm/ROCR-Runtime commit: 9311789398]
2021-02-24 12:22:31 -05:00
Yifan Zhang 12433d7a87 rocrtst: fix a test case setup issue in iommuv2 for APU
APU doesn't have non-KERNARG memory pool for cpu agent or
a global memory pool for gpu agent. Current setup check
fails as below. Change to a APU specific check method.

[==========] Running 45 tests from 5 test cases.
[----------] Global test environment set-up.
[----------] 1 test from rocrtst
[ RUN      ] rocrtst.Test_Example

    #### TEST NAME ####
Test Case Example

    #### TEST DESCRIPTION ####
Put a description of the test case here. Line breaks will be taken care of
on output, not here.

    #### TEST SETUP ####
The gpu device name is gfx902
Target HW Profile is HSA_PROFILE_FULL
Test can run on any profile. OK.
/home/jenkins/hsa/runtime/rocrtst/common/base_rocr_utils.cc:180: Failure
Value of: rocrtst::ProcessIterateError(err)
  Actual: 4096
Expected: HSA_STATUS_SUCCESS
Which is: 0
HSA_STATUS_ERROR: A generic error has occurred.
/home/jenkins/hsa/runtime/rocrtst/suites/test_common/test_case_template.cc:195: Failure
Value of: HSA_STATUS_SUCCESS
  Actual: 0
Expected: err
Which is: 4096
rocrtst64: /home/jenkins/hsa/runtime/rocrtst/common/base_rocr_utils.cc:416: hsa_kernel_dispatch_packet_t* rocrtst::WriteAQLToQueue(rocrtst::BaseRocR*, uint64_t*): Assertion `test->main_queue()' failed.
../shunit2: line 977:  1382 Aborted                 (core dumped) ./rocrtst$ROCRTST_BLD_BITS "$ROCRTST_ARGS" --gtest_output=xml:"$gtest_xml"
failed (failed to run rocrtst)

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: I03691bd4171b6e622231baf3dce4db2211eb47e7


[ROCm/ROCR-Runtime commit: 5977eb554f]
2021-02-23 20:14:40 -05:00
Mike Li c2bed10739 Support for Custom Pitch for gfx103x
Signed-off-by: Mike Li <Tianxinmike.Li@amd.com>
Change-Id: Ica83dff8bb382637010396781190f585754bd150


[ROCm/ROCR-Runtime commit: 93609fd3d4]
2021-02-22 22:05:25 -05:00
Jason Tang 562258ec93 Correct GetIsa() typo
Change-Id: Ia6b5a86bd035fb077f0da9d52160ec8d12987b87


[ROCm/ROCR-Runtime commit: ec22afb8a8]
2021-02-17 11:57:58 -05:00
Sean Keely 622dc89e98 Correct legacy copy path.
Legacy p2p copy path incorrectly transfered in whole pages rather than
the requested size.

Change-Id: I9aa7337754f9e32f587a0cc5305f8ffeb6196f10


[ROCm/ROCR-Runtime commit: 34ac62274a]
2021-02-10 19:53:02 -05:00
Sean Keely 4047b1c3a8 Add hsa_amd_signal_value_pointer.
Enables partial signal interop with non-HSA devices.

Change-Id: Ic39bca84ed1709cbd2cc24b1eb0f4fc6cccb39cf


[ROCm/ROCR-Runtime commit: 01f42dbe46]
2021-02-10 18:47:54 -05:00
Laurent Morichetti b3dc12024b New trap handler ABI (v4)
Replace the stop reasons ttmp11.trap_raised and ttmp11.excp_raised
with ttmp11.wave_stopped which indicates that the trap handler has
halted the wave as the result of an event (trap, single-step or
exception).

If the wave is stopped because of a trap, also record the trap_id in
ttmp11.saved_trap_id[7:0].

Save status.halt in ttmp11.saved_status_halt, so that it can be
restored when resuming a wave (changing a wave's state from stopped to
running or single-stepping).

Change-Id: I7322f59b60e8cc1b92bf5f067dba606a3109ef49


[ROCm/ROCR-Runtime commit: 9ca79d072a]
2021-02-05 09:56:01 -08:00
Evgeny 95ee562f1a adding gfx1030 blocks
Change-Id: Ide2576939c5321dbe928183a8d9984d5ef87a61b


[ROCm/ROCR-Runtime commit: c5aae30d08]
2021-01-29 08:50:10 -06:00
Huang Rui 36d1285c53 Add gfx 10.3.3 into rocrtst list
Change-Id: I854e5092236175e47a2134d703f154885cae8c3e
Signed-off-by: Huang Rui <ray.huang@amd.com>


[ROCm/ROCR-Runtime commit: 554ed5e76d]
2021-01-22 04:22:15 -05:00
Huang Rui cb0f788b9b Add gfx10.3.3 ISA support for Van Gogh
This patch is to let ROCr recognize new gfx10.3.3 ISA.

Change-Id: Ied23eee2752e14c19c8c0a6d7789fded9940e31e
Signed-off-by: Huang Rui <ray.huang@amd.com>


[ROCm/ROCR-Runtime commit: feeb2f62e2]
2021-01-22 04:22:15 -05:00
Laurent Morichetti 062d313530 Don't terminate waves halted at s_endpgm
To support single stepping the instruction preceding an s_endpgm,
unwind the PC by 8 bytes and set ttmp11[9] to notify the debugger
that the wave is halted with a modified PC.

Bump the debug r_version for this new trap handler ABI.

Change-Id: I55e4e0d65576f92da14a336266c31c513baab547


[ROCm/ROCR-Runtime commit: 8aec53969f]
2021-01-21 20:51:38 -08:00
Laurent Morichetti 3eaae50cc6 Correct gfx10.3+ trap handler.
Change-Id: I77d2b41c8882014a430d741ecd777718a1f61561


[ROCm/ROCR-Runtime commit: 8808ed3177]
2021-01-21 09:24:20 -08:00
Tony Tye 0aa0ebe2ee Correct isa lookup for targets that do not support a target feature
Change-Id: I130070a53162e5d9fcc6a64a4bdda7869179be82


[ROCm/ROCR-Runtime commit: 26fe26e415]
2021-01-18 15:47:19 +00:00
Chris Freehill 33438e7adc Correct some target ID strings for gfx908
Change-Id: I7833b561447b9928447cf49472cfe1ca1867e71d


[ROCm/ROCR-Runtime commit: 09bc75bf0d]
2021-01-15 14:56:38 -06:00
Sean Keely 343684f84d Correct computation of scratch slot requirements.
Each SE must be assigned equal numbers of slots and slots
must be assigned in units of whole groups.

Change-Id: I8f3677237fa6f2e2d25e3e78210c5a7a0ad792f3


[ROCm/ROCR-Runtime commit: 7bc6aac5d2]
2021-01-13 15:09:00 -05:00
Sean Keely ddfed66eec Revert "Revert "Cache scratch allocations.""
This reverts commit 4502bb94c9.

Change-Id: I3f3c257270016559f8b2e70151664f0931db28d2


[ROCm/ROCR-Runtime commit: 9fe8ccc3ee]
2021-01-13 15:08:53 -05:00
Tony Tye fdfedaf0d2 Improve Isa class
- Use consistent naming in Isa class.
- Remove unused Isa methods.
- Simplify Isa methods.

Change-Id: I7c4045d08fbfe0d94b3181db8ebc5e5ed8c8cc82


[ROCm/ROCR-Runtime commit: 6bbf6b1c9c]
2021-01-10 18:23:54 +00:00
Tony 3fdfbc56e4 Store target ID in isa registry
Store target ID string in isa registry and use for returning agent and
isa name.

Change-Id: I72a20d8ff963c73d86392158aff3853e4c9bfdbd


[ROCm/ROCR-Runtime commit: 853ccc762e]
2021-01-10 18:23:54 +00:00
Tony bc565f6c69 Correct code object V2 support
- Remove gfx800, gfx804 and gfx901 as they do not exist.
- Map the V2 note record of "AMD:AMDGPU:8:0:0" to gfx802 as they are
  the same target just connected to a differnt motherboard.
- Correct typo for supporting gfx902:xnack+.
- Support agent names with a minor or stepping version greater than 9.

Change-Id: Ife933449f60ab4687e2aaab9baf4c9fc5b86339d


[ROCm/ROCR-Runtime commit: 12eb2764cd]
2021-01-10 18:23:54 +00:00
Sean Keely 4502bb94c9 Revert "Cache scratch allocations."
This reverts commit ce4de85616.

Change-Id: I698b33bacb2be3de6c8185fe89597a60a79521c5


[ROCm/ROCR-Runtime commit: 7e2ba23566]
2021-01-08 11:57:40 -06:00
Sean Keely 0639b53e31 Add support for gfx1032.
Change-Id: I36f93a6b61e74cf17aac1a05d7c1d4ba6369fcc9


[ROCm/ROCR-Runtime commit: d39ae13420]
2021-01-05 17:28:19 -06:00
Sean Keely 14dd324d2f Cleanup warnings when using clang.
Change-Id: I09f72831e29bccdb4170c54e203872412e2f0b59


[ROCm/ROCR-Runtime commit: bd63a2b690]
2020-12-04 22:18:14 -06:00
Tony b36aad204e Make supported targets consistent
Add missing target names and make all parts consistent with which
targets are supported.

- Add gfx805 as a supported target.

- Add all ELF targets to genric code.

- Make offline loader match supported targets.

Change-Id: Idab4d69edc71645aecaa83aa55e29c1aeee4c1d6


[ROCm/ROCR-Runtime commit: b443397bcc]
2020-11-24 03:14:31 +00:00
Sean Keely 62138712cf Add asserts and minimum values for kernarg alignment and utility functions.
Kernel argument size and alignment queries are not supported on
code object v3.

Change-Id: I1bdd34e2e62132f912ac39d80355efd3456df87c


[ROCm/ROCR-Runtime commit: 6182abf5e9]
2020-11-21 21:39:49 -06:00
Tony e1734526fc Update code object V3 kernarg queries
Code object V2 had the ability to support the following queries:

- HSA_CODE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_SIZE
- HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_SIZE
- HSA_CODE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_ALIGNMENT
- HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_ALIGNMENT

However code object V3 onwards cannot support these as the kernel
descriptor changed. These queries need to be deprecated.

Until then return more reasonable values:

- For kernarg alignment return 16 which is the minimum alignment
  required by the HSA standard.

- For kernarg size return the field from the kernel descriptor which
  is a hint. If it is 0 then the compiler is not specifying the kernarg
  size, or the kernel has no kernarg.

Change-Id: I19ce6cd0f3658a2bf62277492f39100ea5ab4256


[ROCm/ROCR-Runtime commit: ef755e4c82]
2020-11-20 21:39:18 -05:00
Sean Keely ce4de85616 Cache scratch allocations.
Avoids calling to KFD to map/unmap scratch allocations for
every large scratch using dispatch.

Change-Id: I9fab5705251ec82b03e4f2f2ca6da7cdccabefb9


[ROCm/ROCR-Runtime commit: 27e044ae4d]
2020-11-20 15:07:01 -05:00
Sean Keely 20f9fbd7f2 Limit clock synchronization to 16Hz.
Improves HIP event performance in directed benchmarks where
clock sync latency is significant.

Change-Id: I78b724a14a8f5b6a9a2b9f4d85afe9d8b81808a6


[ROCm/ROCR-Runtime commit: 32d0fcafa9]
2020-11-20 15:06:13 -05:00
Sean Keely eacc927741 Style update for SDMA enable flag.
Updated to match xnack flag's style.

Change-Id: I6115c0b53660d789e698de1606a9388ae1789866


[ROCm/ROCR-Runtime commit: b51f68b535]
2020-11-20 15:06:02 -05:00
Cordell Bloor 63953d98e1 Fix CMake configure error due to CMP0012
The modern meaning of the construct if( NOT ON ) was added in CMake 2.8,
but when the cmake_minimum_required not set in user code and no policy
level is set in the CMake config, then CMake 2.8 features cannot be
used. In old CMake (the default), ON is interpreted as a variable, and
because it is not defined, it is considered false. The same is true of
OFF.

This change sets a variable as ON, so that old CMake interpretation is
correct, and the if works as expected regardless of policy version.

Change-Id: I67d7ed4ceaf8248eeb5a1c7f54009d72313f3f5d


[ROCm/ROCR-Runtime commit: 4a35f560f6]
2020-11-20 15:04:41 -05:00
Cole Nelson 5dd453a265 opensrc/hsa-runtime/CMakeLists.txt: conformant package names
Names test good:
hsa-rocr-dev_1.2.0.30900-crdnnv.415_amd64.deb
hsa-rocr-dev-1.2.0.30900-crdnnv.415.el7.x86_64.rpm
hsa-rocr-dev-1.2.0.30900-crdnnv.sles151.415.x86_64.rpm

http://confluence.amd.com/display/GPUCPT/Package+File+Naming

Note: rpm requires 'devel' instead of 'dev', to be a subsequent
patchset.

Change-Id: Id6a422f3c335448b52c70c77ed39c9041114b80f
Signed-off-by: Cole Nelson <cole.nelson@amd.com>


[ROCm/ROCR-Runtime commit: 90f2dd5b1b]
2020-11-18 14:56:24 -05:00
Pruthvi Madugundu f254139e48 Fix for uninstallation problem of hsa-rocr-dev
- /opt/rocm-xx/hsa/include directory wasnt deleted after
  debian package uninstallation
- , 

Signed-off-by: Pruthvi Madugundu <pruthvi.madugundu@amd.com>
Change-Id: I213439d73f6533ff3a55e2b0071061d970cf56d4


[ROCm/ROCR-Runtime commit: 87955f8551]
2020-11-11 12:32:11 -08:00
Konstantin Zhuravlyov ba667661c5 Implement Target ID Proposal
Changes from Konstantin Zhuravlyov, Tony Tye

Change-Id: I532801193afa9d5b8ac2a877b5497eab661f0597


[ROCm/ROCR-Runtime commit: 3a08d0964e]
2020-11-10 13:42:35 -05:00