rocm-systems

Auteur	SHA1	Message	Date
Cole Nelson	72fa4a17fa	hsa-runtime: add ENABLE_LDCONFIG to support multi-version install Depends-On: I58fdf1d0b4e864b5a61ffe8e335d430d424811ab Change-Id: I0cb6f8711ea5033e84b7e45ce20e7e23d84005c3 Signed-off-by: Cole Nelson <cole.nelson@amd.com>	2021-03-26 18:37:04 -04:00
Mengbing Wang	d5855c1658	limit the memory allocation on vram to 3/4 of vram size. 1. As we cannot ganrantee that 100% apu vram are free to be allocated, limit the allocation size be no more than 3/4 of vram size. 2. Keep the old 1GB allocation limit for dGPU case. 3. Add the alignment check for alloc_size. Affected tests: rocrtstStress.Memory_Concurrent_Allocate_Test rocrtstStress.Memory_Concurrent_Free_Test Change-Id: Id0023de132024d02f80980ae4237d9d74d9e27d3 Signed-off-by: Mengbing Wang <mengbing.wang@amd.com>	2021-03-23 18:59:42 +08:00
Chris Freehill	82b2dbe495	Don't overwrite default CMAKE_CXX_FLAGS in tests & samples Change-Id: I4a2bb0bcc320fb0645e9fc5447775e6a878b960b	2021-03-17 21:25:18 -05:00
Laurent Morichetti	ea6ee0aa81	New trap handler ABI (v5) Park the wave, if it is stopped, to avoid halting it at an s_endpgm instruction if the architecture does not support it. Free ttmp6 by converting the dispatch_ptr into a queue packet index (25-bit) and storing it in ttmp7[24:0]. Save the exception PC in ttmp11[22:7] ttmp6[31:0]. Change-Id: Iaa3c5baf5b488c0b534044d338f12bffa63ddce2	2021-03-04 21:44:14 -05:00
Laurent Morichetti	7e0f391a08	Correct the trap handler ttmp11 no longer has an "excp_raised" field. Change-Id: I8e673ca404c2b802470bbc9f76e7925782076c5a	2021-03-04 21:21:26 -05:00
Sean Keely	191664cd20	Insert scratch memory into scratch cache on full profile systems. Scratch cache was not updated for IOMMUv2 systems previously. This both negates the cache and causes segfault during scratch release. Change-Id: I71e81d6b642d65ca135868ff7225ea173529d458	2021-03-03 21:30:16 -05:00
Yifan Zhang	27ae854cda	rorctst: check gpu_pool value after hsa_amd_agent_iterate_memory_pools. hsa_amd_agent_iterate_memory_pools return HSA_STATUS_SUCCESS even if no memory pool is found. Add a memory pool check. jenkins@jenkins-System-Product-Name:~/rocrtst_tests/gfx902$ ./rocrtst64 --gtest_filter=rocrtstFunc.MemoryAccessTests Note: Google Test filter = rocrtstFunc.MemoryAccessTests [==========] Running 1 test from 1 test case. [----------] Global test environment set-up. [----------] 1 test from rocrtstFunc [ RUN ] rocrtstFunc.MemoryAccessTests #### TEST NAME #### RocR Memory Access Tests #### TEST DESCRIPTION #### This series of tests check memory allocationon GPU and CPU, i.e. GPU access to system memory and CPU access to GPU memory. #### TEST SETUP #### The gpu device name is gfx902 Target HW Profile is HSA_PROFILE_FULL Test can run on any profile. OK. #### TEST EXECUTION #### * Memory Subtest: CPUAccessToGPUMemoryTest in Memory Pools * Segmentation fault (core dumped) Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Change-Id: Ic335c4c98990b43f5d4842ab6d74855859a9048a Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>	2021-03-03 20:16:26 -05:00
Kent Russell	9311789398	rocrtst: Add packaging information to CMakeLists This will create a deb and an rpm for rocrtst to make installing and running it easier for non-ROCr devs. Signed-off-by: Kent Russell <kent.russell@amd.com> Change-Id: I506baedc1471482e5808139cab5c28ae07ac8fb1	2021-02-24 12:22:31 -05:00
Yifan Zhang	5977eb554f	rocrtst: fix a test case setup issue in iommuv2 for APU APU doesn't have non-KERNARG memory pool for cpu agent or a global memory pool for gpu agent. Current setup check fails as below. Change to a APU specific check method. [==========] Running 45 tests from 5 test cases. [----------] Global test environment set-up. [----------] 1 test from rocrtst [ RUN ] rocrtst.Test_Example #### TEST NAME #### Test Case Example #### TEST DESCRIPTION #### Put a description of the test case here. Line breaks will be taken care of on output, not here. #### TEST SETUP #### The gpu device name is gfx902 Target HW Profile is HSA_PROFILE_FULL Test can run on any profile. OK. /home/jenkins/hsa/runtime/rocrtst/common/base_rocr_utils.cc:180: Failure Value of: rocrtst::ProcessIterateError(err) Actual: 4096 Expected: HSA_STATUS_SUCCESS Which is: 0 HSA_STATUS_ERROR: A generic error has occurred. /home/jenkins/hsa/runtime/rocrtst/suites/test_common/test_case_template.cc:195: Failure Value of: HSA_STATUS_SUCCESS Actual: 0 Expected: err Which is: 4096 rocrtst64: /home/jenkins/hsa/runtime/rocrtst/common/base_rocr_utils.cc:416: hsa_kernel_dispatch_packet_t* rocrtst::WriteAQLToQueue(rocrtst::BaseRocR, uint64_t): Assertion `test->main_queue()' failed. ../shunit2: line 977: 1382 Aborted (core dumped) ./rocrtst$ROCRTST_BLD_BITS "$ROCRTST_ARGS" --gtest_output=xml:"$gtest_xml" failed (failed to run rocrtst) Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Change-Id: I03691bd4171b6e622231baf3dce4db2211eb47e7	2021-02-23 20:14:40 -05:00
Mike Li	93609fd3d4	Support for Custom Pitch for gfx103x Signed-off-by: Mike Li <Tianxinmike.Li@amd.com> Change-Id: Ica83dff8bb382637010396781190f585754bd150	2021-02-22 22:05:25 -05:00
Jason Tang	ec22afb8a8	Correct GetIsa() typo Change-Id: Ia6b5a86bd035fb077f0da9d52160ec8d12987b87	2021-02-17 11:57:58 -05:00
Sean Keely	34ac62274a	Correct legacy copy path. Legacy p2p copy path incorrectly transfered in whole pages rather than the requested size. Change-Id: I9aa7337754f9e32f587a0cc5305f8ffeb6196f10	2021-02-10 19:53:02 -05:00
Sean Keely	01f42dbe46	Add hsa_amd_signal_value_pointer. Enables partial signal interop with non-HSA devices. Change-Id: Ic39bca84ed1709cbd2cc24b1eb0f4fc6cccb39cf	2021-02-10 18:47:54 -05:00
Laurent Morichetti	9ca79d072a	New trap handler ABI (v4) Replace the stop reasons ttmp11.trap_raised and ttmp11.excp_raised with ttmp11.wave_stopped which indicates that the trap handler has halted the wave as the result of an event (trap, single-step or exception). If the wave is stopped because of a trap, also record the trap_id in ttmp11.saved_trap_id[7:0]. Save status.halt in ttmp11.saved_status_halt, so that it can be restored when resuming a wave (changing a wave's state from stopped to running or single-stepping). Change-Id: I7322f59b60e8cc1b92bf5f067dba606a3109ef49	2021-02-05 09:56:01 -08:00
Evgeny	c5aae30d08	adding gfx1030 blocks Change-Id: Ide2576939c5321dbe928183a8d9984d5ef87a61b	2021-01-29 08:50:10 -06:00
Huang Rui	554ed5e76d	Add gfx 10.3.3 into rocrtst list Change-Id: I854e5092236175e47a2134d703f154885cae8c3e Signed-off-by: Huang Rui <ray.huang@amd.com>	2021-01-22 04:22:15 -05:00
Huang Rui	feeb2f62e2	Add gfx10.3.3 ISA support for Van Gogh This patch is to let ROCr recognize new gfx10.3.3 ISA. Change-Id: Ied23eee2752e14c19c8c0a6d7789fded9940e31e Signed-off-by: Huang Rui <ray.huang@amd.com>	2021-01-22 04:22:15 -05:00
Laurent Morichetti	8aec53969f	Don't terminate waves halted at s_endpgm To support single stepping the instruction preceding an s_endpgm, unwind the PC by 8 bytes and set ttmp11[9] to notify the debugger that the wave is halted with a modified PC. Bump the debug r_version for this new trap handler ABI. Change-Id: I55e4e0d65576f92da14a336266c31c513baab547	2021-01-21 20:51:38 -08:00
Laurent Morichetti	8808ed3177	Correct gfx10.3+ trap handler. Change-Id: I77d2b41c8882014a430d741ecd777718a1f61561	2021-01-21 09:24:20 -08:00
Tony Tye	26fe26e415	Correct isa lookup for targets that do not support a target feature Change-Id: I130070a53162e5d9fcc6a64a4bdda7869179be82	2021-01-18 15:47:19 +00:00
Chris Freehill	09bc75bf0d	Correct some target ID strings for gfx908 Change-Id: I7833b561447b9928447cf49472cfe1ca1867e71d	2021-01-15 14:56:38 -06:00
Sean Keely	7bc6aac5d2	Correct computation of scratch slot requirements. Each SE must be assigned equal numbers of slots and slots must be assigned in units of whole groups. Change-Id: I8f3677237fa6f2e2d25e3e78210c5a7a0ad792f3	2021-01-13 15:09:00 -05:00
Sean Keely	9fe8ccc3ee	Revert "Revert "Cache scratch allocations."" This reverts commit `7e2ba23566`. Change-Id: I3f3c257270016559f8b2e70151664f0931db28d2	2021-01-13 15:08:53 -05:00
Tony Tye	6bbf6b1c9c	Improve Isa class - Use consistent naming in Isa class. - Remove unused Isa methods. - Simplify Isa methods. Change-Id: I7c4045d08fbfe0d94b3181db8ebc5e5ed8c8cc82	2021-01-10 18:23:54 +00:00
Tony	853ccc762e	Store target ID in isa registry Store target ID string in isa registry and use for returning agent and isa name. Change-Id: I72a20d8ff963c73d86392158aff3853e4c9bfdbd	2021-01-10 18:23:54 +00:00
Tony	12eb2764cd	Correct code object V2 support - Remove gfx800, gfx804 and gfx901 as they do not exist. - Map the V2 note record of "AMD:AMDGPU:8:0:0" to gfx802 as they are the same target just connected to a differnt motherboard. - Correct typo for supporting gfx902:xnack+. - Support agent names with a minor or stepping version greater than 9. Change-Id: Ife933449f60ab4687e2aaab9baf4c9fc5b86339d	2021-01-10 18:23:54 +00:00
Sean Keely	7e2ba23566	Revert "Cache scratch allocations." This reverts commit `27e044ae4d`. Change-Id: I698b33bacb2be3de6c8185fe89597a60a79521c5	2021-01-08 11:57:40 -06:00
Sean Keely	d39ae13420	Add support for gfx1032. Change-Id: I36f93a6b61e74cf17aac1a05d7c1d4ba6369fcc9	2021-01-05 17:28:19 -06:00
Sean Keely	bd63a2b690	Cleanup warnings when using clang. Change-Id: I09f72831e29bccdb4170c54e203872412e2f0b59	2020-12-04 22:18:14 -06:00
Tony	b443397bcc	Make supported targets consistent Add missing target names and make all parts consistent with which targets are supported. - Add gfx805 as a supported target. - Add all ELF targets to genric code. - Make offline loader match supported targets. Change-Id: Idab4d69edc71645aecaa83aa55e29c1aeee4c1d6	2020-11-24 03:14:31 +00:00
Sean Keely	6182abf5e9	Add asserts and minimum values for kernarg alignment and utility functions. Kernel argument size and alignment queries are not supported on code object v3. Change-Id: I1bdd34e2e62132f912ac39d80355efd3456df87c	2020-11-21 21:39:49 -06:00
Tony	ef755e4c82	Update code object V3 kernarg queries Code object V2 had the ability to support the following queries: - HSA_CODE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_SIZE - HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_SIZE - HSA_CODE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_ALIGNMENT - HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_ALIGNMENT However code object V3 onwards cannot support these as the kernel descriptor changed. These queries need to be deprecated. Until then return more reasonable values: - For kernarg alignment return 16 which is the minimum alignment required by the HSA standard. - For kernarg size return the field from the kernel descriptor which is a hint. If it is 0 then the compiler is not specifying the kernarg size, or the kernel has no kernarg. Change-Id: I19ce6cd0f3658a2bf62277492f39100ea5ab4256	2020-11-20 21:39:18 -05:00
Sean Keely	27e044ae4d	Cache scratch allocations. Avoids calling to KFD to map/unmap scratch allocations for every large scratch using dispatch. Change-Id: I9fab5705251ec82b03e4f2f2ca6da7cdccabefb9	2020-11-20 15:07:01 -05:00
Sean Keely	32d0fcafa9	Limit clock synchronization to 16Hz. Improves HIP event performance in directed benchmarks where clock sync latency is significant. Change-Id: I78b724a14a8f5b6a9a2b9f4d85afe9d8b81808a6	2020-11-20 15:06:13 -05:00
Sean Keely	b51f68b535	Style update for SDMA enable flag. Updated to match xnack flag's style. Change-Id: I6115c0b53660d789e698de1606a9388ae1789866	2020-11-20 15:06:02 -05:00
Cordell Bloor	4a35f560f6	Fix CMake configure error due to CMP0012 The modern meaning of the construct if( NOT ON ) was added in CMake 2.8, but when the cmake_minimum_required not set in user code and no policy level is set in the CMake config, then CMake 2.8 features cannot be used. In old CMake (the default), ON is interpreted as a variable, and because it is not defined, it is considered false. The same is true of OFF. This change sets a variable as ON, so that old CMake interpretation is correct, and the if works as expected regardless of policy version. Change-Id: I67d7ed4ceaf8248eeb5a1c7f54009d72313f3f5d	2020-11-20 15:04:41 -05:00
Cole Nelson	90f2dd5b1b	opensrc/hsa-runtime/CMakeLists.txt: conformant package names Names test good: hsa-rocr-dev_1.2.0.30900-crdnnv.415_amd64.deb hsa-rocr-dev-1.2.0.30900-crdnnv.415.el7.x86_64.rpm hsa-rocr-dev-1.2.0.30900-crdnnv.sles151.415.x86_64.rpm http://confluence.amd.com/display/GPUCPT/Package+File+Naming Note: rpm requires 'devel' instead of 'dev', to be a subsequent patchset. Change-Id: Id6a422f3c335448b52c70c77ed39c9041114b80f Signed-off-by: Cole Nelson <cole.nelson@amd.com>	2020-11-18 14:56:24 -05:00
Pruthvi Madugundu	87955f8551	Fix for uninstallation problem of hsa-rocr-dev - /opt/rocm-xx/hsa/include directory wasnt deleted after debian package uninstallation - , Signed-off-by: Pruthvi Madugundu <pruthvi.madugundu@amd.com> Change-Id: I213439d73f6533ff3a55e2b0071061d970cf56d4	2020-11-11 12:32:11 -08:00
Konstantin Zhuravlyov	3a08d0964e	Implement Target ID Proposal Changes from Konstantin Zhuravlyov, Tony Tye Change-Id: I532801193afa9d5b8ac2a877b5497eab661f0597	2020-11-10 13:42:35 -05:00
Sean Keely	a09ba8bcc8	Diable sram ecc reporting. Temporary workaround while language and compiler teams sort out handling both modes. Change-Id: I5d676cd546382dba05ec0b62bb885baa854614f6	2020-10-20 17:06:30 -05:00
Arun Sunil	8d00f1aa59	CMakeLists.txt: Fix issue with rocrtst Fix for issue where rocrtst could not be built if out directory was outside the src (WORK_ROOT) directory due to hard-coded relative path for OPENCL_INC_DIR. Change-Id: Icb93de2266d568e9c2437166e34c88ec526fb45c	2020-10-13 18:14:26 -04:00
Evgeny	0d1e5cbcb6	aqlprofile: adding counters DISABLE get-info id Change-Id: I90d0f6ae96b0d80c481648eecf907301fc13ab74	2020-10-12 17:12:25 -05:00
Sean Keely	9192dfe1b0	Initialize intercept queue packets properly. Change-Id: I0ff1540940665409a9ade3a517dd576a8f334c7b	2020-10-08 15:33:43 -05:00
Chris Freehill	2b41fb9fdc	Add README for rocrtst Change-Id: Icd43a243ccfc9caf5ade3cd0e7ffc00e251fc0a2	2020-09-28 20:27:53 -05:00
Sean Keely	a3c4aaf95a	Correct return type error in hsa_amd_signal_wait_any. The error checking macro IS_OPEN returns an hsa_signal_t. This conflicts with the return type of uint32_t. Add an assert and rely on spurious return rule to return zero when rocr is not initialized. Change-Id: Ifc9bb75e22ecdd675273de59b31e5026a69c62e0	2020-09-25 21:33:23 -04:00
Sean Keely	248904ab26	Add try/catch blocks to image APIs. Change-Id: I724dcc8015ac556649278dd6cdf1ad4097aaa846	2020-09-22 19:49:36 -04:00
Sean Keely	33a57ddf72	Correct image limits tables to SI limits. Limits remain unchanged through gfx1030. Change-Id: Ibdd39b7b97101ea0133af6cebdf295aeef81ac45	2020-09-22 19:49:08 -04:00
Chris Freehill	4944c74189	Add gfx1031 support Change-Id: I855f7fe8d096331d0c1da10b10adf6b1e75a527f	2020-09-10 11:06:58 -04:00
Sean Keely	2a0c6774fb	Use SDMA for small copies in VRAM. For small copies cache flush latency is larger than data transfer latency in local VRAM. Select SDMA for small copies. Environment key HSA_FORCE_SDMA_SIZE is added for easy adjustment of the small copy size. This may be removed after tuning is done. Change-Id: I733fa0ae01c616617c5de50e71226b51fd589ef2	2020-09-03 03:11:57 -05:00
Sean Keely	9c20f0e649	Correct memory release function. l_name is populated by strdup which requires using free rather than delete. Change-Id: I9d9bdcfaa3ef095502270f332b95a0ee5c0bbcfc	2020-08-26 18:22:59 -05:00

1 2 3 4 5 ...

701 Révisions