Graf commitů

526 Commity

Autor SHA1 Zpráva Datum
Cole Nelson 72fa4a17fa hsa-runtime: add ENABLE_LDCONFIG to support multi-version install
Depends-On: I58fdf1d0b4e864b5a61ffe8e335d430d424811ab
Change-Id: I0cb6f8711ea5033e84b7e45ce20e7e23d84005c3
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
2021-03-26 18:37:04 -04:00
Laurent Morichetti ea6ee0aa81 New trap handler ABI (v5)
Park the wave, if it is stopped, to avoid halting it at an s_endpgm
instruction if the architecture does not support it.

Free ttmp6 by converting the dispatch_ptr into a queue packet index
(25-bit) and storing it in ttmp7[24:0].

Save the exception PC in ttmp11[22:7] ttmp6[31:0].

Change-Id: Iaa3c5baf5b488c0b534044d338f12bffa63ddce2
2021-03-04 21:44:14 -05:00
Laurent Morichetti 7e0f391a08 Correct the trap handler
ttmp11 no longer has an "excp_raised" field.

Change-Id: I8e673ca404c2b802470bbc9f76e7925782076c5a
2021-03-04 21:21:26 -05:00
Sean Keely 191664cd20 Insert scratch memory into scratch cache on full profile systems.
Scratch cache was not updated for IOMMUv2 systems previously.
This both negates the cache and causes segfault during scratch
release.

Change-Id: I71e81d6b642d65ca135868ff7225ea173529d458
2021-03-03 21:30:16 -05:00
Mike Li 93609fd3d4 Support for Custom Pitch for gfx103x
Signed-off-by: Mike Li <Tianxinmike.Li@amd.com>
Change-Id: Ica83dff8bb382637010396781190f585754bd150
2021-02-22 22:05:25 -05:00
Jason Tang ec22afb8a8 Correct GetIsa() typo
Change-Id: Ia6b5a86bd035fb077f0da9d52160ec8d12987b87
2021-02-17 11:57:58 -05:00
Sean Keely 34ac62274a Correct legacy copy path.
Legacy p2p copy path incorrectly transfered in whole pages rather than
the requested size.

Change-Id: I9aa7337754f9e32f587a0cc5305f8ffeb6196f10
2021-02-10 19:53:02 -05:00
Sean Keely 01f42dbe46 Add hsa_amd_signal_value_pointer.
Enables partial signal interop with non-HSA devices.

Change-Id: Ic39bca84ed1709cbd2cc24b1eb0f4fc6cccb39cf
2021-02-10 18:47:54 -05:00
Laurent Morichetti 9ca79d072a New trap handler ABI (v4)
Replace the stop reasons ttmp11.trap_raised and ttmp11.excp_raised
with ttmp11.wave_stopped which indicates that the trap handler has
halted the wave as the result of an event (trap, single-step or
exception).

If the wave is stopped because of a trap, also record the trap_id in
ttmp11.saved_trap_id[7:0].

Save status.halt in ttmp11.saved_status_halt, so that it can be
restored when resuming a wave (changing a wave's state from stopped to
running or single-stepping).

Change-Id: I7322f59b60e8cc1b92bf5f067dba606a3109ef49
2021-02-05 09:56:01 -08:00
Evgeny c5aae30d08 adding gfx1030 blocks
Change-Id: Ide2576939c5321dbe928183a8d9984d5ef87a61b
2021-01-29 08:50:10 -06:00
Huang Rui feeb2f62e2 Add gfx10.3.3 ISA support for Van Gogh
This patch is to let ROCr recognize new gfx10.3.3 ISA.

Change-Id: Ied23eee2752e14c19c8c0a6d7789fded9940e31e
Signed-off-by: Huang Rui <ray.huang@amd.com>
2021-01-22 04:22:15 -05:00
Laurent Morichetti 8aec53969f Don't terminate waves halted at s_endpgm
To support single stepping the instruction preceding an s_endpgm,
unwind the PC by 8 bytes and set ttmp11[9] to notify the debugger
that the wave is halted with a modified PC.

Bump the debug r_version for this new trap handler ABI.

Change-Id: I55e4e0d65576f92da14a336266c31c513baab547
2021-01-21 20:51:38 -08:00
Laurent Morichetti 8808ed3177 Correct gfx10.3+ trap handler.
Change-Id: I77d2b41c8882014a430d741ecd777718a1f61561
2021-01-21 09:24:20 -08:00
Tony Tye 26fe26e415 Correct isa lookup for targets that do not support a target feature
Change-Id: I130070a53162e5d9fcc6a64a4bdda7869179be82
2021-01-18 15:47:19 +00:00
Chris Freehill 09bc75bf0d Correct some target ID strings for gfx908
Change-Id: I7833b561447b9928447cf49472cfe1ca1867e71d
2021-01-15 14:56:38 -06:00
Sean Keely 7bc6aac5d2 Correct computation of scratch slot requirements.
Each SE must be assigned equal numbers of slots and slots
must be assigned in units of whole groups.

Change-Id: I8f3677237fa6f2e2d25e3e78210c5a7a0ad792f3
2021-01-13 15:09:00 -05:00
Sean Keely 9fe8ccc3ee Revert "Revert "Cache scratch allocations.""
This reverts commit 7e2ba23566.

Change-Id: I3f3c257270016559f8b2e70151664f0931db28d2
2021-01-13 15:08:53 -05:00
Tony Tye 6bbf6b1c9c Improve Isa class
- Use consistent naming in Isa class.
- Remove unused Isa methods.
- Simplify Isa methods.

Change-Id: I7c4045d08fbfe0d94b3181db8ebc5e5ed8c8cc82
2021-01-10 18:23:54 +00:00
Tony 853ccc762e Store target ID in isa registry
Store target ID string in isa registry and use for returning agent and
isa name.

Change-Id: I72a20d8ff963c73d86392158aff3853e4c9bfdbd
2021-01-10 18:23:54 +00:00
Tony 12eb2764cd Correct code object V2 support
- Remove gfx800, gfx804 and gfx901 as they do not exist.
- Map the V2 note record of "AMD:AMDGPU:8:0:0" to gfx802 as they are
  the same target just connected to a differnt motherboard.
- Correct typo for supporting gfx902:xnack+.
- Support agent names with a minor or stepping version greater than 9.

Change-Id: Ife933449f60ab4687e2aaab9baf4c9fc5b86339d
2021-01-10 18:23:54 +00:00
Sean Keely 7e2ba23566 Revert "Cache scratch allocations."
This reverts commit 27e044ae4d.

Change-Id: I698b33bacb2be3de6c8185fe89597a60a79521c5
2021-01-08 11:57:40 -06:00
Sean Keely d39ae13420 Add support for gfx1032.
Change-Id: I36f93a6b61e74cf17aac1a05d7c1d4ba6369fcc9
2021-01-05 17:28:19 -06:00
Tony b443397bcc Make supported targets consistent
Add missing target names and make all parts consistent with which
targets are supported.

- Add gfx805 as a supported target.

- Add all ELF targets to genric code.

- Make offline loader match supported targets.

Change-Id: Idab4d69edc71645aecaa83aa55e29c1aeee4c1d6
2020-11-24 03:14:31 +00:00
Tony ef755e4c82 Update code object V3 kernarg queries
Code object V2 had the ability to support the following queries:

- HSA_CODE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_SIZE
- HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_SIZE
- HSA_CODE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_ALIGNMENT
- HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_ALIGNMENT

However code object V3 onwards cannot support these as the kernel
descriptor changed. These queries need to be deprecated.

Until then return more reasonable values:

- For kernarg alignment return 16 which is the minimum alignment
  required by the HSA standard.

- For kernarg size return the field from the kernel descriptor which
  is a hint. If it is 0 then the compiler is not specifying the kernarg
  size, or the kernel has no kernarg.

Change-Id: I19ce6cd0f3658a2bf62277492f39100ea5ab4256
2020-11-20 21:39:18 -05:00
Sean Keely 27e044ae4d Cache scratch allocations.
Avoids calling to KFD to map/unmap scratch allocations for
every large scratch using dispatch.

Change-Id: I9fab5705251ec82b03e4f2f2ca6da7cdccabefb9
2020-11-20 15:07:01 -05:00
Sean Keely 32d0fcafa9 Limit clock synchronization to 16Hz.
Improves HIP event performance in directed benchmarks where
clock sync latency is significant.

Change-Id: I78b724a14a8f5b6a9a2b9f4d85afe9d8b81808a6
2020-11-20 15:06:13 -05:00
Sean Keely b51f68b535 Style update for SDMA enable flag.
Updated to match xnack flag's style.

Change-Id: I6115c0b53660d789e698de1606a9388ae1789866
2020-11-20 15:06:02 -05:00
Cordell Bloor 4a35f560f6 Fix CMake configure error due to CMP0012
The modern meaning of the construct if( NOT ON ) was added in CMake 2.8,
but when the cmake_minimum_required not set in user code and no policy
level is set in the CMake config, then CMake 2.8 features cannot be
used. In old CMake (the default), ON is interpreted as a variable, and
because it is not defined, it is considered false. The same is true of
OFF.

This change sets a variable as ON, so that old CMake interpretation is
correct, and the if works as expected regardless of policy version.

Change-Id: I67d7ed4ceaf8248eeb5a1c7f54009d72313f3f5d
2020-11-20 15:04:41 -05:00
Cole Nelson 90f2dd5b1b opensrc/hsa-runtime/CMakeLists.txt: conformant package names
Names test good:
hsa-rocr-dev_1.2.0.30900-crdnnv.415_amd64.deb
hsa-rocr-dev-1.2.0.30900-crdnnv.415.el7.x86_64.rpm
hsa-rocr-dev-1.2.0.30900-crdnnv.sles151.415.x86_64.rpm

http://confluence.amd.com/display/GPUCPT/Package+File+Naming

Note: rpm requires 'devel' instead of 'dev', to be a subsequent
patchset.

Change-Id: Id6a422f3c335448b52c70c77ed39c9041114b80f
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
2020-11-18 14:56:24 -05:00
Pruthvi Madugundu 87955f8551 Fix for uninstallation problem of hsa-rocr-dev
- /opt/rocm-xx/hsa/include directory wasnt deleted after
  debian package uninstallation
- , 

Signed-off-by: Pruthvi Madugundu <pruthvi.madugundu@amd.com>
Change-Id: I213439d73f6533ff3a55e2b0071061d970cf56d4
2020-11-11 12:32:11 -08:00
Konstantin Zhuravlyov 3a08d0964e Implement Target ID Proposal
Changes from Konstantin Zhuravlyov, Tony Tye

Change-Id: I532801193afa9d5b8ac2a877b5497eab661f0597
2020-11-10 13:42:35 -05:00
Sean Keely a09ba8bcc8 Diable sram ecc reporting.
Temporary workaround while language and compiler teams sort out
handling both modes.

Change-Id: I5d676cd546382dba05ec0b62bb885baa854614f6
2020-10-20 17:06:30 -05:00
Evgeny 0d1e5cbcb6 aqlprofile: adding counters DISABLE get-info id
Change-Id: I90d0f6ae96b0d80c481648eecf907301fc13ab74
2020-10-12 17:12:25 -05:00
Sean Keely 9192dfe1b0 Initialize intercept queue packets properly.
Change-Id: I0ff1540940665409a9ade3a517dd576a8f334c7b
2020-10-08 15:33:43 -05:00
Sean Keely a3c4aaf95a Correct return type error in hsa_amd_signal_wait_any.
The error checking macro IS_OPEN returns an hsa_signal_t.
This conflicts with the return type of uint32_t.

Add an assert and rely on spurious return rule to return zero
when rocr is not initialized.

Change-Id: Ifc9bb75e22ecdd675273de59b31e5026a69c62e0
2020-09-25 21:33:23 -04:00
Sean Keely 248904ab26 Add try/catch blocks to image APIs.
Change-Id: I724dcc8015ac556649278dd6cdf1ad4097aaa846
2020-09-22 19:49:36 -04:00
Sean Keely 33a57ddf72 Correct image limits tables to SI limits.
Limits remain unchanged through gfx1030.

Change-Id: Ibdd39b7b97101ea0133af6cebdf295aeef81ac45
2020-09-22 19:49:08 -04:00
Chris Freehill 4944c74189 Add gfx1031 support
Change-Id: I855f7fe8d096331d0c1da10b10adf6b1e75a527f
2020-09-10 11:06:58 -04:00
Sean Keely 2a0c6774fb Use SDMA for small copies in VRAM.
For small copies cache flush latency is larger than data transfer
latency in local VRAM.  Select SDMA for small copies.

Environment key HSA_FORCE_SDMA_SIZE is added for easy adjustment
of the small copy size.  This may be removed after tuning is done.

Change-Id: I733fa0ae01c616617c5de50e71226b51fd589ef2
2020-09-03 03:11:57 -05:00
Sean Keely 9c20f0e649 Correct memory release function.
l_name is populated by strdup which requires using free rather
than delete.

Change-Id: I9d9bdcfaa3ef095502270f332b95a0ee5c0bbcfc
2020-08-26 18:22:59 -05:00
Sean Keely 5f43778a51 Convert from double to uint64_t in two steps.
We want wraparound behavior here but we don't want to trigger sanitizer
warnings.  Converting to int64_t and then wraping around by cast to
uint64_t avoids the UB issue that triggers the sanitizer warning.

Change-Id: I9400b988dce7899e9ba42cab3e35c7ffedec8fe1
2020-08-25 20:12:52 -05:00
Cole Nelson 24bad55dc7 packaging: set arch, field separators, vendor info
Enables standards compliant package naming for debian and rpm.

Change-Id: Iad86bf942b4e2938516ef46cda6fa2e4bb3744cc
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
2020-08-21 11:33:05 -04:00
Sean Keely 1d919adc75 Add gfx1030 to image blit kernel build list.
Change-Id: I2ddb6a595bb7ca5f6a94f38f8ecc2e40831c52fd
2020-08-12 16:38:39 -05:00
Sean Keely 78e5c06ea8 Switch to release e_flags id for gfx1030.
Change-Id: I51c9ecdf78d6ec56ccc70ca5777bb011db35fda3
2020-08-12 16:38:16 -05:00
Sean Keely dc7e5e7e46 Add xnack isa recognition to gfx1030.
Change-Id: I99301a62f1952b6a3cc548272f4129ad8c0542da
2020-08-12 16:34:17 -05:00
Sean Keely ddfe07871a Add ELF types for gfx1030.
Change-Id: If875534d698da9840e47c380d5630b6dd742ab0c
2020-08-12 16:34:17 -05:00
Chris Freehill e702531b40 Add gfx1030 support
Change-Id: I4bccc731ba802480925f98c6c42593503bf9b98d
2020-08-12 16:34:10 -05:00
Sean Keely f4fe7ddf47 Make explicit reference between init modules.
Make explicit reference to hsa_api_trace.cpp from
initialization of hsa_table_interface.cpp.  Breaks
the ability to use hsa_table_interface.cpp in plugins.

Change-Id: I22a42d3a132512b0d9ec7a1ca629b169e7f8eba7
2020-07-15 16:02:15 -04:00
Aaron Enye Shi d23b26f760 Update to use new bitcode library structure
Rather than manually linking to the device libraries, the compiler
can now handle linking with them. Allow the build to continue using
old layout if the build system still uses it. Therefore maintain
compatibility with ROCm 3.7 and earlier.

Change-Id: Ida81775da3d0f7c2c67386a71cb057ede31a1545
2020-07-14 15:55:08 -04:00
Sean Keely f6e6eae86d Remove unnecessary HSA_API declarations.
The excess declarations mark implemenation functions as default
visibility.  Normally this is not an issue since our linker script
will specify which visible symbols will be permitted into the dynamic
symbol table.  However, for static linking methods which apply linker
directives during incremental linking symbol visibility must be correct
in the (non-dynamic) symbol table.

Change-Id: I13dc8dd1019368e8943920d36335a91f0c555a92
2020-07-07 16:41:34 -04:00