Граф коммитов

58 Коммитов

Автор SHA1 Сообщение Дата
Laurent Morichetti 9ca79d072a New trap handler ABI (v4)
Replace the stop reasons ttmp11.trap_raised and ttmp11.excp_raised
with ttmp11.wave_stopped which indicates that the trap handler has
halted the wave as the result of an event (trap, single-step or
exception).

If the wave is stopped because of a trap, also record the trap_id in
ttmp11.saved_trap_id[7:0].

Save status.halt in ttmp11.saved_status_halt, so that it can be
restored when resuming a wave (changing a wave's state from stopped to
running or single-stepping).

Change-Id: I7322f59b60e8cc1b92bf5f067dba606a3109ef49
2021-02-05 09:56:01 -08:00
Huang Rui feeb2f62e2 Add gfx10.3.3 ISA support for Van Gogh
This patch is to let ROCr recognize new gfx10.3.3 ISA.

Change-Id: Ied23eee2752e14c19c8c0a6d7789fded9940e31e
Signed-off-by: Huang Rui <ray.huang@amd.com>
2021-01-22 04:22:15 -05:00
Laurent Morichetti 8aec53969f Don't terminate waves halted at s_endpgm
To support single stepping the instruction preceding an s_endpgm,
unwind the PC by 8 bytes and set ttmp11[9] to notify the debugger
that the wave is halted with a modified PC.

Bump the debug r_version for this new trap handler ABI.

Change-Id: I55e4e0d65576f92da14a336266c31c513baab547
2021-01-21 20:51:38 -08:00
Tony 12eb2764cd Correct code object V2 support
- Remove gfx800, gfx804 and gfx901 as they do not exist.
- Map the V2 note record of "AMD:AMDGPU:8:0:0" to gfx802 as they are
  the same target just connected to a differnt motherboard.
- Correct typo for supporting gfx902:xnack+.
- Support agent names with a minor or stepping version greater than 9.

Change-Id: Ife933449f60ab4687e2aaab9baf4c9fc5b86339d
2021-01-10 18:23:54 +00:00
Sean Keely d39ae13420 Add support for gfx1032.
Change-Id: I36f93a6b61e74cf17aac1a05d7c1d4ba6369fcc9
2021-01-05 17:28:19 -06:00
Tony b443397bcc Make supported targets consistent
Add missing target names and make all parts consistent with which
targets are supported.

- Add gfx805 as a supported target.

- Add all ELF targets to genric code.

- Make offline loader match supported targets.

Change-Id: Idab4d69edc71645aecaa83aa55e29c1aeee4c1d6
2020-11-24 03:14:31 +00:00
Tony ef755e4c82 Update code object V3 kernarg queries
Code object V2 had the ability to support the following queries:

- HSA_CODE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_SIZE
- HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_SIZE
- HSA_CODE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_ALIGNMENT
- HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_ALIGNMENT

However code object V3 onwards cannot support these as the kernel
descriptor changed. These queries need to be deprecated.

Until then return more reasonable values:

- For kernarg alignment return 16 which is the minimum alignment
  required by the HSA standard.

- For kernarg size return the field from the kernel descriptor which
  is a hint. If it is 0 then the compiler is not specifying the kernarg
  size, or the kernel has no kernarg.

Change-Id: I19ce6cd0f3658a2bf62277492f39100ea5ab4256
2020-11-20 21:39:18 -05:00
Konstantin Zhuravlyov 3a08d0964e Implement Target ID Proposal
Changes from Konstantin Zhuravlyov, Tony Tye

Change-Id: I532801193afa9d5b8ac2a877b5497eab661f0597
2020-11-10 13:42:35 -05:00
Chris Freehill 4944c74189 Add gfx1031 support
Change-Id: I855f7fe8d096331d0c1da10b10adf6b1e75a527f
2020-09-10 11:06:58 -04:00
Sean Keely 9c20f0e649 Correct memory release function.
l_name is populated by strdup which requires using free rather
than delete.

Change-Id: I9d9bdcfaa3ef095502270f332b95a0ee5c0bbcfc
2020-08-26 18:22:59 -05:00
Chris Freehill e702531b40 Add gfx1030 support
Change-Id: I4bccc731ba802480925f98c6c42593503bf9b98d
2020-08-12 16:34:10 -05:00
Sean Keely 3646465429 Remove legacy build file.
Change-Id: I827664a5bb11bcea3176449cbf766b6cb3b93696
2020-06-19 22:35:05 -04:00
Ramesh Errabolu fa13208698 Add rocr namespace to core header and impl files
Change-Id: I1e1b33f9bba1078d049bc19797889988c3e43360
2020-06-19 22:34:21 -04:00
Sean Keely ce19721c88 Update copyright date.
Change-Id: If4bf4c20cf051878bfe759080bb7345d884dd53d
2020-06-19 22:34:01 -04:00
Konstantin Zhuravlyov 9eb735ec24 Add support for code object URI to ROCr
Adds the following:
	- New factory method to create a code object reader from
          file with offset and size.
	- A pair of queries on a loaded code object to get the URI name/length.
	- A bump to the AMD vendor loader extension API and its associated table.

Change-Id: I17c83e9c2447d29a43c438459395365f786a3611
2020-06-01 11:07:50 -04:00
Laurent Morichetti 00da82f951 Add debugger support for wave halted at launch
New trap handler ABI: Record in ttmp11[8:7] the event that caused the
trap handler to be entered. We currently record 2 events, trap_raised
if an s_trap instruction was executed, or excp_raised if an exception
(MEM_VIOL or ILLEGAL_INST) was raised.

Change-Id: Ie278c8277437b3b67c2737dcd1a12fe6511df428
2020-04-29 19:29:56 -04:00
Laurent Morichetti 5f783494f1 Return a file URI for elf images in shared objects
Iterate the loaded shared objects to see if the given elf image binary
is part of a loaded segment.

Change-Id: I074cacd99eb5b59f883f4ce2bd901e0e35a660b8
2020-04-14 15:22:43 -04:00
Ramesh Errabolu 627991b1c1 Update how code references publicly available ROCr headers
Change-Id: I357c51eb713a23704d4fee71081be46a73a71806
2020-02-21 20:01:11 -05:00
Saleel Kudchadker c57f3da1dc Reset link_map map in the constructor
Change-Id: I8a6ad3bc0fca790dec2992cacf9288068b3bcaa3
2020-02-19 15:29:35 -08:00
Sean Keely 3e9aca0f34 Support stripped binaries and remove unneeded attributes.
Attribute optimize(0) doesn't appear to be helpful helpful.  This
prevents optimization in the function but not at call sites to the
function.  The function may still be inlined since it has no side
effect (in some cases that we currently don't support).

Having a side effect prevents a call site optimization that allows
removal of a noinline function call with no side effect.  Call site
optimization should only happen (in GCC at least) when using whole
program optimization so this may be stronger than we strictly need.

Also added _amdgpu_r_debug to the exported symbol list (global) and
switched to the standard macro for an exported symbol (HSA_API).
Without being in the global list the debugger will not find this
symbol if the binary has been stripped.

Change-Id: Ieb00175ccc55fda4491deee44711cd55b3f24aeb
2020-01-21 20:08:02 -05:00
Laurent Morichetti 19e1fb3a4e Fix a build error when compiling with clang
Check __clang__ before __GNUC__ as clang defines both.

Change-Id: I9963f8e0665efb4cb08bd3886fb38fee42dd9861
2020-01-15 18:52:53 -08:00
Qingchuan Shi d63886190f fix optimize(0) for clang.
Change-Id: I83bc57d42815f37445ae97bf6950147e3358ac45
2020-01-13 20:53:40 -05:00
Qingchuan Shi 16a20cfb8c Adding code object list in loader.
Change-Id: Iab3541287bd56276fd32615ee59fcd590de84ca0
2019-10-30 20:31:51 -04:00
Chris Freehill 6ebdad5896 Initial support for gfx1010, gfx1011, gfx1012
Change-Id: I9ec398070c85db08aea72947557c6e1b5f7d541d
2019-09-12 20:24:30 -05:00
Konstantin Zhuravlyov 2275c74695 Loader: add basic logging abilities
- Enabled with env var LOADER_ENABLE_LOGGING=1

Change-Id: Ibdbb1b55ffddb7dc9c63e52fc9db3013409376a4
2019-08-21 13:29:15 -04:00
Chris Freehill 447a30e985 Initial gfx908 updates
Change-Id: I3d6307d6613a38861a95561b9ac68abaa5964b48
2019-07-22 17:25:06 -04:00
Sean Keely 465a8eb40b PR from github user DiamondLovesYou.
Allow user specified profiles if the HSAIL note is not found.

Konstantin reviewed and approved.  HSAIL note is not generated by LLVM.

Change-Id: I40fbfbaedd6787b6a716507918f698d02007afe1
2019-07-16 13:55:38 -05:00
Sean Keely e89f9807f1 Patch from github.
At the moment it is not possible to build ROCr with Clang. This is
a spurious limitation. The present PR addresses it by guarding GCC
only flags and by fixing some additional warnings that Clang triggers;
one of said warnings did outline a rather interesting issue with math
being done on void*s. - AlexVlx

Void ptr arithmetic had already been fixed in amd-master branch.

Change-Id: I5ee97e20b5c40b10dd73facecabe75f02ba46462
2019-04-29 16:17:24 -04:00
Konstantin Zhuravlyov 7001134757 Process symbols with 0 address
Change-Id: I9ed943a8ccd3b103edd6aba8264c009d8cda29fa
2019-03-30 02:14:43 -04:00
Konstantin Zhuravlyov 8bee6e4976 Loader: update symbol processing for v2+
- Skip symbols that are STB_LOCAL and not STT_AMDGPU_HSA_KERNEL

Change-Id: I68567f58de9bf3f07dbd8020ef63f47667c86367
2019-01-18 15:42:28 -05:00
Konstantin Zhuravlyov c1ad82a6b7 Loader updates for code object v3
- Fix loading in some cases
  - Fix symbol kind

Change-Id: I721b4a35972b6d2a6d0ac733ab770b096cc74e17
2019-01-18 15:41:01 -05:00
Konstantin Zhuravlyov a447d79430 Fix dynamic relocations:
- Process dynamic relocation even if there is
    no symbol associated to it.

Change-Id: Iaefee682ee52f5acda8280e5764e6d5fd992774a
2018-11-14 15:25:41 -05:00
Konstantin Zhuravlyov 509bb777e0 Loader: Update license for AMDHSAKernelDescriptor.h
Change-Id: I3a48b595ba089ca8a25f878c056b04a417a2364f
2018-10-12 14:51:05 -04:00
Konstantin Zhuravlyov 386874da55 Loader: Add support for v3 object code.
Change-Id: I7215bd0c1277c2036bf0fadf5b23cb57fdf7f665
2018-10-06 14:01:59 -04:00
Scott Linder 47f0e6f7d3 Apply dynamic relocations for STT_FUNC symbols
Required to support function calls through GOT table.

Change-Id: I174a0269fdd67369d38fe41855b7bd01f350b839
2018-09-23 21:42:32 -04:00
Konstantin Zhuravlyov 7ef70f7eaa Bring naming on par with the spec (hsa-runtime)
Change-Id: Ie1903c90a195cf95b186eb5552131a20af408adf
2018-04-10 09:15:02 -04:00
Wilkin 8e3d26c617 ROCm Runtime Support for respecting target xnack setting
This includes the changes provided by Konstantin, "Add xnack from elf header" (Change 136389).

Change-Id: I95e51141caa0d7c21903b09212c02e4906ec54a3
2018-03-20 16:57:15 -04:00
Konstantin Zhuravlyov b7915e9248 Bring loader in sync with stg sc.
Change-Id: Ib4d9231ca61048557acdad8eb8f632688c4aadd8
2018-03-12 15:00:50 -04:00
Tony Tye d472b24d05 Add support for R_AMDGPU_RELATIVE64
- Add support for R_AMDGPU_RELATIVE64 relocation record.
- Return status error if any unsupported relocation record encountered.

Change-Id: Icbb5dcb81109a70c1f2195412a0df58a11be9da1
2018-01-30 18:20:26 -05:00
Chris Freehill 651ae1bf70 Device ID/family corrections for gfx9xx
Change-Id: Icb25fbbaeb99ce886a2852b48d02875ee0f197a2
2017-11-16 07:27:54 -05:00
Qingchuan Shi ce6aee01ed Add APIs to support debugging vm fault
1. Add hsa ext api hsa_amd_register_vmfault_handler for debugger to register callback in case of VM fault.
2. Extend hsa_ven_amd_loader API to:
   (1) iterate loaded code objects in executable:
       hsa_ven_amd_loader_executable_iterate_loaded_code_objects
   (2) get loaded code object info:
       hsa_ven_amd_loader_loaded_code_object_get_info
3. Make the id of hsa_queue the same as the one used in communication with thunk (for amd_aql_queue)

Change-Id: I68910809e59e24297350d262606f00e96c14bcbd
2017-10-28 21:48:26 -04:00
Chris Freehill a7cbe78366 Use Major/Minor/Step device numbers to differentiate gfx devices
Change-Id: I0901871971a5b33018917ada6c0e69ac7aa91944
2017-10-13 16:18:24 -04:00
Konstantin Zhuravlyov 9887c26113 Bring loader in sync with stg/sc
Change-Id: Iccce07b8fa03d37c4267a2a9bd343e6614dc43e7
2017-02-10 11:21:15 -05:00
Kent Russell 5162a76616 Revert " Add image 1.1 API changes to current code base"
Currently HSA HW Profiler is failing to build due to this patch

This reverts commit e0ce8855dd.

Change-Id: Iabb2b958f33ba614a24b61bb370905b3b7362708
2017-01-12 06:43:54 -05:00
Christophe Paquot e0ce8855dd Add image 1.1 API changes to current code base
Initial work to import the latest (1.1) hsa_ext_image extension.

Change-Id: I4d55adb09ba4d4dbd43d47a4bc54077d4bc531d2
2017-01-11 17:31:37 -05:00
Konstantin Zhuravlyov 08aded148a Revert "Bring loader in sync with stg/sc"
This reverts commit c798c60343.

Change-Id: If99e8cc9e2afb525f690e49eb6538d8e950a5615
2016-12-14 15:14:36 -05:00
Konstantin Zhuravlyov c798c60343 Bring loader in sync with stg/sc
Change-Id: I684522c442de0872007a7e4da8919067fc7b42b3
2016-12-13 16:30:25 -05:00
Ramesh Errabolu eb2efb83d1 Initial set of changes for ThreadTrace
Change-Id: I07ce31f9b4f508cef0fc9ca6dadcf26b6c90361e
2016-11-21 23:40:56 -06:00
Konstantin Zhuravlyov 4b86843409 Remove load_legacy parameter + change prefix for some loaded code object queries back to AMD
Change-Id: I74e905abd77dab3a7a00b5ced94cd9b5130365c5
2016-11-20 13:46:17 -05:00
Konstantin Zhuravlyov 54245e064c Allocate only one segment for code object v2+
Change-Id: I7cd03b5c205d3ea5735f8f29820867ca90ac081b
2016-11-03 09:51:11 -04:00