Графік комітів

67 Коміти

Автор SHA1 Повідомлення Дата
pghoshamd 637b0d71f0 SWDEV-569319 Replace ScopedAcquire with stdcpp wrappers (#2146)
* SWDEV-569319 Replace ScopedAcquire with stdcpp wrappers

* Remove KernelMutex and KernelSharedMutex abstractions with std::mutex and std::shared_mutex

* Replaced unique_locks with lock_guards

* More changes

* Replace new and deletes with smart pointers

* Replaced some more with shared ptrs

* Replacements with smart pointers - pt 2

* missed change
2026-01-06 10:59:34 -05:00
Mario Limonciello bc5d48e76c Run pre-commit's whitespace related hooks on projects/rocr-runtime (#2130)
* Run pre-commit's whitespace related hooks on projects/rocr-runtime

In order for pre-commit to be useful, everything needs to meet a common
baseline.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

* Add missing semicolon which would block compilation on big endian CPUs

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

---------

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-12-08 07:56:50 -06:00
systems-assistant[bot] bebe65f104 rocr: fix nullptr dereference (#262)
* rocr: fix nullptr dereference

Return early in the case that malloc fails to avoid dereferencing of a
null pointer on eventDescrp.

Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>

* rocr: Fix potential nullptr dereference

returns early if sym->section() fails to properly acquire the object.

Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>

---------

Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
Co-authored-by: Sunday Clement <Sunday.Clement@amd.com>
2025-10-21 13:49:01 -04:00
German Andryeyev 913743d433 Add windows build support into ROCr (#912)
Make sure ROCR can be compiled under windows. Extra setup for the windows build environment is required. The change should not have any functional changes under Linux.
2025-09-19 10:10:17 -04:00
David Yat Sin 1474a6c774 rocr: Remove gfx940 and gfx941 support
[ROCm/ROCR-Runtime commit: 13c591d250]
2025-02-19 12:16:24 -05:00
Shane Xiao 6e0b1642b3 rocr: Fix missed read lock in ExecutableImpl::FindHostAddress
Change-Id: Ide9b5cc3aa235d3768ebbfd8dc1560bf70fd0743
Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Reviewed-by: Qiang Yu <qiang.yu@amd.com>


[ROCm/ROCR-Runtime commit: 2d40493c31]
2024-12-30 06:43:25 -05:00
David Yat Sin 91a28fce54 rocr: Move _loader_debug_state to rocr namespace
This avoids exposing the symbol to the default namespace

Change-Id: I2fe5fbab4b59f271effacab93eeb2d95c236ae02


[ROCm/ROCR-Runtime commit: 147abb6ca0]
2024-11-29 10:44:23 -05:00
Chris Freehill b617b05c2a rocr: Ensure globals are initialized at first use
When ROCr is built as a static library, global variables
were often not initialized to valid values at their first
use. This change addresses that problem.

Change-Id: I550fa41feb3bc04b9cc686bcfb4acf2a7b651a88


[ROCm/ROCR-Runtime commit: 9b13bcd0ac]
2024-10-16 23:19:48 -04:00
James Xu 94ee49bdbd rocr: Add nullptr check in IterateExecutables
When an entry is deleted from the array, it's set to nullptr
but not removed. Most other functions that
iterate over the array check if the entry is nullptr
but this loop in IterateExecutables did not.

Change-Id: I763b361eea59f6df201bb86ead0234e95f2cf79c


[ROCm/ROCR-Runtime commit: f3664fd124]
2024-09-19 19:44:53 +00:00
Konstantin Zhuravlyov 45eafcf4ea loader: allow but skip static relocations for code object v2+
Change-Id: I4ae14cb5e740d7d45810b75038b15a0b94d2bf0b


[ROCm/ROCR-Runtime commit: 08c94463de]
2024-04-09 11:39:18 -04:00
Konstantin Zhuravlyov ae24ca1528 Switch to per-executable contexts in the loader
- Per-executable contexts should be used from now on
  - Global contexts are left as is for now for backwards
    compatibility and will be phased out in follow up
    patches.

Change-Id: I6291abf865c7ed24ee71f5065e539afc23f5ce64


[ROCm/ROCR-Runtime commit: b983c19729]
2024-04-09 10:31:51 -04:00
Konstantin Zhuravlyov 2a7fb7a808 Add R_AMDGPU_ABS32 support
Change-Id: I0ee0302d919ede44765adf02eab15015573efef2


[ROCm/ROCR-Runtime commit: 9e8f185397]
2024-03-26 18:47:29 -04:00
Konstantin Zhuravlyov 853ccdecbb Add dynamic relocation types (NFC)
Change-Id: I1b443003077ba241f34444da293e362266c2ae92


[ROCm/ROCR-Runtime commit: c5e74b7d0a]
2024-03-26 18:47:05 -04:00
Konstantin Zhuravlyov ec66509986 Rename existing relocation types to legacy/v1 (NFC)
Change-Id: Ided7f656c34131b8067a19c0d3b2955fc8823628


[ROCm/ROCR-Runtime commit: b2c32ad6cb]
2024-03-26 18:46:50 -04:00
pvanhout 8e43aaab04 [libamdhsacode] Support COV6/Generic Targets
Change-Id: I4680577eb56dc436fbc134b169f172dd476bff37


[ROCm/ROCR-Runtime commit: a93c18dc90]
2024-03-12 07:37:32 -04:00
Lancelot SIX 7f763d499a trap_handler: Set status.skip_export when halting a wave
When inspecting waves on architectures where SPI may not initialize TTMP
registers, the debugger cannot reliably know if the trap handler was
entered and if it saved valuable information in TTMP registers.

This patch uses the status.skip_export bit (unused by the compute
shaders) to indicate that it got executed before halting a wave.
This is done except for gfx940, where ttmp11[31] can be used (as long as
TTMP registers are always initialized by SPI for this architecture).  It
could be possible to be more selective as architectures always
initializing TTMP registers do not require this step, but always doing
is makes maintenance simpler.

Change-Id: I5c4148c78062f7ffa049ac7856c2edc82dbc77d1


[ROCm/ROCR-Runtime commit: 5d3f6a63f1]
2024-03-05 09:28:33 -05:00
Joseph Huber 3f872d8c97 Add executable symbol info for the wavefront size
The wavefront size is currently only exposed as an agent level
attribute. This is not correctyl, because while the agent has a default
wave front size that is usually correct, it can easily be overridden via
options like -mwavefrontsize64 on various ISAs. The wavefrontsize
attribute is actually more of a calling convention that is consistent
within a callgraph. Because the root of each call graph is a kernel in
this architecture, we need to be able to query this on a per-kernel
basis. This information is already avialable in the kernel descriptor
packet, but it wasn't exported.

This patch adds HSA_CODE_SYMBOL_INFO_KERNEL_WAVEFRONT_SIZE as a new
option to query on the executable symbol.

Change-Id: I744815c89cc9d4c82f25479bdd48ae1f32e859ff


[ROCm/ROCR-Runtime commit: 9e26cbac14]
2024-02-09 15:55:30 +00:00
Lancelot SIX 9317d0fbc0 Revert "trap_handler: Set status.skip_export when halting a wave"
This reverts commit 4c8a849772.  This
change is required for the runtime to generate reliable core dump files,
but this feature has been disabled for now by
816b46868a.  Until it is needed, revert
the ABI change in the trap handler to maintain compatibility with older
debugger.

Change-Id: I77a1562dc7962befe2bf88442df858e2d2b1c5ab


[ROCm/ROCR-Runtime commit: 6f828d8609]
2024-01-16 15:55:59 +00:00
Lancelot SIX 4c8a849772 trap_handler: Set status.skip_export when halting a wave
When inspecting waves on architectures where SPI may not initialize TTMP
registers, the debugger cannot reliably know if the trap handler was
entered and if it saved valuable information in TTMP registers.

This patch uses the status.skip_export bit (unused by the compute
shaders) to indicate that it got executed before halting a wave.
This is done except for gfx940, where ttmp11[31] can be used (as long as
TTMP registers are always initialized by SPI for this architecture).  It
could be possible to be more selective as architectures always
initializing TTMP registers do not require this step, but always doing
is makes maintenance simpler.

Change-Id: I314db6b37772f7daa8bd405e6662a86658d3f5e0


[ROCm/ROCR-Runtime commit: c5db063b2f]
2023-12-06 21:20:03 -05:00
Lancelot SIX 09589e5929 Park waves for gfx11 and bump abi version to 9
On gfx11, with a sequence such as

  s_trap 2
  s_sendmsg sendmsg(MSG_DEALLOC_VGPRS)
  s_endpgm

the s_sendmsg does deallocate registers while the wave is supposed to be
stopped.  As a result, the wave cannot do the expected context save
operations, and cannot context save.

To avoid this problem, park the wave in the trap handler for gfx11.

Note that gfx11 has implemented an instruction cache prefetch.  When
parked, the prefetch tries to access memory past the end of trap handler
which causes memory violation exceptions to be reported.  To avoid this,
we need to add padding at the end of the trap handler.  The padding
consists of `s_code_end` instructions  Given that the trap handler is
loaded at a 0x1000 aligned address the maximum prefetch amount (in
bytes) is given by `256 - (trap_handler_size % 64)`.

Change-Id: I5446da54a965a64f21cb0fd3ce3caa4b6137a933


[ROCm/ROCR-Runtime commit: 2f2ba050f6]
2023-07-15 09:44:50 -04:00
Laurent Morichetti 3603303bc7 Update the trap handler for gfx940
gfx940 uses ttmp11 to hold the queue packet index so the first level
trap handler uses ttmp13 instead to save ib_sts.

Repurpose ttmp11[31] to mean that the ttmps are initialized. The issue
was that the debugger could not tell whether ttmp6 was written by the
trap handler when determining the stop reason.

If ttmp11[31]=0, then the trap handler has not been executed and ttmp6
should be assumed to be 0.  If ttmp11[31]=1, then ttmp6 holds the
trap_id, if an s_trap instruction caused the exception.

Signed-off-by: Laurent Morichetti <laurent.morichetti@amd.com>
Signed-off-by: Lancelot Six <lancelot.six@amd.com>

Change-Id: I9af903abae044b9ec530306229caf3b883f3ee46


[ROCm/ROCR-Runtime commit: f31b312611]
2023-04-27 16:15:14 -04:00
Konstantin Zhuravlyov 91448848c6 Add support for the following kernel symbol query:
- HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_DYNAMIC_CALLSTACK

Change-Id: Idff5c1a2ce2a3e2d65bcc9cf1f66a68d37cd41ef


[ROCm/ROCR-Runtime commit: d962fc39bb]
2022-07-29 15:15:24 -04:00
Konstantin Zhuravlyov 625b1c99b3 Add code object v5 support
Change-Id: I03522765056e99ed49e6c5e213ee3753852de27b


[ROCm/ROCR-Runtime commit: 9265409f08]
2022-04-12 08:53:27 -04:00
Jay Cornwall b65eb065c3 Report union of wave errors as a bitmask in trap handler
Also fix incorrect PC increment on host trap.

Change-Id: Ic8bbf2b90f9f879ba62b558b909d010a8939a663


[ROCm/ROCR-Runtime commit: f3d942b67f]
2021-07-16 18:03:26 -05:00
Jay Cornwall 06cc198b57 Add new trap handler, bump debug API version
Also fix hsaKmtRuntimeEnable error handling. Continue if ioctl fails.

Change-Id: I754ccba5910ccfef6f1ada1415593ef89ce33aba


[ROCm/ROCR-Runtime commit: 7e4088309d]
2021-07-16 18:03:26 -05:00
Laurent Morichetti 023947a6de New trap handler ABI (v5)
Park the wave, if it is stopped, to avoid halting it at an s_endpgm
instruction if the architecture does not support it.

Free ttmp6 by converting the dispatch_ptr into a queue packet index
(25-bit) and storing it in ttmp7[24:0].

Save the exception PC in ttmp11[22:7] ttmp6[31:0].

Change-Id: Iaa3c5baf5b488c0b534044d338f12bffa63ddce2


[ROCm/ROCR-Runtime commit: ea6ee0aa81]
2021-03-04 21:44:14 -05:00
Laurent Morichetti b3dc12024b New trap handler ABI (v4)
Replace the stop reasons ttmp11.trap_raised and ttmp11.excp_raised
with ttmp11.wave_stopped which indicates that the trap handler has
halted the wave as the result of an event (trap, single-step or
exception).

If the wave is stopped because of a trap, also record the trap_id in
ttmp11.saved_trap_id[7:0].

Save status.halt in ttmp11.saved_status_halt, so that it can be
restored when resuming a wave (changing a wave's state from stopped to
running or single-stepping).

Change-Id: I7322f59b60e8cc1b92bf5f067dba606a3109ef49


[ROCm/ROCR-Runtime commit: 9ca79d072a]
2021-02-05 09:56:01 -08:00
Laurent Morichetti 062d313530 Don't terminate waves halted at s_endpgm
To support single stepping the instruction preceding an s_endpgm,
unwind the PC by 8 bytes and set ttmp11[9] to notify the debugger
that the wave is halted with a modified PC.

Bump the debug r_version for this new trap handler ABI.

Change-Id: I55e4e0d65576f92da14a336266c31c513baab547


[ROCm/ROCR-Runtime commit: 8aec53969f]
2021-01-21 20:51:38 -08:00
Tony e1734526fc Update code object V3 kernarg queries
Code object V2 had the ability to support the following queries:

- HSA_CODE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_SIZE
- HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_SIZE
- HSA_CODE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_ALIGNMENT
- HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_ALIGNMENT

However code object V3 onwards cannot support these as the kernel
descriptor changed. These queries need to be deprecated.

Until then return more reasonable values:

- For kernarg alignment return 16 which is the minimum alignment
  required by the HSA standard.

- For kernarg size return the field from the kernel descriptor which
  is a hint. If it is 0 then the compiler is not specifying the kernarg
  size, or the kernel has no kernarg.

Change-Id: I19ce6cd0f3658a2bf62277492f39100ea5ab4256


[ROCm/ROCR-Runtime commit: ef755e4c82]
2020-11-20 21:39:18 -05:00
Konstantin Zhuravlyov ba667661c5 Implement Target ID Proposal
Changes from Konstantin Zhuravlyov, Tony Tye

Change-Id: I532801193afa9d5b8ac2a877b5497eab661f0597


[ROCm/ROCR-Runtime commit: 3a08d0964e]
2020-11-10 13:42:35 -05:00
Sean Keely 700dca7dd4 Correct memory release function.
l_name is populated by strdup which requires using free rather
than delete.

Change-Id: I9d9bdcfaa3ef095502270f332b95a0ee5c0bbcfc


[ROCm/ROCR-Runtime commit: 9c20f0e649]
2020-08-26 18:22:59 -05:00
Ramesh Errabolu b84e4987da Add rocr namespace to core header and impl files
Change-Id: I1e1b33f9bba1078d049bc19797889988c3e43360


[ROCm/ROCR-Runtime commit: fa13208698]
2020-06-19 22:34:21 -04:00
Sean Keely abd712f33f Update copyright date.
Change-Id: If4bf4c20cf051878bfe759080bb7345d884dd53d


[ROCm/ROCR-Runtime commit: ce19721c88]
2020-06-19 22:34:01 -04:00
Konstantin Zhuravlyov b1f050524b Add support for code object URI to ROCr
Adds the following:
	- New factory method to create a code object reader from
          file with offset and size.
	- A pair of queries on a loaded code object to get the URI name/length.
	- A bump to the AMD vendor loader extension API and its associated table.

Change-Id: I17c83e9c2447d29a43c438459395365f786a3611


[ROCm/ROCR-Runtime commit: 9eb735ec24]
2020-06-01 11:07:50 -04:00
Laurent Morichetti 3ead90a027 Add debugger support for wave halted at launch
New trap handler ABI: Record in ttmp11[8:7] the event that caused the
trap handler to be entered. We currently record 2 events, trap_raised
if an s_trap instruction was executed, or excp_raised if an exception
(MEM_VIOL or ILLEGAL_INST) was raised.

Change-Id: Ie278c8277437b3b67c2737dcd1a12fe6511df428


[ROCm/ROCR-Runtime commit: 00da82f951]
2020-04-29 19:29:56 -04:00
Laurent Morichetti 124a7e0e0c Return a file URI for elf images in shared objects
Iterate the loaded shared objects to see if the given elf image binary
is part of a loaded segment.

Change-Id: I074cacd99eb5b59f883f4ce2bd901e0e35a660b8


[ROCm/ROCR-Runtime commit: 5f783494f1]
2020-04-14 15:22:43 -04:00
Ramesh Errabolu 38747b8fec Update how code references publicly available ROCr headers
Change-Id: I357c51eb713a23704d4fee71081be46a73a71806


[ROCm/ROCR-Runtime commit: 627991b1c1]
2020-02-21 20:01:11 -05:00
Saleel Kudchadker 7c5a08073f Reset link_map map in the constructor
Change-Id: I8a6ad3bc0fca790dec2992cacf9288068b3bcaa3


[ROCm/ROCR-Runtime commit: c57f3da1dc]
2020-02-19 15:29:35 -08:00
Sean Keely d7d1f6e2e3 Support stripped binaries and remove unneeded attributes.
Attribute optimize(0) doesn't appear to be helpful helpful.  This
prevents optimization in the function but not at call sites to the
function.  The function may still be inlined since it has no side
effect (in some cases that we currently don't support).

Having a side effect prevents a call site optimization that allows
removal of a noinline function call with no side effect.  Call site
optimization should only happen (in GCC at least) when using whole
program optimization so this may be stronger than we strictly need.

Also added _amdgpu_r_debug to the exported symbol list (global) and
switched to the standard macro for an exported symbol (HSA_API).
Without being in the global list the debugger will not find this
symbol if the binary has been stripped.

Change-Id: Ieb00175ccc55fda4491deee44711cd55b3f24aeb


[ROCm/ROCR-Runtime commit: 3e9aca0f34]
2020-01-21 20:08:02 -05:00
Laurent Morichetti 74cd6e1197 Fix a build error when compiling with clang
Check __clang__ before __GNUC__ as clang defines both.

Change-Id: I9963f8e0665efb4cb08bd3886fb38fee42dd9861


[ROCm/ROCR-Runtime commit: 19e1fb3a4e]
2020-01-15 18:52:53 -08:00
Qingchuan Shi a9208ef64f fix optimize(0) for clang.
Change-Id: I83bc57d42815f37445ae97bf6950147e3358ac45


[ROCm/ROCR-Runtime commit: d63886190f]
2020-01-13 20:53:40 -05:00
Qingchuan Shi 2ab9ce6d5c Adding code object list in loader.
Change-Id: Iab3541287bd56276fd32615ee59fcd590de84ca0


[ROCm/ROCR-Runtime commit: 16a20cfb8c]
2019-10-30 20:31:51 -04:00
Konstantin Zhuravlyov 2b9e13a56c Loader: add basic logging abilities
- Enabled with env var LOADER_ENABLE_LOGGING=1

Change-Id: Ibdbb1b55ffddb7dc9c63e52fc9db3013409376a4


[ROCm/ROCR-Runtime commit: 2275c74695]
2019-08-21 13:29:15 -04:00
Sean Keely 49e70a3ef5 PR from github user DiamondLovesYou.
Allow user specified profiles if the HSAIL note is not found.

Konstantin reviewed and approved.  HSAIL note is not generated by LLVM.

Change-Id: I40fbfbaedd6787b6a716507918f698d02007afe1


[ROCm/ROCR-Runtime commit: 465a8eb40b]
2019-07-16 13:55:38 -05:00
Konstantin Zhuravlyov dde11e307d Process symbols with 0 address
Change-Id: I9ed943a8ccd3b103edd6aba8264c009d8cda29fa


[ROCm/ROCR-Runtime commit: 7001134757]
2019-03-30 02:14:43 -04:00
Konstantin Zhuravlyov a506e18fd2 Loader: update symbol processing for v2+
- Skip symbols that are STB_LOCAL and not STT_AMDGPU_HSA_KERNEL

Change-Id: I68567f58de9bf3f07dbd8020ef63f47667c86367


[ROCm/ROCR-Runtime commit: 8bee6e4976]
2019-01-18 15:42:28 -05:00
Konstantin Zhuravlyov 564ac4b348 Loader updates for code object v3
- Fix loading in some cases
  - Fix symbol kind

Change-Id: I721b4a35972b6d2a6d0ac733ab770b096cc74e17


[ROCm/ROCR-Runtime commit: c1ad82a6b7]
2019-01-18 15:41:01 -05:00
Konstantin Zhuravlyov fde14b8588 Fix dynamic relocations:
- Process dynamic relocation even if there is
    no symbol associated to it.

Change-Id: Iaefee682ee52f5acda8280e5764e6d5fd992774a


[ROCm/ROCR-Runtime commit: a447d79430]
2018-11-14 15:25:41 -05:00
Konstantin Zhuravlyov dd2ab28ddb Loader: Add support for v3 object code.
Change-Id: I7215bd0c1277c2036bf0fadf5b23cb57fdf7f665


[ROCm/ROCR-Runtime commit: 386874da55]
2018-10-06 14:01:59 -04:00
Scott Linder 42d4d4ebcf Apply dynamic relocations for STT_FUNC symbols
Required to support function calls through GOT table.

Change-Id: I174a0269fdd67369d38fe41855b7bd01f350b839


[ROCm/ROCR-Runtime commit: 47f0e6f7d3]
2018-09-23 21:42:32 -04:00