Wykres commitów

575 Commity

Autor SHA1 Wiadomość Data
Tony fbb4e4c2c3 Make HSA_QUEUE_TYPE_COOPERATIVE a queue type value
- Correct defintion of HSA_QUEUE_TYPE_COOPERATIVE to be a queue type
  and not a bit mask.
- Correct implementation of hsa_queue_type_t to treat is as an
  enumeration type and not a bit mask. In particular
  HSA_QUEUE_TYPE_COOPERATIVE is a distinct queue type that uses the
  multi producer protocol, and is not a bit set value.

Change-Id: I9415be8853671e5511e16e306caf16020e8c84af


[ROCm/ROCR-Runtime commit: bccb25fc33]
2020-05-07 19:24:19 -04:00
Matt Arsenault b5f6574895 Do not use llvm-dis to pick the triple
There are a couple problems with this. First, llvm-dis is an unstable
llvm development tool and 3rd party users should generally not rely on
it. The text format is unstable, and the regex here isn't even
explicitly looking for the target triple field, so it could
accidentally find something else. Second, picking the target to
compile based on the library you are linking is a fundamentally
backwards decision. The target you're compiling for changes the
library you would want to link. The device libraries are only ever
compiled with amdgcn-amd-amdhsa. If we had a second triple, this
should be explicitly building for any it cares about.

Change-Id: I3bae8398f60f78df61ab2177aa9e83f47ec6dea4


[ROCm/ROCR-Runtime commit: 96d4140609]
2020-05-06 13:28:39 -04:00
Laurent Morichetti ed6147a506 Check all s_endpgm instructions
The ROCR trap handler should check for all end program instructions
and not halt on them. Mask off the imm16 before comparing the
instruction to the s_endpgm opcode.

Change-Id: I669ffc7f5b699d7daf0c8ec5761ed7bb193f07a7


[ROCm/ROCR-Runtime commit: df03a377f5]
2020-05-04 19:52:53 -04:00
Sean Keely 3b0c3e83a3 Update addrlib with latest Mesa source.
Change-Id: Idd8cdaac9ad370397d62f6a32687ca7bc7d7462b


[ROCm/ROCR-Runtime commit: 3da81968cb]
2020-05-01 20:33:09 -04:00
Sean Keely 2269234579 Remove dead code from image_manager_xx.cpp
Image swizzle mode will be set by the preferred surface info
function.

Change-Id: I41e639be53cafbb4db6cf15c159aa2bd457ec5be


[ROCm/ROCR-Runtime commit: 1440da3e15]
2020-05-01 20:32:45 -04:00
Sean Keely 1fc7f2dec7 Move Images code to hsa-runtime folder
Change-Id: I53c1845d985ac3e9708d952865009c0021f3bb4f


[ROCm/ROCR-Runtime commit: 7e3db20826]
2020-04-30 19:35:57 -05:00
Ramesh Errabolu bd5ef0eff8 Update Image code base to use addrlib from mesa
Change-Id: I31355d7fc3db423c16772cf105e9b6b59a3a6307


[ROCm/ROCR-Runtime commit: 1a3ee2fd03]
2020-04-30 19:35:56 -05:00
Laurent Morichetti 3ead90a027 Add debugger support for wave halted at launch
New trap handler ABI: Record in ttmp11[8:7] the event that caused the
trap handler to be entered. We currently record 2 events, trap_raised
if an s_trap instruction was executed, or excp_raised if an exception
(MEM_VIOL or ILLEGAL_INST) was raised.

Change-Id: Ie278c8277437b3b67c2737dcd1a12fe6511df428


[ROCm/ROCR-Runtime commit: 00da82f951]
2020-04-29 19:29:56 -04:00
Matt Arsenault 15e5e6364d Use -nogpulib in another place
Change-Id: I9cc1daa7db7d1f2ff07a0dbfb403dbf41f4bbffb


[ROCm/ROCR-Runtime commit: 2e73d52ac6]
2020-04-28 13:46:01 -04:00
Matt Arsenault 70c54eba7a Use -nogpulib as a quick build fix
Change-Id: I28ca7d53c76f0829719079dfb67b6314f5ff27cc


[ROCm/ROCR-Runtime commit: 0d84b66b1e]
2020-04-28 10:08:37 -04:00
Kent Russell d94a09bc46 CMakeLists: Support static building of hsa-runtime
Remove the hard-coding of "SHARED" as the lib type, and move any
SO-specific linking to only happen if the .so exists in the first place

Change-Id: I3f0bfd5c03f19b2425423b4dc8eed8fd87acc1d6


[ROCm/ROCR-Runtime commit: 33133ebd07]
2020-04-27 20:52:07 -04:00
Sean Keely 7936a4b3bf Adapt to new LLVM location in repo build.
This will reenable incremental PSDB builds.

Change-Id: I2311c124b06b544202f7c1db31b6607f2580194e


[ROCm/ROCR-Runtime commit: 675f73cda9]
2020-04-27 17:58:35 -04:00
Austin Kerbow 3e9e830351 Update IsaRegistry for backend changes
Changes in the compiler are being made to add controls for XNACK and SRAM ECC
for all targets which can support these features. By default the conservatively
correct settings of XNACK on and SRAM ECC on will be used. This change is to
facilitate these backend updates.

Change-Id: I2fd6b6bc1d32937737e7f56d8e08c70fe781c745


[ROCm/ROCR-Runtime commit: 87202d4408]
2020-04-25 04:45:28 -04:00
Sean Keely 9319b029f2 Correct IPC fragment validation.
IPC create must only be used on whole ROCr allocations.
Fragments were allowing handle creation with offsets.

Change-Id: I1faa96d36bc7a6199bdc2e3ff1b8871d1a36a2fa


[ROCm/ROCR-Runtime commit: 7712c7e743]
2020-04-24 00:08:53 -04:00
Sean Keely 71f3dbe6eb Correct capture of PoolInfo::allocable_size_.
Change-Id: I80757bb36048bc15b928220aca0a1eb5d898ab22


[ROCm/ROCR-Runtime commit: b90bf473c1]
2020-04-21 19:03:24 -05:00
Sean Keely 9eb712762e Suppress Finalizer loading attempts.
This has been the default mode for a while now since we don't
distribute or build the finalizer.  Removing the attempt cleans
up debug mode messages that are causing confusion.

Change-Id: I8162c95abd5bbedaa22b90191f7a384a34c388ae


[ROCm/ROCR-Runtime commit: 3fe891d5da]
2020-04-18 00:06:42 -04:00
Sean Keely 358c091a13 Remove references to finalizer header.
Change-Id: I6608c95268ab4bc66053d889cf7d5a30cd8fccab


[ROCm/ROCR-Runtime commit: e25ae1263b]
2020-04-17 23:50:23 -04:00
Sean Keely c354858217 Correct rocrtst numa awareness.
Pool size was being used where alloc_max_size should be.
Changes are necessary on NUMA systems where not all nodes have
installed memory.

Change-Id: If8f507cae50a8dfeae8572d4e39df757abe28599


[ROCm/ROCR-Runtime commit: a9470e3563]
2020-04-17 23:43:38 -04:00
Sean Keely 9989d79543 Don't lock KFD allocated system memory.
Lock API suceeds but the GPU still faults on the address.
This should be fixed in Thunk and/or KFD as well.

Change-Id: I8b2fbcae61ab181e4fe7f0b64e43a5f0772efb24


[ROCm/ROCR-Runtime commit: 9fe44ed675]
2020-04-17 21:45:01 -04:00
Ramesh Errabolu 9ebd4f7163 Extend RocrTst to query UUID of ROCm devices
Change-Id: I6c9fa9751c893ba119e8b7a2808a8ab2aebeba3b


[ROCm/ROCR-Runtime commit: 7434fa14d4]
2020-04-16 21:29:32 -04:00
Ramesh Errabolu bad458e27b Stop building and packaging Tools library
Change-Id: Iee430c24e32ea7412f21564fe8970749e4954b91


[ROCm/ROCR-Runtime commit: 30f46e4e24]
2020-04-15 13:58:33 -04:00
Laurent Morichetti 124a7e0e0c Return a file URI for elf images in shared objects
Iterate the loaded shared objects to see if the given elf image binary
is part of a loaded segment.

Change-Id: I074cacd99eb5b59f883f4ce2bd901e0e35a660b8


[ROCm/ROCR-Runtime commit: 5f783494f1]
2020-04-14 15:22:43 -04:00
Nathan O cb264557f5 Fix hsa_amd_agents_allow_access documentation
- Update the documentation comment in hsa_ext_amd.h, which contained
   contradictory and incorrect information about an argument to the
   hsa_amd_agents_allow_access function.

Change-Id: I60b0dbbdc761078cd81906bc2c63a27d7e6b53e1


[ROCm/ROCR-Runtime commit: 6d5781bb14]
2020-04-10 18:26:13 -04:00
Ramesh Errabolu ccd4e85fc9 Extend Rocr Visible Devices functionality to include UUIDs
Change-Id: Ia2892e4033717556a422fe33dec0294fe2ca9e28


[ROCm/ROCR-Runtime commit: 89f7ef224c]
2020-04-09 00:42:53 -05:00
Ramesh Errabolu e8f4f2d9e2 Extend ROCr to surface UUID of GPU devices that suppport
Change-Id: I478db68d69a01578770403fa695f9e6391637573


[ROCm/ROCR-Runtime commit: 45958c727d]
2020-04-08 19:19:22 -05:00
Pruthvi Madugundu ae435f8253 Updating the hsa include directory symlink creation
- Symlink creation is corrected only for deb packages
- It is follow up package of http://git.amd.com:8080/c/hsa/ec/hsa-runtime/+/334403
- configure_file() is called to update the scripts with proper cmake variable values

Signed-off-by: Pruthvi Madugundu <pruthvi.madugundu@amd.com>
Change-Id: I0e833ead265166411e83593fd57265a9ab356904


[ROCm/ROCR-Runtime commit: 241cdfdd01]
2020-04-01 02:17:11 -07:00
Sean Keely f4843faeec Fix Debian package build on CPack 3.10.
CPack now incorrectly adds two copies of directory symlinks when
building Debian packages.  This causes dpkg to see a file conflict
and fail installing.

The correct long term solution is to remove the symlink and use a
flat directory structure.  This patch adds the symlink in the post
install script as a workaround until we can switch to flat layout.

Change-Id: I879b6cbc2661c19df3db639cb42fba0972fddb93


[ROCm/ROCR-Runtime commit: f3b532b42d]
2020-03-20 21:35:32 -05:00
Sean Keely 34d4ed2ac5 Update asserts and comments for pointer info.
Checks for an IPC memory error and updates comments relevant
to rocr_visible_devices.

Change-Id: I9d2f2dd27f3fa04881d17387cce2692bc046edb2


[ROCm/ROCR-Runtime commit: a1c2439213]
2020-02-24 09:08:48 -05:00
Sean Keely 9e62ba8b96 Report HDP registers at all times.
HDP will now be used for coarse grain kernarg so needs to be
reported without consideration of fine grain vram over pcie.

Change-Id: I648167299faa583876a3d8685c3b3c4d8d31ebf9


[ROCm/ROCR-Runtime commit: 9c35780836]
2020-02-24 09:08:17 -05:00
Ramesh Errabolu 38747b8fec Update how code references publicly available ROCr headers
Change-Id: I357c51eb713a23704d4fee71081be46a73a71806


[ROCm/ROCR-Runtime commit: 627991b1c1]
2020-02-21 20:01:11 -05:00
Sean Keely 302c21ac31 Add env key HSA_NO_SCRATCH_THREAD_LIMITER.
Setting to 1 prevents the scratch handler from reducing peak occupancy.
Scratch allocations that would normally reduce peak occupancy will
instead fail.

Diagnostic for TF and PyTorch.

Change-Id: I2d7ea47077eb5cf708251c8aa3fd183ad4261be0


[ROCm/ROCR-Runtime commit: dc165c92bc]
2020-02-21 17:09:26 -05:00
Sean Keely b6b3140ae7 Correct scratch retry logic.
scratch_used_large_ was uninitialized leading to the observed hang.
DynamicScratchHandler would wait for a large scratch release despite no
large scratch having yet been allocated.  Fixes .

The patch also removes a potential race between AddScratchNotifier and
ReleaseQueueScratch.  The race condition does not exist today since both
scratch alloc and release run on the same thread.  The changes will
prevent this potential race from manifesting if the async event handler
is ever updated to use multiple threads.

Also enhances scratch occupancy reduction reporting.  Reporting now
prints the initial request size as well as the allocated size and the
effect on occupancy this has.  Occupancy is computed in terms of the
requesting dispatch grid size so may be >100%.

Change-Id: I0fc5ee01467ff4c29bdd25d545177c97862c3bd9


[ROCm/ROCR-Runtime commit: 6c556002d8]
2020-02-21 17:09:26 -05:00
Sean Keely b21dcb7913 Insert zero sized pool on CPU agents without attached memory.
Ensures that all CPU agents will have a pool handle to allocate
system memory.  These pools will have no numa binding since the
node their owning Agent represents has no installed memory.

Change-Id: I9f72b455d633646839753c6719ff7f6a4c41f7c4


[ROCm/ROCR-Runtime commit: d53fe07687]
2020-02-21 17:05:10 -05:00
Saleel Kudchadker 7c5a08073f Reset link_map map in the constructor
Change-Id: I8a6ad3bc0fca790dec2992cacf9288068b3bcaa3


[ROCm/ROCR-Runtime commit: c57f3da1dc]
2020-02-19 15:29:35 -08:00
Chris Freehill cdda497901 By default, don't collect rsmi monitor values
Change-Id: I4946efadcb9c5ececead1b4c40b73adc5ceca957


[ROCm/ROCR-Runtime commit: f5e86a8f14]
2020-02-14 18:06:19 -05:00
Pruthvi Madugundu 9638a946ae Adding RUNPATH to find libhsakmt.so for Centos and SLES
- This new path is required when libhsaruntime.so is referred
from the top level ROCm lib directory.
- Once ROCm stack lib/lib64 structure is flatten, RUNPATH in all
the libraries needs to be updated.

Change-Id: I369131ce93e14958ec57a54701671f2bfd8d522a
Signed-off-by: Pruthvi Madugundu <pruthvi.madugundu@amd.com>


[ROCm/ROCR-Runtime commit: e931fd424b]
2020-02-14 17:43:29 -05:00
Sean Keely d7d1f6e2e3 Support stripped binaries and remove unneeded attributes.
Attribute optimize(0) doesn't appear to be helpful helpful.  This
prevents optimization in the function but not at call sites to the
function.  The function may still be inlined since it has no side
effect (in some cases that we currently don't support).

Having a side effect prevents a call site optimization that allows
removal of a noinline function call with no side effect.  Call site
optimization should only happen (in GCC at least) when using whole
program optimization so this may be stronger than we strictly need.

Also added _amdgpu_r_debug to the exported symbol list (global) and
switched to the standard macro for an exported symbol (HSA_API).
Without being in the global list the debugger will not find this
symbol if the binary has been stripped.

Change-Id: Ieb00175ccc55fda4491deee44711cd55b3f24aeb


[ROCm/ROCR-Runtime commit: 3e9aca0f34]
2020-01-21 20:08:02 -05:00
Freddy Paul ac2ab48aa0 ADD libhsakmt.so RUNPATH for SLES
For SLES libhsakmt.so is located in <ROCm install>/lib64

Change-Id: I038dd80b65b4a493ac37981649b02f1b35caea88


[ROCm/ROCR-Runtime commit: 4b2256ac21]
2020-01-16 22:20:14 -05:00
Laurent Morichetti 74cd6e1197 Fix a build error when compiling with clang
Check __clang__ before __GNUC__ as clang defines both.

Change-Id: I9963f8e0665efb4cb08bd3886fb38fee42dd9861


[ROCm/ROCR-Runtime commit: 19e1fb3a4e]
2020-01-15 18:52:53 -08:00
Srinivasan Subramanian b53a8a6377 Avoid shared library conflicts across multiple ROCM version
Adding patch number based on ROCM build/release to have unique
file name for libraries across multiple versions of ROCM.

Signed-off-by: Pruthvi Madugundu <pruthvi.madugundu@amd.com>

Change-Id: I58d665b0e7d577b5bd7a6000d1202a0242672727


[ROCm/ROCR-Runtime commit: 54d94d02bd]
2020-01-15 01:08:23 -05:00
Qingchuan Shi a9208ef64f fix optimize(0) for clang.
Change-Id: I83bc57d42815f37445ae97bf6950147e3358ac45


[ROCm/ROCR-Runtime commit: d63886190f]
2020-01-13 20:53:40 -05:00
Ramesh Errabolu 8306bf2d94 Cpack rule encoding dependencies was broken
Change-Id: I2b8df7d255cdffd6b42713f0b59df2aeef83a607


[ROCm/ROCR-Runtime commit: be5782e198]
2019-12-23 10:51:55 -06:00
Sean Keely 47c4c7bacf Disable SDMA on gfx10.
Lack of cache controls only allow operating SDMA at
agent scope.  All copy APIs are defined at system scope so may
result in data errors.

Change-Id: I9cd10007defddcbf8feb14a2e3daa1ba17c0489f


[ROCm/ROCR-Runtime commit: 22a601292d]
2019-12-20 17:25:47 -06:00
Ramesh Errabolu 44fb1be462 Support the building and packaging of legacy ROCr tools by itself
Change-Id: I2247bf7a46ee93495340f7b2603b09dc6b667443


[ROCm/ROCR-Runtime commit: d015d78de3]
2019-12-18 19:20:03 -06:00
Konstantin Zhuravlyov 1c4e072260 Switch to llvm monorepo
Change-Id: Ibfe045afd811d36521486573168aecd06279ccb6


[ROCm/ROCR-Runtime commit: 096e715629]
2019-12-17 22:55:20 -05:00
Chris Freehill ce2b8ab35c Import PAL version of addrlib with initial gfx1xxx support
Change-Id: I439930a5cbf5b13a359ec164e75c6828af8c668d


[ROCm/ROCR-Runtime commit: 4c22962024]
2019-12-12 21:38:01 -06:00
Sean Keely 3bcde37b58 Initial GWS queue support.
Queues should transition to ref counting for all queues eventually.
That cleanup will be part of shared queue pooling support.

Change-Id: I217ff5d573156678b9559da6fb81baa8cd31c617


[ROCm/ROCR-Runtime commit: 0a43a107b1]
2019-12-09 21:21:17 -05:00
Ramesh Errabolu a161850fcf Add package dependencies for Images and Tools
Change-Id: Id77ba0e81d3b3e872153cdd7680338dd70319026


[ROCm/ROCR-Runtime commit: 144017e148]
2019-12-06 16:38:21 -06:00
Sean Keely 511b86a55e Allow disabling scratch reclaim.
Debug and RCCL NPI assist feature.

Change-Id: I2cb76f0a086fa341465df3ede26965ab713bc3b4


[ROCm/ROCR-Runtime commit: d2a50a0048]
2019-11-20 02:41:58 -05:00
Sean Keely c6592f7757 Raise large scratch allocation limit for RCCL.
Temporary workaround for 2.10 release.  RCCL, compiler, or firmware
must be corrected and this code reverted before another ASIC release.

Change-Id: I27851353289b93df9acb72d28b8c6ccb9f7f7d7a


[ROCm/ROCR-Runtime commit: 35c1ffa863]
2019-11-20 02:41:27 -05:00