Grafik Komit

552 Melakukan

Penulis SHA1 Pesan Tanggal
Ramesh Errabolu ccd4e85fc9 Extend Rocr Visible Devices functionality to include UUIDs
Change-Id: Ia2892e4033717556a422fe33dec0294fe2ca9e28


[ROCm/ROCR-Runtime commit: 89f7ef224c]
2020-04-09 00:42:53 -05:00
Ramesh Errabolu e8f4f2d9e2 Extend ROCr to surface UUID of GPU devices that suppport
Change-Id: I478db68d69a01578770403fa695f9e6391637573


[ROCm/ROCR-Runtime commit: 45958c727d]
2020-04-08 19:19:22 -05:00
Pruthvi Madugundu ae435f8253 Updating the hsa include directory symlink creation
- Symlink creation is corrected only for deb packages
- It is follow up package of http://git.amd.com:8080/c/hsa/ec/hsa-runtime/+/334403
- configure_file() is called to update the scripts with proper cmake variable values

Signed-off-by: Pruthvi Madugundu <pruthvi.madugundu@amd.com>
Change-Id: I0e833ead265166411e83593fd57265a9ab356904


[ROCm/ROCR-Runtime commit: 241cdfdd01]
2020-04-01 02:17:11 -07:00
Sean Keely f4843faeec Fix Debian package build on CPack 3.10.
CPack now incorrectly adds two copies of directory symlinks when
building Debian packages.  This causes dpkg to see a file conflict
and fail installing.

The correct long term solution is to remove the symlink and use a
flat directory structure.  This patch adds the symlink in the post
install script as a workaround until we can switch to flat layout.

Change-Id: I879b6cbc2661c19df3db639cb42fba0972fddb93


[ROCm/ROCR-Runtime commit: f3b532b42d]
2020-03-20 21:35:32 -05:00
Sean Keely 34d4ed2ac5 Update asserts and comments for pointer info.
Checks for an IPC memory error and updates comments relevant
to rocr_visible_devices.

Change-Id: I9d2f2dd27f3fa04881d17387cce2692bc046edb2


[ROCm/ROCR-Runtime commit: a1c2439213]
2020-02-24 09:08:48 -05:00
Sean Keely 9e62ba8b96 Report HDP registers at all times.
HDP will now be used for coarse grain kernarg so needs to be
reported without consideration of fine grain vram over pcie.

Change-Id: I648167299faa583876a3d8685c3b3c4d8d31ebf9


[ROCm/ROCR-Runtime commit: 9c35780836]
2020-02-24 09:08:17 -05:00
Ramesh Errabolu 38747b8fec Update how code references publicly available ROCr headers
Change-Id: I357c51eb713a23704d4fee71081be46a73a71806


[ROCm/ROCR-Runtime commit: 627991b1c1]
2020-02-21 20:01:11 -05:00
Sean Keely 302c21ac31 Add env key HSA_NO_SCRATCH_THREAD_LIMITER.
Setting to 1 prevents the scratch handler from reducing peak occupancy.
Scratch allocations that would normally reduce peak occupancy will
instead fail.

Diagnostic for TF and PyTorch.

Change-Id: I2d7ea47077eb5cf708251c8aa3fd183ad4261be0


[ROCm/ROCR-Runtime commit: dc165c92bc]
2020-02-21 17:09:26 -05:00
Sean Keely b6b3140ae7 Correct scratch retry logic.
scratch_used_large_ was uninitialized leading to the observed hang.
DynamicScratchHandler would wait for a large scratch release despite no
large scratch having yet been allocated.  Fixes .

The patch also removes a potential race between AddScratchNotifier and
ReleaseQueueScratch.  The race condition does not exist today since both
scratch alloc and release run on the same thread.  The changes will
prevent this potential race from manifesting if the async event handler
is ever updated to use multiple threads.

Also enhances scratch occupancy reduction reporting.  Reporting now
prints the initial request size as well as the allocated size and the
effect on occupancy this has.  Occupancy is computed in terms of the
requesting dispatch grid size so may be >100%.

Change-Id: I0fc5ee01467ff4c29bdd25d545177c97862c3bd9


[ROCm/ROCR-Runtime commit: 6c556002d8]
2020-02-21 17:09:26 -05:00
Sean Keely b21dcb7913 Insert zero sized pool on CPU agents without attached memory.
Ensures that all CPU agents will have a pool handle to allocate
system memory.  These pools will have no numa binding since the
node their owning Agent represents has no installed memory.

Change-Id: I9f72b455d633646839753c6719ff7f6a4c41f7c4


[ROCm/ROCR-Runtime commit: d53fe07687]
2020-02-21 17:05:10 -05:00
Saleel Kudchadker 7c5a08073f Reset link_map map in the constructor
Change-Id: I8a6ad3bc0fca790dec2992cacf9288068b3bcaa3


[ROCm/ROCR-Runtime commit: c57f3da1dc]
2020-02-19 15:29:35 -08:00
Chris Freehill cdda497901 By default, don't collect rsmi monitor values
Change-Id: I4946efadcb9c5ececead1b4c40b73adc5ceca957


[ROCm/ROCR-Runtime commit: f5e86a8f14]
2020-02-14 18:06:19 -05:00
Pruthvi Madugundu 9638a946ae Adding RUNPATH to find libhsakmt.so for Centos and SLES
- This new path is required when libhsaruntime.so is referred
from the top level ROCm lib directory.
- Once ROCm stack lib/lib64 structure is flatten, RUNPATH in all
the libraries needs to be updated.

Change-Id: I369131ce93e14958ec57a54701671f2bfd8d522a
Signed-off-by: Pruthvi Madugundu <pruthvi.madugundu@amd.com>


[ROCm/ROCR-Runtime commit: e931fd424b]
2020-02-14 17:43:29 -05:00
Sean Keely d7d1f6e2e3 Support stripped binaries and remove unneeded attributes.
Attribute optimize(0) doesn't appear to be helpful helpful.  This
prevents optimization in the function but not at call sites to the
function.  The function may still be inlined since it has no side
effect (in some cases that we currently don't support).

Having a side effect prevents a call site optimization that allows
removal of a noinline function call with no side effect.  Call site
optimization should only happen (in GCC at least) when using whole
program optimization so this may be stronger than we strictly need.

Also added _amdgpu_r_debug to the exported symbol list (global) and
switched to the standard macro for an exported symbol (HSA_API).
Without being in the global list the debugger will not find this
symbol if the binary has been stripped.

Change-Id: Ieb00175ccc55fda4491deee44711cd55b3f24aeb


[ROCm/ROCR-Runtime commit: 3e9aca0f34]
2020-01-21 20:08:02 -05:00
Freddy Paul ac2ab48aa0 ADD libhsakmt.so RUNPATH for SLES
For SLES libhsakmt.so is located in <ROCm install>/lib64

Change-Id: I038dd80b65b4a493ac37981649b02f1b35caea88


[ROCm/ROCR-Runtime commit: 4b2256ac21]
2020-01-16 22:20:14 -05:00
Laurent Morichetti 74cd6e1197 Fix a build error when compiling with clang
Check __clang__ before __GNUC__ as clang defines both.

Change-Id: I9963f8e0665efb4cb08bd3886fb38fee42dd9861


[ROCm/ROCR-Runtime commit: 19e1fb3a4e]
2020-01-15 18:52:53 -08:00
Srinivasan Subramanian b53a8a6377 Avoid shared library conflicts across multiple ROCM version
Adding patch number based on ROCM build/release to have unique
file name for libraries across multiple versions of ROCM.

Signed-off-by: Pruthvi Madugundu <pruthvi.madugundu@amd.com>

Change-Id: I58d665b0e7d577b5bd7a6000d1202a0242672727


[ROCm/ROCR-Runtime commit: 54d94d02bd]
2020-01-15 01:08:23 -05:00
Qingchuan Shi a9208ef64f fix optimize(0) for clang.
Change-Id: I83bc57d42815f37445ae97bf6950147e3358ac45


[ROCm/ROCR-Runtime commit: d63886190f]
2020-01-13 20:53:40 -05:00
Ramesh Errabolu 8306bf2d94 Cpack rule encoding dependencies was broken
Change-Id: I2b8df7d255cdffd6b42713f0b59df2aeef83a607


[ROCm/ROCR-Runtime commit: be5782e198]
2019-12-23 10:51:55 -06:00
Sean Keely 47c4c7bacf Disable SDMA on gfx10.
Lack of cache controls only allow operating SDMA at
agent scope.  All copy APIs are defined at system scope so may
result in data errors.

Change-Id: I9cd10007defddcbf8feb14a2e3daa1ba17c0489f


[ROCm/ROCR-Runtime commit: 22a601292d]
2019-12-20 17:25:47 -06:00
Ramesh Errabolu 44fb1be462 Support the building and packaging of legacy ROCr tools by itself
Change-Id: I2247bf7a46ee93495340f7b2603b09dc6b667443


[ROCm/ROCR-Runtime commit: d015d78de3]
2019-12-18 19:20:03 -06:00
Konstantin Zhuravlyov 1c4e072260 Switch to llvm monorepo
Change-Id: Ibfe045afd811d36521486573168aecd06279ccb6


[ROCm/ROCR-Runtime commit: 096e715629]
2019-12-17 22:55:20 -05:00
Chris Freehill ce2b8ab35c Import PAL version of addrlib with initial gfx1xxx support
Change-Id: I439930a5cbf5b13a359ec164e75c6828af8c668d


[ROCm/ROCR-Runtime commit: 4c22962024]
2019-12-12 21:38:01 -06:00
Sean Keely 3bcde37b58 Initial GWS queue support.
Queues should transition to ref counting for all queues eventually.
That cleanup will be part of shared queue pooling support.

Change-Id: I217ff5d573156678b9559da6fb81baa8cd31c617


[ROCm/ROCR-Runtime commit: 0a43a107b1]
2019-12-09 21:21:17 -05:00
Ramesh Errabolu a161850fcf Add package dependencies for Images and Tools
Change-Id: Id77ba0e81d3b3e872153cdd7680338dd70319026


[ROCm/ROCR-Runtime commit: 144017e148]
2019-12-06 16:38:21 -06:00
Sean Keely 511b86a55e Allow disabling scratch reclaim.
Debug and RCCL NPI assist feature.

Change-Id: I2cb76f0a086fa341465df3ede26965ab713bc3b4


[ROCm/ROCR-Runtime commit: d2a50a0048]
2019-11-20 02:41:58 -05:00
Sean Keely c6592f7757 Raise large scratch allocation limit for RCCL.
Temporary workaround for 2.10 release.  RCCL, compiler, or firmware
must be corrected and this code reverted before another ASIC release.

Change-Id: I27851353289b93df9acb72d28b8c6ccb9f7f7d7a


[ROCm/ROCR-Runtime commit: 35c1ffa863]
2019-11-20 02:41:27 -05:00
Sean Keely 7af8b533c2 Correct clang paths for incremental builds.
We should be using bin/clang, not the build/lightining/bin/clang since
build/ is the project's internal build directory.  This patch corrects
this where possible.

However, lightning does not install all it's end user files under out/
so some headers can not be found anywhere in out/ in an incremental build.
This header (opencl-c.h) if fetched out of the lightning source tree if
necessary.

Change-Id: I083d8b27bb39dd615fba3bb0711a789318f95e77


[ROCm/ROCR-Runtime commit: f62017f4a5]
2019-11-16 03:10:28 -06:00
Laurent Morichetti 6e3347ed25 Fix a typo in INSN_S_ENDPGM_OPCODE encoding.
The correct s_endpgm instruction encoding is 0xBF810000.

Change-Id: I03f304762dcaced5bf3fa4f069da7a0b287d1cd2


[ROCm/ROCR-Runtime commit: 5774d9162b]
2019-11-12 11:54:17 -08:00
Ramesh Errabolu 1798d1386b Fix Aql Header initialization
Change-Id: If23c12b28d22fb62b673038890f40bb0f3c3948b


[ROCm/ROCR-Runtime commit: 086b4b686c]
2019-11-06 16:13:58 -05:00
Qingchuan Shi 2ab9ce6d5c Adding code object list in loader.
Change-Id: Iab3541287bd56276fd32615ee59fcd590de84ca0


[ROCm/ROCR-Runtime commit: 16a20cfb8c]
2019-10-30 20:31:51 -04:00
Jay Cornwall 698ddfed09 Merge debugger trap handler into ROCr trap handler
Debugger path is taken for (trap_id >= 3) and single step exceptions.
Other traps/exceptions behave as before.

Change-Id: I276c0eb69953709968353a57717ee017d22348a2


[ROCm/ROCR-Runtime commit: 78e754935c]
2019-10-30 13:56:06 -04:00
Sean Keely 37f96146f5 Correct strip command.
Strip should only apply to the output target library.  Symlinks
with .so endings which will be relocated during install will cause
strip to fail, aborting the build.

Change-Id: Ieb598c2cec5277d9d14c8afa88b91ca2c7f4412d


[ROCm/ROCR-Runtime commit: 851ee799c4]
2019-10-30 01:24:43 -05:00
Sean Keely afc463c817 Adapt to new versioning.
Using branch point for count since last change since we don't
have questions answered on tags yet.

Removed unused CMake files.
Restructured CMake to use the cache rather than only commandline
and be ccmake & cmake-gui friendly.
Dependency search paths are added for the Repo tree layout.

Search paths still needed for install paths.

Simplified packaging.  hsa-ext-rocr-dev package and contents now
build from the package CMake rather than being 3 separate projects.

Not applying new version number or new install paths!

Change-Id: Ibea50dc8a6ab091e91857f78833f5379a4511547


[ROCm/ROCR-Runtime commit: 6c3acda664]
2019-10-29 04:21:17 -05:00
Chris Freehill 6687f73732 Add some gfx1xxx targets
This is to fix 

Change-Id: I69a87884d8174733905e4c007cf0f19b5103482a


[ROCm/ROCR-Runtime commit: 53228ad819]
2019-10-16 09:01:22 -04:00
Jay Cornwall 27ea6107f8 Disable SDMA HDP flush on gfx10
Not currently functional, triggering SRBM write protection.

Change-Id: Ib0b832357e3df5a6a0d0b46648515ec9bd70f017


[ROCm/ROCR-Runtime commit: 906cd84186]
2019-09-14 14:08:47 -04:00
Jay Cornwall fe6a31ee4e Set MTYPE field in SDMA fence command on gfx10
This is the only SDMA command with an MTYPE field.

Change-Id: Ice146ace9c3e8e7aff038e1e004be73c070f48fe


[ROCm/ROCR-Runtime commit: e0358d7dc2]
2019-09-14 14:07:57 -04:00
Jay Cornwall d597f80b04 Add gfx1010, gfx1011, gfx1012 ELF types to loader
Change-Id: I23a1159fb10f60881ea6830ba13ee73bd373bfc9


[ROCm/ROCR-Runtime commit: 32a9a5dbb0]
2019-09-14 14:07:16 -04:00
Jay Cornwall 933f052033 Implement code cache (SQC I$, SQC K$, TCP, GL1, GL2) invalidation for gfx10
Change-Id: I8b2a59118094fbb55e3f575fa9f79959d3725d7d


[ROCm/ROCR-Runtime commit: 5b64fbd0e5]
2019-09-14 14:06:31 -04:00
Jay Cornwall e729948e41 Add binary shaders for gfx10
Change-Id: Iaf586a15a2f2aebc266da5148aa8637b092c1002


[ROCm/ROCR-Runtime commit: d1c5a079cd]
2019-09-14 14:05:35 -04:00
Chris Freehill e44fecc07c Add gfx10,11,12 old to new name format conversion
Change-Id: I792c840d8d819d1d48f95fc4167b2e25c6beec23


[ROCm/ROCR-Runtime commit: 0afe6618a6]
2019-09-14 10:37:19 -04:00
Jay Cornwall b25eda2db7 Support wave32/wave64 scratch allocations on gfx10
- Use new buffer resource descriptor layout
- Handle wave32 scratch allocation error from CP
- Make wavefront size a property of scratch allocation requests
- Repurpose wave64-specific amd_queue_t.scratch_workitem_byte_size field
- Clear index_stride field in V# on gfx10, calculated per-dispatch by CP

Change-Id: If2acdf6430772abd4d6a8c792fc8c11260764dda


[ROCm/ROCR-Runtime commit: f8d0ccd159]
2019-09-13 17:22:59 -04:00
Chris Freehill 5fbd73af1d Update addrlib to pull in gfx10
This is mostly un-edited from Perforce. We will make other required
edits in future commits.

Change-Id: I55a809f2f23f03d60e4dcd1fb947ad558e737027


[ROCm/ROCR-Runtime commit: 08841faf4c]
2019-09-13 11:44:23 -04:00
Chris Freehill aad11979eb Make gfx10 use OSS defined packet fields
Change-Id: Icf622c22a17005aaeafb24f80a414319bebb891f


[ROCm/ROCR-Runtime commit: 0ec781478d]
2019-09-13 08:14:24 -04:00
Chris Freehill 547f41e83a Add gfx10 as a target ID
Change-Id: Ib9a78776af9f26ff9278a06b059cb8b7ee216ee2


[ROCm/ROCR-Runtime commit: b104031628]
2019-09-12 20:24:40 -05:00
Chris Freehill f2023220fd Initial support for gfx1010, gfx1011, gfx1012
Change-Id: I9ec398070c85db08aea72947557c6e1b5f7d541d


[ROCm/ROCR-Runtime commit: 6ebdad5896]
2019-09-12 20:24:30 -05:00
Sean Keely 286cf8f732 Enable trap handler on APUs.
Change-Id: Ifdc8c2782498b3fbe238d773120d378c47918d07


[ROCm/ROCR-Runtime commit: f2599fccb6]
2019-09-06 18:10:20 -04:00
Sean Keely 9c6f904413 Correct doorbell_queue_map allocation.
doorbell_queue_map should always be allocated or we will need to
add branches around all accesses.

Change-Id: I994c0eaf4be62c1a4a37bd06894272dba1fc1da6


[ROCm/ROCR-Runtime commit: f9d3796db8]
2019-09-06 18:10:20 -04:00
Christian Sigg c28aadf5a8 Add missing include to lazy_ptr.h
Change-Id: I5b061692a4ec6def631d7c3182e5b644b6b9c519


[ROCm/ROCR-Runtime commit: 00b0ee15b3]
2019-09-05 02:44:27 -04:00
Christian Sigg e17c7e24d6 Change #include of libelf.h from quote to angle.
Change-Id: Ie940ed0f78e95224e42978381c552861e6d58ee4


[ROCm/ROCR-Runtime commit: 1f177cf9c2]
2019-09-05 02:43:54 -04:00