Gráfico de Commits

2959 Commits

Autor SHA1 Mensagem Data
Laurent Morichetti 00da82f951 Add debugger support for wave halted at launch
New trap handler ABI: Record in ttmp11[8:7] the event that caused the
trap handler to be entered. We currently record 2 events, trap_raised
if an s_trap instruction was executed, or excp_raised if an exception
(MEM_VIOL or ILLEGAL_INST) was raised.

Change-Id: Ie278c8277437b3b67c2737dcd1a12fe6511df428
2020-04-29 19:29:56 -04:00
Matt Arsenault 2e73d52ac6 Use -nogpulib in another place
Change-Id: I9cc1daa7db7d1f2ff07a0dbfb403dbf41f4bbffb
2020-04-28 13:46:01 -04:00
Matt Arsenault 0d84b66b1e Use -nogpulib as a quick build fix
Change-Id: I28ca7d53c76f0829719079dfb67b6314f5ff27cc
2020-04-28 10:08:37 -04:00
Kent Russell 33133ebd07 CMakeLists: Support static building of hsa-runtime
Remove the hard-coding of "SHARED" as the lib type, and move any
SO-specific linking to only happen if the .so exists in the first place

Change-Id: I3f0bfd5c03f19b2425423b4dc8eed8fd87acc1d6
2020-04-27 20:52:07 -04:00
Sean Keely 675f73cda9 Adapt to new LLVM location in repo build.
This will reenable incremental PSDB builds.

Change-Id: I2311c124b06b544202f7c1db31b6607f2580194e
2020-04-27 17:58:35 -04:00
Kent Russell ddd38deab7 Make thunk lib type be defined by cmake
Make the hsakmt library take the value from CMake regarding
static/shared
KFDTest automatically grabs the right one due to it checking the normal
shared folders. Tested locally (and via automation by the time that this
is merged)
Also set the default to building SOs if BUILD_SHARED_LIBS is not defined

Change-Id: I7f8b76a7e60f3b41e5981f472b388301ae09e2c6
Signed-off-by: Kent Russell <kent.russell@amd.com>
2020-04-27 13:52:42 -04:00
Austin Kerbow 87202d4408 Update IsaRegistry for backend changes
Changes in the compiler are being made to add controls for XNACK and SRAM ECC
for all targets which can support these features. By default the conservatively
correct settings of XNACK on and SRAM ECC on will be used. This change is to
facilitate these backend updates.

Change-Id: I2fd6b6bc1d32937737e7f56d8e08c70fe781c745
2020-04-25 04:45:28 -04:00
Kent Russell b72bbeac3e Fix naming conventions again for -dev package
Using the building OS isn't guaranteed, as we can theoretically build
RPMs in Ubuntu or DEBs in CentOS. Use CPack's DEB/RPM-specific variables
to get around this issue

Change-Id: I404246c070eac2c74b45ae4b763c612891d66de1
Signed-off-by: Kent Russell <kent.russell@amd.com>
2020-04-24 08:06:38 -04:00
Sean Keely 7712c7e743 Correct IPC fragment validation.
IPC create must only be used on whole ROCr allocations.
Fragments were allowing handle creation with offsets.

Change-Id: I1faa96d36bc7a6199bdc2e3ff1b8871d1a36a2fa
2020-04-24 00:08:53 -04:00
Sean Keely b90bf473c1 Correct capture of PoolInfo::allocable_size_.
Change-Id: I80757bb36048bc15b928220aca0a1eb5d898ab22
2020-04-21 19:03:24 -05:00
Sean Keely 3fe891d5da Suppress Finalizer loading attempts.
This has been the default mode for a while now since we don't
distribute or build the finalizer.  Removing the attempt cleans
up debug mode messages that are causing confusion.

Change-Id: I8162c95abd5bbedaa22b90191f7a384a34c388ae
2020-04-18 00:06:42 -04:00
Sean Keely e25ae1263b Remove references to finalizer header.
Change-Id: I6608c95268ab4bc66053d889cf7d5a30cd8fccab
2020-04-17 23:50:23 -04:00
Sean Keely a9470e3563 Correct rocrtst numa awareness.
Pool size was being used where alloc_max_size should be.
Changes are necessary on NUMA systems where not all nodes have
installed memory.

Change-Id: If8f507cae50a8dfeae8572d4e39df757abe28599
2020-04-17 23:43:38 -04:00
Jonathan Kim af249159ee kfdtest: do not request host accessible memory for P2P tests
Do not request host accessible memory otherwise small-bar XGMI fails.

Change-Id: I6b1e750839ae66a34c85405fa8d0a4aa455399ef
Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
2020-04-17 23:42:10 -04:00
Sean Keely 9fe44ed675 Don't lock KFD allocated system memory.
Lock API suceeds but the GPU still faults on the address.
This should be fixed in Thunk and/or KFD as well.

Change-Id: I8b2fbcae61ab181e4fe7f0b64e43a5f0772efb24
2020-04-17 21:45:01 -04:00
Ramesh Errabolu 7434fa14d4 Extend RocrTst to query UUID of ROCm devices
Change-Id: I6c9fa9751c893ba119e8b7a2808a8ab2aebeba3b
2020-04-16 21:29:32 -04:00
Felix Kuehling 8ee763d94a kfdtest: Fix problems finding kfdtest.exclude
When running run_kfdtest.sh through a wrapper script that sources
run_kfdtest.sh, kfdtest.exclude isn't found because $0 points to the
location of the wrapper script. User $BASH_SOURCE instead of $0 to
find the location of the correct run_kfdtest.sh script.

Change-Id: I0ae7899e527e6d98bb8651197484e5ee03a5fd7b
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2020-04-16 18:23:33 -04:00
Kent Russell 7ee9e01587 KFDTest: Use SMI for HIGH clocks, if possible
Some systems don't support coarse-grained DPM, so performance level will
fail. Remove the compute_utils.sh references, and just use the SMI if we
request clocks be high, without throwing errors if it fails.

Change-Id: Ic5beda9921128be36ac2d58cae3f0608618a8e21
2020-04-16 07:59:33 -04:00
Srinivasan Subramanian 5e35364838 libhsakmt: check ret and errno for EBADF
Change-Id: I9fcbf955d8b7b01ff1025534a8c2eaa8e6790565
Signed-off-by: Srinivasan Subramanian <srinivasan.subramanian@amd.com>
2020-04-15 20:55:40 -04:00
Ramesh Errabolu 30f46e4e24 Stop building and packaging Tools library
Change-Id: Iee430c24e32ea7412f21564fe8970749e4954b91
2020-04-15 13:58:33 -04:00
Laurent Morichetti 5f783494f1 Return a file URI for elf images in shared objects
Iterate the loaded shared objects to see if the given elf image binary
is part of a loaded segment.

Change-Id: I074cacd99eb5b59f883f4ce2bd901e0e35a660b8
2020-04-14 15:22:43 -04:00
Nathan O 6d5781bb14 Fix hsa_amd_agents_allow_access documentation
- Update the documentation comment in hsa_ext_amd.h, which contained
   contradictory and incorrect information about an argument to the
   hsa_amd_agents_allow_access function.

Change-Id: I60b0dbbdc761078cd81906bc2c63a27d7e6b53e1
2020-04-10 18:26:13 -04:00
Ramesh Errabolu 89f7ef224c Extend Rocr Visible Devices functionality to include UUIDs
Change-Id: Ia2892e4033717556a422fe33dec0294fe2ca9e28
2020-04-09 00:42:53 -05:00
Ramesh Errabolu 45958c727d Extend ROCr to surface UUID of GPU devices that suppport
Change-Id: I478db68d69a01578770403fa695f9e6391637573
2020-04-08 19:19:22 -05:00
Sean Keely 884fed4f04 Correct initial kfd_open_count increment.
Don't set kfd_open_count=1 unless hsaKmtOpenKFD actually succeeds.
This prevents returning HSAKMT_STATUS_KERNEL_ALREADY_OPENED in
subsequent calls when KFD is actually closed.

Signed-off-by: Sean Keely <Sean.Keely@amd.com>
Change-Id: Ia870b5faa8626826a6c8795aa10784d376cf2e80
2020-04-03 21:05:07 -04:00
Pruthvi Madugundu 241cdfdd01 Updating the hsa include directory symlink creation
- Symlink creation is corrected only for deb packages
- It is follow up package of http://git.amd.com:8080/c/hsa/ec/hsa-runtime/+/334403
- configure_file() is called to update the scripts with proper cmake variable values

Signed-off-by: Pruthvi Madugundu <pruthvi.madugundu@amd.com>
Change-Id: I0e833ead265166411e83593fd57265a9ab356904
2020-04-01 02:17:11 -07:00
Sean Keely f3b532b42d Fix Debian package build on CPack 3.10.
CPack now incorrectly adds two copies of directory symlinks when
building Debian packages.  This causes dpkg to see a file conflict
and fail installing.

The correct long term solution is to remove the symlink and use a
flat directory structure.  This patch adds the symlink in the post
install script as a workaround until we can switch to flat layout.

Change-Id: I879b6cbc2661c19df3db639cb42fba0972fddb93
2020-03-20 21:35:32 -05:00
Jon Chesterfield 0a1718b753 Replace libpci with new parser.
libpci was only used to find a marketing string for a device.
This patch looks for a pci.ids on disk and parses it to extract the
same string, using 'Device xxxx' as the fallback on file i/o error
or missing data from the text file. Tested by checking every vendor/
device pair against the values returned from libpci.

Change-Id: I21af3157472c1824d57fcee31393c6ee8ce07330
Signed-off-by: Jon Chesterfield <Jonathan.Chesterfield@amd.com>
2020-03-20 17:50:47 +00:00
Jeffrey Poznanovic 5dcd49f726 Added CentOS-6 mods to support manylinux2010
Change-Id: I8c303ccfdc7d314d1b4609ed6181d46795ada621
2020-03-20 08:47:25 -04:00
Divya Shikre 96259b5830 kfdtest: Provide Unique ID information.
Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I9a837a3d1ab38f5ad05673406874d862c9e97541
2020-03-18 15:55:59 -04:00
Sean Keely 9efefe6d52 Handle EBADF when KFD file handle is still open.
Signed-off-by: Sean Keely <Sean.Keely@amd.com>
Change-Id: I23d6c87d5729f57c261030c6baeff4c977eef934
2020-03-11 18:52:19 -05:00
Divya Shikre ebe7de1f99 libhsakmt: Expose device Unique Id
Read device unique id from sysfs and expose it in HsaNodeProperties.
For devices not supported the value will be 0

Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I97b8689dfa090971c6876de6feaa97652e28c03d
2020-03-10 10:06:11 -05:00
Yong Zhao 4e7b2f2e27 kfdtest: Print a message when there is no GPU
This helps the user to troubleshoot the problem.

Change-Id: If6cf42c488097011285252a6c722d3d74c0f7ce7
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2020-03-04 15:00:47 -05:00
Yong Zhao 0e5c4d83e6 kfdtest: Delete MULTI_GPU usage in run_kfdtest.sh
It is obsolete.

Change-Id: Ifd137ce1ce8d9133cfa5c8bfd46aaeea461b5aa7
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2020-03-04 15:00:47 -05:00
Sean Keely a1c2439213 Update asserts and comments for pointer info.
Checks for an IPC memory error and updates comments relevant
to rocr_visible_devices.

Change-Id: I9d2f2dd27f3fa04881d17387cce2692bc046edb2
2020-02-24 09:08:48 -05:00
Sean Keely 9c35780836 Report HDP registers at all times.
HDP will now be used for coarse grain kernarg so needs to be
reported without consideration of fine grain vram over pcie.

Change-Id: I648167299faa583876a3d8685c3b3c4d8d31ebf9
2020-02-24 09:08:17 -05:00
Ramesh Errabolu 627991b1c1 Update how code references publicly available ROCr headers
Change-Id: I357c51eb713a23704d4fee71081be46a73a71806
2020-02-21 20:01:11 -05:00
Sean Keely dc165c92bc Add env key HSA_NO_SCRATCH_THREAD_LIMITER.
Setting to 1 prevents the scratch handler from reducing peak occupancy.
Scratch allocations that would normally reduce peak occupancy will
instead fail.

Diagnostic for TF and PyTorch.

Change-Id: I2d7ea47077eb5cf708251c8aa3fd183ad4261be0
2020-02-21 17:09:26 -05:00
Sean Keely 6c556002d8 Correct scratch retry logic.
scratch_used_large_ was uninitialized leading to the observed hang.
DynamicScratchHandler would wait for a large scratch release despite no
large scratch having yet been allocated.  Fixes .

The patch also removes a potential race between AddScratchNotifier and
ReleaseQueueScratch.  The race condition does not exist today since both
scratch alloc and release run on the same thread.  The changes will
prevent this potential race from manifesting if the async event handler
is ever updated to use multiple threads.

Also enhances scratch occupancy reduction reporting.  Reporting now
prints the initial request size as well as the allocated size and the
effect on occupancy this has.  Occupancy is computed in terms of the
requesting dispatch grid size so may be >100%.

Change-Id: I0fc5ee01467ff4c29bdd25d545177c97862c3bd9
2020-02-21 17:09:26 -05:00
Sean Keely d53fe07687 Insert zero sized pool on CPU agents without attached memory.
Ensures that all CPU agents will have a pool handle to allocate
system memory.  These pools will have no numa binding since the
node their owning Agent represents has no installed memory.

Change-Id: I9f72b455d633646839753c6719ff7f6a4c41f7c4
2020-02-21 17:05:10 -05:00
Saleel Kudchadker c57f3da1dc Reset link_map map in the constructor
Change-Id: I8a6ad3bc0fca790dec2992cacf9288068b3bcaa3
2020-02-19 15:29:35 -08:00
Chris Freehill f5e86a8f14 By default, don't collect rsmi monitor values
Change-Id: I4946efadcb9c5ececead1b4c40b73adc5ceca957
2020-02-14 18:06:19 -05:00
Pruthvi Madugundu e931fd424b Adding RUNPATH to find libhsakmt.so for Centos and SLES
- This new path is required when libhsaruntime.so is referred
from the top level ROCm lib directory.
- Once ROCm stack lib/lib64 structure is flatten, RUNPATH in all
the libraries needs to be updated.

Change-Id: I369131ce93e14958ec57a54701671f2bfd8d522a
Signed-off-by: Pruthvi Madugundu <pruthvi.madugundu@amd.com>
2020-02-14 17:43:29 -05:00
Yong Zhao 7a852be42e kfdtest: Clean up KFDEvictTest
Move the shader code before the test case code so that all test case
code is consecutive.

Rectify the print messages and avoid calling GetSysMemSize() repeatedly.

Change-Id: I1c4aa5552de4d74163717fe66ad9759fb09e1316
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2020-02-13 15:42:04 -05:00
Yong Zhao 21cda69ba9 kfdtest: Adapt the CWSR test for emulators
The original test takes forever to run on emulators because emulators
are much slower than Asic. So intelligently detect the emulator scenarios
and reduce the run time by slashing the iteration times.

Change-Id: I087f43c04c2b23b5ab2ecaad07533b767c337e94
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2020-02-13 11:35:37 -05:00
Kent Russell 0d4c209552 Remove RegisterForeignDeviceMem test code
Use an "if 0" to remove the code, as it is not working as expected. The
test is supposed to test mapping from a foreign device (like a NIC), so
it uses a separate GPU to map, but this mapping can be evicted and thus
the test can fail unexpectedly. Remove the code until the test can be
reworked

Change-Id: Ie4a15c2a018bbd8e931b06b6700d10b3be86e410
2020-02-13 10:04:01 -05:00
Sean Keely e66818e4d3 Update analysis_memory_exception to recognize shared memory.
Add type HSA_POINTER_REGISTERED_SHARED printing.

Change-Id: Ic0400a097ebabde4f035b57fbca4cca12428fc97
2020-02-12 21:51:53 -05:00
Kent Russell a360c68b0c Add DEB/RPM packaging for KFDTest
This will allow it to be installed with the ROCm suite,
and centralize things a little bit more
Also update run_kfdtest.sh to reflect the changes
Lastly, remove "die" reference as compute_utils.sh
may not be packaged with KFDTest

Change-Id: I4c30cd29979192496419e71e3685937d7417f739
2020-02-11 13:53:09 -05:00
Harish Kasiviswanathan 31530da7c6 libhsakmt: Child process can reacquire system props
If child process explicitly calls hsaKmtReleaseSystemProperties(), it
fails. Allow child process to release and acquire system properties.



Change-Id: I649a4600212711b2ad4474f605f3ca39a4003d03
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
2020-02-06 15:09:39 -05:00
Felix Kuehling 6b8095184f kfdtest: List source files explicitly
Tests run in the order in which they are linked. Currently that order
is non-deterministic. Listing source files in the Makefile explicitly
makes the order deterministic.

The order chosen runs basic tests before more advanced tests before
stress tests.

Change-Id: I5bc032bcd589f92a51db36acb518bb4d8ef778d3
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2020-02-05 14:34:53 -05:00