Граф коммитов

479 Коммитов

Автор SHA1 Сообщение Дата
Chris Freehill 290dfd785f Make build_rocrtst.sh build all target kernels by default
This will allow the default target list to be branch
specific.

Change-Id: If8ecc14e2b7fb5ed2eb25ab447480308d539b248


[ROCm/ROCR-Runtime commit: d699039284]
2019-07-05 19:30:07 -04:00
Jay Cornwall 60da601be4 Handle traps, illegal instruction, memory violations through queue signal
Report traps and fatal exceptions through a wavefront's
amd_queue_t.queue_inactive_signal. Previously, only traps were
reported and requireed the compiler to pass in the signal pointer
in s[0:1].

The signal is obtained through a mapping from doorbell index to
amd_queue_t*. The doorbell is fetched within a wavefront through
the gfx9+ S_SENDMSG(MSG_GET_DOORBELL) instruction.

Change-Id: I319b45f2e15dfcfe4db8f4065da1136e9539a42b


[ROCm/ROCR-Runtime commit: ff8f439112]
2019-07-01 22:59:41 -04:00
Jay Cornwall 822d838eae Replace gfx9 SP3 trap handler with LLVM, fix IB_STS restore
Assembler toolchains are moving from SP3 to LLVM. Replace trap handler
source code with LLVM equivalent.

Fix a trap issue with SQ_WAVE_IB_STS restore. Mostly harmless as all
traps are currently considered fatal to the wavefront.

Change-Id: Iacecd9dd31a1d96a083c8b8327f442f33c861f9f


[ROCm/ROCR-Runtime commit: 6ed686ee29]
2019-07-01 22:59:27 -04:00
Chris Freehill 970cca3731 Temporarily disable Debug test
Change-Id: Iabb238fcd78b9c2eb0c085b19ab93b8c9e538140


[ROCm/ROCR-Runtime commit: 8caa6c0b01]
2019-06-29 04:55:35 -04:00
Sean Keely 872c359ba2 Initial support for deallocation callbacks.
Adds hsa_amd_register_deallocation_callback and hsa_amd_deregister_deallocation_callback
to notify when HSA memory has been released.

Change-Id: I1f33cee250ca890e5c2e7fddfa4479aa5874651d


[ROCm/ROCR-Runtime commit: 299874f17d]
2019-06-26 04:12:17 -05:00
Chris Freehill 9d70b6a420 rocrtst fixes for hsa_signal cleanup and aql packet dispatch
In several places aql packets were written to queue all at once
instead of doing the header atomically. These cases have been
fixed.

There were a few hsa_signal leaked that have been addressed.

There was some duplication of code that has been addressed.

Addresses ROCMOPS-456

Change-Id: Ia1869bc370f92e49ac560301df47741d5f76978e


[ROCm/ROCR-Runtime commit: 081a2cc875]
2019-06-21 17:34:10 -05:00
Evgeny 87cdf00d09 aqlprofile api fix
Change-Id: I2a710040422c7853ece5472ea776442b25d69dcb


[ROCm/ROCR-Runtime commit: 6c0aaa2773]
2019-06-19 23:14:27 -04:00
Sean Keely 904723af7c Fix IPC related hangs/faults in rocrtst.
IPC was failing due to calling fork when HSA was open.  The fix
was correcting incomplete cleanup in several other tests.

TestBase::Close (via CommonCleanUp) now checks that HSA is properly
closed between tests.

rocrtstPerf.Memory_Async_Copy uses hwloc which uses OpenCL which
has no shutdown routine.  Consequently this test can not cleanup
properly.  I added a hack to force HSA refcount to the value
it should have if OpenCL were cleaning up but this leaks resources
and potentially puts hwloc & OpenCL in a bad state.

OpenCL loads LLVM which installs some exit handlers.  Those handlers
can't execute in a child process and can't be removed since OpenCL
doesn't cleanup.  IPC hacks around this by aborting rather than exiting
in the child process.

Change-Id: I92326a73d7b11632208717d99728e6dafdc7d3ca


[ROCm/ROCR-Runtime commit: bb980462e7]
2019-06-19 01:03:52 -04:00
Sean Keely 5d5d40fcf9 PTHREAD_STACK_MIN may differ from system parameters.
Restrict stack adjustment to non-default stack requests and allow
stack growth within reason (20MB cutoff).

Change-Id: I320280c711402ac29683e94c7246b7c32c797611


[ROCm/ROCR-Runtime commit: 0c0e634458]
2019-06-17 21:04:17 -05:00
Sean Keely ca44cbb3d9 Revert to SystemClockCounter for HSA system time.
CPUClockCounter is not NTP adjusted (CLOCK_MONOTONIC_RAW) so should be 
better for measurements.  However, it is implemented with syscall while
CLOCK_MONOTONIC is implemented via vDSO.  The latency increase becomes
significant when language layers make corresponding clock measurements.
Reverting to CLOCK_MONOTONIC will reduce latency and allow small
duration events to be measured at the cost of incorporating NTP
frequency skew errors.  NTP may adjust frequency by 500ppm so limits us
to ~3 decimals in elapsed time.

Change-Id: I920b9f707f47109d80d6c256c475638c03fb8d76


[ROCm/ROCR-Runtime commit: 4b22d24346]
2019-06-17 21:07:26 -04:00
Chris Freehill 74eb2440c3 Temporarily disable some failing tests
Change-Id: Iee713bb963db812c36ce2568aee2a4f8409c52e5


[ROCm/ROCR-Runtime commit: 259a1bac18]
2019-06-14 08:36:11 -05:00
Sean Keely ba3ec88220 Fix description of HSA_AMD_MEMORY_POOL_INFO_ACCESSIBLE_BY_ALL.
Description was inconsistent with itself and code.  Existing behavior
returns HSA_AMD_MEMORY_POOL_INFO_ACCESSIBLE_BY_ALL == true for system
memory pools only and system memory pools do require hsa_amd_agents_allow_access.

Change-Id: I64b287bff9fdb21688aa169296e410edf1b209b5


[ROCm/ROCR-Runtime commit: bbb90bdfc9]
2019-06-11 01:45:22 -04:00
Evgeny e07cc81005 aqlprofile API: sdma blocks
Change-Id: I619af8adc17706f808644180cdd5a5c785e052ec


[ROCm/ROCR-Runtime commit: a06d96cef8]
2019-06-05 18:54:08 -05:00
Evgeny f3b7848904 adding new trace API
Change-Id: I6c83b5789f5a6cdbb574d041c40d5a47229c7f1a


[ROCm/ROCR-Runtime commit: 1be9298f72]
2019-06-01 14:33:59 -04:00
Matt Arsenault 1379fea626 Don't check VERSION_BUILD is defined
Check if it is true or not. The string() call would define this to an
empty string, which would pass. This would then leave a trailing -
in the version string, which dpkg would error on during package
installation.

Change-Id: Ifb5fc15f5dde506e96bff7881a5d3f22d983406e


[ROCm/ROCR-Runtime commit: 0016c6ce5b]
2019-05-29 11:09:31 -04:00
Sean Keely b754622b33 Allow hsa_status_string when HSA is closed.
API is a stateless lookup of RO data and needed to interpret
hsa_init error codes.

Change-Id: If80cba2f697843d08e529da0f790acf3c37127a7


[ROCm/ROCR-Runtime commit: 22de0e7fb9]
2019-05-24 22:40:03 -04:00
Sean Keely b9c2754101 Add exception and error safety for CreateThread.
Change-Id: I82aaf64e039ca9614b4948deec1f87147f56279a


[ROCm/ROCR-Runtime commit: 9f81bdfbe1]
2019-05-24 22:39:55 -04:00
Matt Arsenault 0bf3b480ee Change include flag order
Search the local src directories first. If using a system
installed hsakmt, this would pick the installed hsa headers.

Change-Id: I9746d6e9db1749a130e4d93e024556754a537083


[ROCm/ROCR-Runtime commit: 22d29b55a4]
2019-05-22 16:43:18 -07:00
Sean Keely f336a19a0f Correct pthread join/detach handling.
Joined threads can not be joined more than once nor can they be detached.
Thread library wait and close allows multiple waits and separate close so
this fixes the pthread implementation.

Change-Id: I0019271a438f11ed4c6c11854011f5c4f6e16b65


[ROCm/ROCR-Runtime commit: a913549190]
2019-05-16 12:14:06 -05:00
Sean Keely bdda9b4f0e Correlate errors for time stamps which predate process start.
Small times may be given to time conversion if GPU clocks are used to
accumulate elapsed time.  Because HSA APIs deal in absolute time this
leads to large conversion offsets of order system uptime.  Variation
in relative clock ratio estimation may be amplified in this case,
destroying elapsed time measurements.

This patch fixes the relative clock ratio used for times which predate
the call to hsa_init.  This correlates errors in such times allowing
the elapsed time to be correctly computed.

The effective maximum system uptime before elapsed time conversion becomes
inaccurate is ~3.5 months.  GPU event timestamps are good for process uptime
of ~3.5 months.  These are limited by double's mantissa precision.

Change-Id: I48752ff354920439d91016d6f2b0c8ddfa60b445


[ROCm/ROCR-Runtime commit: 6e2a056e1b]
2019-05-14 17:35:06 -04:00
Sean Keely ec39134408 Expose HDP flush registers.
Exposed via agent info query.  Only valid if fine grain PCIe memory is enabled.

Change-Id: Ib4770901592ec047276458926a947737f9b93bb5


[ROCm/ROCR-Runtime commit: 06376e726b]
2019-05-11 00:04:47 -04:00
Sean Keely 5b71bc65b7 Patch from github.
At the moment it is not possible to build ROCr with Clang. This is
a spurious limitation. The present PR addresses it by guarding GCC
only flags and by fixing some additional warnings that Clang triggers;
one of said warnings did outline a rather interesting issue with math
being done on void*s. - AlexVlx

Void ptr arithmetic had already been fixed in amd-master branch.

Change-Id: I5ee97e20b5c40b10dd73facecabe75f02ba46462


[ROCm/ROCR-Runtime commit: e89f9807f1]
2019-04-29 16:17:24 -04:00
Felix Kuehling d810b66917 Use non-paged memory for IPC signals
Non-paged memory can be IPC-shared even when HSA_USERPTR_FOR_PAGED_MEM
is enabled.

Change-Id: I8b1fa6d7a4a9327c78a77b3679697fbf55397093
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 0c6b9532d4]
2019-04-29 09:20:11 -04:00
Sean Keely bdf4b84f82 Don't create blits when copy profiling is enabled.
Change-Id: I879827133957ee610c3381ea30c536ec7d10ffab


[ROCm/ROCR-Runtime commit: 1251842900]
2019-04-18 20:00:02 -05:00
Jay Cornwall 2c3c92208d Detect memory event through Flags field insetad of Failure
KFD no longer reports MemoryAccessFault.Failure with retry fault
implementation. ROCr ignores the memory event when Failure = 0.

Use the Flags field instead, which will be non-zero when the
event is triggered.

Change-Id: Ie90799a303b0b2f1b476b20ffafdde79ae137182


[ROCm/ROCR-Runtime commit: 56f280c8a7]
2019-04-15 19:16:07 -05:00
Ramesh Errabolu fd05ee66a7 Remove instantiation of MemoryRegion for heap type SVM surfaced by ROCt
Change-Id: Ib4ff7e7cabe9aacb811888aeb74f652dcb57f9e0


[ROCm/ROCR-Runtime commit: ba029ebe21]
2019-04-10 18:33:07 -05:00
Konstantin Zhuravlyov dde11e307d Process symbols with 0 address
Change-Id: I9ed943a8ccd3b103edd6aba8264c009d8cda29fa


[ROCm/ROCR-Runtime commit: 7001134757]
2019-03-30 02:14:43 -04:00
Sean Keely 59e91f0be8 Add hsa_amd_memory_lock_to_pool.
Makes malloc memory accessible to GPUs so that the memory has the
capabilities of the pool it is locked to.
This admits fine grained locked memory and reserves API space for any future
special CPU pools.

Change-Id: If8c3dd8582a43f19d3d36b3763c1a688cc419ef0


[ROCm/ROCR-Runtime commit: a535e18cc1]
2019-03-29 01:09:21 -05:00
Sean Keely f819304f49 Remove legacy memory fault event name.
Change-Id: I3ad240482523409e1152548009aecf127e63bbfa


[ROCm/ROCR-Runtime commit: 9f7df6d6fe]
2019-03-28 15:25:25 -05:00
Sean Keely 6121ae4f6b Fix void* arithmetic.
GCC allows arithmetic on void* treating void as char.  Clang and
the language spec does not.

Change-Id: I939f2432f276979bb81881406e10528597ac6001


[ROCm/ROCR-Runtime commit: e5de33dd9a]
2019-03-28 12:49:19 -05:00
Sean Keely cdd2c26ac4 Disable sram-ecc reporting via ISA until HCC is fixed.
Change-Id: I0382825884b727173385f04da9f2088650c3ba1d


[ROCm/ROCR-Runtime commit: 7ea0cd688f]
2019-03-21 17:46:56 -04:00
Chris Freehill 30f23c3ff4 Re-enable RSMI call using updated API prototype
Change-Id: Ifc8fa35708fea05cbc8a9bea727a6d4c9d2ecea7


[ROCm/ROCR-Runtime commit: 68c202de1f]
2019-03-18 23:18:42 -05:00
Chris Freehill 79ad9eebf6 Temporary disable of rsmi call due to api change
Change-Id: If73f31c5fbe4bcd34f8e52a5109a6fbfff70b5e1


[ROCm/ROCR-Runtime commit: 7b3537cf44]
2019-03-16 21:05:47 -05:00
Sean Keely 906c36278a Do not strip release builds.
Customer request.

Change-Id: Id77dcdc0b6908c7a5e460edfd7d9468a1691e351


[ROCm/ROCR-Runtime commit: fd9fb77e28]
2019-03-07 14:04:14 -06:00
Sean Keely 953355e0f7 Report SRAM ECC errors through the system event handler.
Modify the system event handler to support multiple users.
Name memory fault reason codes.

Change-Id: I1b5979b36ab15637eb2be59a61e2d57e76d0a70e


[ROCm/ROCR-Runtime commit: 67376e06ab]
2019-02-27 18:08:07 -05:00
Sean Keely 8b90c223ac Loader support for SRAM ECC.
Change-Id: I0c6791c356d9186cc8dabae9fd698b1d4de19b09


[ROCm/ROCR-Runtime commit: 3c3db0243e]
2019-02-25 18:30:05 -05:00
Sean Keely f7dbaf103b Add fine grain vram pool.
Part 1 of 2.
Enables fine grain vram over PCIe based on env flag.
Part 2 will extend to XGMI.

Change-Id: I8ad506e004b398d56d462b0200274eae2293a461


[ROCm/ROCR-Runtime commit: c56d86100b]
2019-02-21 13:08:11 -05:00
Sean Keely c66675d192 Suppress exception reporting for well defined invalid signal handles.
hsa_exceptions with empty what() strings will not report in debug builds.

Change-Id: I0d424d3b1d3044808ece1720a460a57d68bf878e


[ROCm/ROCR-Runtime commit: 344d964f9f]
2019-02-15 19:35:57 -05:00
Sean Keely 3183fea597 Remove stop using ROCm release tags for library version numbers.
Version is now a fixed string that matches previous internal builds.
This also matches released DEB/RPM builds (but not github versions).

Change-Id: Id4819b9de8c855250aadf1a1cebb187b5c031721


[ROCm/ROCR-Runtime commit: 400304aa10]
2019-02-06 19:22:53 -05:00
Ramesh Errabolu 30363b9dbb Allows users, via env ROCR_VISIBLE_DEVICES, to surface a subset of Gpu devices
Change-Id: I5662639d5d70f054831969669f9d30dec356dd5a

Update per review comments

Change-Id: I18c7d7cb00b261493b61c2cf5454d486166f40d8


[ROCm/ROCR-Runtime commit: 3fbf03af76]
2019-02-06 02:02:29 -06:00
Chris Freehill 4661538a0a Fix boolean semantic error
Change-Id: Ic927370d5874af3f33105fca6ee0b581ebc6fa08


[ROCm/ROCR-Runtime commit: 014945310a]
2019-01-31 14:03:48 -06:00
Chris Freehill 8f1a9494cd Fix async memory test; temporarily disable NUMA memory test
Change-Id: I1c0618f5dba513c1cf8fafb5fc64e5c811df8454


[ROCm/ROCR-Runtime commit: 626f13a88b]
2019-01-31 09:14:54 -05:00
Sean Keely f39ed0dc23 Unify APU and dGPU initial queue scratch allocation.
Both support dynamic scratch allocation so there is no reason
to preemptively allocate on APUs.

Change-Id: I22eaec01a83a091ee9dc1f594a1a9106e8dd81fc


[ROCm/ROCR-Runtime commit: 65d39cc476]
2019-01-25 02:11:39 -05:00
Jay Cornwall 268ae92794 Remove legacy microcode version check in GpuAgent::InvalidateCodeCaches
Fixes instruction cache invalidation when using microcode branches.

Change-Id: I932676e683983145f5c807204e592fb5e530c8af


[ROCm/ROCR-Runtime commit: 079eadd71b]
2019-01-22 16:39:52 -06:00
Konstantin Zhuravlyov a506e18fd2 Loader: update symbol processing for v2+
- Skip symbols that are STB_LOCAL and not STT_AMDGPU_HSA_KERNEL

Change-Id: I68567f58de9bf3f07dbd8020ef63f47667c86367


[ROCm/ROCR-Runtime commit: 8bee6e4976]
2019-01-18 15:42:28 -05:00
Konstantin Zhuravlyov 564ac4b348 Loader updates for code object v3
- Fix loading in some cases
  - Fix symbol kind

Change-Id: I721b4a35972b6d2a6d0ac733ab770b096cc74e17


[ROCm/ROCR-Runtime commit: c1ad82a6b7]
2019-01-18 15:41:01 -05:00
Chris Freehill c848f8a365 Decrease test size for emulatation runs
Decrease number of iterations and array sizes in some cases.

Change-Id: I1a0a43faa907b28662ff3a44c172950ed7b1500e


[ROCm/ROCR-Runtime commit: 6bca866e6c]
2019-01-14 21:23:04 -05:00
Ramesh Errabolu efc2ac9024 Initialize queue buffer with Invalid Pkt Headers
Change-Id: I4166f1359746ee6829b730bac2db358af72ab16e


[ROCm/ROCR-Runtime commit: 28c3f9a269]
2018-11-21 19:09:10 -05:00
Mark Searles 508124a012 Force object code v2 until v3 is supported
Change-Id: I4c2a64bf9bd515686d1f1d90aece2a9ac40e5685


[ROCm/ROCR-Runtime commit: 8ea836017a]
2018-11-21 10:06:08 -08:00
Sean Keely d79cd9abf3 Check max wave scratch limits.
HW has limited bits for wave scratch base address stride.  Enforcement
prevents programs with larger than supported scratch allocations from
running and clobbering neighboring scratch space.

Change-Id: I574da888e9d1d5e290a9c0025ba13b5ef9f1e5c0


[ROCm/ROCR-Runtime commit: 8e4177382a]
2018-11-16 20:59:20 -05:00