Commit Graph

2930 Commits

Author SHA1 Message Date
Amber Lin 2dec7b1d74 Search VM object by range
Add vm_find_object_by_userptr_range so QueryPointerInfo can find the
object as well when the pointer is not the starting address but it's
inside the memory range. Also rename vm_find_object_xxx functions to
_by_address and _by_address_range to be consistent.

Change-Id: I5c2b3a05b41493e32b7fd9154665bf078b043606


[ROCm/ROCR-Runtime commit: 4911c91389]
2016-09-13 12:44:29 -04:00
Christophe Paquot 5544a9e5e5 Add tiling code
Introducing tiling format for images, still using LINEAR for now.
Using the new KFD/Thunk API hsaKmtGetTileConfig API for the address library.

Change-Id: Ic0677429dd320eef09ab62dddaf9b2dd94c4f904


[ROCm/ROCR-Runtime commit: 538736a660]
2016-09-13 11:42:10 -04:00
Amber Lin 4b17993791 Pointer attributes on APU
Add CPUVM aperture to keep track of memory allocation that is not known
to GPU driver. Together with GPUVM, this patch adds the pointer attributes
support to APU.

Change-Id: If13f9cf01ff8b9f709b99b66661e7505246adf4c


[ROCm/ROCR-Runtime commit: 19f2676ea7]
2016-09-12 11:32:26 -04:00
Ramesh Errabolu 85aa61f011 Print Debug Mesg if private segment memory request is illegal
Change-Id: I46351651b6b2bf14e26645440a4321bc941900b2


[ROCm/ROCR-Runtime commit: c54304fe38]
2016-09-08 11:16:09 -04:00
Amber Lin 8a1cef5fbb Add pointer attributes API
Add two pointer attributes APIs:
hsaKmtQueryPointerInfo - allow the user to query the memory information
    using a pointer. This pointer can point to any address inside the
    range known to HSA.
hsaKmtSetMemoryUserData - allow the user to attach data to a pointer to
    add memory tracking information. This pointer must match the start
    address of a memory allocation or registration.
TODO: This patch implements support on dGPU. Needs to add APU.

Change-Id: I4711809274248434901f0794f50ebfa13a7371a8


[ROCm/ROCR-Runtime commit: 51e4d27c37]
2016-09-07 17:24:46 -04:00
Sean Keely 3dd28a4064 Add missing export of hsa_signal_group_wait_any_relaxed.
Change-Id: Ia043c72234534c1ac7c0a0518b64e244fc116157


[ROCm/ROCR-Runtime commit: 2008af1899]
2016-09-07 15:03:33 -05:00
Yong Zhao cba37c251c Implement hsaKmtGetTileConfig in thunk
Change-Id: Iba8d8efa46e3c268a03442d3db568e1b19230e94


[ROCm/ROCR-Runtime commit: 8351b3d2e8]
2016-09-06 16:24:29 -04:00
Ramesh Errabolu 87018b14b4 Read name of Hsa Agent from a new field of Node Properties
Change-Id: I3abca521a904c40cb84d90800a16363b1ad64768


[ROCm/ROCR-Runtime commit: 5ab5396529]
2016-09-05 14:33:08 -04:00
Sean Keely 1306432ae1 Switch atomic_helpers.h from C11 atomics to GCC __atomic builtins.
C11 atomics are not statically guaranteed to be lock free and so
may not be atomic with respect to atomic operations originating
outside the standard library, such as platform atomics.

C11 macros to statically discover always lock free operations
(ATOMIC_*_LOCK_FREE) do not cover uint64_t in GCC and
std::atomic<uint64_t> is not a type alias of any covered type.

All use of __atomic by atomic_helpers.h is statically checked to be
always lock free.

GCC builtin fencing does not appear to be strong enough for WC memory.
Added an option (enabled) to enforce consistency for WC memory on x64.

__sync builtin's were not used as they were declared legacy by GCC.

Added a strongly conservative option (ALWAYS_CONSERVATIVE) to enable
use of full memory fences in place of partial fences and compiler
driven processor specific optimization.

Change-Id: Id7aaaca626144070f58759f6a348cbee4612bbc0


[ROCm/ROCR-Runtime commit: 1bc15bbf79]
2016-09-03 06:22:42 -04:00
Lan Xiao a015256f33 libhsakmt: Marketing Name and AMDName support for APUs
For APUs, use /proc/cpuinfo to get Marketing name.



Change-Id: I4a17516d26a092683f36631032be00ad44f7e7fe
Signed-off-by: Lan Xiao <Lan.Xiao@amd.com>


[ROCm/ROCR-Runtime commit: df593aa076]
2016-09-02 15:16:18 -04:00
Konstantin Zhuravlyov 2d64d36223 Use memcpy instead of hsa_memory_copy
Change hsa_code_object_serialize and hsa_code_object_deserialize to use memcpy instead of hsa_memory_copy since it is system->system copy

Change-Id: I329e270ae4e2fc25e177dc8080d93662ffb261ab


[ROCm/ROCR-Runtime commit: 73ed2116d5]
2016-09-01 01:56:07 -04:00
Konstantin Zhuravlyov a6d80953ba Purge unused variables (to silence warnings)
Change-Id: Ifc11c4bc4725f4c70d6be75208b6906d163754b4


[ROCm/ROCR-Runtime commit: 518da7d4e7]
2016-08-31 14:54:20 -04:00
Andres Rodriguez 996bf3f9ca LICENSE: add X11/MIT license file
Change-Id: I2e95af843046896708bb7a116f7b03a0fa30a255


[ROCm/ROCR-Runtime commit: b1d2867b60]
2016-08-25 16:27:46 -04:00
Andres Rodriguez 45e0ca4e91 Makefile: remove 32bit thunk compilation by default
Compiling in 32bit mode is broken, and we don't have an intention on
restarting compatibility with 32bit apps.

Change-Id: I5524b5b63fe62e6026aa04d84c4510e290a86106


[ROCm/ROCR-Runtime commit: e0c77a38cb]
2016-08-25 16:27:19 -04:00
Jay Cornwall d5ecfae62f Refactor: Consolidate calls to hsaKmtAllocMemory
Route all device-visible system memory allocations through system_allocator.

Change-Id: I5e90a1bf491e432678a6d8ab1f9f3770734cbda1


[ROCm/ROCR-Runtime commit: 74f5aca93d]
2016-08-24 23:57:19 -04:00
Lan Xiao 742354161b libhsakmt: Add MarketingName and AMDName for all nodes - CPU & GPU
HSA thunk API is currently reporting engineering name to MarketingName
and returning NULL when querying for AMDName.

-Change current name reporting from MarketingName to AMDName.
-Use libpci to get MarketingName



Change-Id: I819a6de7b067a2e724a6695e7d800274b83a71f8
Signed-off-by: Lan Xiao <Lan.Xiao@amd.com>


[ROCm/ROCR-Runtime commit: 9cbbf30be7]
2016-08-23 10:49:27 -04:00
Sean Keely 7e2179da7b Update clang-format file to clang-format v3.8.
Format HSA v1.1 core updates.

Change-Id: I540b5c0e5b3ec7522b09c2e070167812b3f17769


[ROCm/ROCR-Runtime commit: 54f1311e01]
2016-08-23 05:50:28 -05:00
Kent Russell 2d604a8498 queues.c: Enforce CUMaskCount being a multiple of 32
The thunk spec requires that CUMaskCount be divisible by 32. Check this
and return INVALID_PARAMETER if it is not.

Change-Id: I4e0c8502d996d3da31224b817a5d4ff2c6054e13


[ROCm/ROCR-Runtime commit: 70b1b5b17e]
2016-08-23 06:16:39 -04:00
Jay Cornwall 53cd59e689 Propagate errors from hsaKmtOpenKFD back to hsa_init
Errors are otherwise silently ignored and hsa_init completes successfully.

Change-Id: Ib1b7dbd7a65d40f869cdbb2792fa97132873d3d7


[ROCm/ROCR-Runtime commit: 0c9f96cfa4]
2016-08-22 17:28:48 -05:00
Ramesh Errabolu 99998f1c87 Fix Image Create Func Decl in Hsa Api Table
Change-Id: I3862b3c78231fe24b6361167a78c6a8c7ad6ce0b


[ROCm/ROCR-Runtime commit: 31f64cdaab]
2016-08-22 15:43:48 -04:00
Konstantin Zhuravlyov e10cd184ef Update code object/isa/loader to hsa v1.1
- Includes Sean's latest changes
- Cleanups/improvements
- Fixes for few bugs that crept over from previous releases

Change-Id: I839dc4895bf13ebd0afc8843424387a9fef667b0


[ROCm/ROCR-Runtime commit: c2c993e0d8]
2016-08-22 15:03:23 -04:00
Jay Cornwall a9dff11965 Temporary fix for gfx801 hang with microcode #685
The PM4 IB must have executable permission.

A second part of this fix concerns robustness when this is not the case.
This remains under investigation.

This fix will shortly be cleaned up in a refactoring pass to consolidate
calls to hsaKmtAllocMemory.

Change-Id: I326fe01949a77669e0b07c3cadc9fd44b8065055


[ROCm/ROCR-Runtime commit: f71de56c79]
2016-08-19 18:05:40 -05:00
Yong Zhao f3e009e431 Fix a bug when mmap fails
EventId is needed in calling hsaKmtDestroyEvent() when mmap failed,
so we should move it ahead of mmap call.

Change-Id: I5f4288b953611799a02b0e988d6b2e48104466a0


[ROCm/ROCR-Runtime commit: 9c9bfa30c0]
2016-08-18 14:30:03 -04:00
James Edwards 21fe24647f Update Brig.h file.
Change-Id: Id74e0c6c0c1863c15bc9a47501dd7d156a9cfc99


[ROCm/ROCR-Runtime commit: 26c704a4ae]
2016-08-18 11:35:55 -05:00
Amber Lin 9bc0411400 Add performance counters for gfx803
Counter IDs in SQ_PERFCOUNTER0_SELECT are identical on gfx803 10 and
gfx803 11.

Change-Id: I5cfefd44b52989efd1d89311cf8c70c84ea2b230


[ROCm/ROCR-Runtime commit: 0b5c65a903]
2016-08-08 18:10:51 -04:00
Amber Lin 194f2083d2 Add gfx803 support
Add gfx803 and gfx80311 device IDs to the support

Change-Id: I16220fd811db102c02e5e0c5b82e40ec299877af


[ROCm/ROCR-Runtime commit: 876384305b]
2016-08-08 11:30:57 -04:00
Konstantin Zhuravlyov 700347b6bf [NFC] Cosmetic improvements:
- Doxygenify comments
  - Match order of implementation with order of declaration

Change-Id: I3c7e486c4dd3616f4b10b2f3e69532a4b5fb9e8e


[ROCm/ROCR-Runtime commit: 01dc3a8ff3]
2016-08-06 14:41:08 -04:00
Jay Cornwall eb8ab3730b Invalidate caches after allocating a code object
Due to a misinterpretation of the HSA specification the microcode has,
until now, been responsible for ensuring a coherent view of the
amd_kernel_code_t object when acquire_fence_scope is set to agent or system.
To correct this the runtime must instead assume this responsibility.

Introduce GpuAgentInt::InvalidateCodeCaches to perform this operation
on-demand. Invoke this after code object allocation. Extend the Queue
implementations to support PM4 command submission, through which the
PM4 command ACQUIRE_MEM can be submitted to perform cache invalidation.
Submit through a runtime-managed queue shared with the blit implementation.

This change depends on microcode support and this is checked against the
running version. Older microcode builds will perform cache invalidation
themselves, so it is acceptable for this change to do nothing in that case.

Change-Id: I268dd2b83af3decdd9ad07430a81df8a2ecb6bd2


[ROCm/ROCR-Runtime commit: f76577ae43]
2016-08-02 13:30:55 -04:00
Jay Cornwall a25aee96ca Add -O0 to CMake debug build configuration
The default optimization level may interfere with debugging.

Change-Id: Ie694ef35b05e4cf2bf4f68bc346e8d60a2d27bc8


[ROCm/ROCR-Runtime commit: d2a4629c55]
2016-07-31 19:28:13 -04:00
Jay Cornwall 70e07dcdfd Enable VM fault message by default on Linux
This option was disabled by default to address issues writing to stderr
in Windows applications. The lack of an error message for memory access
faults is confusing to users, however.

Enable the error message by default on Linux only.

Change-Id: I1f44ba42362f8874abdc7c8e63ddd54a855b5394


[ROCm/ROCR-Runtime commit: acc5f15e4c]
2016-07-30 10:10:14 -04:00
Jay Cornwall 4723abd67d Separate blit compute interface from queue creation
The runtime needs a queue on which to submit cache management commands.
Device-to-device blit copy already creates a queue unconditionally.
We can share this queue for both purposes.

This change restructures the BlitKernel interface to accept, rather than
create, a queue. GpuAgent creates queues as needed for both cache
management and blit compute.

Fix queue full detection in AcquireWriteIndex (<= vs <).

Change-Id: I61d0c6b9d04f2dba74872f0676ad791435778ba4


[ROCm/ROCR-Runtime commit: f7ab361347]
2016-07-29 09:20:25 -04:00
Amber Lin fd3b0ef0e5 Shorten the device list in PerfCounter
get_block_properties uses the complete DID to identify the GPU. This list
is getting too long when more devices are added. Reading the 12 most
significant digits is good enough to identify the GPU.


Change-Id: Ieebb05402bbe08af12eb7289dfeb5bbf1f515b0f


[ROCm/ROCR-Runtime commit: 6c4d19a9d2]
2016-07-27 17:21:31 -04:00
Ramesh Errabolu b81d34bcdf Refactor Trap Handler Code
Change-Id: Iefdc2706bace3e7d907e8e59b9f554affdd0f613


[ROCm/ROCR-Runtime commit: 570301ffd0]
2016-07-27 16:11:53 -04:00
Ramesh Errabolu 4d4616c7dd Fix computation of max_wave_id property
Change-Id: I2ab145d301c92f39bbdb911e48aecccbd64ac82b


[ROCm/ROCR-Runtime commit: da52417c14]
2016-07-20 11:16:17 -04:00
Konstantin Zhuravlyov 83d9c273a5 Reorder loader extension functions to maintain backwards compatibility
Change-Id: I93f0899cdece4bab167290085da67d1a1770eb9b


[ROCm/ROCR-Runtime commit: 49a6a39724]
2016-07-20 08:58:11 -04:00
Jay Cornwall 8fe807f2a9 Replace SP3 dynamic assembly with pre-assembled binaries
This is the first part of transitioning to the LLVM-based assembler.
SP3 is deprecated and all references to the library are removed.
Pending LLVM support, relevant shaders have been precompiled.

Change-Id: I7d44cef5ded1836c4a74b77881af5bea8803d2c1


[ROCm/ROCR-Runtime commit: 712ea75377]
2016-07-16 16:38:32 -05:00
James Edwards da234099e1 Add the hsa_ven_amd_loader.h to the hsa-rocr-dev package and remove hsa_ven_amd_loaded_code_object.h
Change-Id: I6f55e7a98b1f49306d41f13e38190b20d326d5c2


[ROCm/ROCR-Runtime commit: aba3046bb6]
2016-07-15 15:20:24 -04:00
James Edwards a4ab641b90 Add libhsa-ext-image64 library to the rocr extensions packages
Change-Id: Ic3e4570918559f7bb413b8c2e37822b317d92f1f


[ROCm/ROCR-Runtime commit: 0543757148]
2016-07-15 12:55:31 -04:00
Jay Cornwall a1f109afe7 Recognize all CPU nodes in hsa_signal_create consumer list
On multi-node systems only the first CPU node was recognized in the
signal consumer list, causing fallback to non-interrupt signals.

Change-Id: I9bd0706bafbe046be9d7f210d05fa4cf1fcd16fa


[ROCm/ROCR-Runtime commit: b44417043b]
2016-07-09 18:40:39 -05:00
Konstantin Zhuravlyov a03e79c3c0 Remove loaded code object api
Change-Id: If20a6a3d15e25658b9e0aaf9ef8f3f33b2e0dd5c


[ROCm/ROCR-Runtime commit: 93ac77979c]
2016-07-07 13:09:30 -04:00
Ramesh Errabolu d260c22467 Export Amd Extension APIs including support for Version Control
Change-Id: I8c03cbd4049e8115ae00d51f193b9c31ac941f21


[ROCm/ROCR-Runtime commit: 95dc97da7b]
2016-07-06 13:50:18 -05:00
Fan Cao 0bbc303295 Query device name from KFD
Before this change, runtime hard code the device name, in this commit,
we will query the name from KFD. Will use codecvt to do UTF-16 to
UTF-8 transfer after GCC supports it.



Change-Id: I7c4dc32ef857296296c810d083888c5ba1c808b6


[ROCm/ROCR-Runtime commit: 88708b8e5a]
2016-07-06 09:49:17 -04:00
James Edwards 5faed35221 Updates to finalizer CMakeLists.txt file.
Change-Id: I30ab1969ce76a4c1060257e0ebe62763378dc65c


[ROCm/ROCR-Runtime commit: d0d13c34fc]
2016-07-05 16:23:09 -05:00
James Edwards 1cd10b316e Add the finalizer makefile to the open source directory.
Change-Id: I381f27e774573085c81d0dc4e1cbcb11768b3780


[ROCm/ROCR-Runtime commit: 029fe2403e]
2016-07-01 17:27:49 -04:00
Konstantin Zhuravlyov 068dfb7e2f Update p4 makefiles to build new load map api
Change-Id: Ic77560d050bed2a2a8e9b83feaa000da640e437a


[ROCm/ROCR-Runtime commit: 5129ae1d61]
2016-06-29 18:59:39 -04:00
Konstantin Zhuravlyov b43cdae36b Implement new load map API.
Change-Id: I5f148fe66f899b2fa6a2e75430afa988f38db58d

[ROCm/ROCR-Runtime commit: 0e4cab3001]
2016-06-28 11:32:19 -04:00
Christophe Paquot 25ff46fbb8 Handle alternate_va==0
Have amd::MemoryRegion::Lock not assert if the alternate_va
is null but use the host_ptr instead because in the case where
the src/dst memory pointer is allocated via KFD, the host_ptr
is a GPUVA already.

Change-Id: If44368cc2854d4c0c477ae56e4eeabc37e54c1a5


[ROCm/ROCR-Runtime commit: 4e93bdc99c]
2016-06-23 14:51:25 -07:00
Jay Cornwall 125816f08f Share blit queue for device-to-device and device-to-host copies
Reduces the number of blit queues from 3 to 2, when SDMA is unavailable,
improving the availability of queue slots for applications.

Change-Id: I8860d2b6c6d6527494b9fc35d164099e1313886a


[ROCm/ROCR-Runtime commit: 38fddca9fe]
2016-06-21 16:59:36 -05:00
Christophe Paquot e0df393c14 Updated blit kernel code to use device accessible memory
for the kernel args.
Most image-related HSA conformance tests pass now
Many more ocltst/oclperf image ones pass too.

Change-Id: I3f28d4ee7369f0ebc7af5128d3ffe1390957db98

[ROCm/ROCR-Runtime commit: c64f646711]
2016-06-14 17:03:49 -04:00
Besar Wicaksono 2e60df69e4 Add interrupt signal support to SDMA
Change-Id: Ie2b192f3351a0c3bf1eb36ba9704825b18e6059b


[ROCm/ROCR-Runtime commit: aee8ab6ef0]
2016-06-14 14:26:25 -04:00