Commit graph

65 Commits

Autor SHA1 Nachricht Datum
Ramesh Errabolu 31f64cdaab Fix Image Create Func Decl in Hsa Api Table
Change-Id: I3862b3c78231fe24b6361167a78c6a8c7ad6ce0b
2016-08-22 15:43:48 -04:00
Konstantin Zhuravlyov c2c993e0d8 Update code object/isa/loader to hsa v1.1
- Includes Sean's latest changes
- Cleanups/improvements
- Fixes for few bugs that crept over from previous releases

Change-Id: I839dc4895bf13ebd0afc8843424387a9fef667b0
2016-08-22 15:03:23 -04:00
Jay Cornwall f71de56c79 Temporary fix for gfx801 hang with microcode #685
The PM4 IB must have executable permission.

A second part of this fix concerns robustness when this is not the case.
This remains under investigation.

This fix will shortly be cleaned up in a refactoring pass to consolidate
calls to hsaKmtAllocMemory.

Change-Id: I326fe01949a77669e0b07c3cadc9fd44b8065055
2016-08-19 18:05:40 -05:00
James Edwards 26c704a4ae Update Brig.h file.
Change-Id: Id74e0c6c0c1863c15bc9a47501dd7d156a9cfc99
2016-08-18 11:35:55 -05:00
Konstantin Zhuravlyov 01dc3a8ff3 [NFC] Cosmetic improvements:
- Doxygenify comments
  - Match order of implementation with order of declaration

Change-Id: I3c7e486c4dd3616f4b10b2f3e69532a4b5fb9e8e
2016-08-06 14:41:08 -04:00
Jay Cornwall f76577ae43 Invalidate caches after allocating a code object
Due to a misinterpretation of the HSA specification the microcode has,
until now, been responsible for ensuring a coherent view of the
amd_kernel_code_t object when acquire_fence_scope is set to agent or system.
To correct this the runtime must instead assume this responsibility.

Introduce GpuAgentInt::InvalidateCodeCaches to perform this operation
on-demand. Invoke this after code object allocation. Extend the Queue
implementations to support PM4 command submission, through which the
PM4 command ACQUIRE_MEM can be submitted to perform cache invalidation.
Submit through a runtime-managed queue shared with the blit implementation.

This change depends on microcode support and this is checked against the
running version. Older microcode builds will perform cache invalidation
themselves, so it is acceptable for this change to do nothing in that case.

Change-Id: I268dd2b83af3decdd9ad07430a81df8a2ecb6bd2
2016-08-02 13:30:55 -04:00
Jay Cornwall d2a4629c55 Add -O0 to CMake debug build configuration
The default optimization level may interfere with debugging.

Change-Id: Ie694ef35b05e4cf2bf4f68bc346e8d60a2d27bc8
2016-07-31 19:28:13 -04:00
Jay Cornwall acc5f15e4c Enable VM fault message by default on Linux
This option was disabled by default to address issues writing to stderr
in Windows applications. The lack of an error message for memory access
faults is confusing to users, however.

Enable the error message by default on Linux only.

Change-Id: I1f44ba42362f8874abdc7c8e63ddd54a855b5394
2016-07-30 10:10:14 -04:00
Jay Cornwall f7ab361347 Separate blit compute interface from queue creation
The runtime needs a queue on which to submit cache management commands.
Device-to-device blit copy already creates a queue unconditionally.
We can share this queue for both purposes.

This change restructures the BlitKernel interface to accept, rather than
create, a queue. GpuAgent creates queues as needed for both cache
management and blit compute.

Fix queue full detection in AcquireWriteIndex (<= vs <).

Change-Id: I61d0c6b9d04f2dba74872f0676ad791435778ba4
2016-07-29 09:20:25 -04:00
Ramesh Errabolu 570301ffd0 Refactor Trap Handler Code
Change-Id: Iefdc2706bace3e7d907e8e59b9f554affdd0f613
2016-07-27 16:11:53 -04:00
Ramesh Errabolu da52417c14 Fix computation of max_wave_id property
Change-Id: I2ab145d301c92f39bbdb911e48aecccbd64ac82b
2016-07-20 11:16:17 -04:00
Konstantin Zhuravlyov 49a6a39724 Reorder loader extension functions to maintain backwards compatibility
Change-Id: I93f0899cdece4bab167290085da67d1a1770eb9b
2016-07-20 08:58:11 -04:00
Jay Cornwall 712ea75377 Replace SP3 dynamic assembly with pre-assembled binaries
This is the first part of transitioning to the LLVM-based assembler.
SP3 is deprecated and all references to the library are removed.
Pending LLVM support, relevant shaders have been precompiled.

Change-Id: I7d44cef5ded1836c4a74b77881af5bea8803d2c1
2016-07-16 16:38:32 -05:00
James Edwards aba3046bb6 Add the hsa_ven_amd_loader.h to the hsa-rocr-dev package and remove hsa_ven_amd_loaded_code_object.h
Change-Id: I6f55e7a98b1f49306d41f13e38190b20d326d5c2
2016-07-15 15:20:24 -04:00
James Edwards 0543757148 Add libhsa-ext-image64 library to the rocr extensions packages
Change-Id: Ic3e4570918559f7bb413b8c2e37822b317d92f1f
2016-07-15 12:55:31 -04:00
Jay Cornwall b44417043b Recognize all CPU nodes in hsa_signal_create consumer list
On multi-node systems only the first CPU node was recognized in the
signal consumer list, causing fallback to non-interrupt signals.

Change-Id: I9bd0706bafbe046be9d7f210d05fa4cf1fcd16fa
2016-07-09 18:40:39 -05:00
Konstantin Zhuravlyov 93ac77979c Remove loaded code object api
Change-Id: If20a6a3d15e25658b9e0aaf9ef8f3f33b2e0dd5c
2016-07-07 13:09:30 -04:00
Ramesh Errabolu 95dc97da7b Export Amd Extension APIs including support for Version Control
Change-Id: I8c03cbd4049e8115ae00d51f193b9c31ac941f21
2016-07-06 13:50:18 -05:00
Fan Cao 88708b8e5a Query device name from KFD
Before this change, runtime hard code the device name, in this commit,
we will query the name from KFD. Will use codecvt to do UTF-16 to
UTF-8 transfer after GCC supports it.



Change-Id: I7c4dc32ef857296296c810d083888c5ba1c808b6
2016-07-06 09:49:17 -04:00
James Edwards d0d13c34fc Updates to finalizer CMakeLists.txt file.
Change-Id: I30ab1969ce76a4c1060257e0ebe62763378dc65c
2016-07-05 16:23:09 -05:00
James Edwards 029fe2403e Add the finalizer makefile to the open source directory.
Change-Id: I381f27e774573085c81d0dc4e1cbcb11768b3780
2016-07-01 17:27:49 -04:00
Konstantin Zhuravlyov 5129ae1d61 Update p4 makefiles to build new load map api
Change-Id: Ic77560d050bed2a2a8e9b83feaa000da640e437a
2016-06-29 18:59:39 -04:00
Konstantin Zhuravlyov 0e4cab3001 Implement new load map API.
Change-Id: I5f148fe66f899b2fa6a2e75430afa988f38db58d
2016-06-28 11:32:19 -04:00
Christophe Paquot 4e93bdc99c Handle alternate_va==0
Have amd::MemoryRegion::Lock not assert if the alternate_va
is null but use the host_ptr instead because in the case where
the src/dst memory pointer is allocated via KFD, the host_ptr
is a GPUVA already.

Change-Id: If44368cc2854d4c0c477ae56e4eeabc37e54c1a5
2016-06-23 14:51:25 -07:00
Jay Cornwall 38fddca9fe Share blit queue for device-to-device and device-to-host copies
Reduces the number of blit queues from 3 to 2, when SDMA is unavailable,
improving the availability of queue slots for applications.

Change-Id: I8860d2b6c6d6527494b9fc35d164099e1313886a
2016-06-21 16:59:36 -05:00
Christophe Paquot c64f646711 Updated blit kernel code to use device accessible memory
for the kernel args.
Most image-related HSA conformance tests pass now
Many more ocltst/oclperf image ones pass too.

Change-Id: I3f28d4ee7369f0ebc7af5128d3ffe1390957db98
2016-06-14 17:03:49 -04:00
Besar Wicaksono aee8ab6ef0 Add interrupt signal support to SDMA
Change-Id: Ie2b192f3351a0c3bf1eb36ba9704825b18e6059b
2016-06-14 14:26:25 -04:00
Besar Wicaksono a2ebd9a825 Fix close source build for tools library,
Change-Id: Id0265b186ac2fbc5385ff70e3d34947055788c21
2016-06-06 21:08:21 -04:00
Besar Wicaksono 103cd04236 Blit SDMA support for gfx70x
Change-Id: Ie6f215890553ef41c3f36b349fc9cc39c2d38747
2016-06-02 06:18:36 -04:00
James Edwards f49ddad0a1 Modify runtime cmake files to use HSA_CLOSED_SOURCE_ROOT.
Change-Id: I416f8608cfb793eac9065c1f63a85da2d3c3a816
2016-05-31 14:08:10 -05:00
Konstantin Zhuravlyov 5a14d496ab Add support for dynamic relocations (code object v2.1)
Change-Id: Ic19be97d3ea78b53f5aa814787515b587d0be21b
2016-05-26 14:09:07 -04:00
Besar Wicaksono a8b00680b6 Add profiling support to DMA copy function
Change-Id: Iadeefa2692f35d9305ac1b242284a6220d5830a7
2016-05-26 11:29:29 -04:00
James Edwards 50339c12f1 Correct minor issues in License text and sample code for hsa-rocr-dev package.
Change-Id: If1c4387794de3cb707a8ba8281a40a1123130c95
2016-05-26 09:42:24 -04:00
Ramesh Errabolu 383ed6983f Refactor Scratch Memory Descriptor Initialization
Change-Id: Ib4a136c266646cc5d5f5afb98f4aaf9266d02072
2016-05-25 22:17:43 -04:00
James Edwards ec6478e693 Add hsa-rocr-dev packaging CMakeList.txt file.
Change-Id: I1f6a0d4ad44aa7f20f43d43942719f668b620c36
2016-05-25 17:04:27 -04:00
James Edwards 72cb6dd33f Add hsa-ext-rocr-dev automatic packaging.
Change-Id: Ieb0d179b4e1a398a9400bd80037a46d0513582bc
2016-05-23 10:10:44 -04:00
Besar Wicaksono bc589048a9 Use lazy initialization to create Blit objects
Change-Id: I388865030dc2538c5c881c055e38af52a57f6d87
2016-05-20 14:26:06 -04:00
James Edwards ceab9a3eb0 Update hsa-ext-image CMakeList.txt file to include static lib compiler options
Change-Id: I06cff984d3dc169cdb30832bf0115bc7d821eadf
2016-05-19 15:48:42 -05:00
Jay Cornwall 90ab72cd66 Implement optimized blit/fill kernels
Replace HSAIL kernels with SP3 shaders.
Support all alignment variations efficiently.

Change-Id: Icf7f5471f3ba68389f55484d82f2805dd9bc3827
2016-05-10 21:51:57 -05:00
James Edwards 023b302fae Add image and tools cmake files to the opensrc directory.
Change-Id: I9e95d391992fa6ad7d13b500cd28eb0fb93dda1d
2016-05-03 17:01:14 -05:00
bwicakso 6ea42ae333 hsa_amd_agent_memory_pool_get_info gives wrong results for gfx803. Root cause: missing break point when querying the num hop attribute. Other change: max the reported num hop to 1 since the runtime does not have enough information about each hop, also clarified the comment about HSA_AMD_AGENT_MEMORY_POOL_INFO_NUM_LINK_HOPS attribute in the header file
Change-Id: I5d868eb457666e1377d5308f6145e76176bbfaf7
2016-05-03 12:52:38 -04:00
James Edwards 24714cb769 Remove whitespace from comments in CMakeLists.txt
Change-Id: I9a94a6f224a5cbd5fb1f8b57ed0c369339e23228
2016-04-28 11:24:02 -05:00
Shi, Aaron (en ye) (xN/A) TO ad21f0606e HSA Finalizer: Promote SC PRM -> Finalizer (HSA tree) up to CL 1258514
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1259784]
2016-04-19 15:31:52 -05:00
Jay Cornwall (xN/A) UK 1d4a257225 Fix SDMA fill for >=4MB regions
max_single_fill_size_ overflowed the packet field size. Reduce by one dword.

[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1259263]
2016-04-18 16:05:13 -05:00
Besar Wicaksono (xN/A) TX [TEXT] 5a584fa1ab Fix query HSA_AMD_AGENT_MEMORY_POOL_INFO_LINK_INFO
Querying HSA_AMD_AGENT_MEMORY_POOL_INFO_LINK_INFO between a gpu agent
and its own local memory pool returns a wrong information.
Fix: return link with 0 hop count.

[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1257544]
2016-04-13 12:39:25 -05:00
Hari Thangirala 0545761aa9 ROCR Build ID support
Fix dirty-tree status. Thanks to Fan for fixing the issue.

[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1256716]
2016-04-11 18:48:29 -05:00
Besar Wicaksono (xN/A) TX [TEXT] ea67bb8374 Sdma wraparound optimization.
Remove mutex and just make the thread spin again if the queue is wrapping.
Remove the wait for the queue to finish wrapping, and just check if there is enough space to recycle when reserving queue space.

[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1256713]
2016-04-11 18:31:45 -05:00
James Edwards (xN/A) TX 871412adff Remove ENV variables from CMakeLists.txt files.
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1256687]
2016-04-11 17:18:01 -05:00
Hari Thangirala a148fd0b68 ROCR Build ID support
Build system/Package maintainer:
-    BUILDID is specified at cmake.
-    USAGE: cmake -DBUILDID=<ID> ../src

For developer builds the who typically don?t provide BUILDID, cmake will:
-    Determine the last git commit when this tree was syncd 
-    Deteremine the build date 
-    Check if tree is clean when built 

The idea of this embedded string is that later when you get a ROCR build, you can get some idea on the build origination by using: strings libhsa-runtime.so.1 | grep ?ROCR BUILD ID?

For eg:
-    If it?s a Jenkins build 25, it returns: ?ROCR  BUILD ID: 25?
-    If it?s a developer build sync'd @ 06f5f2a with modifications, it returns: ?ROCR BUILD ID: 06f5f2a-2016-04-11-0"

[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1256588]
2016-04-11 15:03:06 -05:00
Zhuravlyov, Konstantin (x21446) MA 503fd728dd Fail gracefully if memory allocation did not succeed
Testing: precheckin (http://ocltc.amd.com:8111/viewModification.html?modId=69427&personal=true&tab=vcsModificationBuilds)

[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1256179]
2016-04-09 16:40:24 -05:00