Commit-Graf

2925 Incheckningar

Upphovsman SHA1 Meddelande Datum
Ramesh Errabolu 03668e6e06 Removing Old Commandwriter used by Profiler and Debugger
Change-Id: I77d2172eef1724d1d4aed6d8e7a9df6cdbeb0648


[ROCm/ROCR-Runtime commit: f0263d8198]
2017-01-13 11:51:21 -05:00
Christophe Paquot 321fdd26bf Add image 1.1 API changes to current code base
Initial work to import the latest (1.1) hsa_ext_image extension.

Change-Id: I51d70ef26f97250c884b3def2088be0d7eb04eb3


[ROCm/ROCR-Runtime commit: 31d379c821]
2017-01-12 14:49:54 -08:00
Harish Kasiviswanathan 6f2b43d96e Add fork support
If fork() is called, clear all duplicated data that is invalid in the
child process.

Change-Id: I4e27198060db593c630c6337b7071dfbd0d80b83
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>


[ROCm/ROCR-Runtime commit: f1f62d863c]
2017-01-12 14:38:30 -05:00
Kent Russell 56096ff555 Revert " Add image 1.1 API changes to current code base"
Currently HSA HW Profiler is failing to build due to this patch

This reverts commit 91866d4bbb.

Change-Id: Iabb2b958f33ba614a24b61bb370905b3b7362708


[ROCm/ROCR-Runtime commit: 5162a76616]
2017-01-12 06:43:54 -05:00
Christophe Paquot 91866d4bbb Add image 1.1 API changes to current code base
Initial work to import the latest (1.1) hsa_ext_image extension.

Change-Id: I4d55adb09ba4d4dbd43d47a4bc54077d4bc531d2


[ROCm/ROCR-Runtime commit: e0ce8855dd]
2017-01-11 17:31:37 -05:00
Felix Kuehling f07e9a5606 Allocate CWSR buffer in system memory
CWSR buffers can be large on dGPUs (~21MB on gfx803). Allocating them
in VRAM limits the number of queues that can be created unnecessarily.

Also make freeing of per-queue buffers symmetric with allocation. All
buffers are now allocated with allocate_exec_aligned_memory on dGPUs
and APUs, so use free_exec_aligned_memory to free them.

Change-Id: I45e8cb1801857d0268750202cdd422426611e457


[ROCm/ROCR-Runtime commit: 4181b408fc]
2017-01-04 16:07:56 -05:00
Sean Keely b75a8d9ed0 Allow reducing max occupancy (max scratch waves) when applications request large amounts of scratch.
Also emit error messages to stderr if no async queue error callback was registered and queue fault messages are enabled (on by default).
Queue fault messages are controlled with env key HSA_ENABLE_QUEUE_FAULT_MESSAGE.

Change-Id: I496487b8d048b83aa95b9784e92928211f167b17


[ROCm/ROCR-Runtime commit: 0e17cc2887]
2016-12-20 16:52:59 -06:00
James Edwards e0576546f9 Readme and comment updates to ROCm 1.4
Change-Id: I2864a9c475b9ceb2fa08bfc35999c7e0e043b26d


[ROCm/ROCR-Runtime commit: 433a3bcde8]
2016-12-16 11:43:36 -06:00
Chris Freehill 1df3375ec4 HSA Enabled IPC support
Uncommented HSA IPC code.
Changed hsa_amd_ipc_memory_t to be 8 uint32_t's instead of 9 to
match spec

Change-Id: Id1523125e9b876a23c3743df1be29c98b47f6725


[ROCm/ROCR-Runtime commit: 160f8c5880]
2016-12-15 19:16:29 -05:00
Konstantin Zhuravlyov 578c17681b Revert "Bring loader in sync with stg/sc"
This reverts commit cdcc5ec921.

Change-Id: If99e8cc9e2afb525f690e49eb6538d8e950a5615


[ROCm/ROCR-Runtime commit: 08aded148a]
2016-12-14 15:14:36 -05:00
Konstantin Zhuravlyov cdcc5ec921 Bring loader in sync with stg/sc
Change-Id: I684522c442de0872007a7e4da8919067fc7b42b3


[ROCm/ROCR-Runtime commit: c798c60343]
2016-12-13 16:30:25 -05:00
James Edwards 046929d5ac Make copyright and README changes for ROCm
Change-Id: Ic9717fcbc57d4552dddf8374b34fc7c34e44268a


[ROCm/ROCR-Runtime commit: 25ed746314]
2016-12-06 16:11:00 -06:00
Harish Kasiviswanathan 5d2336f1ca Add API entrypoints for IPC functionality
Implement three new APIs for IPC buffer sharing:
	-hsaKmtShareMemory()
	-hsaKmtRegisterSharedHandle()
	-hsaKmtRegisterSharedHandleToNodes()

Add new ioclts necessary for the above APIs.

Change-Id: Ia2b4d0dc91ec64bff959395d11c0536467404792
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>


[ROCm/ROCR-Runtime commit: 559e31d6ff]
2016-11-28 16:19:22 -05:00
Ramesh Errabolu a4c8e0fdca Per review comments
Change-Id: I44585a6dcf3c4f0ce10f0e895270c113790e0652


[ROCm/ROCR-Runtime commit: 935c1b0ba9]
2016-11-28 12:41:32 -06:00
Harish Kasiviswanathan bd81b76c5a kfd ioctl. Remove unused definitions
Sync the file with Kernel header file

Change-Id: I52d52a38fb38bd4b37d8210ce79775b88c8a985d


[ROCm/ROCR-Runtime commit: 4838f6e740]
2016-11-22 09:44:22 -05:00
Ramesh Errabolu e3d8eac639 Initial set of changes for ThreadTrace
Change-Id: I07ce31f9b4f508cef0fc9ca6dadcf26b6c90361e


[ROCm/ROCR-Runtime commit: eb2efb83d1]
2016-11-21 23:40:56 -06:00
Konstantin Zhuravlyov 05a7b5a9e4 Remove load_legacy parameter + change prefix for some loaded code object queries back to AMD
Change-Id: I74e905abd77dab3a7a00b5ced94cd9b5130365c5


[ROCm/ROCR-Runtime commit: 4b86843409]
2016-11-20 13:46:17 -05:00
Amber Lin 7ea1a24220 Allow a memory to be registered multiple times
A memory region is allowed to be registered multiple times when the memory
is specified by a user pointer. If it's registered with the same user
pointer but with different sizes, it's treated as different instances and
multiple VM objects are created with different GPU address.


Change-Id: I49627111bb5db36d18f1133b252fb62a611f06a4


[ROCm/ROCR-Runtime commit: 2a50ebba98]
2016-11-18 17:46:12 -05:00
Yong Zhao 1a2ed0ec64 Making the code more robust by checking the NULL pointer
Change-Id: I36b9f73eadd7547c71fe3641ac131c7408b14816
Signed-off-by: Yong Zhao <yong.zhao@amd.com>


[ROCm/ROCR-Runtime commit: a1f417715b]
2016-11-16 11:35:26 -05:00
Sean Keely 2e8c176e11 Add InterProcess memory sharing support.
Support is disabled pending KFD / Thunk readiness.

Change-Id: I55def748e3d56cbfcfa6e24983a0ab78567aa81d


[ROCm/ROCR-Runtime commit: 8081758a55]
2016-11-15 18:58:29 -06:00
Sean Keely c9e78a451b Add pointer info support.
Change-Id: I3edcc0bfddbf12465065c9bc3b6565288faff1b8


[ROCm/ROCR-Runtime commit: 9dd76dbeda]
2016-11-11 18:40:16 -06:00
Sean Keely b5d60f56dd Restrict stack usage in interop map to a more reasonable level.
Change-Id: I663f262a391d1e7f8a6fc3028ea9acbe37d8bcf0


[ROCm/ROCR-Runtime commit: e01c43578c]
2016-11-10 16:55:55 -06:00
Sean Keely 6ba6e14e9a Set the default value of userdata in pointer info to NULL.
Change-Id: Ie0d94b921bbce880d9548d5a014a2d7c33062f7e


[ROCm/ROCR-Runtime commit: c54c63fd56]
2016-11-09 21:15:07 -06:00
Konstantin Zhuravlyov ee3b926381 Allocate only one segment for code object v2+
Change-Id: I7cd03b5c205d3ea5735f8f29820867ca90ac081b


[ROCm/ROCR-Runtime commit: 54245e064c]
2016-11-03 09:51:11 -04:00
Andres Rodriguez c589186700 Fix hsaKmtOpen incorrectly doing nothing in some fork scenarios
Currently, if a process' parent called hsaKmtOpen, the child will be
unable to open a connection to KFD, since kfd_open_count will be > 0.

When forking, the refcount should be reset, in order to allow the child
to re-open /dev/kfd.

Change-Id: Ia4b78f6bacc4f82e8ac724e5f488a3eff5084007


[ROCm/ROCR-Runtime commit: 0de39b6724]
2016-11-01 15:54:17 -04:00
Chris Freehill 522cffc4b6 Added support for agent attribute HSA_AMD_AGENT_INFO_MAX_WAVES_PER_CU
Change-Id: I2b90e7165384c4dce928a620a1782395267b35b0


[ROCm/ROCR-Runtime commit: 30f1ec2691]
2016-10-28 11:24:21 -04:00
Jay Cornwall c6bfeda697 Fix miscellaneous warnings flagged by Clang
Change-Id: I85a45cb3b44e4379b31bcc56af061fd1571f2af5


[ROCm/ROCR-Runtime commit: c30c25bd30]
2016-10-26 19:26:16 -05:00
Jay Cornwall 1b76d435b4 Insert explicit memory fence before submitting doorbell
Ensure that the write index and ring buffer contents are visible
to the HW before sending the doorbell. The latter is a write-combined
MMIO store and must be ordered with prior cacehable non-MMIO stores.

Also be more explicit about memory semantics for doorbell stores.

Change-Id: Ie4d96a7ee2a507237a8dbe7705fdf234d62ce9ba


[ROCm/ROCR-Runtime commit: d5b4078072]
2016-10-17 10:01:47 -04:00
Jay Cornwall 5467ef0a59 Fix AQL packet lookup in dynamic scratch event handler
Read index was not wrapped correctly.

Change-Id: I25c901b61e4760990871e22468ffd0391abef244


[ROCm/ROCR-Runtime commit: c83846cd45]
2016-10-17 10:01:14 -04:00
James Edwards d93ce4adc3 Fix hsa-runtime-tools library dependencies.
Change-Id: I649ebde06ab5c4b1892968c22e44c2bd5f53e49b


[ROCm/ROCR-Runtime commit: d66af9718e]
2016-10-15 14:35:44 -05:00
James Edwards 171802e188 Fix library namespace to ROCR_1.
Change-Id: Ie341203bad8e17673be86529bfed3c4fa98e9343


[ROCm/ROCR-Runtime commit: 4b6feb8498]
2016-10-13 16:26:46 -05:00
Konstantin Zhuravlyov 65e3c8c179 Initialize symbol's agent member for agent allocation symbols
Change-Id: I0ee0e07e4132ca13b3ecf7469c59ca327ff3c76d


[ROCm/ROCR-Runtime commit: 6f216f30c2]
2016-10-13 12:43:19 -04:00
Jay Cornwall 469f3a5b5d Disable GPUVM-mapped doorbell on gfx802
gfx802 requires a workaround for a VM TLB bug in which lookups use
the ACTIVE bit of the 8th PTE within any aligned group of 8 PTEs.
Until this is fixed in amdgpu the GPUVM doorbell logic will fail.

Change-Id: I5ec7b1fcd8b7677011a141d27cfc486c45d9a415
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>


[ROCm/ROCR-Runtime commit: 5493ae420b]
2016-10-10 18:39:31 -05:00
James Edwards b86f6d176e Modify image libraries CMakeLists.txt file to use HSAKMT variables.
Change-Id: I19f1c4a312b108774f2e3e9c6697db07e6731556


[ROCm/ROCR-Runtime commit: 070be8823f]
2016-09-30 23:05:38 -05:00
James Edwards 6f2c876923 Fix CMakeList.txt file to use correct compile options. Fix compilation errors.
Change-Id: I6229a83d0823ee7a123cdaa9efd782108aa3a03c


[ROCm/ROCR-Runtime commit: b170e0ad8c]
2016-09-30 16:36:01 -05:00
James Edwards 0a7bf6fe14 Update required cmake version to 3.5.1 to support rpm packages.
Change-Id: Ic8d0d96b131937407de2578a653025a44843d330


[ROCm/ROCR-Runtime commit: cfc949c1ee]
2016-09-30 16:04:02 -05:00
James Edwards a6b81aa2e9 Remove unneeded CMakeList.txt file. Move link and def files.
Change-Id: I219317496aa564bde488a8e56e7d83808ebddb66


[ROCm/ROCR-Runtime commit: 1b30c322f0]
2016-09-30 13:02:30 -05:00
James Edwards d35b7f118b Fix Teamcity build break, part 1.
Change-Id: I2019b502700e5fda2175b258dcfd3681ab93bc77


[ROCm/ROCR-Runtime commit: 41432ee6ca]
2016-09-29 18:41:54 -05:00
James Edwards b9b6f69860 Update runtime CMakeLists.txt and utility files to support top level CMakeLists.txt build.
Change-Id: I4a0eb512af82908a24f2d1964b201c28023ccae5


[ROCm/ROCR-Runtime commit: 809356e0b5]
2016-09-27 11:58:31 -04:00
Christophe Paquot 3ebe9fab51 Sync needed when wrapping around the kernel arg buffer
If we issue too many copy commands without syncing and wrapping happens,
we need to wait for the blits to be done before moving forward otherwise
we will overwrite the kernel args of the blits in flight.

Change-Id: I9a21e31ce07f8e8157ca38e96dc264ff47fd3639


[ROCm/ROCR-Runtime commit: 5519c96b74]
2016-09-26 14:14:33 -04:00
James Edwards a7dd25b98f Update rocr packaging CMakeList.txt files to PACKAGE and LIBRARY versioning. Also, fix support for rpm packaging.
Change-Id: Iff41df41ea78b7d1248164ce3f587ad34a8865a5


[ROCm/ROCR-Runtime commit: fddd5246f3]
2016-09-23 19:12:19 -04:00
Ramesh Errabolu 26495af3cd Support a new property of Rocr Agents called Product Name
Change-Id: Ia7217094223bfed908d9aa9ccdaa590e785503cb


[ROCm/ROCR-Runtime commit: 5d972064c1]
2016-09-21 17:00:41 -04:00
James Edwards 4ee6e7c69a Add libhsakmt cmake build and packaging files.
Change-Id: Ic7fa22d5b266480aa0c62628022f39da4e043d23


[ROCm/ROCR-Runtime commit: 7511631f08]
2016-09-20 17:48:36 -04:00
James Edwards 22d3ba4068 ADD copy_targets targets to ALL target
Change-Id: I3d4b28f873f09a9e866d9f27f0bdfd3c65494e6f


[ROCm/ROCR-Runtime commit: b4fedf6785]
2016-09-20 13:50:06 -05:00
Felix Kuehling fee7a91fb9 Allocate and map doorbells in SVM for discrete GPUs
Allocate doorbells for dGPUs in the SVM aperture and map them for
GPU access. This is necessary to allow GPU-initiated submissions to
user mode queues.

Depends on new doorbell BO allocation flag in KFD.

Change-Id: I0737bef4a4764bb4a66c43846707ead2108f6601


[ROCm/ROCR-Runtime commit: 2e0a6eb371]
2016-09-16 16:04:27 -04:00
Amber Lin 6b33ada07b Disable CPU cache info in non-x86
CPU cache information reported by Thunk topology is obtained from cpuid
instruction. This instruction only applies to X86 systems. It can cause
compile errors on non-X86 platforms. This patch temporarily disables CPU
cache functions in topology for non-X86 platforms in order to compile.

Change-Id: If86671817b0d036cb324eebf3f354682bfb75856


[ROCm/ROCR-Runtime commit: 660a6ebbd4]
2016-09-14 17:30:50 -04:00
Amber Lin 2dec7b1d74 Search VM object by range
Add vm_find_object_by_userptr_range so QueryPointerInfo can find the
object as well when the pointer is not the starting address but it's
inside the memory range. Also rename vm_find_object_xxx functions to
_by_address and _by_address_range to be consistent.

Change-Id: I5c2b3a05b41493e32b7fd9154665bf078b043606


[ROCm/ROCR-Runtime commit: 4911c91389]
2016-09-13 12:44:29 -04:00
Christophe Paquot 5544a9e5e5 Add tiling code
Introducing tiling format for images, still using LINEAR for now.
Using the new KFD/Thunk API hsaKmtGetTileConfig API for the address library.

Change-Id: Ic0677429dd320eef09ab62dddaf9b2dd94c4f904


[ROCm/ROCR-Runtime commit: 538736a660]
2016-09-13 11:42:10 -04:00
Amber Lin 4b17993791 Pointer attributes on APU
Add CPUVM aperture to keep track of memory allocation that is not known
to GPU driver. Together with GPUVM, this patch adds the pointer attributes
support to APU.

Change-Id: If13f9cf01ff8b9f709b99b66661e7505246adf4c


[ROCm/ROCR-Runtime commit: 19f2676ea7]
2016-09-12 11:32:26 -04:00
Ramesh Errabolu 85aa61f011 Print Debug Mesg if private segment memory request is illegal
Change-Id: I46351651b6b2bf14e26645440a4321bc941900b2


[ROCm/ROCR-Runtime commit: c54304fe38]
2016-09-08 11:16:09 -04:00