Commit graph

2959 Commits

Autor SHA1 Nachricht Datum
Felix Kuehling 2915d521a1 Remove redundant dev package build
No need to build the package in the build-dev target. This is taken
care of by package-dev. Removing the redundant packaging command
allows install-dev to work without building a package unnecessarily.

Also moved the rm command into the package-dev target.

Change-Id: I044871be03ebc5673146b44e4291b48b112f4440
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-05-04 14:04:19 -04:00
Evgeny 0e0be791ec Tool load failure report changing to unconditional print bcos it's already is controlled with the env var
Change-Id: I91b400ba94575a32005e825e6b41bda05c55b357
2018-05-03 22:31:17 -05:00
Oak Zeng dc1bbccc39 Use svm aperture for device memory allocate for gfx902 and after APU
Change-Id: Ib1d822adde30138a016e010bf581220465a087b9
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2018-05-03 12:03:22 -04:00
Sean Keely a20cceb425 Relax large scratch cutoff.
Optimize for few queues rather than max.

Change-Id: I6531427319d3b2109b70d98fdb51daee7ffe4403
2018-04-30 07:25:22 -04:00
Sean Keely 5f25619bb7 Enable large scratch on GFX8.
Ensure system release fence is set on GFX8 large scratch using packets.

Change-Id: I13cfdcd35969482ea6e95e0b352f5cb3a0454b86
2018-04-30 07:24:53 -04:00
Evgeny 0fcd2fa56e aqlprofile: get version API
Change-Id: I3a85c088bfff3f54d8829e17cdafd7dfcdfb0c1d
2018-04-27 23:11:16 -04:00
Chris Freehill 11e13704ea Added rocm-smi-lib calls with updated interface
Change-Id: I62b59dca5135ec012f11b249c78b04e5e8e2dd9a
2018-04-27 18:36:54 -04:00
Evgeny b37027e347 aqlprofile: read API
Change-Id: I896b1fbf1c19608197ac0a99b9d467d8c1bee775
2018-04-27 11:00:08 -05:00
Sean Keely 7cd6e366ed Workaround SDMA poll packet preemption.
Use async. signal handler to satisfy dependencies for SDMA blits.

Change-Id: Ifa8d3ee6810509f400a568ca2387ac6ab3ab7c36
2018-04-26 02:00:33 -05:00
srinivas Charupally 8c8cd2dbd0 Adding concurrent init and init shutdown tests
Change-Id: Ifdbda16ae6c93a86373557f26eb414e40775d343
2018-04-24 11:56:50 +05:30
rohit pathania b1b036acb8 Memory Allocation Negative tests
Change-Id: Icdd355f2351968dd76a3bc466636e223573cfb16
2018-04-23 01:49:43 -04:00
Qingchuan Shi 49d2175c74 debug suport for queue error.
1/ Revised debug event handler to handle different events.
2/ Added queue error handler using the callback in queue create, which will print out wave info when queue in error state.
3/ Preempt queue instead of destory queue when queue error state.

Change-Id: Ib727d208de9caf1c72c76d42268483b24aaebde8
2018-04-20 14:25:16 -04:00
Sean Keely b66764e4c6 Disable large scratch on GFX8.
Temporary pending firmware fix.

Change-Id: Id1b1ecef421bc97327fd0d2e6225549a6e81dba0
2018-04-19 20:26:12 -05:00
Evgeny 5a6f47c475 aqlprofile API: adding SQTT SE_MASK parameter
Change-Id: I0149692c2249c6d84ca710ce64e7346784ae593f
2018-04-16 16:39:42 -05:00
Hari Thangirala 3e0cd85d69 Allow HSA_ENABLE_SDMA to override runtime defaults.
Change-Id: I2305304228010157bfb589c365f4a998577231cd
2018-04-10 12:56:48 -04:00
Konstantin Zhuravlyov 7ef70f7eaa Bring naming on par with the spec (hsa-runtime)
Change-Id: Ie1903c90a195cf95b186eb5552131a20af408adf
2018-04-10 09:15:02 -04:00
Chris Freehill 0d9e71a63a Revert "Re-enable rocm-smi with new c api"
This reverts commit 2e81d33395.

Change-Id: I9866610597a6de97a3c06ef9646f0afc85f149f4
2018-04-07 19:59:13 -04:00
Chris Freehill 2e81d33395 Re-enable rocm-smi with new c api
Change-Id: Idf393f31522bac8ac0c3c03a930ef66d97ce5fa2
2018-04-07 18:55:30 -04:00
Ramesh Errabolu f25d59cab2 Compute size of command buffer based on support for HDP Flush
Change-Id: I4987a262c191a91cd845fe18002c314a95a9ed8c
2018-04-07 13:36:09 -04:00
Sean Keely 7caf9633f6 Support large scratch allocations and reclaim.
Also improve small_heap used for scratch region allocation.

Change-Id: Ib7311b663b38968d88ebc355b81e12c0863dc541
2018-04-05 21:51:56 -04:00
Jay Cornwall df964343a3 Support new first-level trap handler ABI
- Ignore exceptions passed to the second-level handler
- Restore SQ_WAVE_IB_STS and SQ_WAVE_STATUS before exiting trap

Change-Id: I872c111c030d94eae644ae073df3c2e508f42f45
2018-04-04 11:01:14 -05:00
Sean Keely b6f0248f53 Respect new memory model requirements at queue destroy.
Spec requires GPU release fences and CPU acquire fences at queue destroy.
Also update the recognized status codes.

Change-Id: If9166f5149f65417c7057ff7c0f69f6ac094d6ab
2018-04-04 08:13:00 -04:00
Sean Keely 6df9ba97ce Sequence queue error callbacks with queue destroy.
HSA v1.2 update.

Change-Id: I13975e71b2c1ea5b7738236f5d02df84312ad00c
2018-04-04 08:12:58 -04:00
Konstantin Zhuravlyov c93584e725 ROCRTST: Add missing hidden arguments
Change-Id: Idd5d58749f4dd740c96299c40e87d83840b6fb2b
2018-04-02 18:19:24 -04:00
Ramesh Errabolu 987f3f97aa Enable sDMA packet HDP Flush on Gfx9 and later devices
Change-Id: I85922e5266883ef7e9eed3565e2c3b209009d294
2018-04-02 11:47:59 -04:00
Shaoyun Liu aa28484583 Thunk: Add gfx904 support on libthunk
Change-Id: I78bc623f6b86293e2bf9fbe00a646d152faafdc4
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
2018-03-29 18:21:02 -04:00
Konstantin Zhuravlyov e4c6a3ae28 Convert old target names to old new target names.
Change-Id: I701651d74f353e234556e4bf5d50d63c598e5f15
2018-03-26 11:59:13 -04:00
Sean Keely cd46954cc4 Cleanup in blit kernel management code.
Remove unused function (FenceRelease), add comments to barrier packet settings,
correct profiling controls to work with queue wrappers.

Change-Id: I45bb26227bcc2b78edb8ad5dc497603c33234e18
2018-03-20 22:19:54 -04:00
Sean Keely f4521ce782 Revert "Reduce to only one internal compute queue."
This reverts commit 0eb534e3cf.

Change-Id: Ifcc5e148457243a6cf9ef277da7ab7c4e10f6fc9
2018-03-20 22:19:44 -04:00
Wilkin 8e3d26c617 ROCm Runtime Support for respecting target xnack setting
This includes the changes provided by Konstantin, "Add xnack from elf header" (Change 136389).

Change-Id: I95e51141caa0d7c21903b09212c02e4906ec54a3
2018-03-20 16:57:15 -04:00
Felix Kuehling 8ac2150e81 Let KFD use VM from DRM render node
Move opening of DRM render nodes from topology to FMM aperture
initialization. Keep the same FDs open for the life time of the
process to match how KFD uses the VMs in the FDs. Call acquire_vm
ioctl during aperture initialization to let KFD use the VMs from
the render nodes.

Change-Id: Ie07d57788cbe685b1841cccc00820c12894a0356
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-03-20 15:42:45 -04:00
Philip Yang 1bf93d4e89 Export microcode version of sDMA
Change-Id: I86fa5da5e72af13a2e76e6e3be4667a7220923d5
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2018-03-19 08:38:50 -04:00
Konstantin Zhuravlyov b7915e9248 Bring loader in sync with stg sc.
Change-Id: Ib4d9231ca61048557acdad8eb8f632688c4aadd8
2018-03-12 15:00:50 -04:00
Felix Kuehling 19dacdecd3 Update kfd_ioctl.h from kernel
This adds new acquire_vm ioctl.

Change-Id: Ia6794bfd291706cecdb2d06f4902b324b48577df
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-03-09 16:36:52 -05:00
Felix Kuehling 85e1a9bf5e Rework SVM aperture initialization
Query GPUVM aperture limits of all dGPUs to determine SVM aperture
base and limit. This depends on a recent KFD change that reports
the GPUVM apurture limits for dGPUs in the
AMDKFD_IOC_GET_PROCESS_APERTURES_NEW ioctl (drm/amdkfd: Simplify
dGPU SVM aperture handling).

Only initialize SVM aperture once, instead of once per GPU.

Don't call AMDKFD_IOC_SET_PROCESS_DGPU_APERTURE. It's not needed any
more and will not be upstreamed.

Change-Id: Ib3389e8ba18505ba15fc33f45fe8a57e690a565d
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-03-09 16:36:49 -05:00
Felix Kuehling c5cfb7e25b Move dGPU memory aperture initialization
Define dgpu_mem_init before it's used and keep the code close to the
rest of the aperture initialization code.

Change-Id: I14ad11a364524a15affee9186b1298ba7d56d2c9
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-03-09 15:00:12 -05:00
Sean Keely ac5ccb45b7 Use atomic variable for Runtime ref_count_.
Change-Id: Ic4d0ad9ff93d0cc52cfe2df006ee3436d5960b07
2018-03-06 03:45:14 -06:00
Sean Keely 31c05d2fc7 Add exception safety to Runtime::Acquire.
Change-Id: Ia2a9baf08bb56971412f1ac3914592612de5f134
2018-02-28 05:21:07 -06:00
Yong Zhao 15e525af45 Add pkg config support in the hsakmt-roct-dev package
Change-Id: Ida6b3083bfc9405ef9b6b8e426dc7dc51d61a811
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2018-02-27 11:21:38 -05:00
Yong Zhao 2c426a026a Turn off the verbose building message
Change-Id: If4ebdb6f87fde9c3cc76b16c57e862bfb972ed5e
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2018-02-26 18:17:13 -05:00
Philip Yang 105291849f Close shmem file handle, to fix file handle leak
kfdtest hsaKmtOpenKFD failed after 1019 loop if using --gtest_loop=-1,
because default max open file handle limit is 1024. Found shmem file handle
is not closed from lsof output.

Change-Id: I474de2bae6c03e879a219dedf5f18639118b73e5
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2018-02-23 10:50:52 -05:00
Jay Cornwall e2c353dc0d Allocate EOP queue local to GPU
On discrete GPUs place the EOP queue in VRAM. The reader/writer of this
queue is the CP and the size is small. Dispatch latency improves
through lower read latency in AQL completion phase.

Change-Id: Id8351dcddbd21fd7c7d699803c96434c9132db71
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
2018-02-22 18:14:05 -05:00
Oak Zeng 25170c3c57 Support ptrace access invisible vram
Invisible device memory is mmapped as PROT_NONE.
Normal CPU access to the memory is still not allowed but
struct vm_area_struct will be created for the memory address
so ptrace can access the memory via the vma.

Change-Id: I07c69208716c920ccce33e6b494b610b61a0a7c1
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2018-02-20 14:13:00 -05:00
Chris Freehill 81c923b26f Reenable Memory_Max_Mem test
Change-Id: I2da50f886cd8d28d0f9ad8b8f77cfa13d392bf01
2018-02-13 17:37:25 -06:00
Chris Freehill 146b3871df Completely disable rocm-smi (take 2)
Change-Id: I68f403c539163bfe00ee2b59dbd36d1c6d7669f1
2018-02-12 06:43:47 -06:00
Chris Freehill bd0c4efc34 Completely disable rocm-smi from rocrtst until rocm-smi-lib is updated
Change-Id: I5cce06a2bbde7a3a48e391022c793a462794c6d1
2018-02-11 21:42:25 -06:00
Sean Keely 95c926059d Improve fragment map reporting format.
Change-Id: I85d09d085b08de46271ec902c766a8609a4b921a
2018-02-09 14:03:03 -05:00
Sean Keely 9212e7a09f Emit fragment map and thunk ptr info with VM faults.
Change-Id: If1302f674df7a636529c64bf66dfdda755a70c32
2018-02-09 14:02:26 -05:00
Sean Keely 0eb534e3cf Reduce to only one internal compute queue.
Change-Id: Ie42ecb3b242077624d74caeabfcd418dbbd9ff3e
2018-02-09 14:02:15 -05:00
Sean Keely bd5dd47ca1 Defer creation of internal queues and blits until first needed.
Change-Id: I2e61d7e102f38389d806d9eb24beda910573157b
2018-02-09 14:02:07 -05:00