Граф коммитов

196 Коммитов

Автор SHA1 Сообщение Дата
Jay Cornwall 469f3a5b5d Disable GPUVM-mapped doorbell on gfx802
gfx802 requires a workaround for a VM TLB bug in which lookups use
the ACTIVE bit of the 8th PTE within any aligned group of 8 PTEs.
Until this is fixed in amdgpu the GPUVM doorbell logic will fail.

Change-Id: I5ec7b1fcd8b7677011a141d27cfc486c45d9a415
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>


[ROCm/ROCR-Runtime commit: 5493ae420b]
2016-10-10 18:39:31 -05:00
James Edwards 6f2c876923 Fix CMakeList.txt file to use correct compile options. Fix compilation errors.
Change-Id: I6229a83d0823ee7a123cdaa9efd782108aa3a03c


[ROCm/ROCR-Runtime commit: b170e0ad8c]
2016-09-30 16:36:01 -05:00
James Edwards 4ee6e7c69a Add libhsakmt cmake build and packaging files.
Change-Id: Ic7fa22d5b266480aa0c62628022f39da4e043d23


[ROCm/ROCR-Runtime commit: 7511631f08]
2016-09-20 17:48:36 -04:00
Felix Kuehling fee7a91fb9 Allocate and map doorbells in SVM for discrete GPUs
Allocate doorbells for dGPUs in the SVM aperture and map them for
GPU access. This is necessary to allow GPU-initiated submissions to
user mode queues.

Depends on new doorbell BO allocation flag in KFD.

Change-Id: I0737bef4a4764bb4a66c43846707ead2108f6601


[ROCm/ROCR-Runtime commit: 2e0a6eb371]
2016-09-16 16:04:27 -04:00
Amber Lin 6b33ada07b Disable CPU cache info in non-x86
CPU cache information reported by Thunk topology is obtained from cpuid
instruction. This instruction only applies to X86 systems. It can cause
compile errors on non-X86 platforms. This patch temporarily disables CPU
cache functions in topology for non-X86 platforms in order to compile.

Change-Id: If86671817b0d036cb324eebf3f354682bfb75856


[ROCm/ROCR-Runtime commit: 660a6ebbd4]
2016-09-14 17:30:50 -04:00
Amber Lin 2dec7b1d74 Search VM object by range
Add vm_find_object_by_userptr_range so QueryPointerInfo can find the
object as well when the pointer is not the starting address but it's
inside the memory range. Also rename vm_find_object_xxx functions to
_by_address and _by_address_range to be consistent.

Change-Id: I5c2b3a05b41493e32b7fd9154665bf078b043606


[ROCm/ROCR-Runtime commit: 4911c91389]
2016-09-13 12:44:29 -04:00
Amber Lin 4b17993791 Pointer attributes on APU
Add CPUVM aperture to keep track of memory allocation that is not known
to GPU driver. Together with GPUVM, this patch adds the pointer attributes
support to APU.

Change-Id: If13f9cf01ff8b9f709b99b66661e7505246adf4c


[ROCm/ROCR-Runtime commit: 19f2676ea7]
2016-09-12 11:32:26 -04:00
Amber Lin 8a1cef5fbb Add pointer attributes API
Add two pointer attributes APIs:
hsaKmtQueryPointerInfo - allow the user to query the memory information
    using a pointer. This pointer can point to any address inside the
    range known to HSA.
hsaKmtSetMemoryUserData - allow the user to attach data to a pointer to
    add memory tracking information. This pointer must match the start
    address of a memory allocation or registration.
TODO: This patch implements support on dGPU. Needs to add APU.

Change-Id: I4711809274248434901f0794f50ebfa13a7371a8


[ROCm/ROCR-Runtime commit: 51e4d27c37]
2016-09-07 17:24:46 -04:00
Yong Zhao cba37c251c Implement hsaKmtGetTileConfig in thunk
Change-Id: Iba8d8efa46e3c268a03442d3db568e1b19230e94


[ROCm/ROCR-Runtime commit: 8351b3d2e8]
2016-09-06 16:24:29 -04:00
Lan Xiao a015256f33 libhsakmt: Marketing Name and AMDName support for APUs
For APUs, use /proc/cpuinfo to get Marketing name.



Change-Id: I4a17516d26a092683f36631032be00ad44f7e7fe
Signed-off-by: Lan Xiao <Lan.Xiao@amd.com>


[ROCm/ROCR-Runtime commit: df593aa076]
2016-09-02 15:16:18 -04:00
Andres Rodriguez 996bf3f9ca LICENSE: add X11/MIT license file
Change-Id: I2e95af843046896708bb7a116f7b03a0fa30a255


[ROCm/ROCR-Runtime commit: b1d2867b60]
2016-08-25 16:27:46 -04:00
Andres Rodriguez 45e0ca4e91 Makefile: remove 32bit thunk compilation by default
Compiling in 32bit mode is broken, and we don't have an intention on
restarting compatibility with 32bit apps.

Change-Id: I5524b5b63fe62e6026aa04d84c4510e290a86106


[ROCm/ROCR-Runtime commit: e0c77a38cb]
2016-08-25 16:27:19 -04:00
Lan Xiao 742354161b libhsakmt: Add MarketingName and AMDName for all nodes - CPU & GPU
HSA thunk API is currently reporting engineering name to MarketingName
and returning NULL when querying for AMDName.

-Change current name reporting from MarketingName to AMDName.
-Use libpci to get MarketingName



Change-Id: I819a6de7b067a2e724a6695e7d800274b83a71f8
Signed-off-by: Lan Xiao <Lan.Xiao@amd.com>


[ROCm/ROCR-Runtime commit: 9cbbf30be7]
2016-08-23 10:49:27 -04:00
Kent Russell 2d604a8498 queues.c: Enforce CUMaskCount being a multiple of 32
The thunk spec requires that CUMaskCount be divisible by 32. Check this
and return INVALID_PARAMETER if it is not.

Change-Id: I4e0c8502d996d3da31224b817a5d4ff2c6054e13


[ROCm/ROCR-Runtime commit: 70b1b5b17e]
2016-08-23 06:16:39 -04:00
Yong Zhao f3e009e431 Fix a bug when mmap fails
EventId is needed in calling hsaKmtDestroyEvent() when mmap failed,
so we should move it ahead of mmap call.

Change-Id: I5f4288b953611799a02b0e988d6b2e48104466a0


[ROCm/ROCR-Runtime commit: 9c9bfa30c0]
2016-08-18 14:30:03 -04:00
Amber Lin 9bc0411400 Add performance counters for gfx803
Counter IDs in SQ_PERFCOUNTER0_SELECT are identical on gfx803 10 and
gfx803 11.

Change-Id: I5cfefd44b52989efd1d89311cf8c70c84ea2b230


[ROCm/ROCR-Runtime commit: 0b5c65a903]
2016-08-08 18:10:51 -04:00
Amber Lin 194f2083d2 Add gfx803 support
Add gfx803 and gfx80311 device IDs to the support

Change-Id: I16220fd811db102c02e5e0c5b82e40ec299877af


[ROCm/ROCR-Runtime commit: 876384305b]
2016-08-08 11:30:57 -04:00
Amber Lin fd3b0ef0e5 Shorten the device list in PerfCounter
get_block_properties uses the complete DID to identify the GPU. This list
is getting too long when more devices are added. Reading the 12 most
significant digits is good enough to identify the GPU.


Change-Id: Ieebb05402bbe08af12eb7289dfeb5bbf1f515b0f


[ROCm/ROCR-Runtime commit: 6c4d19a9d2]
2016-07-27 17:21:31 -04:00
shaoyunl 18373473c3 libhsakmt: Compute context save area size depends on CU num
Change-Id: Iaf35ddeee9fe5a6367097483f67c4adaa0796d7d
Signed-off-by: shaoyunl <Shaoyun.Liu@amd.com>


[ROCm/ROCR-Runtime commit: bf16caa75f]
2016-06-10 10:19:40 -04:00
Amber Lin de82e820c2 Add performance counters for gfx70x
Add performance counters for gfx70x. The reference is the gfx7 register spec.
The register being looked at is SQ_PERFCOUNTER0_SELECT.

Change-Id: I344bfb7452f6148f4dc268163d12c553c6be8424


[ROCm/ROCR-Runtime commit: 6d21c4e753]
2016-05-20 16:24:36 -04:00
shaoyunl 62d4e557ed libhsakmt: Add new device id for virtualized function of gfx803
Signed-off-by: Shaoyun Liu <Shaoyun.liu@amd.com>
Change-Id: I90b0bdaeaed8e9e80375e5a7a142205f2a542289


[ROCm/ROCR-Runtime commit: 16d5aa0d83]
2016-05-12 13:25:01 -04:00
Felix Kuehling a6b5c17133 Report gfx70x engine ID as 7.0.1
Stepping 1 indicates higher double-precision float performance and
potentially other runtime workarounds needed for lack of PCIe atomics
on gfx70x.

Change-Id: I97185c1233e7d24caaf20a1eadea931d5a2bc664


[ROCm/ROCR-Runtime commit: fa102f3b8b]
2016-05-04 13:53:24 -04:00
Amber Lin 19b4a16ead Correct NumCaches against the CPU node
In a NUMA system, topology should report NumCaches as the number of caches
within the node but current code reports the total caches in the system. This
patch fixes the error. This patch also uses cpuid to get cache information
instead of reading from sysfs files. See "Intel Corporation, Intel 64 and IA-32
 Architectures Software Developer's  Manual Volume 2(2A, 2B & 2C) Instruction
Set Reference" 3-179 for cpuid instruction features used in this patch.


Change-Id: I8ecece6c2b230741822620b44e66ddc201ff5112


[ROCm/ROCR-Runtime commit: 73ad0a1942]
2016-05-03 11:39:33 -04:00
Felix Kuehling 87bd249ed5 Add gfx70x support
Change-Id: I400adb62b5225ef3a42da279d067fb0a62907089


[ROCm/ROCR-Runtime commit: 97e51ce33d]
2016-04-25 14:27:44 -04:00
Andres Rodriguez e8d96eac7a package: rename to hsathk-rocm-dev
Since we include headers and not just a library anymore, we should be
considered a -dev package and not a lib package.

Change-Id: I220465ea4ffc8d66d8d76e6716e6c6c50cdacea1


[ROCm/ROCR-Runtime commit: 44572965f6]
2016-04-13 19:39:54 -04:00
Andres Rodriguez ade12f4ec1 Adopt new ROCm packaging guidelines
All files should go into /opt/rocm/$component

For developer convenience, a single include directory is created through
symlinks, from the component include directory to /opt/rocm/include.

Similarly, a unified linked directory is present in /opt/rocm/lib

The component lib directory should not include linker names (library
names without version numbers).

This commit also fixes 'make rpm' running correctly without the need for
sourcing build/envsetup.sh

Change-Id: I95a680f6d3e3bd1ae688d0694934a0577dbd007c


[ROCm/ROCR-Runtime commit: 9f355b78a0]
2016-04-11 18:30:54 -04:00
Felix Kuehling f0af6eceed Fix 4GB and larger system memory allocations
Intermediate size was stored in a 32-bit variable. This resulted in
4GB allocations to fail in KFD due to 0 size. Larger allocations
would allocate the wrong amount of memory.

Change-Id: If19dedf64952f1d2edd813793241e12c0362d220


[ROCm/ROCR-Runtime commit: 82b3fad320]
2016-04-11 11:17:06 -04:00
Andres Rodriguez e4f1d95ef2 package: change install directory to /opt/rocm
Align with the rest of the driver stack on the new installation path
/opt/rocm/*

This mechanism for generating packages should be changed for something
nicer and more standards compliant in the future.

Change-Id: Ic31409b0d0b8f6ee4b25296d2580982a76aab564


[ROCm/ROCR-Runtime commit: 31861c838e]
2016-04-08 11:41:49 -04:00
David Ogbeide 9abf85c06b libhsakmt: get CPU model name from proc/cpuinfo
HSA thunk is currently only aware of GPU node
model info, CPU names are NULL.



Signed-off-by: David Ogbeide <davidboyowa.ogbeide@amd.com>
Change-Id: I3c2adbb8566a5048b44c39fff4fd8228912468ff


[ROCm/ROCR-Runtime commit: 682776d89a]
2016-03-23 11:11:18 -04:00
Felix Kuehling a8a5960095 Add environment variable to disable GPU caching
This option may help debug synchronization or coherency issues
involving the GPU caches. It works only on dGPUs, by changing the
cache policy of the GPUVM default aperture to "cohrent", which is
implemented as non-cached on current dGPU hardware.

Change-Id: I544ac9cc5c0cf1fa5c4e30f67aa42b3b5e44ae67


[ROCm/ROCR-Runtime commit: 06d391c6c9]
2016-03-17 18:51:47 -04:00
Harish Kasiviswanathan 718e3600b8 Add QPI or HT io_links
Create QPI or HT links among all NUMA nodes. For now, assume all the
NUMA nodes are interconnected with same Weight (=1).

Change-Id: Id48ba95b9d75515a186f7dc5006b19bd92743ae3


[ROCm/ROCR-Runtime commit: f1fbacca15]
2016-03-15 21:10:53 -04:00
Harish Kasiviswanathan 14e60b6ab3 Get processor vendor from /proc/cpuinfo
Change-Id: I9039385d268ef1693fab121cbf1caf442129a12e


[ROCm/ROCR-Runtime commit: ee1dd5d9c2]
2016-03-15 15:37:52 -04:00
shaoyunl 0c6a45ca49 Add Imprecise flag for memory access fault
KFD may not be  able to provide the precise VM fault address and status.
This flag will indicate whether the event data has the fault details

Change-Id: I15ffd5c25f555003c6450cc0700efb769418f76b


[ROCm/ROCR-Runtime commit: 79077811f5]
2016-03-14 15:17:17 -04:00
Felix Kuehling a31106ee4c Report SVM heap in topology
The Runtime requested this information so they can tell easily
whether a pointer is part of HSA shared address space or not.


Change-Id: If2041ed34031636677d692bc2dc6625634027ed4


[ROCm/ROCR-Runtime commit: 0ed29f5191]
2016-03-14 11:52:36 -04:00
Harish Kasiviswanathan 2f015053b2 Sync IOLINK defines to thunk spec
Current thunk spec v1.07 dated Feb 1, 2016

Change-Id: Ie1821f7f1903ac48b76cb68d452a6073d3a3c8d9


[ROCm/ROCR-Runtime commit: 1c1bc32477]
2016-03-11 18:59:57 -05:00
Harish Kasiviswanathan dbe8c8faba Fix indirect io_links
Connect only (Peer-to-Peer) GPUs that belong to same NUMA node. Without
this additional check non direct GPUs would also get connected.

Change-Id: I9a5ed19b8f06cd0527854cbbdb51ede99eade28b


[ROCm/ROCR-Runtime commit: 8ff2bcd48d]
2016-03-11 18:54:32 -05:00
Felix Kuehling 68f1b37518 Fix lstopo
Lstopo doesn't have system memory mappings at low addresses. Make
sure we leave enough GPUVM address space for kernel allocations
(currently only CWSR) before the start of the user-managed SVM
aperture.

Change-Id: Ic197f7bd5a3cfb150a0da2bfdbc848664e7869be


[ROCm/ROCR-Runtime commit: cac0c08496]
2016-03-11 11:01:12 -05:00
Harish Kasiviswanathan 6409b00f8f Add indirect io_links
Connect (Peer-to-Peer) GPUs that belong to same NUMA node.
Connect all [GPU] <--> [Non Parent NUMA] node

Change-Id: Ib4b08a6545d28b7dce4c9b1a90378bfc51bed07e


[ROCm/ROCR-Runtime commit: 7042292c60]
2016-03-10 15:11:17 -05:00
Harish Kasiviswanathan 35e6692134 Allocate memory for indirect io_links
To simplify, allocate maximum needed memory for node_t->link array.
No need for realloc when indirect links are added. Trade off - for some
nodes more memory than required will be allocated.

This means the loop to compute the number of direct (reverse) io_links
for a CPU node is not necessary.

Change-Id: I2b2559142cbec3b262d0b4ea5fdebfd8f36c28fc


[ROCm/ROCR-Runtime commit: 1e729510d2]
2016-03-10 15:10:48 -05:00
Felix Kuehling 029002d073 Add support for hsaKmtRegisterGraphicsHandleToNodes
Change-Id: I6fd7154dea78188480d5cb89ac237bad572356c4


[ROCm/ROCR-Runtime commit: 61ec3df2f9]
2016-03-10 11:16:02 -05:00
Ben Goz c32a504b59 Support MapMemoryToGPUNodes on APU
Change-Id: Ie77a2eb23cd9fe6671ff9e0630977220218e55dd
Signed-off-by: Ben Goz <ben.goz@amd.com>


[ROCm/ROCR-Runtime commit: b1393f8224]
2016-03-09 21:31:52 -05:00
Felix Kuehling e2d2d6bd32 Update kfd_ioctl.h from kernel
Change-Id: I9852ef2e33e1f3b24343747e3c1c09b0050ffdc1


[ROCm/ROCR-Runtime commit: cb0315d31d]
2016-03-09 10:55:12 -05:00
Felix Kuehling f171fef754 Clean up GPUVM aperture management
Non-canonical GPUVM aperture doesn't exist on dGPUs. Remove comments
and code that say otherwise.

Fix alignment of GPUVM aperture for gfx801. Requires the same workaround
as gfx802. It's not used for anything on gfx801 yet, but will be soon.

Change-Id: I88607fe7b340081cc0715b85f28fdbf5f1bb0ad7


[ROCm/ROCR-Runtime commit: b837c3e7b0]
2016-03-09 10:55:12 -05:00
Yair Shachar 4c543389c7 name unnamed struct within HsaMemMapFlagd union
For aligning with RT definitions

Change-Id: I4dca0c5818fdcea6c596a48c7516835fc595a289
Signed-off-by: Yair Shachar <Yair.Shachar@amd.com>


[ROCm/ROCR-Runtime commit: c42ec0b82c]
2016-03-07 18:43:03 +02:00
Harish Kasiviswanathan ac547f8cb2 Add reverse direct io_links
The Kernel only creates one way direct link -
	GPU(PCI_BUS) --> [Parent NUMA Node]

Create the reverse direct io_link here -
	[Parent NUMA Node] -->  GPU(PCI_BUS)

Change-Id: I829a1b1b7f34bda42871ede3472d60915e88418c


[ROCm/ROCR-Runtime commit: 1d1c30db7c]
2016-03-04 15:54:03 -05:00
Harish Kasiviswanathan 5fc05ab059 Add free_nodes() helper function
Change-Id: I18ae0ac91b05275d7ad9d93175bae06870080844


[ROCm/ROCR-Runtime commit: a80d2f2303]
2016-03-03 18:33:59 -05:00
Andres Rodriguez 682178796f README: spelling and date fixes
Change-Id: I51fa196971b91ea71fd8b0abe169fe23502ebb96


[ROCm/ROCR-Runtime commit: 7c376247b5]
2016-03-02 18:42:01 -05:00
Andres Rodriguez 08c4246009 readme: add an initial README.md file
This is a simple README.md since most of the details should be in the
ROCK project.

Change-Id: I3175e2a5ade0f9ecb913076a4842b528f14947f0


[ROCm/ROCR-Runtime commit: 35e8fc6b15]
2016-03-02 18:42:01 -05:00
Ben Goz 8baf22651d Align hsaKmtMapMemoryToGPUNodes according thunk spec
Change-Id: I507ba5c6029ca5e7088c25930d46f5221679ace4
Signed-off-by: Ben Goz <ben.goz@amd.com>


[ROCm/ROCR-Runtime commit: e2fb4bc312]
2016-03-02 16:12:03 +02:00
shaoyunl 8067849931 Export libKmtSetTrapHandler symbol as global
Change-Id: I065dbecd05e992bc528128d893edaf636c1beff7


[ROCm/ROCR-Runtime commit: fea5ab9114]
2016-03-01 10:30:02 -05:00