Gráfico de Commits

2925 Commits

Autor SHA1 Mensagem Data
shaoyunl 0c6a45ca49 Add Imprecise flag for memory access fault
KFD may not be  able to provide the precise VM fault address and status.
This flag will indicate whether the event data has the fault details

Change-Id: I15ffd5c25f555003c6450cc0700efb769418f76b


[ROCm/ROCR-Runtime commit: 79077811f5]
2016-03-14 15:17:17 -04:00
Felix Kuehling a31106ee4c Report SVM heap in topology
The Runtime requested this information so they can tell easily
whether a pointer is part of HSA shared address space or not.


Change-Id: If2041ed34031636677d692bc2dc6625634027ed4


[ROCm/ROCR-Runtime commit: 0ed29f5191]
2016-03-14 11:52:36 -04:00
Harish Kasiviswanathan 2f015053b2 Sync IOLINK defines to thunk spec
Current thunk spec v1.07 dated Feb 1, 2016

Change-Id: Ie1821f7f1903ac48b76cb68d452a6073d3a3c8d9


[ROCm/ROCR-Runtime commit: 1c1bc32477]
2016-03-11 18:59:57 -05:00
Harish Kasiviswanathan dbe8c8faba Fix indirect io_links
Connect only (Peer-to-Peer) GPUs that belong to same NUMA node. Without
this additional check non direct GPUs would also get connected.

Change-Id: I9a5ed19b8f06cd0527854cbbdb51ede99eade28b


[ROCm/ROCR-Runtime commit: 8ff2bcd48d]
2016-03-11 18:54:32 -05:00
Felix Kuehling 68f1b37518 Fix lstopo
Lstopo doesn't have system memory mappings at low addresses. Make
sure we leave enough GPUVM address space for kernel allocations
(currently only CWSR) before the start of the user-managed SVM
aperture.

Change-Id: Ic197f7bd5a3cfb150a0da2bfdbc848664e7869be


[ROCm/ROCR-Runtime commit: cac0c08496]
2016-03-11 11:01:12 -05:00
Harish Kasiviswanathan 6409b00f8f Add indirect io_links
Connect (Peer-to-Peer) GPUs that belong to same NUMA node.
Connect all [GPU] <--> [Non Parent NUMA] node

Change-Id: Ib4b08a6545d28b7dce4c9b1a90378bfc51bed07e


[ROCm/ROCR-Runtime commit: 7042292c60]
2016-03-10 15:11:17 -05:00
Harish Kasiviswanathan 35e6692134 Allocate memory for indirect io_links
To simplify, allocate maximum needed memory for node_t->link array.
No need for realloc when indirect links are added. Trade off - for some
nodes more memory than required will be allocated.

This means the loop to compute the number of direct (reverse) io_links
for a CPU node is not necessary.

Change-Id: I2b2559142cbec3b262d0b4ea5fdebfd8f36c28fc


[ROCm/ROCR-Runtime commit: 1e729510d2]
2016-03-10 15:10:48 -05:00
Felix Kuehling 029002d073 Add support for hsaKmtRegisterGraphicsHandleToNodes
Change-Id: I6fd7154dea78188480d5cb89ac237bad572356c4


[ROCm/ROCR-Runtime commit: 61ec3df2f9]
2016-03-10 11:16:02 -05:00
Ben Goz c32a504b59 Support MapMemoryToGPUNodes on APU
Change-Id: Ie77a2eb23cd9fe6671ff9e0630977220218e55dd
Signed-off-by: Ben Goz <ben.goz@amd.com>


[ROCm/ROCR-Runtime commit: b1393f8224]
2016-03-09 21:31:52 -05:00
Felix Kuehling e2d2d6bd32 Update kfd_ioctl.h from kernel
Change-Id: I9852ef2e33e1f3b24343747e3c1c09b0050ffdc1


[ROCm/ROCR-Runtime commit: cb0315d31d]
2016-03-09 10:55:12 -05:00
Felix Kuehling f171fef754 Clean up GPUVM aperture management
Non-canonical GPUVM aperture doesn't exist on dGPUs. Remove comments
and code that say otherwise.

Fix alignment of GPUVM aperture for gfx801. Requires the same workaround
as gfx802. It's not used for anything on gfx801 yet, but will be soon.

Change-Id: I88607fe7b340081cc0715b85f28fdbf5f1bb0ad7


[ROCm/ROCR-Runtime commit: b837c3e7b0]
2016-03-09 10:55:12 -05:00
Yair Shachar 4c543389c7 name unnamed struct within HsaMemMapFlagd union
For aligning with RT definitions

Change-Id: I4dca0c5818fdcea6c596a48c7516835fc595a289
Signed-off-by: Yair Shachar <Yair.Shachar@amd.com>


[ROCm/ROCR-Runtime commit: c42ec0b82c]
2016-03-07 18:43:03 +02:00
Harish Kasiviswanathan ac547f8cb2 Add reverse direct io_links
The Kernel only creates one way direct link -
	GPU(PCI_BUS) --> [Parent NUMA Node]

Create the reverse direct io_link here -
	[Parent NUMA Node] -->  GPU(PCI_BUS)

Change-Id: I829a1b1b7f34bda42871ede3472d60915e88418c


[ROCm/ROCR-Runtime commit: 1d1c30db7c]
2016-03-04 15:54:03 -05:00
Harish Kasiviswanathan 5fc05ab059 Add free_nodes() helper function
Change-Id: I18ae0ac91b05275d7ad9d93175bae06870080844


[ROCm/ROCR-Runtime commit: a80d2f2303]
2016-03-03 18:33:59 -05:00
Andres Rodriguez 682178796f README: spelling and date fixes
Change-Id: I51fa196971b91ea71fd8b0abe169fe23502ebb96


[ROCm/ROCR-Runtime commit: 7c376247b5]
2016-03-02 18:42:01 -05:00
Andres Rodriguez 08c4246009 readme: add an initial README.md file
This is a simple README.md since most of the details should be in the
ROCK project.

Change-Id: I3175e2a5ade0f9ecb913076a4842b528f14947f0


[ROCm/ROCR-Runtime commit: 35e8fc6b15]
2016-03-02 18:42:01 -05:00
Ben Goz 8baf22651d Align hsaKmtMapMemoryToGPUNodes according thunk spec
Change-Id: I507ba5c6029ca5e7088c25930d46f5221679ace4
Signed-off-by: Ben Goz <ben.goz@amd.com>


[ROCm/ROCR-Runtime commit: e2fb4bc312]
2016-03-02 16:12:03 +02:00
shaoyunl 8067849931 Export libKmtSetTrapHandler symbol as global
Change-Id: I065dbecd05e992bc528128d893edaf636c1beff7


[ROCm/ROCR-Runtime commit: fea5ab9114]
2016-03-01 10:30:02 -05:00
Harish Kasiviswanathan 268045084d Fix io_links sysfs directory name typo
Change-Id: I4f6fb43c4a038b94c0f94f66ee383e83ad0ffa62


[ROCm/ROCR-Runtime commit: bf03058112]
2016-02-29 11:15:29 -05:00
Jay Cornwall 537f217f11 Fix race in dGPU event page setup
events_page is unprotected from multiple allocation. The first event
creation ioctl is unprotected from a race with args.event_page_offset
being set (for page setup) and null (all subsequent invocations).

Change-Id: I40ba712a17e9eff257785f90c553a74ad09c661d
Signed-off-by: Yair Shachar <Yair.Shachar@amd.com>


[ROCm/ROCR-Runtime commit: 3a662ac712]
2016-02-28 07:14:23 -05:00
Felix Kuehling d56931260a Fix address space leak in __fmm_release
Use the object size when freeing address space, instead of the
parameter passed in by the caller. The parameter may be incorrect
due to app or runtime bugs, or when the buffers is an AQL ring
buffer with double mapping workaround.


Change-Id: I00bb31d4520ef969a49d6d5ea723e8a33418acc3


[ROCm/ROCR-Runtime commit: 006f3ee41b]
2016-02-26 09:19:21 -05:00
Felix Kuehling 8ae4e547bc Use aligned size for looking up userptr object after allocation
The alignment performed in vm_find_object_by_address isn't sufficient
because it doesn't take into account the offset from the start of the
page.

This fixes a bug where certain unaligned userpointers and sizes fail
to register correctly.

Change-Id: I17872e264467a619f5e1bedb7e1ed3d994a856bf


[ROCm/ROCR-Runtime commit: 8a0161d6bb]
2016-02-25 19:47:05 -05:00
Ramesh Errabolu (xN/A) TX 4fb3765f77 Configure AQL packet header with System Scope for flush
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1240170]


[ROCm/ROCR-Runtime commit: f7693cf777]
2016-02-24 14:08:35 -05:00
Ben Goz 18953cfa9a Mapping public VRAM BO to cpu
Change-Id: I2ff62ff0784f8ce556ad80739a177b90d866f1b4
Signed-off-by: Ben Goz <ben.goz@amd.com>


[ROCm/ROCR-Runtime commit: 3f02a3cf0b]
2016-02-24 17:30:15 +02:00
Felix Kuehling 6d856ebbff Fix memory leaks due to stale CPU mappings
Use the aligned size of the buffer objects for CPU unmapping in
__fmm_release instead of relying on the unaligned size passed in by
the caller.

Change-Id: If986ec24e9a05d32981549fddbf143221fc40bac


[ROCm/ROCR-Runtime commit: 7a383f9d88]
2016-02-16 18:12:05 -05:00
Felix Kuehling 99325bf7c4 Add support for register/deregister memory for dGPU
Allocate SVM address space for the registered memory and use new
userptr support in KFD to create a system memory BO associated with
the given user pointer. Map this BO at the SVM address for CPU
access.

MapMemoryToGPU can be used with the registered user pointer and
will return the SVM address as alternate GPUVA.

Change-Id: I4886e193c51fb6870a567878870c36bf8b5c3748


[ROCm/ROCR-Runtime commit: 85f9efb1a0]
2016-02-16 18:12:05 -05:00
Ben Goz 89905c0cd7 Align gpu-id-array size to multiple of sizeof(uint32_t)
Change-Id: I9f46b6a331a8d928ef570b420fb60b99b2edfdd1
Signed-off-by: Ben Goz <ben.goz@amd.com>


[ROCm/ROCR-Runtime commit: 00386734b1]
2016-02-16 11:27:06 -05:00
Besar Wicaksono (xN/A) TX [TEXT] 62ea8e12e3 Modify MatrixMultiplication sample to use memory pool API
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1237420]


[ROCm/ROCR-Runtime commit: bbe0be05d4]
2016-02-16 11:12:25 -05:00
Besar Wicaksono (xN/A) TX [TEXT] f166016d8a Add sample application to use the new memory pool API.
Details:
- add HsaGetInfo program that prints out all available CPU, GPU and their respective memory pools.

[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1237219]


[ROCm/ROCR-Runtime commit: c494af9d49]
2016-02-15 18:11:44 -05:00
Harish Kasiviswanathan e8327090b9 gfx803: Add performance counter information
Change-Id: Id81b43e90029306f03c84752cef06dc336e3a4a9


[ROCm/ROCR-Runtime commit: 04b92b8e05]
2016-02-12 16:39:39 -05:00
Harish Kasiviswanathan f4f0ffc8cb Adding missing performance counters for gfx801
Few more counters are now available in GFX8 register specs. So adding
them. Also for gfx700 and gfx801 report correct number of SQ perf counter slots

Change-Id: I9e6b4b10238230aabeccbfaa5e491a28b5e54f2d


[ROCm/ROCR-Runtime commit: 1a0f915957]
2016-02-12 16:37:21 -05:00
Ramesh Errabolu (xN/A) TX b33f9613fa Populate Cpu and Gpu nodes into different agent lists
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1236865]


[ROCm/ROCR-Runtime commit: 2280190f70]
2016-02-12 16:14:39 -05:00
Ben Goz 0dc374e1a4 Fix double free issue and pointer alignment
Change-Id: Id5bab454d53d404883a92282168b3f6cbc468cbb
Signed-off-by: Ben Goz <ben.goz@amd.com>


[ROCm/ROCR-Runtime commit: b37f99a01e]
2016-02-12 11:21:32 -05:00
Kent Russell 1b6994a2dc Fix build location for thunk RPM
Change-Id: I4f5c7688a3e9b4dd31d8d72cae3adf9a796e38f9


[ROCm/ROCR-Runtime commit: cd6d75880f]
2016-02-12 08:29:52 -05:00
Felix Kuehling 03720306b9 Make hsaKmtAllocMemory more compliant with the Thunk spec
Allocations from GPU nodes will return VRAM, not system memory.
Only non-paged allocation from GPU nodes is supported. System
memory can only be allocated from CPU nodes (usually node 0).

The HostAccess flag is no longer used to distinguish the memory
type. It only indicates, whether the memory is mapped for CPU
access.

Maintain compatibility with broken KfdTests by returning system
memory for paged-memory requested from GPU nodes.

Change-Id: I514defede735f55e6de436f41944125b6f2c4ccf


[ROCm/ROCR-Runtime commit: 887b32fe86]
2016-02-10 10:29:54 -05:00
Yair Shachar 8359dc3119 Disable scratch Host allocation - via debug registration flags.
Change-Id: Ia6e5f86ec3979c4a49800f7af4509442a4e5be27
Signed-off-by: Yair Shachar <Yair.Shachar@amd.com>


[ROCm/ROCR-Runtime commit: a815a4337f]
2016-02-10 07:52:32 -05:00
Ben Goz 18aab410cc Adding support to hsaKmtMapMemoryToGPUNodes
Change-Id: Iab6222402a43c3cd31b0efc5a316a6482986258e
Signed-off-by: Ben Goz <ben.goz@amd.com>


[ROCm/ROCR-Runtime commit: 7070f7ec5e]
2016-02-09 17:34:29 +02:00
Ding, Wei (xN/A) TX a1837859ef Changes 5 hsail apps for supporting gfx803.
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1235366]


[ROCm/ROCR-Runtime commit: df99562905]
2016-02-08 15:39:18 -05:00
shaoyunl 60bbf00fb1 libhsaKmt: Add CWSR support on dGPU
This is thunk part of the  CWSR support.
1. SDMA queue don't support CWSR , no necessary to allocate the context save/restore memory
2. Allocate the context save/restore memory in local frame buffer for dGPU

Change-Id: Ie83506f0cced2a5a537c49d68125796d831c2764


[ROCm/ROCR-Runtime commit: 4e6c25e55b]
2016-02-04 15:00:58 -05:00
shaoyunl 4c5a3ca774 libhsakmt: Use GPU ID instead of Node ID in set_process_dgpu_aperture
Change-Id: I0e66ca4a018c15c009a3516d250f0044a4407878


[ROCm/ROCR-Runtime commit: 7e40877e81]
2016-02-04 10:32:23 -05:00
Andres Rodriguez cd849bc3e9 Bump version for bugfix release 1.8.1
Change-Id: I06701905592594221d26c075a8fe370b4cc92aff


[ROCm/ROCR-Runtime commit: 3797b56ec9]
2016-02-02 01:29:51 -05:00
Ben Goz 07a0c70dd5 Adding HsaMemMapFlags struct
Change-Id: Ib0ee6dede1169582fd58bfca648347c3f8aa0b54
Signed-off-by: Ben Goz <ben.goz@amd.com>


[ROCm/ROCR-Runtime commit: e37863d7f2]
2016-01-31 05:16:53 -05:00
Felix Kuehling 61039bcd36 Remove gfx802 page size workaround on gfx803
All tonga page size alignment is done in the memory management
functions in fmm.c. All other code only specifies the minimum
alignment it needs and lets fmm.c handle the HW-specific
alignment.

Clean up aligned-exec memory allocation in queue.c to remove
hard-coded TONGA_PAGE_SIZE alignments and remove code duplication.
Make sure alignments are consistent between allocate and free.

Change-Id: Ia8923448173d1cef315af24cebff12adef385cb0


[ROCm/ROCR-Runtime commit: cc9fc386bd]
2016-01-28 16:05:18 -05:00
David Ogbeide 8fce9f7026 libhsakmt: Add marketing names for GPU nodes
HSA thunk API returns null when querying for GPU node marketing
names due to empty system topology file.

- Add marketing names to device GFX IP data structs.
- Modify name retrieval to pull from data structs instead of file.



Signed-off by: David Ogbeide <davidboyowa.ogbeide@amd.com>

Change-Id: I30ea04111be7e0df2e93894f801fbeb414ffa790


[ROCm/ROCR-Runtime commit: 4e4a881940]
2016-01-25 11:03:54 -05:00
Felix Kuehling 8ea4e037c8 Add simple test for unloading and reloading Thunk
Change-Id: I4ca95dee8a180023d1de5f69161607dd368164de


[ROCm/ROCR-Runtime commit: 641bfd2cd5]
2016-01-22 18:41:53 -05:00
Felix Kuehling db5b6fd35a Link libhsakmt with -z nodelete
This prevents the library from being unloaded at runtime, even when
dlclose is called. This preserves global variables, such as state
about the SVM address space and avoids catastrophic leaks on dlclose.

Change-Id: I34f1d19a450835200e9d4815458e8d1b3045053c


[ROCm/ROCR-Runtime commit: cc7491ec71]
2016-01-22 18:08:19 -05:00
Amber Lin 07500db1df Revert "Free resources when dlclose is called"
This reverts commit 4dd9dbb128.

Conflicts:
	src/fmm.c
	src/perfctr.c

Change-Id: Ib6113c2dd3962c72100c7f74cdef6897e1df40b3


[ROCm/ROCR-Runtime commit: 7416805a44]
2016-01-22 17:58:33 -05:00
Serguei Sagalovitch f5bebcf875 Fixed logic to return data back to user
Change-Id: I324d07c38e8d7eb202d4dccfed6e62006cf9cd29
Signed-off-by: Serguei Sagalovitch <Serguei.Sagalovitch@amd.com>


[ROCm/ROCR-Runtime commit: f44982a7ca]
2016-01-22 14:49:18 -05:00
Serguei Sagalovitch b10380d783 Skeleton for RDMA unit test v4
Added application and driver to serve as the starting point for RDMA
unit test uility.

v2: Added initial mmap support
v3: Fixed logic to find correct ioctl handler
v4: Fixed logic in mmap to find correct pages table

Change-Id: Iaf97c0eb2acef2160d542c71afed58cf400414f7
Signed-off-by: Serguei Sagalovitch <Serguei.Sagalovitch@amd.com>


[ROCm/ROCR-Runtime commit: 47cef87a34]
2016-01-21 15:20:24 -05:00
Harish Kasiviswanathan b687eaf2c2 Don't limit number of supported HSA Nodes
Remove #define MAX_NODES 8

Change-Id: I756cadc652543dd17ea48a1c956adc08c3d2631a


[ROCm/ROCR-Runtime commit: 5e53205b9e]
2016-01-15 17:27:43 -05:00