rocm-systems

Автор	SHA1	Сообщение	Дата
Felix Kuehling	97e51ce33d	Add gfx70x support Change-Id: I400adb62b5225ef3a42da279d067fb0a62907089	2016-04-25 14:27:44 -04:00
Andres Rodriguez	44572965f6	package: rename to hsathk-rocm-dev Since we include headers and not just a library anymore, we should be considered a -dev package and not a lib package. Change-Id: I220465ea4ffc8d66d8d76e6716e6c6c50cdacea1	2016-04-13 19:39:54 -04:00
Andres Rodriguez	9f355b78a0	Adopt new ROCm packaging guidelines All files should go into /opt/rocm/$component For developer convenience, a single include directory is created through symlinks, from the component include directory to /opt/rocm/include. Similarly, a unified linked directory is present in /opt/rocm/lib The component lib directory should not include linker names (library names without version numbers). This commit also fixes 'make rpm' running correctly without the need for sourcing build/envsetup.sh Change-Id: I95a680f6d3e3bd1ae688d0694934a0577dbd007c	2016-04-11 18:30:54 -04:00
Felix Kuehling	82b3fad320	Fix 4GB and larger system memory allocations Intermediate size was stored in a 32-bit variable. This resulted in 4GB allocations to fail in KFD due to 0 size. Larger allocations would allocate the wrong amount of memory. Change-Id: If19dedf64952f1d2edd813793241e12c0362d220	2016-04-11 11:17:06 -04:00
Andres Rodriguez	31861c838e	package: change install directory to /opt/rocm Align with the rest of the driver stack on the new installation path /opt/rocm/* This mechanism for generating packages should be changed for something nicer and more standards compliant in the future. Change-Id: Ic31409b0d0b8f6ee4b25296d2580982a76aab564	2016-04-08 11:41:49 -04:00
David Ogbeide	682776d89a	libhsakmt: get CPU model name from proc/cpuinfo HSA thunk is currently only aware of GPU node model info, CPU names are NULL. Signed-off-by: David Ogbeide <davidboyowa.ogbeide@amd.com> Change-Id: I3c2adbb8566a5048b44c39fff4fd8228912468ff	2016-03-23 11:11:18 -04:00
Felix Kuehling	06d391c6c9	Add environment variable to disable GPU caching This option may help debug synchronization or coherency issues involving the GPU caches. It works only on dGPUs, by changing the cache policy of the GPUVM default aperture to "cohrent", which is implemented as non-cached on current dGPU hardware. Change-Id: I544ac9cc5c0cf1fa5c4e30f67aa42b3b5e44ae67	2016-03-17 18:51:47 -04:00
Harish Kasiviswanathan	f1fbacca15	Add QPI or HT io_links Create QPI or HT links among all NUMA nodes. For now, assume all the NUMA nodes are interconnected with same Weight (=1). Change-Id: Id48ba95b9d75515a186f7dc5006b19bd92743ae3	2016-03-15 21:10:53 -04:00
Harish Kasiviswanathan	ee1dd5d9c2	Get processor vendor from /proc/cpuinfo Change-Id: I9039385d268ef1693fab121cbf1caf442129a12e	2016-03-15 15:37:52 -04:00
shaoyunl	79077811f5	Add Imprecise flag for memory access fault KFD may not be able to provide the precise VM fault address and status. This flag will indicate whether the event data has the fault details Change-Id: I15ffd5c25f555003c6450cc0700efb769418f76b	2016-03-14 15:17:17 -04:00
Felix Kuehling	0ed29f5191	Report SVM heap in topology The Runtime requested this information so they can tell easily whether a pointer is part of HSA shared address space or not. Change-Id: If2041ed34031636677d692bc2dc6625634027ed4	2016-03-14 11:52:36 -04:00
Harish Kasiviswanathan	1c1bc32477	Sync IOLINK defines to thunk spec Current thunk spec v1.07 dated Feb 1, 2016 Change-Id: Ie1821f7f1903ac48b76cb68d452a6073d3a3c8d9	2016-03-11 18:59:57 -05:00
Harish Kasiviswanathan	8ff2bcd48d	Fix indirect io_links Connect only (Peer-to-Peer) GPUs that belong to same NUMA node. Without this additional check non direct GPUs would also get connected. Change-Id: I9a5ed19b8f06cd0527854cbbdb51ede99eade28b	2016-03-11 18:54:32 -05:00
Felix Kuehling	cac0c08496	Fix lstopo Lstopo doesn't have system memory mappings at low addresses. Make sure we leave enough GPUVM address space for kernel allocations (currently only CWSR) before the start of the user-managed SVM aperture. Change-Id: Ic197f7bd5a3cfb150a0da2bfdbc848664e7869be	2016-03-11 11:01:12 -05:00
Harish Kasiviswanathan	7042292c60	Add indirect io_links Connect (Peer-to-Peer) GPUs that belong to same NUMA node. Connect all [GPU] <--> [Non Parent NUMA] node Change-Id: Ib4b08a6545d28b7dce4c9b1a90378bfc51bed07e	2016-03-10 15:11:17 -05:00
Harish Kasiviswanathan	1e729510d2	Allocate memory for indirect io_links To simplify, allocate maximum needed memory for node_t->link array. No need for realloc when indirect links are added. Trade off - for some nodes more memory than required will be allocated. This means the loop to compute the number of direct (reverse) io_links for a CPU node is not necessary. Change-Id: I2b2559142cbec3b262d0b4ea5fdebfd8f36c28fc	2016-03-10 15:10:48 -05:00
Felix Kuehling	61ec3df2f9	Add support for hsaKmtRegisterGraphicsHandleToNodes Change-Id: I6fd7154dea78188480d5cb89ac237bad572356c4	2016-03-10 11:16:02 -05:00
Ben Goz	b1393f8224	Support MapMemoryToGPUNodes on APU Change-Id: Ie77a2eb23cd9fe6671ff9e0630977220218e55dd Signed-off-by: Ben Goz <ben.goz@amd.com>	2016-03-09 21:31:52 -05:00
Felix Kuehling	cb0315d31d	Update kfd_ioctl.h from kernel Change-Id: I9852ef2e33e1f3b24343747e3c1c09b0050ffdc1	2016-03-09 10:55:12 -05:00
Felix Kuehling	b837c3e7b0	Clean up GPUVM aperture management Non-canonical GPUVM aperture doesn't exist on dGPUs. Remove comments and code that say otherwise. Fix alignment of GPUVM aperture for gfx801. Requires the same workaround as gfx802. It's not used for anything on gfx801 yet, but will be soon. Change-Id: I88607fe7b340081cc0715b85f28fdbf5f1bb0ad7	2016-03-09 10:55:12 -05:00
Yair Shachar	c42ec0b82c	name unnamed struct within HsaMemMapFlagd union For aligning with RT definitions Change-Id: I4dca0c5818fdcea6c596a48c7516835fc595a289 Signed-off-by: Yair Shachar <Yair.Shachar@amd.com>	2016-03-07 18:43:03 +02:00
Harish Kasiviswanathan	1d1c30db7c	Add reverse direct io_links The Kernel only creates one way direct link - GPU(PCI_BUS) --> [Parent NUMA Node] Create the reverse direct io_link here - [Parent NUMA Node] --> GPU(PCI_BUS) Change-Id: I829a1b1b7f34bda42871ede3472d60915e88418c	2016-03-04 15:54:03 -05:00
Harish Kasiviswanathan	a80d2f2303	Add free_nodes() helper function Change-Id: I18ae0ac91b05275d7ad9d93175bae06870080844	2016-03-03 18:33:59 -05:00
Andres Rodriguez	7c376247b5	README: spelling and date fixes Change-Id: I51fa196971b91ea71fd8b0abe169fe23502ebb96	2016-03-02 18:42:01 -05:00
Andres Rodriguez	35e8fc6b15	readme: add an initial README.md file This is a simple README.md since most of the details should be in the ROCK project. Change-Id: I3175e2a5ade0f9ecb913076a4842b528f14947f0	2016-03-02 18:42:01 -05:00
Ben Goz	e2fb4bc312	Align hsaKmtMapMemoryToGPUNodes according thunk spec Change-Id: I507ba5c6029ca5e7088c25930d46f5221679ace4 Signed-off-by: Ben Goz <ben.goz@amd.com>	2016-03-02 16:12:03 +02:00
shaoyunl	fea5ab9114	Export libKmtSetTrapHandler symbol as global Change-Id: I065dbecd05e992bc528128d893edaf636c1beff7	2016-03-01 10:30:02 -05:00
Harish Kasiviswanathan	bf03058112	Fix io_links sysfs directory name typo Change-Id: I4f6fb43c4a038b94c0f94f66ee383e83ad0ffa62	2016-02-29 11:15:29 -05:00
Jay Cornwall	3a662ac712	Fix race in dGPU event page setup events_page is unprotected from multiple allocation. The first event creation ioctl is unprotected from a race with args.event_page_offset being set (for page setup) and null (all subsequent invocations). Change-Id: I40ba712a17e9eff257785f90c553a74ad09c661d Signed-off-by: Yair Shachar <Yair.Shachar@amd.com>	2016-02-28 07:14:23 -05:00
Felix Kuehling	006f3ee41b	Fix address space leak in __fmm_release Use the object size when freeing address space, instead of the parameter passed in by the caller. The parameter may be incorrect due to app or runtime bugs, or when the buffers is an AQL ring buffer with double mapping workaround. Change-Id: I00bb31d4520ef969a49d6d5ea723e8a33418acc3	2016-02-26 09:19:21 -05:00
Felix Kuehling	8a0161d6bb	Use aligned size for looking up userptr object after allocation The alignment performed in vm_find_object_by_address isn't sufficient because it doesn't take into account the offset from the start of the page. This fixes a bug where certain unaligned userpointers and sizes fail to register correctly. Change-Id: I17872e264467a619f5e1bedb7e1ed3d994a856bf	2016-02-25 19:47:05 -05:00
Ben Goz	3f02a3cf0b	Mapping public VRAM BO to cpu Change-Id: I2ff62ff0784f8ce556ad80739a177b90d866f1b4 Signed-off-by: Ben Goz <ben.goz@amd.com>	2016-02-24 17:30:15 +02:00
Felix Kuehling	7a383f9d88	Fix memory leaks due to stale CPU mappings Use the aligned size of the buffer objects for CPU unmapping in __fmm_release instead of relying on the unaligned size passed in by the caller. Change-Id: If986ec24e9a05d32981549fddbf143221fc40bac	2016-02-16 18:12:05 -05:00
Felix Kuehling	85f9efb1a0	Add support for register/deregister memory for dGPU Allocate SVM address space for the registered memory and use new userptr support in KFD to create a system memory BO associated with the given user pointer. Map this BO at the SVM address for CPU access. MapMemoryToGPU can be used with the registered user pointer and will return the SVM address as alternate GPUVA. Change-Id: I4886e193c51fb6870a567878870c36bf8b5c3748	2016-02-16 18:12:05 -05:00
Ben Goz	00386734b1	Align gpu-id-array size to multiple of sizeof(uint32_t) Change-Id: I9f46b6a331a8d928ef570b420fb60b99b2edfdd1 Signed-off-by: Ben Goz <ben.goz@amd.com>	2016-02-16 11:27:06 -05:00
Harish Kasiviswanathan	04b92b8e05	gfx803: Add performance counter information Change-Id: Id81b43e90029306f03c84752cef06dc336e3a4a9	2016-02-12 16:39:39 -05:00
Harish Kasiviswanathan	1a0f915957	Adding missing performance counters for gfx801 Few more counters are now available in GFX8 register specs. So adding them. Also for gfx700 and gfx801 report correct number of SQ perf counter slots Change-Id: I9e6b4b10238230aabeccbfaa5e491a28b5e54f2d	2016-02-12 16:37:21 -05:00
Ben Goz	b37f99a01e	Fix double free issue and pointer alignment Change-Id: Id5bab454d53d404883a92282168b3f6cbc468cbb Signed-off-by: Ben Goz <ben.goz@amd.com>	2016-02-12 11:21:32 -05:00
Kent Russell	cd6d75880f	Fix build location for thunk RPM Change-Id: I4f5c7688a3e9b4dd31d8d72cae3adf9a796e38f9	2016-02-12 08:29:52 -05:00
Felix Kuehling	887b32fe86	Make hsaKmtAllocMemory more compliant with the Thunk spec Allocations from GPU nodes will return VRAM, not system memory. Only non-paged allocation from GPU nodes is supported. System memory can only be allocated from CPU nodes (usually node 0). The HostAccess flag is no longer used to distinguish the memory type. It only indicates, whether the memory is mapped for CPU access. Maintain compatibility with broken KfdTests by returning system memory for paged-memory requested from GPU nodes. Change-Id: I514defede735f55e6de436f41944125b6f2c4ccf	2016-02-10 10:29:54 -05:00
Yair Shachar	a815a4337f	Disable scratch Host allocation - via debug registration flags. Change-Id: Ia6e5f86ec3979c4a49800f7af4509442a4e5be27 Signed-off-by: Yair Shachar <Yair.Shachar@amd.com>	2016-02-10 07:52:32 -05:00
Ben Goz	7070f7ec5e	Adding support to hsaKmtMapMemoryToGPUNodes Change-Id: Iab6222402a43c3cd31b0efc5a316a6482986258e Signed-off-by: Ben Goz <ben.goz@amd.com>	2016-02-09 17:34:29 +02:00
shaoyunl	4e6c25e55b	libhsaKmt: Add CWSR support on dGPU This is thunk part of the CWSR support. 1. SDMA queue don't support CWSR , no necessary to allocate the context save/restore memory 2. Allocate the context save/restore memory in local frame buffer for dGPU Change-Id: Ie83506f0cced2a5a537c49d68125796d831c2764	2016-02-04 15:00:58 -05:00
shaoyunl	7e40877e81	libhsakmt: Use GPU ID instead of Node ID in set_process_dgpu_aperture Change-Id: I0e66ca4a018c15c009a3516d250f0044a4407878	2016-02-04 10:32:23 -05:00
Andres Rodriguez	3797b56ec9	Bump version for bugfix release 1.8.1 Change-Id: I06701905592594221d26c075a8fe370b4cc92aff	2016-02-02 01:29:51 -05:00
Ben Goz	e37863d7f2	Adding HsaMemMapFlags struct Change-Id: Ib0ee6dede1169582fd58bfca648347c3f8aa0b54 Signed-off-by: Ben Goz <ben.goz@amd.com>	2016-01-31 05:16:53 -05:00
Felix Kuehling	cc9fc386bd	Remove gfx802 page size workaround on gfx803 All tonga page size alignment is done in the memory management functions in fmm.c. All other code only specifies the minimum alignment it needs and lets fmm.c handle the HW-specific alignment. Clean up aligned-exec memory allocation in queue.c to remove hard-coded TONGA_PAGE_SIZE alignments and remove code duplication. Make sure alignments are consistent between allocate and free. Change-Id: Ia8923448173d1cef315af24cebff12adef385cb0	2016-01-28 16:05:18 -05:00
David Ogbeide	4e4a881940	libhsakmt: Add marketing names for GPU nodes HSA thunk API returns null when querying for GPU node marketing names due to empty system topology file. - Add marketing names to device GFX IP data structs. - Modify name retrieval to pull from data structs instead of file. Signed-off by: David Ogbeide <davidboyowa.ogbeide@amd.com> Change-Id: I30ea04111be7e0df2e93894f801fbeb414ffa790	2016-01-25 11:03:54 -05:00
Felix Kuehling	641bfd2cd5	Add simple test for unloading and reloading Thunk Change-Id: I4ca95dee8a180023d1de5f69161607dd368164de	2016-01-22 18:41:53 -05:00
Felix Kuehling	cc7491ec71	Link libhsakmt with -z nodelete This prevents the library from being unloaded at runtime, even when dlclose is called. This preserves global variables, such as state about the SVM address space and avoids catastrophic leaks on dlclose. Change-Id: I34f1d19a450835200e9d4815458e8d1b3045053c	2016-01-22 18:08:19 -05:00

... 4 5 6 7 8 ...

423 Коммитов