rocm-systems

Автор	SHA1	Сообщение	Дата
James Edwards	b170e0ad8c	Fix CMakeList.txt file to use correct compile options. Fix compilation errors. Change-Id: I6229a83d0823ee7a123cdaa9efd782108aa3a03c	2016-09-30 16:36:01 -05:00
James Edwards	7511631f08	Add libhsakmt cmake build and packaging files. Change-Id: Ic7fa22d5b266480aa0c62628022f39da4e043d23	2016-09-20 17:48:36 -04:00
Felix Kuehling	2e0a6eb371	Allocate and map doorbells in SVM for discrete GPUs Allocate doorbells for dGPUs in the SVM aperture and map them for GPU access. This is necessary to allow GPU-initiated submissions to user mode queues. Depends on new doorbell BO allocation flag in KFD. Change-Id: I0737bef4a4764bb4a66c43846707ead2108f6601	2016-09-16 16:04:27 -04:00
Amber Lin	660a6ebbd4	Disable CPU cache info in non-x86 CPU cache information reported by Thunk topology is obtained from cpuid instruction. This instruction only applies to X86 systems. It can cause compile errors on non-X86 platforms. This patch temporarily disables CPU cache functions in topology for non-X86 platforms in order to compile. Change-Id: If86671817b0d036cb324eebf3f354682bfb75856	2016-09-14 17:30:50 -04:00
Amber Lin	4911c91389	Search VM object by range Add vm_find_object_by_userptr_range so QueryPointerInfo can find the object as well when the pointer is not the starting address but it's inside the memory range. Also rename vm_find_object_xxx functions to _by_address and _by_address_range to be consistent. Change-Id: I5c2b3a05b41493e32b7fd9154665bf078b043606	2016-09-13 12:44:29 -04:00
Amber Lin	19f2676ea7	Pointer attributes on APU Add CPUVM aperture to keep track of memory allocation that is not known to GPU driver. Together with GPUVM, this patch adds the pointer attributes support to APU. Change-Id: If13f9cf01ff8b9f709b99b66661e7505246adf4c	2016-09-12 11:32:26 -04:00
Amber Lin	51e4d27c37	Add pointer attributes API Add two pointer attributes APIs: hsaKmtQueryPointerInfo - allow the user to query the memory information using a pointer. This pointer can point to any address inside the range known to HSA. hsaKmtSetMemoryUserData - allow the user to attach data to a pointer to add memory tracking information. This pointer must match the start address of a memory allocation or registration. TODO: This patch implements support on dGPU. Needs to add APU. Change-Id: I4711809274248434901f0794f50ebfa13a7371a8	2016-09-07 17:24:46 -04:00
Yong Zhao	8351b3d2e8	Implement hsaKmtGetTileConfig in thunk Change-Id: Iba8d8efa46e3c268a03442d3db568e1b19230e94	2016-09-06 16:24:29 -04:00
Lan Xiao	df593aa076	libhsakmt: Marketing Name and AMDName support for APUs For APUs, use /proc/cpuinfo to get Marketing name. Change-Id: I4a17516d26a092683f36631032be00ad44f7e7fe Signed-off-by: Lan Xiao <Lan.Xiao@amd.com>	2016-09-02 15:16:18 -04:00
Andres Rodriguez	b1d2867b60	LICENSE: add X11/MIT license file Change-Id: I2e95af843046896708bb7a116f7b03a0fa30a255	2016-08-25 16:27:46 -04:00
Andres Rodriguez	e0c77a38cb	Makefile: remove 32bit thunk compilation by default Compiling in 32bit mode is broken, and we don't have an intention on restarting compatibility with 32bit apps. Change-Id: I5524b5b63fe62e6026aa04d84c4510e290a86106	2016-08-25 16:27:19 -04:00
Lan Xiao	9cbbf30be7	libhsakmt: Add MarketingName and AMDName for all nodes - CPU & GPU HSA thunk API is currently reporting engineering name to MarketingName and returning NULL when querying for AMDName. -Change current name reporting from MarketingName to AMDName. -Use libpci to get MarketingName Change-Id: I819a6de7b067a2e724a6695e7d800274b83a71f8 Signed-off-by: Lan Xiao <Lan.Xiao@amd.com>	2016-08-23 10:49:27 -04:00
Kent Russell	70b1b5b17e	queues.c: Enforce CUMaskCount being a multiple of 32 The thunk spec requires that CUMaskCount be divisible by 32. Check this and return INVALID_PARAMETER if it is not. Change-Id: I4e0c8502d996d3da31224b817a5d4ff2c6054e13	2016-08-23 06:16:39 -04:00
Yong Zhao	9c9bfa30c0	Fix a bug when mmap fails EventId is needed in calling hsaKmtDestroyEvent() when mmap failed, so we should move it ahead of mmap call. Change-Id: I5f4288b953611799a02b0e988d6b2e48104466a0	2016-08-18 14:30:03 -04:00
Amber Lin	0b5c65a903	Add performance counters for gfx803 Counter IDs in SQ_PERFCOUNTER0_SELECT are identical on gfx803 10 and gfx803 11. Change-Id: I5cfefd44b52989efd1d89311cf8c70c84ea2b230	2016-08-08 18:10:51 -04:00
Amber Lin	876384305b	Add gfx803 support Add gfx803 and gfx80311 device IDs to the support Change-Id: I16220fd811db102c02e5e0c5b82e40ec299877af	2016-08-08 11:30:57 -04:00
Amber Lin	6c4d19a9d2	Shorten the device list in PerfCounter get_block_properties uses the complete DID to identify the GPU. This list is getting too long when more devices are added. Reading the 12 most significant digits is good enough to identify the GPU. Change-Id: Ieebb05402bbe08af12eb7289dfeb5bbf1f515b0f	2016-07-27 17:21:31 -04:00
shaoyunl	bf16caa75f	libhsakmt: Compute context save area size depends on CU num Change-Id: Iaf35ddeee9fe5a6367097483f67c4adaa0796d7d Signed-off-by: shaoyunl <Shaoyun.Liu@amd.com>	2016-06-10 10:19:40 -04:00
Amber Lin	6d21c4e753	Add performance counters for gfx70x Add performance counters for gfx70x. The reference is the gfx7 register spec. The register being looked at is SQ_PERFCOUNTER0_SELECT. Change-Id: I344bfb7452f6148f4dc268163d12c553c6be8424	2016-05-20 16:24:36 -04:00
shaoyunl	16d5aa0d83	libhsakmt: Add new device id for virtualized function of gfx803 Signed-off-by: Shaoyun Liu <Shaoyun.liu@amd.com> Change-Id: I90b0bdaeaed8e9e80375e5a7a142205f2a542289	2016-05-12 13:25:01 -04:00
Felix Kuehling	fa102f3b8b	Report gfx70x engine ID as 7.0.1 Stepping 1 indicates higher double-precision float performance and potentially other runtime workarounds needed for lack of PCIe atomics on gfx70x. Change-Id: I97185c1233e7d24caaf20a1eadea931d5a2bc664	2016-05-04 13:53:24 -04:00
Amber Lin	73ad0a1942	Correct NumCaches against the CPU node In a NUMA system, topology should report NumCaches as the number of caches within the node but current code reports the total caches in the system. This patch fixes the error. This patch also uses cpuid to get cache information instead of reading from sysfs files. See "Intel Corporation, Intel 64 and IA-32 Architectures Software Developer's Manual Volume 2(2A, 2B & 2C) Instruction Set Reference" 3-179 for cpuid instruction features used in this patch. Change-Id: I8ecece6c2b230741822620b44e66ddc201ff5112	2016-05-03 11:39:33 -04:00
Felix Kuehling	97e51ce33d	Add gfx70x support Change-Id: I400adb62b5225ef3a42da279d067fb0a62907089	2016-04-25 14:27:44 -04:00
Andres Rodriguez	44572965f6	package: rename to hsathk-rocm-dev Since we include headers and not just a library anymore, we should be considered a -dev package and not a lib package. Change-Id: I220465ea4ffc8d66d8d76e6716e6c6c50cdacea1	2016-04-13 19:39:54 -04:00
Andres Rodriguez	9f355b78a0	Adopt new ROCm packaging guidelines All files should go into /opt/rocm/$component For developer convenience, a single include directory is created through symlinks, from the component include directory to /opt/rocm/include. Similarly, a unified linked directory is present in /opt/rocm/lib The component lib directory should not include linker names (library names without version numbers). This commit also fixes 'make rpm' running correctly without the need for sourcing build/envsetup.sh Change-Id: I95a680f6d3e3bd1ae688d0694934a0577dbd007c	2016-04-11 18:30:54 -04:00
Felix Kuehling	82b3fad320	Fix 4GB and larger system memory allocations Intermediate size was stored in a 32-bit variable. This resulted in 4GB allocations to fail in KFD due to 0 size. Larger allocations would allocate the wrong amount of memory. Change-Id: If19dedf64952f1d2edd813793241e12c0362d220	2016-04-11 11:17:06 -04:00
Andres Rodriguez	31861c838e	package: change install directory to /opt/rocm Align with the rest of the driver stack on the new installation path /opt/rocm/* This mechanism for generating packages should be changed for something nicer and more standards compliant in the future. Change-Id: Ic31409b0d0b8f6ee4b25296d2580982a76aab564	2016-04-08 11:41:49 -04:00
David Ogbeide	682776d89a	libhsakmt: get CPU model name from proc/cpuinfo HSA thunk is currently only aware of GPU node model info, CPU names are NULL. Signed-off-by: David Ogbeide <davidboyowa.ogbeide@amd.com> Change-Id: I3c2adbb8566a5048b44c39fff4fd8228912468ff	2016-03-23 11:11:18 -04:00
Felix Kuehling	06d391c6c9	Add environment variable to disable GPU caching This option may help debug synchronization or coherency issues involving the GPU caches. It works only on dGPUs, by changing the cache policy of the GPUVM default aperture to "cohrent", which is implemented as non-cached on current dGPU hardware. Change-Id: I544ac9cc5c0cf1fa5c4e30f67aa42b3b5e44ae67	2016-03-17 18:51:47 -04:00
Harish Kasiviswanathan	f1fbacca15	Add QPI or HT io_links Create QPI or HT links among all NUMA nodes. For now, assume all the NUMA nodes are interconnected with same Weight (=1). Change-Id: Id48ba95b9d75515a186f7dc5006b19bd92743ae3	2016-03-15 21:10:53 -04:00
Harish Kasiviswanathan	ee1dd5d9c2	Get processor vendor from /proc/cpuinfo Change-Id: I9039385d268ef1693fab121cbf1caf442129a12e	2016-03-15 15:37:52 -04:00
shaoyunl	79077811f5	Add Imprecise flag for memory access fault KFD may not be able to provide the precise VM fault address and status. This flag will indicate whether the event data has the fault details Change-Id: I15ffd5c25f555003c6450cc0700efb769418f76b	2016-03-14 15:17:17 -04:00
Felix Kuehling	0ed29f5191	Report SVM heap in topology The Runtime requested this information so they can tell easily whether a pointer is part of HSA shared address space or not. Change-Id: If2041ed34031636677d692bc2dc6625634027ed4	2016-03-14 11:52:36 -04:00
Harish Kasiviswanathan	1c1bc32477	Sync IOLINK defines to thunk spec Current thunk spec v1.07 dated Feb 1, 2016 Change-Id: Ie1821f7f1903ac48b76cb68d452a6073d3a3c8d9	2016-03-11 18:59:57 -05:00
Harish Kasiviswanathan	8ff2bcd48d	Fix indirect io_links Connect only (Peer-to-Peer) GPUs that belong to same NUMA node. Without this additional check non direct GPUs would also get connected. Change-Id: I9a5ed19b8f06cd0527854cbbdb51ede99eade28b	2016-03-11 18:54:32 -05:00
Felix Kuehling	cac0c08496	Fix lstopo Lstopo doesn't have system memory mappings at low addresses. Make sure we leave enough GPUVM address space for kernel allocations (currently only CWSR) before the start of the user-managed SVM aperture. Change-Id: Ic197f7bd5a3cfb150a0da2bfdbc848664e7869be	2016-03-11 11:01:12 -05:00
Harish Kasiviswanathan	7042292c60	Add indirect io_links Connect (Peer-to-Peer) GPUs that belong to same NUMA node. Connect all [GPU] <--> [Non Parent NUMA] node Change-Id: Ib4b08a6545d28b7dce4c9b1a90378bfc51bed07e	2016-03-10 15:11:17 -05:00
Harish Kasiviswanathan	1e729510d2	Allocate memory for indirect io_links To simplify, allocate maximum needed memory for node_t->link array. No need for realloc when indirect links are added. Trade off - for some nodes more memory than required will be allocated. This means the loop to compute the number of direct (reverse) io_links for a CPU node is not necessary. Change-Id: I2b2559142cbec3b262d0b4ea5fdebfd8f36c28fc	2016-03-10 15:10:48 -05:00
Felix Kuehling	61ec3df2f9	Add support for hsaKmtRegisterGraphicsHandleToNodes Change-Id: I6fd7154dea78188480d5cb89ac237bad572356c4	2016-03-10 11:16:02 -05:00
Ben Goz	b1393f8224	Support MapMemoryToGPUNodes on APU Change-Id: Ie77a2eb23cd9fe6671ff9e0630977220218e55dd Signed-off-by: Ben Goz <ben.goz@amd.com>	2016-03-09 21:31:52 -05:00
Felix Kuehling	cb0315d31d	Update kfd_ioctl.h from kernel Change-Id: I9852ef2e33e1f3b24343747e3c1c09b0050ffdc1	2016-03-09 10:55:12 -05:00
Felix Kuehling	b837c3e7b0	Clean up GPUVM aperture management Non-canonical GPUVM aperture doesn't exist on dGPUs. Remove comments and code that say otherwise. Fix alignment of GPUVM aperture for gfx801. Requires the same workaround as gfx802. It's not used for anything on gfx801 yet, but will be soon. Change-Id: I88607fe7b340081cc0715b85f28fdbf5f1bb0ad7	2016-03-09 10:55:12 -05:00
Yair Shachar	c42ec0b82c	name unnamed struct within HsaMemMapFlagd union For aligning with RT definitions Change-Id: I4dca0c5818fdcea6c596a48c7516835fc595a289 Signed-off-by: Yair Shachar <Yair.Shachar@amd.com>	2016-03-07 18:43:03 +02:00
Harish Kasiviswanathan	1d1c30db7c	Add reverse direct io_links The Kernel only creates one way direct link - GPU(PCI_BUS) --> [Parent NUMA Node] Create the reverse direct io_link here - [Parent NUMA Node] --> GPU(PCI_BUS) Change-Id: I829a1b1b7f34bda42871ede3472d60915e88418c	2016-03-04 15:54:03 -05:00
Harish Kasiviswanathan	a80d2f2303	Add free_nodes() helper function Change-Id: I18ae0ac91b05275d7ad9d93175bae06870080844	2016-03-03 18:33:59 -05:00
Andres Rodriguez	7c376247b5	README: spelling and date fixes Change-Id: I51fa196971b91ea71fd8b0abe169fe23502ebb96	2016-03-02 18:42:01 -05:00
Andres Rodriguez	35e8fc6b15	readme: add an initial README.md file This is a simple README.md since most of the details should be in the ROCK project. Change-Id: I3175e2a5ade0f9ecb913076a4842b528f14947f0	2016-03-02 18:42:01 -05:00
Ben Goz	e2fb4bc312	Align hsaKmtMapMemoryToGPUNodes according thunk spec Change-Id: I507ba5c6029ca5e7088c25930d46f5221679ace4 Signed-off-by: Ben Goz <ben.goz@amd.com>	2016-03-02 16:12:03 +02:00
shaoyunl	fea5ab9114	Export libKmtSetTrapHandler symbol as global Change-Id: I065dbecd05e992bc528128d893edaf636c1beff7	2016-03-01 10:30:02 -05:00
Harish Kasiviswanathan	bf03058112	Fix io_links sysfs directory name typo Change-Id: I4f6fb43c4a038b94c0f94f66ee383e83ad0ffa62	2016-02-29 11:15:29 -05:00

1 2 3 4

195 Коммитов