rocm-systems

作成者	SHA1	メッセージ	日付
Kent Russell	1b6994a2dc	Fix build location for thunk RPM Change-Id: I4f5c7688a3e9b4dd31d8d72cae3adf9a796e38f9 [ROCm/ROCR-Runtime commit: `cd6d75880f`]	2016-02-12 08:29:52 -05:00
Felix Kuehling	03720306b9	Make hsaKmtAllocMemory more compliant with the Thunk spec Allocations from GPU nodes will return VRAM, not system memory. Only non-paged allocation from GPU nodes is supported. System memory can only be allocated from CPU nodes (usually node 0). The HostAccess flag is no longer used to distinguish the memory type. It only indicates, whether the memory is mapped for CPU access. Maintain compatibility with broken KfdTests by returning system memory for paged-memory requested from GPU nodes. Change-Id: I514defede735f55e6de436f41944125b6f2c4ccf [ROCm/ROCR-Runtime commit: `887b32fe86`]	2016-02-10 10:29:54 -05:00
Yair Shachar	8359dc3119	Disable scratch Host allocation - via debug registration flags. Change-Id: Ia6e5f86ec3979c4a49800f7af4509442a4e5be27 Signed-off-by: Yair Shachar <Yair.Shachar@amd.com> [ROCm/ROCR-Runtime commit: `a815a4337f`]	2016-02-10 07:52:32 -05:00
Ben Goz	18aab410cc	Adding support to hsaKmtMapMemoryToGPUNodes Change-Id: Iab6222402a43c3cd31b0efc5a316a6482986258e Signed-off-by: Ben Goz <ben.goz@amd.com> [ROCm/ROCR-Runtime commit: `7070f7ec5e`]	2016-02-09 17:34:29 +02:00
shaoyunl	60bbf00fb1	libhsaKmt: Add CWSR support on dGPU This is thunk part of the CWSR support. 1. SDMA queue don't support CWSR , no necessary to allocate the context save/restore memory 2. Allocate the context save/restore memory in local frame buffer for dGPU Change-Id: Ie83506f0cced2a5a537c49d68125796d831c2764 [ROCm/ROCR-Runtime commit: `4e6c25e55b`]	2016-02-04 15:00:58 -05:00
shaoyunl	4c5a3ca774	libhsakmt: Use GPU ID instead of Node ID in set_process_dgpu_aperture Change-Id: I0e66ca4a018c15c009a3516d250f0044a4407878 [ROCm/ROCR-Runtime commit: `7e40877e81`]	2016-02-04 10:32:23 -05:00
Andres Rodriguez	cd849bc3e9	Bump version for bugfix release 1.8.1 Change-Id: I06701905592594221d26c075a8fe370b4cc92aff [ROCm/ROCR-Runtime commit: `3797b56ec9`]	2016-02-02 01:29:51 -05:00
Ben Goz	07a0c70dd5	Adding HsaMemMapFlags struct Change-Id: Ib0ee6dede1169582fd58bfca648347c3f8aa0b54 Signed-off-by: Ben Goz <ben.goz@amd.com> [ROCm/ROCR-Runtime commit: `e37863d7f2`]	2016-01-31 05:16:53 -05:00
Felix Kuehling	61039bcd36	Remove gfx802 page size workaround on gfx803 All tonga page size alignment is done in the memory management functions in fmm.c. All other code only specifies the minimum alignment it needs and lets fmm.c handle the HW-specific alignment. Clean up aligned-exec memory allocation in queue.c to remove hard-coded TONGA_PAGE_SIZE alignments and remove code duplication. Make sure alignments are consistent between allocate and free. Change-Id: Ia8923448173d1cef315af24cebff12adef385cb0 [ROCm/ROCR-Runtime commit: `cc9fc386bd`]	2016-01-28 16:05:18 -05:00
David Ogbeide	8fce9f7026	libhsakmt: Add marketing names for GPU nodes HSA thunk API returns null when querying for GPU node marketing names due to empty system topology file. - Add marketing names to device GFX IP data structs. - Modify name retrieval to pull from data structs instead of file. Signed-off by: David Ogbeide <davidboyowa.ogbeide@amd.com> Change-Id: I30ea04111be7e0df2e93894f801fbeb414ffa790 [ROCm/ROCR-Runtime commit: `4e4a881940`]	2016-01-25 11:03:54 -05:00
Felix Kuehling	8ea4e037c8	Add simple test for unloading and reloading Thunk Change-Id: I4ca95dee8a180023d1de5f69161607dd368164de [ROCm/ROCR-Runtime commit: `641bfd2cd5`]	2016-01-22 18:41:53 -05:00
Felix Kuehling	db5b6fd35a	Link libhsakmt with -z nodelete This prevents the library from being unloaded at runtime, even when dlclose is called. This preserves global variables, such as state about the SVM address space and avoids catastrophic leaks on dlclose. Change-Id: I34f1d19a450835200e9d4815458e8d1b3045053c [ROCm/ROCR-Runtime commit: `cc7491ec71`]	2016-01-22 18:08:19 -05:00
Amber Lin	07500db1df	Revert "Free resources when dlclose is called" This reverts commit `4dd9dbb128`. Conflicts: src/fmm.c src/perfctr.c Change-Id: Ib6113c2dd3962c72100c7f74cdef6897e1df40b3 [ROCm/ROCR-Runtime commit: `7416805a44`]	2016-01-22 17:58:33 -05:00
Serguei Sagalovitch	f5bebcf875	Fixed logic to return data back to user Change-Id: I324d07c38e8d7eb202d4dccfed6e62006cf9cd29 Signed-off-by: Serguei Sagalovitch <Serguei.Sagalovitch@amd.com> [ROCm/ROCR-Runtime commit: `f44982a7ca`]	2016-01-22 14:49:18 -05:00
Serguei Sagalovitch	b10380d783	Skeleton for RDMA unit test v4 Added application and driver to serve as the starting point for RDMA unit test uility. v2: Added initial mmap support v3: Fixed logic to find correct ioctl handler v4: Fixed logic in mmap to find correct pages table Change-Id: Iaf97c0eb2acef2160d542c71afed58cf400414f7 Signed-off-by: Serguei Sagalovitch <Serguei.Sagalovitch@amd.com> [ROCm/ROCR-Runtime commit: `47cef87a34`]	2016-01-21 15:20:24 -05:00
Harish Kasiviswanathan	b687eaf2c2	Don't limit number of supported HSA Nodes Remove #define MAX_NODES 8 Change-Id: I756cadc652543dd17ea48a1c956adc08c3d2631a [ROCm/ROCR-Runtime commit: `5e53205b9e`]	2016-01-15 17:27:43 -05:00
Harish Kasiviswanathan	14358ee07f	Don't limit number of supported GPUs Stop using NUM_OF_SUPPORTED_GPUS. For now the definitions itself cannot be removed as ioctl code is in upstream Kernel. Change-Id: If846625a8ad5062d5483e762850c793d3c00b9d0 [ROCm/ROCR-Runtime commit: `ce83dc623f`]	2016-01-15 11:44:42 -05:00
Harish Kasiviswanathan	add443f1ef	Use new ioctl for getting process apertures Change-Id: I73678744ad73942edec442ad9c6d38637f7e1235 [ROCm/ROCR-Runtime commit: `e7e1361c3d`]	2016-01-12 12:09:25 -05:00
Felix Kuehling	c89d3124d9	Implement hsaKmtRegisterMemoryToNodes Fix hsaKmtRegisterMemory to be a no-op for now and move the multi-GPU implementation to hsaKmtRegisterMemoryToNodes. Make GPU memory mappings of host memory visible to all GPUs by default. Device memory is still visible to the allocating GPU only by default (but can be overridden with hsaKmtRegisterMemoryToNodes for experimenting with P2P). Change-Id: I73408afbe3b10c8dad2ab3a780f58413249692e6 [ROCm/ROCR-Runtime commit: `063ad3ad9e`]	2016-01-08 16:00:23 -05:00
Ben Goz	2fa7eef572	Adding support for mGPU Change-Id: I5ed184e6a58b38d9dde48867f14513d161cf41a9 Signed-off-by: Ben Goz <ben.goz@amd.com> [ROCm/ROCR-Runtime commit: `ea0f9d2a0b`]	2016-01-04 15:35:15 +02:00
Ben Goz	d874bcd8b3	Fix AQL Double buffer allocation mode Change-Id: I5162ffd89416d317fd0ca0fc51da523298488922 Signed-off-by: Ben Goz <ben.goz@amd.com> [ROCm/ROCR-Runtime commit: `53b208adf2`]	2016-01-04 15:34:53 +02:00
Yair Shachar	63f646d050	Add support for scratch GPUVM on host memory This is required when we have a debug session Change-Id: If9d6d2d23a9016b6ca9562e02a91fc16e0354ee4 Signed-off-by: Yair Shachar <Yair.Shachar@amd.com> [ROCm/ROCR-Runtime commit: `681f4dcecc`]	2015-12-20 15:50:50 +02:00
Harish Kasiviswanathan	8bf76bdf67	Fix node_id in gpu_mem[] array Change-Id: I4897623612e1749e275fb97ce1603dc5130fc9ce [ROCm/ROCR-Runtime commit: `39bf9c6611`]	2015-12-14 16:25:18 -05:00
Amber Lin	4dd9dbb128	Free resources when dlclose is called When the Thunk is initialized multiple times in the lifetime of a single process , some global resources are leaked. This can happen when dlopen and dlclose are used to load the library at runtime, rather than linking the runtime against the Thunk. This patch adds the destructor to release global resources when dlclose is called. Change-Id: Ia00da0d41f095d0b2706f98c0e75effedd596f49 [ROCm/ROCR-Runtime commit: `582b70f9c3`]	2015-12-11 16:32:41 -05:00
Yair Shachar	f01386b61c	Add support for per device debug register state tracking Change-Id: I8d51670f5de8d379ead898d484f668a8034f9878 Signed-off-by: Yair Shachar <Yair.Shachar@amd.com> [ROCm/ROCR-Runtime commit: `8f529e3c72`]	2015-12-07 21:11:21 +02:00
Harish Kasiviswanathan	419117eff9	Remove unused parameter gpu_id from few functions This will also fix out of bound access in functions fmm_get_aperture_base_and_limit and fmm_release Change-Id: Icf064c46647e69a069126171dbacdf3d5b27f972 Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> [ROCm/ROCR-Runtime commit: `a4cf02d797`]	2015-11-30 11:51:44 -05:00
Harish Kasiviswanathan	f34b407728	Use same VM range for all dGPUs dgpu_aperture and dgpu_alt_aperture will be shared by all dGPUs. Change-Id: I814495e43b51acabdc6266cfa8d83db5a062e20d Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> [ROCm/ROCR-Runtime commit: `2903a610e1`]	2015-11-26 15:07:29 -05:00
Harish Kasiviswanathan	87ddd7732e	Fix dgpu_vm_limit Break from the for-loop once dgpu VM range is found, otherwise the length is reduced by half Change-Id: Ie602054c16ea69ea1cbb75e804ead551bc3615c0 Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> [ROCm/ROCR-Runtime commit: `5a55383baf`]	2015-11-23 11:51:39 -05:00
Amber Lin	4c84c85252	Fix sibling map in CPU cache properties Previous code only works for systems where shared_cpu_map lists 32 or less bits. Some systems list more than 32 bits and express them as XXXXXXXX,XXXXXXXX,.... This patch adds that calculation. Also increase MAX_CPU_CORES and MAX_CACHES to accommodate more advanced systems. Change-Id: Ia5c7041866456a6aa3b66f8f0f951022d7c51028 [ROCm/ROCR-Runtime commit: `a5bc8360e8`]	2015-11-12 08:31:51 -05:00
Felix Kuehling	1ab2c3341a	Reserve address space with PROT_NONE Access to reserved address space that has not been allocated should result in a segfault. Use PROT_NONE to ensure that. Change-Id: Ic5da9392fabbe78c9ec14f98e8b7b47e5267a98a [ROCm/ROCR-Runtime commit: `62337b6c0a`]	2015-11-10 18:19:56 -05:00
Kent Russell	650232b83b	Use OUT_DIR for thunkroot variable Pick up the thunk from the correct location. It is no longer inside THUNK_ROOT, but instead part of the OUT folder. Change-Id: I41dd7dae243e66270d0ea7182f1ba119b18a1cfb [ROCm/ROCR-Runtime commit: `3786e18d99`]	2015-11-09 16:21:49 -05:00
Kent Russell	63c43d3404	Fix variable for RPM build Certain versions of rpmbuild need the variable to be outside of curly braces. This addresses that issue in that situation. Change-Id: Iff7200b332b9d8e41a4d7676ca14c5a32c075beb [ROCm/ROCR-Runtime commit: `4e4d4a81e1`]	2015-11-09 11:05:32 -05:00
Amber Lin	403eb13050	Add CPU cache information Fill up cache properties of CPU node by reading data from /proc/cpuinfo and /sys/devices/system/cpu/cpuX/cache/indexY Change-Id: I0a96760575e504e38962554f192c3fe66bea3c15 [ROCm/ROCR-Runtime commit: `b6f65f9849`]	2015-11-09 07:16:24 -05:00
Kent Russell	67d98aa280	Add option to create release build for Thunk By adding REL=1 to the make command line (e.g. make REL=1 deb), we can create a release build of the Thunk. This will not affect existing functionality, and will only have an effect if REL=1 is specified on the command line, or in the build_thunk.sh script. Change-Id: Iedc3b6094e70a4ebd726499eda56013cc254b83d [ROCm/ROCR-Runtime commit: `cb3a664065`]	2015-10-30 14:05:40 -04:00
Kent Russell	39d2152a3f	Cleanup RPM build of thunk Change-Id: Ib437a3ec7be9f5aa7d3ef9e53c13e3c5e7b7382e [ROCm/ROCR-Runtime commit: `cabbcbabff`]	2015-10-30 08:42:16 -04:00
Felix Kuehling	b900df9215	Use correct aperture for _fmm_unmap_from_gpu_scratch Passing in the wrong aperture resulted in failure to unmap scratch. Change-Id: Icd7423abfb1bcc773b33becffcbefc233f4ff340 [ROCm/ROCR-Runtime commit: `bd93eecc64`]	2015-10-29 18:26:15 -04:00
Philip Cox	782cea350c	Add SDMA IOCTL type to Create Queue function. Change-Id: I7e31507b761ca388b2cac93f994f6106de962f17 [ROCm/ROCR-Runtime commit: `0c234c7ef3`]	2015-10-29 10:25:41 -04:00
Kent Russell	f4889d439d	libhsakmt - Add make option to package thunk as RPM Add an option to libhsakmt to allow the thunk to be packaged as an RPM. The default will remain being built as-is, but this can now be packaged as an RPM by using "make src rpm" . build_thunk.sh will be modified to reflect this new option. Change-Id: I38e03d10cfb5035bdf0a87635a784c47a709a5b6 [ROCm/ROCR-Runtime commit: `6ceed7def3`]	2015-10-29 07:49:13 -04:00
Harish Kasiviswanathan	595f51899f	Remove erroneous and redundant memory banks reported hsaKmtGetNodeMemoryProperties - - Return only HSA_HEAPTYPE_SYSTEM memory for CPU only node. - For dGPU remove redundant HSA_HEAPTYPE_FRAME_BUFFER_PRIVATE entry. Change-Id: I0349be39b8409a0fd64a038b8b2956191356d937 [ROCm/ROCR-Runtime commit: `f885e551aa`]	2015-10-23 18:43:46 -04:00
Harish Kasiviswanathan	71dc59b245	Correct parameter name for topology_is_dgpu() The function expects device_id and not gpu_id. Change-Id: I79794fd4e58e6e6adb26659da30f3e4d8e108434 [ROCm/ROCR-Runtime commit: `69662da3dc`]	2015-10-23 18:43:45 -04:00
Harish Kasiviswanathan	d7589c62e1	Unify fmm_get_aperture_xxx functions Unify fmm_get_aperture_base and fmm_get_aperture_limit into one function. Make the return value to HSAKMT_STATUS. Change-Id: I0b3f563ffb268947ab891f4935f61788d0af0e01 [ROCm/ROCR-Runtime commit: `cb53548c89`]	2015-10-23 18:43:34 -04:00
Felix Kuehling	29561cc13e	Implement flat scratch support for dGPU hsaKmtAllocMemory only allocates aligned address space and sets up the scratch_physical aperture to match the allocated address space. Actual allocation of backing memory happens in hsaKmtMapMemoryToGPU. Change-Id: Ie709815ab9bedb3d682e096b4005fdfb5e94d3a7 [ROCm/ROCR-Runtime commit: `5131ab4e64`]	2015-10-22 20:40:22 -04:00
Felix Kuehling	17a31f1cce	Allow address space allocations with specific alignment Change-Id: I4bf7f7ac53c3921dd330b9dc7a40582611f88b69 [ROCm/ROCR-Runtime commit: `149261ba09`]	2015-10-22 20:27:49 -04:00
Ben Goz	4ec82c7edd	Casting local memory size to uint64_t Change-Id: I5c2010056b84ac01bb65361210d2a693e437050a Signed-off-by: Ben Goz <ben.goz@amd.com> [ROCm/ROCR-Runtime commit: `55b1a5dc43`]	2015-10-22 09:05:34 -04:00
Ben Goz	a511b7c4f7	Adding support for new AQL Queue Memory allocation Change-Id: If84fc4b961627dbdd0b77b1c509a3c9a4c709b9f Signed-off-by: Ben Goz <ben.goz@amd.com> [ROCm/ROCR-Runtime commit: `e61500c46e`]	2015-10-22 13:13:54 +03:00
Felix Kuehling	55bd82cd89	Fix node 0 system memory allocation for dGPU This is a hack to allow the Runtime to allocate system memory with PreferredNode=0 on a dGPU system. We allocated it from Node 1 instead so that the node 1 GPU can map the memory. A proper fix will be implemented together with multi-GPU support. Change-Id: Ieb52599e5275781c04ee34405ea850bf782c523a [ROCm/ROCR-Runtime commit: `590c8e522c`]	2015-10-21 20:00:01 -04:00
Felix Kuehling	ebf6ec1806	Reserve more SVM process address space Try to reserve as much SVM address space as GPUVM can address. Implement a fallback scheme to smaller sizes if larger allocations fail or are not addressable by the GPU, down to an (arbitrary) minimum of 4GB. Change-Id: I770177834cc9e6ddd6ef4f20d789eab63c8055cb [ROCm/ROCR-Runtime commit: `39bde26c9b`]	2015-10-19 17:44:23 -04:00
Andres Rodriguez	f6eba4d367	make: add 'deb' target for creating deb packages When 'make deb' is run create a libhsakmt.deb archive that installs libhsakmt into the appropriate folder on the target where the dymanic linker can find it. Change-Id: I32de7198975f7831e509a67371e78456982b5c42 [ROCm/ROCR-Runtime commit: `0df346aaf9`]	2015-10-16 19:13:51 -04:00
Harish Kasiviswanathan	d38a3f1438	Fix init process apertures Kernel ioctl AMDKFD_IOC_GET_PROCESS_APERTURES returns process apertures only for GPU nodes. The current implementation assumed that this list of GPU nodes returned by the ioctl has one to one correpondence to sysfs topology nodes. This fails when non-GPU nodes exist in topology as in case of Intel + gfx802 Fix this by using gpu_id (./sys/.../kfd/topology/nodes/1/gpu_id) to map information obtained from kernel ioctl call. Change-Id: I4ab8ae5354f12cf0b6609fc4b24182b82eb3677f [ROCm/ROCR-Runtime commit: `5cc56a2647`]	2015-10-15 15:38:14 -04:00
Harish Kasiviswanathan	462a775ec3	Fix hard-coded usage of Node 0 Use appropriate NodeId instead Change-Id: I46af93b76978fea7bedb34457fcc0864ed4fe2d4 [ROCm/ROCR-Runtime commit: `b6c6f79143`]	2015-10-14 17:27:38 -04:00

1 2 3

135 コミット