rocm-systems

Autor(a)	SHA1	Mensagem	Data
Amber Lin	07500db1df	Revert "Free resources when dlclose is called" This reverts commit `4dd9dbb128`. Conflicts: src/fmm.c src/perfctr.c Change-Id: Ib6113c2dd3962c72100c7f74cdef6897e1df40b3 [ROCm/ROCR-Runtime commit: `7416805a44`]	2016-01-22 17:58:33 -05:00
Serguei Sagalovitch	f5bebcf875	Fixed logic to return data back to user Change-Id: I324d07c38e8d7eb202d4dccfed6e62006cf9cd29 Signed-off-by: Serguei Sagalovitch <Serguei.Sagalovitch@amd.com> [ROCm/ROCR-Runtime commit: `f44982a7ca`]	2016-01-22 14:49:18 -05:00
Serguei Sagalovitch	b10380d783	Skeleton for RDMA unit test v4 Added application and driver to serve as the starting point for RDMA unit test uility. v2: Added initial mmap support v3: Fixed logic to find correct ioctl handler v4: Fixed logic in mmap to find correct pages table Change-Id: Iaf97c0eb2acef2160d542c71afed58cf400414f7 Signed-off-by: Serguei Sagalovitch <Serguei.Sagalovitch@amd.com> [ROCm/ROCR-Runtime commit: `47cef87a34`]	2016-01-21 15:20:24 -05:00
Harish Kasiviswanathan	b687eaf2c2	Don't limit number of supported HSA Nodes Remove #define MAX_NODES 8 Change-Id: I756cadc652543dd17ea48a1c956adc08c3d2631a [ROCm/ROCR-Runtime commit: `5e53205b9e`]	2016-01-15 17:27:43 -05:00
Harish Kasiviswanathan	14358ee07f	Don't limit number of supported GPUs Stop using NUM_OF_SUPPORTED_GPUS. For now the definitions itself cannot be removed as ioctl code is in upstream Kernel. Change-Id: If846625a8ad5062d5483e762850c793d3c00b9d0 [ROCm/ROCR-Runtime commit: `ce83dc623f`]	2016-01-15 11:44:42 -05:00
Harish Kasiviswanathan	add443f1ef	Use new ioctl for getting process apertures Change-Id: I73678744ad73942edec442ad9c6d38637f7e1235 [ROCm/ROCR-Runtime commit: `e7e1361c3d`]	2016-01-12 12:09:25 -05:00
Felix Kuehling	c89d3124d9	Implement hsaKmtRegisterMemoryToNodes Fix hsaKmtRegisterMemory to be a no-op for now and move the multi-GPU implementation to hsaKmtRegisterMemoryToNodes. Make GPU memory mappings of host memory visible to all GPUs by default. Device memory is still visible to the allocating GPU only by default (but can be overridden with hsaKmtRegisterMemoryToNodes for experimenting with P2P). Change-Id: I73408afbe3b10c8dad2ab3a780f58413249692e6 [ROCm/ROCR-Runtime commit: `063ad3ad9e`]	2016-01-08 16:00:23 -05:00
Ben Goz	2fa7eef572	Adding support for mGPU Change-Id: I5ed184e6a58b38d9dde48867f14513d161cf41a9 Signed-off-by: Ben Goz <ben.goz@amd.com> [ROCm/ROCR-Runtime commit: `ea0f9d2a0b`]	2016-01-04 15:35:15 +02:00
Ben Goz	d874bcd8b3	Fix AQL Double buffer allocation mode Change-Id: I5162ffd89416d317fd0ca0fc51da523298488922 Signed-off-by: Ben Goz <ben.goz@amd.com> [ROCm/ROCR-Runtime commit: `53b208adf2`]	2016-01-04 15:34:53 +02:00
Nikolay Haustov [TEXT]	275cb22707	Split libHSAIL and libHSAIL-AMD (HSA Changes) [git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1223723] [ROCm/ROCR-Runtime commit: `d8e67d962b`]	2015-12-28 10:00:43 -05:00
Yair Shachar	63f646d050	Add support for scratch GPUVM on host memory This is required when we have a debug session Change-Id: If9d6d2d23a9016b6ca9562e02a91fc16e0354ee4 Signed-off-by: Yair Shachar <Yair.Shachar@amd.com> [ROCm/ROCR-Runtime commit: `681f4dcecc`]	2015-12-20 15:50:50 +02:00
Harish Kasiviswanathan	8bf76bdf67	Fix node_id in gpu_mem[] array Change-Id: I4897623612e1749e275fb97ce1603dc5130fc9ce [ROCm/ROCR-Runtime commit: `39bf9c6611`]	2015-12-14 16:25:18 -05:00
Amber Lin	4dd9dbb128	Free resources when dlclose is called When the Thunk is initialized multiple times in the lifetime of a single process , some global resources are leaked. This can happen when dlopen and dlclose are used to load the library at runtime, rather than linking the runtime against the Thunk. This patch adds the destructor to release global resources when dlclose is called. Change-Id: Ia00da0d41f095d0b2706f98c0e75effedd596f49 [ROCm/ROCR-Runtime commit: `582b70f9c3`]	2015-12-11 16:32:41 -05:00
Yair Shachar	f01386b61c	Add support for per device debug register state tracking Change-Id: I8d51670f5de8d379ead898d484f668a8034f9878 Signed-off-by: Yair Shachar <Yair.Shachar@amd.com> [ROCm/ROCR-Runtime commit: `8f529e3c72`]	2015-12-07 21:11:21 +02:00
Harish Kasiviswanathan	419117eff9	Remove unused parameter gpu_id from few functions This will also fix out of bound access in functions fmm_get_aperture_base_and_limit and fmm_release Change-Id: Icf064c46647e69a069126171dbacdf3d5b27f972 Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> [ROCm/ROCR-Runtime commit: `a4cf02d797`]	2015-11-30 11:51:44 -05:00
Harish Kasiviswanathan	f34b407728	Use same VM range for all dGPUs dgpu_aperture and dgpu_alt_aperture will be shared by all dGPUs. Change-Id: I814495e43b51acabdc6266cfa8d83db5a062e20d Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> [ROCm/ROCR-Runtime commit: `2903a610e1`]	2015-11-26 15:07:29 -05:00
Harish Kasiviswanathan	87ddd7732e	Fix dgpu_vm_limit Break from the for-loop once dgpu VM range is found, otherwise the length is reduced by half Change-Id: Ie602054c16ea69ea1cbb75e804ead551bc3615c0 Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> [ROCm/ROCR-Runtime commit: `5a55383baf`]	2015-11-23 11:51:39 -05:00
Amber Lin	4c84c85252	Fix sibling map in CPU cache properties Previous code only works for systems where shared_cpu_map lists 32 or less bits. Some systems list more than 32 bits and express them as XXXXXXXX,XXXXXXXX,.... This patch adds that calculation. Also increase MAX_CPU_CORES and MAX_CACHES to accommodate more advanced systems. Change-Id: Ia5c7041866456a6aa3b66f8f0f951022d7c51028 [ROCm/ROCR-Runtime commit: `a5bc8360e8`]	2015-11-12 08:31:51 -05:00
Felix Kuehling	1ab2c3341a	Reserve address space with PROT_NONE Access to reserved address space that has not been allocated should result in a segfault. Use PROT_NONE to ensure that. Change-Id: Ic5da9392fabbe78c9ec14f98e8b7b47e5267a98a [ROCm/ROCR-Runtime commit: `62337b6c0a`]	2015-11-10 18:19:56 -05:00
Kent Russell	650232b83b	Use OUT_DIR for thunkroot variable Pick up the thunk from the correct location. It is no longer inside THUNK_ROOT, but instead part of the OUT folder. Change-Id: I41dd7dae243e66270d0ea7182f1ba119b18a1cfb [ROCm/ROCR-Runtime commit: `3786e18d99`]	2015-11-09 16:21:49 -05:00
Kent Russell	63c43d3404	Fix variable for RPM build Certain versions of rpmbuild need the variable to be outside of curly braces. This addresses that issue in that situation. Change-Id: Iff7200b332b9d8e41a4d7676ca14c5a32c075beb [ROCm/ROCR-Runtime commit: `4e4d4a81e1`]	2015-11-09 11:05:32 -05:00
Amber Lin	403eb13050	Add CPU cache information Fill up cache properties of CPU node by reading data from /proc/cpuinfo and /sys/devices/system/cpu/cpuX/cache/indexY Change-Id: I0a96760575e504e38962554f192c3fe66bea3c15 [ROCm/ROCR-Runtime commit: `b6f65f9849`]	2015-11-09 07:16:24 -05:00
Ramesh Errabolu (xN/A) TX	6445dcac80	Update Binarysearch and BlackScholes Hsa Sample to support FULL and BASE Profiles [git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1206169] [ROCm/ROCR-Runtime commit: `2f0425d354`]	2015-10-30 17:53:25 -05:00
Kent Russell	67d98aa280	Add option to create release build for Thunk By adding REL=1 to the make command line (e.g. make REL=1 deb), we can create a release build of the Thunk. This will not affect existing functionality, and will only have an effect if REL=1 is specified on the command line, or in the build_thunk.sh script. Change-Id: Iedc3b6094e70a4ebd726499eda56013cc254b83d [ROCm/ROCR-Runtime commit: `cb3a664065`]	2015-10-30 14:05:40 -04:00
Kent Russell	39d2152a3f	Cleanup RPM build of thunk Change-Id: Ib437a3ec7be9f5aa7d3ef9e53c13e3c5e7b7382e [ROCm/ROCR-Runtime commit: `cabbcbabff`]	2015-10-30 08:42:16 -04:00
Felix Kuehling	b900df9215	Use correct aperture for _fmm_unmap_from_gpu_scratch Passing in the wrong aperture resulted in failure to unmap scratch. Change-Id: Icd7423abfb1bcc773b33becffcbefc233f4ff340 [ROCm/ROCR-Runtime commit: `bd93eecc64`]	2015-10-29 18:26:15 -04:00
Philip Cox	782cea350c	Add SDMA IOCTL type to Create Queue function. Change-Id: I7e31507b761ca388b2cac93f994f6106de962f17 [ROCm/ROCR-Runtime commit: `0c234c7ef3`]	2015-10-29 10:25:41 -04:00
Kent Russell	f4889d439d	libhsakmt - Add make option to package thunk as RPM Add an option to libhsakmt to allow the thunk to be packaged as an RPM. The default will remain being built as-is, but this can now be packaged as an RPM by using "make src rpm" . build_thunk.sh will be modified to reflect this new option. Change-Id: I38e03d10cfb5035bdf0a87635a784c47a709a5b6 [ROCm/ROCR-Runtime commit: `6ceed7def3`]	2015-10-29 07:49:13 -04:00
Harish Kasiviswanathan	595f51899f	Remove erroneous and redundant memory banks reported hsaKmtGetNodeMemoryProperties - - Return only HSA_HEAPTYPE_SYSTEM memory for CPU only node. - For dGPU remove redundant HSA_HEAPTYPE_FRAME_BUFFER_PRIVATE entry. Change-Id: I0349be39b8409a0fd64a038b8b2956191356d937 [ROCm/ROCR-Runtime commit: `f885e551aa`]	2015-10-23 18:43:46 -04:00
Harish Kasiviswanathan	71dc59b245	Correct parameter name for topology_is_dgpu() The function expects device_id and not gpu_id. Change-Id: I79794fd4e58e6e6adb26659da30f3e4d8e108434 [ROCm/ROCR-Runtime commit: `69662da3dc`]	2015-10-23 18:43:45 -04:00
Harish Kasiviswanathan	d7589c62e1	Unify fmm_get_aperture_xxx functions Unify fmm_get_aperture_base and fmm_get_aperture_limit into one function. Make the return value to HSAKMT_STATUS. Change-Id: I0b3f563ffb268947ab891f4935f61788d0af0e01 [ROCm/ROCR-Runtime commit: `cb53548c89`]	2015-10-23 18:43:34 -04:00
Felix Kuehling	29561cc13e	Implement flat scratch support for dGPU hsaKmtAllocMemory only allocates aligned address space and sets up the scratch_physical aperture to match the allocated address space. Actual allocation of backing memory happens in hsaKmtMapMemoryToGPU. Change-Id: Ie709815ab9bedb3d682e096b4005fdfb5e94d3a7 [ROCm/ROCR-Runtime commit: `5131ab4e64`]	2015-10-22 20:40:22 -04:00
Felix Kuehling	17a31f1cce	Allow address space allocations with specific alignment Change-Id: I4bf7f7ac53c3921dd330b9dc7a40582611f88b69 [ROCm/ROCR-Runtime commit: `149261ba09`]	2015-10-22 20:27:49 -04:00
Ben Goz	4ec82c7edd	Casting local memory size to uint64_t Change-Id: I5c2010056b84ac01bb65361210d2a693e437050a Signed-off-by: Ben Goz <ben.goz@amd.com> [ROCm/ROCR-Runtime commit: `55b1a5dc43`]	2015-10-22 09:05:34 -04:00
Ben Goz	a511b7c4f7	Adding support for new AQL Queue Memory allocation Change-Id: If84fc4b961627dbdd0b77b1c509a3c9a4c709b9f Signed-off-by: Ben Goz <ben.goz@amd.com> [ROCm/ROCR-Runtime commit: `e61500c46e`]	2015-10-22 13:13:54 +03:00
Felix Kuehling	55bd82cd89	Fix node 0 system memory allocation for dGPU This is a hack to allow the Runtime to allocate system memory with PreferredNode=0 on a dGPU system. We allocated it from Node 1 instead so that the node 1 GPU can map the memory. A proper fix will be implemented together with multi-GPU support. Change-Id: Ieb52599e5275781c04ee34405ea850bf782c523a [ROCm/ROCR-Runtime commit: `590c8e522c`]	2015-10-21 20:00:01 -04:00
Felix Kuehling	ebf6ec1806	Reserve more SVM process address space Try to reserve as much SVM address space as GPUVM can address. Implement a fallback scheme to smaller sizes if larger allocations fail or are not addressable by the GPU, down to an (arbitrary) minimum of 4GB. Change-Id: I770177834cc9e6ddd6ef4f20d789eab63c8055cb [ROCm/ROCR-Runtime commit: `39bde26c9b`]	2015-10-19 17:44:23 -04:00
Andres Rodriguez	f6eba4d367	make: add 'deb' target for creating deb packages When 'make deb' is run create a libhsakmt.deb archive that installs libhsakmt into the appropriate folder on the target where the dymanic linker can find it. Change-Id: I32de7198975f7831e509a67371e78456982b5c42 [ROCm/ROCR-Runtime commit: `0df346aaf9`]	2015-10-16 19:13:51 -04:00
Harish Kasiviswanathan	d38a3f1438	Fix init process apertures Kernel ioctl AMDKFD_IOC_GET_PROCESS_APERTURES returns process apertures only for GPU nodes. The current implementation assumed that this list of GPU nodes returned by the ioctl has one to one correpondence to sysfs topology nodes. This fails when non-GPU nodes exist in topology as in case of Intel + gfx802 Fix this by using gpu_id (./sys/.../kfd/topology/nodes/1/gpu_id) to map information obtained from kernel ioctl call. Change-Id: I4ab8ae5354f12cf0b6609fc4b24182b82eb3677f [ROCm/ROCR-Runtime commit: `5cc56a2647`]	2015-10-15 15:38:14 -04:00
Harish Kasiviswanathan	462a775ec3	Fix hard-coded usage of Node 0 Use appropriate NodeId instead Change-Id: I46af93b76978fea7bedb34457fcc0864ed4fe2d4 [ROCm/ROCR-Runtime commit: `b6c6f79143`]	2015-10-14 17:27:38 -04:00
Felix Kuehling	574fcdd340	Fix various dgpu memory management issues Fix TONGA_PAGE_SIZE value and move it to libhsakmt.h for usiing it consistently in all places that require the same alignment for the same reason. Create a generic alignment helper macro to replace some incorrect hand-coded size alignments. Move virtual address and size alignments down into aperture management functions. Alignment is a per-aperture property that is set during fmm_init_process_apertures. Doing the alignment there ensures that all allocations in the same aperture are aligned the same way. Finding objects by size and address can take the alignment into account. Also align the size of physical allocations to back aligned virtual address allocations. CPU mappings do not need to be aligned. Map anonymous pages over released memory mappings to allow the backing pages to be released, while keeping the address space reserved. Add alignment parameter to free_exec_aligned_memory_gpu to match the interface of allocate_exec_aligned_memory_cpu. It doesn't make sense to allow an alignment parameter in one but assume a specific alignment in the other. Change-Id: I74226ca6938f4948f643e5aee1d474720cd89e78 [ROCm/ROCR-Runtime commit: `6a5ca4bc5a`]	2015-10-13 19:14:56 -04:00
Felix Kuehling	a4c4170906	Add support for gfx803 Create new device_info and add device ID. Add helper macros to identify chip families (VI, discrete). For now gfx803 behaves like gfx802. But if necessary we can have gfx802 or gfx803-specific code paths or workarounds in the future. Change-Id: I61b4ffef7dd7796bb34cb01fbff0089bd49507bb [ROCm/ROCR-Runtime commit: `0fc0a5b526`]	2015-10-09 17:40:54 -04:00
Harish Kasiviswanathan	72cc7c2234	Fix assert failure for CPU only node hsa_gfxip_table lists only (supported) GPUs. So assert fail only when a non-supported GPU is detected. Change-Id: I6207dc7cd55860c8b3348b6a4ca6102131975722 [ROCm/ROCR-Runtime commit: `758824db17`]	2015-10-08 11:52:59 -04:00
Harish Kasiviswanathan	ee891bed05	Refactor hsa_gfxip_table lookup Also fix some formatting Change-Id: Ia04d7a9cd3972cc4d283c576161de639027aac6d [ROCm/ROCR-Runtime commit: `f2a46101d3`]	2015-10-08 11:52:59 -04:00
Felix Kuehling	8f0b7e6a76	Update HsaMemFlags.ui32.CoarseGrain comment As advised by Paul Blinzer Change-Id: Icabf4acd94866ddbbe53faf48a71e1113f0c76b6 [ROCm/ROCR-Runtime commit: `b94ae66c62`]	2015-10-05 16:48:50 -04:00
Felix Kuehling	f09c6b84af	Setup APE1 on dGPU for coherent access The default is non-coherent access for better performance on dGPU. Disabled hsaKmtSetMemoryPolicy function on dGPU to prevent app from overriding the APE1 settings at runtime. Fixed dGPU VM aperture limit to be inclusive. Change-Id: I378ff74a654f533572775c0c97c19779a56bc6d9 [ROCm/ROCR-Runtime commit: `8e836f8183`]	2015-10-02 17:20:33 -04:00
Felix Kuehling	c3a1263604	Add all gfx802 device IDs to supported_devices Without this, queue creation segfaults on unknown devices. Change-Id: Ieea0bc4783e7313b3dcdabf03ab1269e3670b217 [ROCm/ROCR-Runtime commit: `7505893cc7`]	2015-10-02 15:33:37 -04:00
Felix Kuehling	9dd2664db2	Fix returning of base and limit on dgpu_mem_init reinitialization Change-Id: I1d1500ee57c3b85fc39c224d233a62097f981719 [ROCm/ROCR-Runtime commit: `f3aaba0621`]	2015-09-30 18:07:04 -04:00
Felix Kuehling	048207626a	Add CoarseGrain memory flag Change-Id: If8ac0339ae8c809c6e6a4f56592a4061d110ea94 [ROCm/ROCR-Runtime commit: `f2f45cc0e4`]	2015-09-30 18:07:04 -04:00
shaoyunl	d30daaba5d	Initiali support for CWSR on thunk 1. Add IOCTL defines to set trap handler 2. Add control stack size information on create queue argument. 3. Increase the total save&restore area size for carrizo to include the control stack size. Signed-off-by: Shaoyun Liu <Shaoyun.liu@amd.com> Change-Id: Iccf15e073b7db2519e96e7f7b46a89d57ab9a4df [ROCm/ROCR-Runtime commit: `2d63ee7b8f`]	2015-09-25 15:12:25 -04:00

... 54 55 56 57 58 ...

2930 Cometimentos