events_page is unprotected from multiple allocation. The first event
creation ioctl is unprotected from a race with args.event_page_offset
being set (for page setup) and null (all subsequent invocations).
Change-Id: I40ba712a17e9eff257785f90c553a74ad09c661d
Signed-off-by: Yair Shachar <Yair.Shachar@amd.com>
Use the object size when freeing address space, instead of the
parameter passed in by the caller. The parameter may be incorrect
due to app or runtime bugs, or when the buffers is an AQL ring
buffer with double mapping workaround.
Change-Id: I00bb31d4520ef969a49d6d5ea723e8a33418acc3
The alignment performed in vm_find_object_by_address isn't sufficient
because it doesn't take into account the offset from the start of the
page.
This fixes a bug where certain unaligned userpointers and sizes fail
to register correctly.
Change-Id: I17872e264467a619f5e1bedb7e1ed3d994a856bf
Use the aligned size of the buffer objects for CPU unmapping in
__fmm_release instead of relying on the unaligned size passed in by
the caller.
Change-Id: If986ec24e9a05d32981549fddbf143221fc40bac
Allocate SVM address space for the registered memory and use new
userptr support in KFD to create a system memory BO associated with
the given user pointer. Map this BO at the SVM address for CPU
access.
MapMemoryToGPU can be used with the registered user pointer and
will return the SVM address as alternate GPUVA.
Change-Id: I4886e193c51fb6870a567878870c36bf8b5c3748
Details:
- add HsaGetInfo program that prints out all available CPU, GPU and their respective memory pools.
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1237219]
Few more counters are now available in GFX8 register specs. So adding
them. Also for gfx700 and gfx801 report correct number of SQ perf counter slots
Change-Id: I9e6b4b10238230aabeccbfaa5e491a28b5e54f2d
Allocations from GPU nodes will return VRAM, not system memory.
Only non-paged allocation from GPU nodes is supported. System
memory can only be allocated from CPU nodes (usually node 0).
The HostAccess flag is no longer used to distinguish the memory
type. It only indicates, whether the memory is mapped for CPU
access.
Maintain compatibility with broken KfdTests by returning system
memory for paged-memory requested from GPU nodes.
Change-Id: I514defede735f55e6de436f41944125b6f2c4ccf
This is thunk part of the CWSR support.
1. SDMA queue don't support CWSR , no necessary to allocate the context save/restore memory
2. Allocate the context save/restore memory in local frame buffer for dGPU
Change-Id: Ie83506f0cced2a5a537c49d68125796d831c2764
All tonga page size alignment is done in the memory management
functions in fmm.c. All other code only specifies the minimum
alignment it needs and lets fmm.c handle the HW-specific
alignment.
Clean up aligned-exec memory allocation in queue.c to remove
hard-coded TONGA_PAGE_SIZE alignments and remove code duplication.
Make sure alignments are consistent between allocate and free.
Change-Id: Ia8923448173d1cef315af24cebff12adef385cb0
HSA thunk API returns null when querying for GPU node marketing
names due to empty system topology file.
- Add marketing names to device GFX IP data structs.
- Modify name retrieval to pull from data structs instead of file.
Signed-off by: David Ogbeide <davidboyowa.ogbeide@amd.com>
Change-Id: I30ea04111be7e0df2e93894f801fbeb414ffa790
This prevents the library from being unloaded at runtime, even when
dlclose is called. This preserves global variables, such as state
about the SVM address space and avoids catastrophic leaks on dlclose.
Change-Id: I34f1d19a450835200e9d4815458e8d1b3045053c
Added application and driver to serve as the starting point for RDMA
unit test uility.
v2: Added initial mmap support
v3: Fixed logic to find correct ioctl handler
v4: Fixed logic in mmap to find correct pages table
Change-Id: Iaf97c0eb2acef2160d542c71afed58cf400414f7
Signed-off-by: Serguei Sagalovitch <Serguei.Sagalovitch@amd.com>
Stop using NUM_OF_SUPPORTED_GPUS. For now the definitions itself cannot
be removed as ioctl code is in upstream Kernel.
Change-Id: If846625a8ad5062d5483e762850c793d3c00b9d0
Fix hsaKmtRegisterMemory to be a no-op for now and move the multi-GPU
implementation to hsaKmtRegisterMemoryToNodes. Make GPU memory mappings
of host memory visible to all GPUs by default. Device memory is still
visible to the allocating GPU only by default (but can be overridden
with hsaKmtRegisterMemoryToNodes for experimenting with P2P).
Change-Id: I73408afbe3b10c8dad2ab3a780f58413249692e6
When the Thunk is initialized multiple times in the lifetime of a single process
, some global resources are leaked. This can happen when dlopen and dlclose are
used to load the library at runtime, rather than linking the runtime against
the Thunk. This patch adds the destructor to release global resources when
dlclose is called.
Change-Id: Ia00da0d41f095d0b2706f98c0e75effedd596f49
This will also fix out of bound access in functions
fmm_get_aperture_base_and_limit and fmm_release
Change-Id: Icf064c46647e69a069126171dbacdf3d5b27f972
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
dgpu_aperture and dgpu_alt_aperture will be shared by all dGPUs.
Change-Id: I814495e43b51acabdc6266cfa8d83db5a062e20d
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Break from the for-loop once dgpu VM range is found, otherwise the
length is reduced by half
Change-Id: Ie602054c16ea69ea1cbb75e804ead551bc3615c0
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Previous code only works for systems where shared_cpu_map lists 32 or less
bits. Some systems list more than 32 bits and express them as
XXXXXXXX,XXXXXXXX,.... This patch adds that calculation. Also increase
MAX_CPU_CORES and MAX_CACHES to accommodate more advanced systems.
Change-Id: Ia5c7041866456a6aa3b66f8f0f951022d7c51028
Access to reserved address space that has not been allocated should
result in a segfault. Use PROT_NONE to ensure that.
Change-Id: Ic5da9392fabbe78c9ec14f98e8b7b47e5267a98a
Pick up the thunk from the correct location. It is no longer inside
THUNK_ROOT, but instead part of the OUT folder.
Change-Id: I41dd7dae243e66270d0ea7182f1ba119b18a1cfb
Certain versions of rpmbuild need the variable to be outside of curly
braces. This addresses that issue in that situation.
Change-Id: Iff7200b332b9d8e41a4d7676ca14c5a32c075beb