Track pointer info for sub 2MB fragment allocations in allocation_map_.
Add fragment support to IPC.
Change-Id: I00cfc2e2fa289aac90a4718c392f9bb056a61a87
Blocks inside of HsaCounterProperties structure is not a fixed size. It
varies with number of counters in the block -- size of Counters in
HsaCounterBlockProperties is different in every block. Current
implementation assumes fixed size and the next block will overwrite the
previous block's Counters. This patch change the array implementation to
using a pointer so it'll move the next block to the correction position.
Change-Id: I72800f4db5f2a68215fba477a61ca07ec99054bf
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Applications may try to allocate lots host memory and reaches the mmap
limit (/proc/sys/vm/max_map_count). When Applications fails to allocate
memory and calls hsaKmtFreeMemory to release the memory, Thunk fails to
reduce the maps count so the following hsaKmtAllocMemory calls continue
to fail, which doesn't make sense to the application. This patch checks
the mmap to NORESERVE return value. If it fails and the error number is
ENOMEM, reduce the map count by munmap and map it again immediately.
Change-Id: I127cb479dfd86b199172eef269d59426f23859ea
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Support all fragment sizes up to 2MB by aligning buffers according
to their size.
Change-Id: I82b7ef8be6f1507d941e5c97edb6618adf8c66de
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Due to max_mmap_count issue, set default of guard page as disabled.
Change-Id: Ic9dfe69b621733e9fac86831b008a122994a67e7
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Use the mapped_device_id_array size when allocating temp_node_id_array
for unmapping queues in fmm_map_to_gpu_nodes. registered_device_id_array
size may be 0. Also, this temporary array is small enough to allocate it
on the stack. Malloc and free are overkill here.
Fix potential memory leak when registering the same device ID array
multiple times.
Change-Id: I83f09fd0925d9de7cf11bf72ba0ebb77273f587d
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
IOMMU path in sysfs was amd_iommu. After implementing multiple devices
support, the path is replaced with amd_iommu_<index>. Current Thunk spec
is not clear about how to support multiple instances in one block. There
is no products having multiple IOMMUs yet at this point. This patch
changes the path to support both amd_iommu and amd_iommu_0 for Carizo.
Change-Id: I3beea2fc78d96296232226191501a02ccf20d6b1
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Add pr_debug to all memory APIs and pr_err to some failure cases.
Change-Id: I8b519a1228cc19e6c04118fd87432e7f48f3cbf9
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Simplify fmm_map_to_gpu_nodes code. Also fix a memory leak in this change.
Change-Id: I3487338b78c915de44588d0206bac4c53e728c60
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Added an API for creating signals with attributes.
Added two APIs for IPC operations on signals.
Initial use of exceptions for error handling.
Add ref counting to signals.
Removed spin loops from signal destructors.
Signals are no longer to be destroyed with delete, use DeleteSignal instead.
Added delete safety to doorbells.
Added secondary hsa_signal_t -> Signal* translation path for IPC enabled signals.
Change-Id: Id59065d002f0c2566b0a9425694da2ed27cb7d7f
Fall back to older apertures API and old events page size if the new APIs
fail. This allows running on current upstream kernels (with only minor
fixes) on gfx801 and enables testing of further changes during upstreaming.
Change-Id: I9d86d4f576e52fcbb5bc158d80f1bf41261e4e87
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Removed Werror CFLAGS for lower version of gcc. there
will be some warning message on lower gcc version but build
is ok.
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Change-Id: Icf556625cb870c4ad73e1d89f3d4ade3a96e821f
Non paged system memory is allocated with node id 0. However, since a
gpu node is required for allocating system memory via KFD, the first
dgpu is used. In hsaKmtShareMemory() if system memory use the same
(first) dgpu.
Change-Id: I85789a89a4e4f7888e3826826401ea89ce4d1718
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Topology uses cpuid to get CPU cache information. However when running
under Valgrind, data returned from cpuid are not from the processor we set
affinity to. Instead they are all from one specific processor. For a quick
workaround so other teams can continue their work, this patch will report
CPU cache from that specific processor and ignore others.
Change-Id: I5cfac2329dac277f3dbde1be92fa26e085465401
Signed-off-by: Amber Lin <Amber.Lin@amd.com>