Kernel file has been changed recently, so we update the file in thunk.
Change-Id: I359a389fa9d91641114c7fb75f420ee6b16f467a
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
[ROCm/ROCR-Runtime commit: 8126ddc77e]
one or both directions. Users can enumerate the pools reported
by system to specify which pools serve as source / destination
Change-Id: I8e6d0adb3743b3328dd3ce9152762ca840ea613b
[ROCm/ROCR-Runtime commit: c2caa5ae2c]
Since access may only be manipulated on whole pages, suballocator fragments must cooperate to set the page's access.
Since the KFD does not migrate memory on access changes this implementation makes agent access sticky across the requests in a fragmented page.
Change-Id: I88479ed45fb40e9782b704526a7b8ffb22e7bd76
[ROCm/ROCR-Runtime commit: e9a6f2c3e6]
GCC can't reasonably be told that the lock ptr isn't null. Adding a private bool
allows the branch to be eliminated, along with the bool.
Change-Id: I0605d69474d6a6e6951be93c0af1d8caf3f77124
[ROCm/ROCR-Runtime commit: 9dfdce5b3c]
Track pointer info for sub 2MB fragment allocations in allocation_map_.
Add fragment support to IPC.
Change-Id: I00cfc2e2fa289aac90a4718c392f9bb056a61a87
[ROCm/ROCR-Runtime commit: 117be0b55a]
Blocks inside of HsaCounterProperties structure is not a fixed size. It
varies with number of counters in the block -- size of Counters in
HsaCounterBlockProperties is different in every block. Current
implementation assumes fixed size and the next block will overwrite the
previous block's Counters. This patch change the array implementation to
using a pointer so it'll move the next block to the correction position.
Change-Id: I72800f4db5f2a68215fba477a61ca07ec99054bf
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
[ROCm/ROCR-Runtime commit: 117fa5034b]
Applications may try to allocate lots host memory and reaches the mmap
limit (/proc/sys/vm/max_map_count). When Applications fails to allocate
memory and calls hsaKmtFreeMemory to release the memory, Thunk fails to
reduce the maps count so the following hsaKmtAllocMemory calls continue
to fail, which doesn't make sense to the application. This patch checks
the mmap to NORESERVE return value. If it fails and the error number is
ENOMEM, reduce the map count by munmap and map it again immediately.
Change-Id: I127cb479dfd86b199172eef269d59426f23859ea
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
[ROCm/ROCR-Runtime commit: a81b29890c]
Support all fragment sizes up to 2MB by aligning buffers according
to their size.
Change-Id: I82b7ef8be6f1507d941e5c97edb6618adf8c66de
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: 52598cf37e]
Due to max_mmap_count issue, set default of guard page as disabled.
Change-Id: Ic9dfe69b621733e9fac86831b008a122994a67e7
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
[ROCm/ROCR-Runtime commit: 65d680c035]
Use the mapped_device_id_array size when allocating temp_node_id_array
for unmapping queues in fmm_map_to_gpu_nodes. registered_device_id_array
size may be 0. Also, this temporary array is small enough to allocate it
on the stack. Malloc and free are overkill here.
Fix potential memory leak when registering the same device ID array
multiple times.
Change-Id: I83f09fd0925d9de7cf11bf72ba0ebb77273f587d
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: 395ecaa985]
IOMMU path in sysfs was amd_iommu. After implementing multiple devices
support, the path is replaced with amd_iommu_<index>. Current Thunk spec
is not clear about how to support multiple instances in one block. There
is no products having multiple IOMMUs yet at this point. This patch
changes the path to support both amd_iommu and amd_iommu_0 for Carizo.
Change-Id: I3beea2fc78d96296232226191501a02ccf20d6b1
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
[ROCm/ROCR-Runtime commit: 369902bf5b]
Add pr_debug to all memory APIs and pr_err to some failure cases.
Change-Id: I8b519a1228cc19e6c04118fd87432e7f48f3cbf9
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
[ROCm/ROCR-Runtime commit: 73707766ef]
Simplify fmm_map_to_gpu_nodes code. Also fix a memory leak in this change.
Change-Id: I3487338b78c915de44588d0206bac4c53e728c60
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
[ROCm/ROCR-Runtime commit: f1a5248cf2]
Added an API for creating signals with attributes.
Added two APIs for IPC operations on signals.
Initial use of exceptions for error handling.
Add ref counting to signals.
Removed spin loops from signal destructors.
Signals are no longer to be destroyed with delete, use DeleteSignal instead.
Added delete safety to doorbells.
Added secondary hsa_signal_t -> Signal* translation path for IPC enabled signals.
Change-Id: Id59065d002f0c2566b0a9425694da2ed27cb7d7f
[ROCm/ROCR-Runtime commit: c9642cf7af]
Also separate signal ABI block allocations from the runtime interface object.
Change-Id: If16763338db664f29163a1348f8f4c38cf0597b2
[ROCm/ROCR-Runtime commit: 2732b18092]