Wykres commitów

336 Commity

Autor SHA1 Wiadomość Data
Philip Yang 1bf93d4e89 Export microcode version of sDMA
Change-Id: I86fa5da5e72af13a2e76e6e3be4667a7220923d5
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2018-03-19 08:38:50 -04:00
Felix Kuehling 19dacdecd3 Update kfd_ioctl.h from kernel
This adds new acquire_vm ioctl.

Change-Id: Ia6794bfd291706cecdb2d06f4902b324b48577df
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-03-09 16:36:52 -05:00
Felix Kuehling 85e1a9bf5e Rework SVM aperture initialization
Query GPUVM aperture limits of all dGPUs to determine SVM aperture
base and limit. This depends on a recent KFD change that reports
the GPUVM apurture limits for dGPUs in the
AMDKFD_IOC_GET_PROCESS_APERTURES_NEW ioctl (drm/amdkfd: Simplify
dGPU SVM aperture handling).

Only initialize SVM aperture once, instead of once per GPU.

Don't call AMDKFD_IOC_SET_PROCESS_DGPU_APERTURE. It's not needed any
more and will not be upstreamed.

Change-Id: Ib3389e8ba18505ba15fc33f45fe8a57e690a565d
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-03-09 16:36:49 -05:00
Felix Kuehling c5cfb7e25b Move dGPU memory aperture initialization
Define dgpu_mem_init before it's used and keep the code close to the
rest of the aperture initialization code.

Change-Id: I14ad11a364524a15affee9186b1298ba7d56d2c9
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-03-09 15:00:12 -05:00
Yong Zhao 15e525af45 Add pkg config support in the hsakmt-roct-dev package
Change-Id: Ida6b3083bfc9405ef9b6b8e426dc7dc51d61a811
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2018-02-27 11:21:38 -05:00
Yong Zhao 2c426a026a Turn off the verbose building message
Change-Id: If4ebdb6f87fde9c3cc76b16c57e862bfb972ed5e
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2018-02-26 18:17:13 -05:00
Philip Yang 105291849f Close shmem file handle, to fix file handle leak
kfdtest hsaKmtOpenKFD failed after 1019 loop if using --gtest_loop=-1,
because default max open file handle limit is 1024. Found shmem file handle
is not closed from lsof output.

Change-Id: I474de2bae6c03e879a219dedf5f18639118b73e5
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2018-02-23 10:50:52 -05:00
Jay Cornwall e2c353dc0d Allocate EOP queue local to GPU
On discrete GPUs place the EOP queue in VRAM. The reader/writer of this
queue is the CP and the size is small. Dispatch latency improves
through lower read latency in AQL completion phase.

Change-Id: Id8351dcddbd21fd7c7d699803c96434c9132db71
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
2018-02-22 18:14:05 -05:00
Oak Zeng 25170c3c57 Support ptrace access invisible vram
Invisible device memory is mmapped as PROT_NONE.
Normal CPU access to the memory is still not allowed but
struct vm_area_struct will be created for the memory address
so ptrace can access the memory via the vma.

Change-Id: I07c69208716c920ccce33e6b494b610b61a0a7c1
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2018-02-20 14:13:00 -05:00
Harish Kasiviswanathan 7de0199e99 CMA: Initialize SizeCopied return parameter
UCX test cases are reporting uninitialized values when CMA fails. The
application should ideally ignore SizeCopied when the function fails but
it doesn't. This is leading to wrong diagnosis.

v2: Fill in partial SizeCopied in case of failure

Change-Id: I6b7e1c19a8b702ec91ca64201a3dda27bd897877
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
2018-02-08 12:46:40 -05:00
Yong Zhao 55bb61ff9c Revert "Workaround: make mmap memory resident for gfx902"
This reverts commit 716755b1de.

Change-Id: I9f4f0b6b426aeae4cb652b33cf0d4c0f57270ca5
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2018-02-02 12:31:06 -05:00
Laurent Morichetti 056ddbbc82 Silence Valgrind warnings
Change-Id: I8803f3d310fccd69d0d04b2464b00dccc40270e3
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-01-25 16:48:17 -05:00
Amber Lin 7031a77428 Update README to reflect cmake change
New CMakeLists.txt sets a default module search so -DCMAKE_MODULE_PATH is
no longer required in the command.

Change-Id: I95189ce2f36016b7c4929239d0e512851bec5ef6
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2018-01-04 15:55:20 -05:00
Amber Lin 8bc83e1e9b Update README to include new requirement
Latest Thunk requires the user to belong to video group. Add this
statement to README.md to notify external users on Github.

Change-Id: Id9843abf09de5b63a3b7c3f7b322bc9099c6ff1a
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-12-18 12:10:24 -05:00
Yong Zhao 716755b1de Workaround: make mmap memory resident for gfx902
Change-Id: I5f90f316740f7995d54cb083a6d7e05bc4e2966e
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2017-12-14 15:11:01 -05:00
Yong Zhao 0f83774635 Report gfx902 as GFX 9.0.2
This change is needed to match other higher level components.

Change-Id: I45114d23f2ed428dfbbb836061b3020c5ab166ec
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2017-12-07 16:08:10 -05:00
Oak Zeng c2dc301792 Revert "Revert "More cleanup of fmm.c""
This reverts commit 52f6a61970.

Change-Id: I31afe4889794df8cf1e96f5f18771bed75a213d9
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-12-04 15:48:11 -05:00
Oak Zeng 786e470241 Revert "Revert "Cleanup fmm.c""
This reverts commit f7689d4fef,
Plus a bug fix to patch "Cleanup fmm.c":
Call id_in_array with correct parameter. The third parameter
of id_in_array is size in byte of the array, not the number
of array items. Call it correctly.

Change-Id: I72d8e2fcc0df32af76c72967386e92c1be18c159
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-12-04 15:48:11 -05:00
Felix Kuehling 587d4f4bdf Rename fmm_allocate_memory_in_device
to fmm_allocate_memory_object. This function name was confusingly
similar to fmm_allocate_device and __fmm_allocate_device. The new name
reflects its function better: allocate the VM object and the kernel
mode buffer object.

Change-Id: I6604d228004b4d41e871d4de784786823608b5d6
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-12-04 10:23:55 -05:00
Oak Zeng f7689d4fef Revert "Cleanup fmm.c"
This reverts commit b4c89c1ea7.
This change caused a regression ()
Revert temporarily

Change-Id: Ic3829264151e37d1f8c6927c6f464006234ba17f
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-11-29 09:43:11 -05:00
Oak Zeng 52f6a61970 Revert "More cleanup of fmm.c"
This reverts commit 019f7cbd20.
This change caused a regression ()
Revert temporarily

Change-Id: I5af59d319afeb7f0b03e5a09e8397e3853b8b37b
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-11-29 09:42:19 -05:00
Oak Zeng cce57cec26 Cosmetic changes in events.c
Change-Id: Idecb8eede8811020b3af51cbc71da74849029c82
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-11-28 15:20:51 -05:00
Oak Zeng 019f7cbd20 More cleanup of fmm.c
1. Renamed _fmm_map_to_gpu to _fmm_map_to_apu_local
   to reflect the real semantics of this function
2. Renamed _fmm_map_to_gpu_gtt to _fmm_map_to_gpu
   because this function is used to map both gtt
   and local memory
3. Call _fmm_map_to_gpu in _fmm_map_to_apu_local
   to get rid of duplicated codes

Change-Id: Id8e3ebfffe0a3c27ebdcac8a8f4dc3738d67d10a
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-11-27 18:47:35 -05:00
Oak Zeng b4c89c1ea7 Cleanup fmm.c
1. Initialize pointers to NULL in vm_create_and_init_object
2. Added helper function to add/remove device ids to/from mapped arrary
3. Only map nodes that were not mapped currently
4. Remove unnecessary condition check on object frees

Change-Id: I7aed6d40c7464be0d168d5796229af55451e0f34
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-11-27 18:47:23 -05:00
Amber Lin 6f7b55f2d8 Add debug message in PMC trace
Print data in PMC trace when the debug level is set to 7(pr_debug).

Change-Id: I9abbb8f6c3f7962fb637528578c1a58b7784042d
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-11-22 10:09:49 -05:00
Oak Zeng 061db45fe2 Fix unconditional unmap in fmm_map_to_gpu_nodes
_fmm_unmap_from_gpu is called in fmm_map_to_gpu_nodes
to unmap buffer from nodes that is already mapped to
but not in the new map nodes list. Previously, the unmap
was called unconditionally even though the size of the
array to unmap is 0. This fixes the issue by calling
the unmap func only when the unmap array size is not 0.

Also releases the fmm_mutex on error returns

Change-Id: Iadd8383caf7ebb92f02618798c5efd138a352aaa
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-11-21 15:16:39 -05:00
Oak Zeng f06e887725 Properly control lifecycle of ptr info objects
Buffer mapping to devices and buffer registration to
devices can be changed b/t two pointer info queries.
Thus update buffer mapping info and registration info
only when mapping and registration changed. This is
done by free mapped_node_id_array on mapping to new
device and free registered_node_id_array on registration
and re-allocate them on next ptr info query.

Also uses fmm_mutex to avoid race conditions in case
of calling hsaKmtQueryPointerInfo concurrently with
calling of buffer mapping or registration

Change-Id: Ibc2e20be1fc0147066f873dfa44b21f5015104b7
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-11-21 15:16:29 -05:00
Sean Keely e81d04e11b Use a default module search path if not already specified.
Change-Id: I782f0b758dc908c25abeb7f3536418cb5a48ac5e
2017-11-20 14:12:27 -05:00
Amber Lin 61d1c6ffac Correct command in make package
cmake command in making packages was not updated.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>

Change-Id: Iafd3d9f4941d782bd77cfd0efafe48a02221b002
2017-11-20 10:28:25 -05:00
Oak Zeng d4e6ec0ff0 Added "-g" to CFLAGS for debug build
Previously even for debug build, -O2 is used.
So there wasn't debug information in the debug build.

Change-Id: I6334474e007480eb2db191bdfec5a71677c26a52
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-11-09 07:43:36 -05:00
Oak Zeng 07110fbd38 Correctly handle max_map_count limit after failed memory allocation
Also separated a function for removing CPU mapping
and reserving address, as a refactoring of codes

Change-Id: I1feb85b0b2ec942487f899ec3192c7c47dd7c7d5
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-11-08 10:05:04 -05:00
Kent Russell c704ff60b3 CMakeLists: Make roct-dev dependent on roct
Change-Id: Ib7d2927087dcd53da7916951de9d6a71ae6bb21b
2017-11-07 06:43:37 -05:00
Amber Lin 310e3d7b8b Use absolute path on cmake parameter
Update build instructions in README.md to use absolute path on cmake
parameter, CMAKE_MODULE_PATH. Relative path causes build error. Tested
on cmake 3.5.1 ans cmake 3.5.2.

Change-Id: I1b8e8deb9f4941580580be8087a94655ae155d02
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-11-02 17:34:46 -04:00
Oak Zeng 68a2d286ca Use drm render device to map kfd BOs
Previously kfd device is used to map memory for CPU access.
However this is not compatible with how TTM handles CPU mapping
on eviction - memory won't be unmapped and remapped on restore.
This fixes the issue by mmapping memory using DRM render device.

This patch requires a coordinated kernel driver change to work.
To make it compatible with old kernel driver, some temporary codes
are included. Once the coordinated kernel driver is checked in,
the temporary codes can be removed.



Change-Id: Ie7b304c4a82b7e8d5ab703acb81d66430af4f0bc
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-11-02 09:06:26 -04:00
Shaoyun Liu 55fc06dac3 Add asic id for gfx906 on emulator
On thunk level, gfx906 works same as gfx900 chip

Change-Id: I727bd904284616f3b1b9b911e41ad0f19318b3ee
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
2017-10-31 14:58:09 -04:00
Philip Yang 3501b2f40d Fix double free on fork after hsaKmtCloseKFD
Child process hsaKmtOpenKFD() call must re-initialize global variables
copied from parent process. This includes close all file handles, free
dynamically malloc buf. Double free issue is because destroy_device_
debugging_memory() free the memory in parent process hsaKmtCloseKFD()
but don't reset it to null pointer. As a result, child process free it
again. kfd_fd is closed in parent process but don't reset to 0, so
child process close it again.

Fix: reset kfd_fd to 0 after close, reset is_device_debugged pointer to 0
after free



Change-Id: I421b3decbcaa4111298b8e599aa16940d851a58c
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2017-10-26 15:36:15 -04:00
Sean Keely aece2f8fc2 Remove make build file.
Change-Id: I86abae4c44b6c606fb850eff6d44cdbf30cf59f5
2017-10-26 01:12:31 -04:00
Jan Vesely d8a8f88737 cmake: make sure there are no undefined symbols
Change-Id: Id5a268d7e512f71c1a65af598543eb60ae6c3596
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-10-25 17:46:58 -04:00
Jan Vesely 383f275aa7 cmake: Use pkg-config to find libpci
Change-Id: I1ab4397d88a2bd48ce0d6f2d3c33efcf47bc442f
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-10-25 17:46:58 -04:00
Yong Zhao 3b852b4437 Add SVM aperture on gfx902
Because of HW design change, GPUVM aperture is no longer needed on GFX9
APUs. However, on APUs some functionalities still depend on GPUVM
aperture, so we choose to use SVM aperture instead to assume
the functionality of previous GPUVM aperture.

Change-Id: Ife7f0d598dd7989f2bcf7cdf3466d5a68703ca60
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2017-10-16 15:06:09 -04:00
Felix Kuehling cb4814eadc Make system memory allocations NUMA aware
Use mbind to specify the NUMA node for system memory allocation. This
only works with HSA_USERPTR_FOR_PAGED_MEM=1.

Change-Id: I88e7815d5a5aefcc4c22358c1a4a1635d7677ef3
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-10-13 16:25:56 -04:00
Tom Stellard e2ed9cf79a Don't mark heap memory as executable v3
Marking heap memory as executable using mprotect() is not allowed
by SELinux.  mprotect() calls that try to do this will fail on systems
with SELinux enabled.  This is also a security risk, so it should be
fixed even on systems that allow this.

Any memory we want to mark as executable must be allocated using mmap().
See https://www.akkadia.org/drepper/selinux-mem.html

The two places where we try to mark heap memory as executable both use
posix_memalign() to allocate the heap memory.  In both cases, the
alignment value passed into this function is always equal to PAGE_SIZE,
which means that they are safe to replace with mmap(), which guarantees
alignment to PAGE_SIZE.  In this case PAGE_SIZE has been set to
sysconf(_SC_PAGESIZE);

v2:
  - Use MAP_PRIVATE instead of MAP_SHARED.  This matches the behavior
    of memory allocated by posix_memalign()
  - Ignore alignment hints instead of returning error when we can't
    accommodate them.
  - Drop alignment parameter of allocate_exec_aligned_memory() since
    the only alignment supported is sysconf(_SC_PAGESIZE).
  - Remove extra parameter from fmm_release().
  - Add error path to fmm_allocate_host_cpu() for when mmap fails.

v3:
  - Avoid use after free.

Change-Id: I7d51279790d9700bc3fa761c44bfde1c1936019b
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-10-13 14:05:58 -04:00
Tom Stellard 4815d9b887 Build fixes for gcc 7.2 v2
src/perfctr.c: In function ‘destroy_shared_region’:
src/perfctr.c:154:10: error: logical ‘and’ of equal expressions [-Werror=logical-op]
  if (sem && sem != SEM_FAILED) {
          ^~
src/perfctr.c: In function ‘update_block_slots’:
src/perfctr.c:323:11: error: logical ‘or’ of equal expressions [-Werror=logical-op]
  if (!sem || sem == SEM_FAILED)
           ^~
v2:
  - Initialize and reset sem to SEM_FAILED.

Change-Id: Id70361079b715c4946b13e4460e4fd85d9542c46
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-10-13 11:37:38 -04:00
Amber Lin 5815d9de9b Fix endless loop
Fix a while loop that can cause forever loop when cpuid instruction
doesn't work properly.

Change-Id: Iefa49d23b40c994eb4369621974a7d3c4067e47a
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-10-12 14:22:42 -04:00
Yong Zhao 8126ddc77e Update kfd_ioctl.h
Kernel file has been changed recently, so we update the file in thunk.

Change-Id: I359a389fa9d91641114c7fb75f420ee6b16f467a
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2017-10-06 14:49:11 -04:00
Yong Zhao df4d8a0010 Revise gfx902 GFX version to 9.0.3
Change-Id: I6c16726ac9d096dc4ab127fb266eed105a4f9c87
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2017-10-06 14:48:38 -04:00
oak zeng d9e71260c4 Print debug message for GPU vm fault analysis
Change-Id: Ia6dac9d3f5c35a7d0e41de9b54c06596d00c7946
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-10-04 15:17:55 -04:00
Amber Lin 117fa5034b Fix PmcGetCounterProperties
Blocks inside of HsaCounterProperties structure is not a fixed size. It
varies with number of counters in the block -- size of Counters in
HsaCounterBlockProperties is different in every block. Current
implementation assumes fixed size and the next block will overwrite the
previous block's Counters. This patch change the array implementation to
using a pointer so it'll move the next block to the correction position.

Change-Id: I72800f4db5f2a68215fba477a61ca07ec99054bf
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-09-01 17:58:15 -04:00
Amber Lin a74f6896ea Revert "Set guard page as disabled as default"
This reverts commit 65d680c035.

Change-Id: I09b7e7915ec4759cab57d5863089a2c4a44dfacd
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-08-30 10:48:23 -04:00
Amber Lin a81b29890c Fix mmap when the count reaches the max
Applications may try to allocate lots host memory and reaches the mmap
limit (/proc/sys/vm/max_map_count). When Applications fails to allocate
memory and calls hsaKmtFreeMemory to release the memory, Thunk fails to
reduce the maps count so the following hsaKmtAllocMemory calls continue
to fail, which doesn't make sense to the application. This patch checks
the mmap to NORESERVE return value. If it fails and the error number is
ENOMEM, reduce the map count by munmap and map it again immediately.



Change-Id: I127cb479dfd86b199172eef269d59426f23859ea
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-08-29 11:47:29 -04:00