Grafik Komit

2959 Melakukan

Penulis SHA1 Pesan Tanggal
Chris Freehill 473be763ff Re-enable IPC test
Fix for  fixes this.

Change-Id: I63a8d1a16d5029f240f075bb97ab6a1156b5cab2
2017-11-08 10:02:51 -05:00
Kent Russell b29d3f63e2 Revert "aqlprofile API: _start() sets buffers sizes with NULL ptr; block counters reg number / block name info"
This reverts commit 3daa85fad8.

Change-Id: Ie90b091df772bf9391494c773d63858aafbc1176
2017-11-08 06:59:33 -04:00
Evgeny 3daa85fad8 aqlprofile API: _start() sets buffers sizes with NULL ptr; block counters reg number / block name info
Change-Id: I3cb93453b683c55bf5ec26271648232306a5d140
2017-11-07 15:05:47 -05:00
Kent Russell c704ff60b3 CMakeLists: Make roct-dev dependent on roct
Change-Id: Ib7d2927087dcd53da7916951de9d6a71ae6bb21b
2017-11-07 06:43:37 -05:00
Amber Lin 310e3d7b8b Use absolute path on cmake parameter
Update build instructions in README.md to use absolute path on cmake
parameter, CMAKE_MODULE_PATH. Relative path causes build error. Tested
on cmake 3.5.1 ans cmake 3.5.2.

Change-Id: I1b8e8deb9f4941580580be8087a94655ae155d02
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-11-02 17:34:46 -04:00
Oak Zeng 68a2d286ca Use drm render device to map kfd BOs
Previously kfd device is used to map memory for CPU access.
However this is not compatible with how TTM handles CPU mapping
on eviction - memory won't be unmapped and remapped on restore.
This fixes the issue by mmapping memory using DRM render device.

This patch requires a coordinated kernel driver change to work.
To make it compatible with old kernel driver, some temporary codes
are included. Once the coordinated kernel driver is checked in,
the temporary codes can be removed.



Change-Id: Ie7b304c4a82b7e8d5ab703acb81d66430af4f0bc
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-11-02 09:06:26 -04:00
Shaoyun Liu 55fc06dac3 Add asic id for gfx906 on emulator
On thunk level, gfx906 works same as gfx900 chip

Change-Id: I727bd904284616f3b1b9b911e41ad0f19318b3ee
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
2017-10-31 14:58:09 -04:00
Sean Keely d93f92f42d Make HostQueue::queue_count_ a portable atomic type.
Also make lint happy.

Change-Id: I0f965df6a76fd959df9eb411d1f1b11847159790
2017-10-31 02:38:25 -05:00
Qingchuan Shi ce6aee01ed Add APIs to support debugging vm fault
1. Add hsa ext api hsa_amd_register_vmfault_handler for debugger to register callback in case of VM fault.
2. Extend hsa_ven_amd_loader API to:
   (1) iterate loaded code objects in executable:
       hsa_ven_amd_loader_executable_iterate_loaded_code_objects
   (2) get loaded code object info:
       hsa_ven_amd_loader_loaded_code_object_get_info
3. Make the id of hsa_queue the same as the one used in communication with thunk (for amd_aql_queue)

Change-Id: I68910809e59e24297350d262606f00e96c14bcbd
2017-10-28 21:48:26 -04:00
Philip Yang 3501b2f40d Fix double free on fork after hsaKmtCloseKFD
Child process hsaKmtOpenKFD() call must re-initialize global variables
copied from parent process. This includes close all file handles, free
dynamically malloc buf. Double free issue is because destroy_device_
debugging_memory() free the memory in parent process hsaKmtCloseKFD()
but don't reset it to null pointer. As a result, child process free it
again. kfd_fd is closed in parent process but don't reset to 0, so
child process close it again.

Fix: reset kfd_fd to 0 after close, reset is_device_debugged pointer to 0
after free



Change-Id: I421b3decbcaa4111298b8e599aa16940d851a58c
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2017-10-26 15:36:15 -04:00
Sean Keely aece2f8fc2 Remove make build file.
Change-Id: I86abae4c44b6c606fb850eff6d44cdbf30cf59f5
2017-10-26 01:12:31 -04:00
Jan Vesely d8a8f88737 cmake: make sure there are no undefined symbols
Change-Id: Id5a268d7e512f71c1a65af598543eb60ae6c3596
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-10-25 17:46:58 -04:00
Jan Vesely 383f275aa7 cmake: Use pkg-config to find libpci
Change-Id: I1ab4397d88a2bd48ce0d6f2d3c33efcf47bc442f
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-10-25 17:46:58 -04:00
Sean Keely 6ee2ccb08b Fix error message description.
Change-Id: I32efed68e970a4882aca9decbbcda3fcd5c5cb43
2017-10-24 21:52:21 -05:00
Sean Keely 5a4ab91be1 Set doorbell kind code for gfx9+ device enqueue.
Change-Id: I93c4cea677ae51f97ac768614333743fb26b2f54
2017-10-21 11:08:44 -04:00
Sean Keely a8d818a6bc Improve build system handling of non-default directory layouts.
Adds the thunk include and lib paths to the cache, removes paths
to indicator files from the cache, uses the cached path directory
(if any) as a search hint for indicator files.

Change-Id: I0859faa8d229a97abfaacb408d2c831e317aed5f
2017-10-21 11:08:15 -04:00
Sean Keely 3cef9b1a04 Improve unhandled exception error reporting in debug builds.
Change-Id: Ia92d1a93163105d817a2147d96f2edd399e2b70d
2017-10-21 11:08:01 -04:00
Sean Keely 737966eb25 Fix memory leak in exception path.
Change-Id: Iad5f035cd1909be4a8f1a1f5dd7ca5abec0694b4
2017-10-21 11:08:00 -04:00
Chris Freehill 5a3230af66 Undo temporary namespace change
Change-Id: I7f4c06f7037713db855b51367256cf4c7ba41860
2017-10-20 20:02:13 -05:00
Chris Freehill 50a3d9402a Temporary change to namespaces to adjust for smi change
Change-Id: Ic91bfb678912a82214f0a462a4b57e531f12977a
2017-10-20 13:12:06 -04:00
Ramesh Errabolu dccbc9f2af Changes to support surfacing of link weight as part of link info
Change-Id: I1c0705a9374af1245f0419c51beded0d7ee10639
2017-10-20 12:09:31 -04:00
Yong Zhao 3b852b4437 Add SVM aperture on gfx902
Because of HW design change, GPUVM aperture is no longer needed on GFX9
APUs. However, on APUs some functionalities still depend on GPUVM
aperture, so we choose to use SVM aperture instead to assume
the functionality of previous GPUVM aperture.

Change-Id: Ife7f0d598dd7989f2bcf7cdf3466d5a68703ca60
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2017-10-16 15:06:09 -04:00
Felix Kuehling cb4814eadc Make system memory allocations NUMA aware
Use mbind to specify the NUMA node for system memory allocation. This
only works with HSA_USERPTR_FOR_PAGED_MEM=1.

Change-Id: I88e7815d5a5aefcc4c22358c1a4a1635d7677ef3
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-10-13 16:25:56 -04:00
Chris Freehill a7cbe78366 Use Major/Minor/Step device numbers to differentiate gfx devices
Change-Id: I0901871971a5b33018917ada6c0e69ac7aa91944
2017-10-13 16:18:24 -04:00
Tom Stellard e2ed9cf79a Don't mark heap memory as executable v3
Marking heap memory as executable using mprotect() is not allowed
by SELinux.  mprotect() calls that try to do this will fail on systems
with SELinux enabled.  This is also a security risk, so it should be
fixed even on systems that allow this.

Any memory we want to mark as executable must be allocated using mmap().
See https://www.akkadia.org/drepper/selinux-mem.html

The two places where we try to mark heap memory as executable both use
posix_memalign() to allocate the heap memory.  In both cases, the
alignment value passed into this function is always equal to PAGE_SIZE,
which means that they are safe to replace with mmap(), which guarantees
alignment to PAGE_SIZE.  In this case PAGE_SIZE has been set to
sysconf(_SC_PAGESIZE);

v2:
  - Use MAP_PRIVATE instead of MAP_SHARED.  This matches the behavior
    of memory allocated by posix_memalign()
  - Ignore alignment hints instead of returning error when we can't
    accommodate them.
  - Drop alignment parameter of allocate_exec_aligned_memory() since
    the only alignment supported is sysconf(_SC_PAGESIZE).
  - Remove extra parameter from fmm_release().
  - Add error path to fmm_allocate_host_cpu() for when mmap fails.

v3:
  - Avoid use after free.

Change-Id: I7d51279790d9700bc3fa761c44bfde1c1936019b
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-10-13 14:05:58 -04:00
Tom Stellard 4815d9b887 Build fixes for gcc 7.2 v2
src/perfctr.c: In function ‘destroy_shared_region’:
src/perfctr.c:154:10: error: logical ‘and’ of equal expressions [-Werror=logical-op]
  if (sem && sem != SEM_FAILED) {
          ^~
src/perfctr.c: In function ‘update_block_slots’:
src/perfctr.c:323:11: error: logical ‘or’ of equal expressions [-Werror=logical-op]
  if (!sem || sem == SEM_FAILED)
           ^~
v2:
  - Initialize and reset sem to SEM_FAILED.

Change-Id: Id70361079b715c4946b13e4460e4fd85d9542c46
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-10-13 11:37:38 -04:00
Sean Keely 4f299a9909 Capture more memory allocation types with the 2MB allocator.
TensorFlow was running out of VRAM due to padding up allocations
from legacy memory APIs.  These allocations have been added to
the fragment allocator to improve VRAM utilization.

Change-Id: Ic680fff576a0434b3b17a4c91746da44e09957fa
2017-10-12 23:22:10 -04:00
Amber Lin 5815d9de9b Fix endless loop
Fix a while loop that can cause forever loop when cpuid instruction
doesn't work properly.

Change-Id: Iefa49d23b40c994eb4369621974a7d3c4067e47a
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-10-12 14:22:42 -04:00
Ramesh Errabolu 703b1466c1 Update Copy requests involving all pools i.e. options -a or -A
Change-Id: I0c8d8fbb39f43cd6a1f84ae6ae32337fa9b1f5e2
2017-10-10 13:01:46 -04:00
Evgeny fd99e909ff aqlprofile API: enabling privilege memory related counters
Change-Id: I28a24ad1a3ce78c5d8a6319635ae1ffd392ab690
2017-10-09 17:34:54 -05:00
Yong Zhao 8126ddc77e Update kfd_ioctl.h
Kernel file has been changed recently, so we update the file in thunk.

Change-Id: I359a389fa9d91641114c7fb75f420ee6b16f467a
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2017-10-06 14:49:11 -04:00
Yong Zhao df4d8a0010 Revise gfx902 GFX version to 9.0.3
Change-Id: I6c16726ac9d096dc4ab127fb266eed105a4f9c87
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2017-10-06 14:48:38 -04:00
Chris Freehill 75eb3316aa Make rocrtst use new rocm_smi library
Change-Id: Id688d6d6d5ff106a23f5b55eaca4e723c39433a3
2017-10-06 09:41:10 -05:00
Ramesh Errabolu c2caa5ae2c Benchmark copy of data from one pool to another pool either in
one or both directions. Users can enumerate the pools reported
  by system to specify which pools serve as source / destination

Change-Id: I8e6d0adb3743b3328dd3ce9152762ca840ea613b
2017-10-04 20:53:25 -04:00
Ramesh Errabolu 34602f7e95 Adding kernels to read / write buffers
Change-Id: Icad95c084e0fcd0bd9f86154e23ac8f54c24afbe
2017-10-04 20:33:48 -04:00
oak zeng d9e71260c4 Print debug message for GPU vm fault analysis
Change-Id: Ia6dac9d3f5c35a7d0e41de9b54c06596d00c7946
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-10-04 15:17:55 -04:00
Sean Keely 9ba83d83f7 Release cached memory blocks when memory allocation fails and retry.
Change-Id: I6d96e136e147d8ffe9ff7baec26b4b5a678b739f
2017-10-03 20:12:05 -04:00
Ramesh Errabolu 2cab8307b9 Cmake project to build Rocm Code Objects
Change-Id: If3a631615316c203318bb5ae1df328a66e2919b1
2017-10-03 17:53:56 -04:00
Chris Freehill 0adfe5a18e Temporarily diable Max Memory alloc. test
Change-Id: I13cfc77bd5b823354e60a3023356255c72c1fd6c
2017-10-03 00:19:08 -04:00
Chris Freehill c8b92c5087 Add IPC test to rocrtst
Change-Id: I6a40375790a184df11afc88b863cafc3d244e92a
2017-10-01 11:01:31 -04:00
Chris Freehill 3fa0b7e5b4 Fix build error in release version
Change-Id: I5b8378e4e771369ff2b2cc64ddfb44dde38d8d44
2017-09-28 23:47:51 -04:00
Sean Keely e9a6f2c3e6 Support hsa_amd_agents_allow_access on page fragments.
Since access may only be manipulated on whole pages, suballocator fragments must cooperate to set the page's access.
Since the KFD does not migrate memory on access changes this implementation makes agent access sticky across the requests in a fragmented page.

Change-Id: I88479ed45fb40e9782b704526a7b8ffb22e7bd76
2017-09-27 19:04:04 -05:00
Evgeny 0e88414f5c removing graphics specific block RMI, ennabling memory related blocks
Change-Id: I477adc49b9ee3c8593c193bdc69c0deb4a9726e1
2017-09-25 10:49:22 -05:00
Sean Keely 476c8e36bf Fix assert in simple_heap.
Also add comments to clarify pointer info constraints.

Change-Id: I8d07831a0e953d667c84c96fe53ed07c18ba115c
2017-09-21 00:47:18 -04:00
Evgeny fcaecfee80 adding hsa_ven_amd_aqlprofile.h to the packaging
Change-Id: I3b69396e3cea129106d47be53218213e29de9843
2017-09-20 14:40:49 -04:00
Sean Keely 30fce248c6 Enable use of CLOCK_MONOTONIC_RAW for post 4.4 kernels.
Change-Id: I3c1f27c7e639df5128c36d81f715fa16e6c1cf13
2017-09-20 14:28:23 -04:00
Chris Freehill 7d46a02df4 Use relative dir. instead of abs. (2nd instance)
Change-Id: I778a59e94efdd0845249473d92eaedd172429a48
2017-09-19 21:38:38 -04:00
Chris Freehill 7d84190c4e Use rel. dir instead of abs. in CMake;Have a default number if iterations
Change-Id: I097fd229338ed520196cc4ed1ef1d00fe538e50c
2017-09-19 14:13:49 -04:00
Chris Freehill 2d58324ac8 Use relative path for symlink instead of absolute
Change-Id: I165f38df43afd554f022bb3bac54546c7bc5e806
2017-09-19 09:25:43 -04:00
Sean Keely 9dfdce5b3c Improve branch elimination in ScopedAcquire.
GCC can't reasonably be told that the lock ptr isn't null.  Adding a private bool
allows the branch to be eliminated, along with the bool.

Change-Id: I0605d69474d6a6e6951be93c0af1d8caf3f77124
2017-09-19 06:08:36 -04:00