Graf commitů

2959 Commity

Autor SHA1 Zpráva Datum
James Edwards 4d7d50feba Add tools headers and library back to packaging.
Change-Id: If6c9befe50fc111eb154bd5b4eb5c7858f5d510b
2018-07-16 16:51:12 -04:00
Sean Keely 35a270ef7e Do not initialize runtime internal queues based on mapping memory to a GPU.
Conserves VMIDs when multiple processes are in use and memory operations
are not GPU specific.  For instance HIP API hipHostMalloc does not accept
a target GPU so when used with one process per GPU (ie GPU == MPI rank) we can
quickly exceed the available VMID slots if every process consumes a VMID on
every GPU.

Change-Id: Ib6fa051290089f71581029c09f9a44b9992237d1
2018-07-13 19:58:45 -04:00
Chris Freehill 65c3cf27f5 Use the new name of the rocm_smi library
Change-Id: I7358b7b819826f1d3d3b0ca99fc5fd1a4e6d9536
2018-07-13 11:46:49 -04:00
Chris Freehill 3cca09ccca Fix NUMA async copy test
Change-Id: I64b5bd1ac5bf9b58d86c3dfc170bcf06a39abee4
2018-07-11 19:20:13 -04:00
Sean Keely c6cf161125 Fix git describe command to retrieve version tags correctly.
Change-Id: I904f5ccdb88c1e28d5eeffd104174fcd57626ee7
2018-07-10 20:19:04 -05:00
Sean Keely 63f2a0d280 Fix git describe command to retrieve version tags correctly.
Change-Id: I33282e8130d092e2f56b2f5947946d3c0ee22c60
2018-07-10 19:49:00 -05:00
Chris Freehill 06759fed5f Undo temporary work-around for RSMI change
Change-Id: I9bf144add951c95e4eebc8647bffb71d13f4f612
2018-07-09 08:46:57 -05:00
xinhui pan ab9017715f use rbtree instead of vm_objects list
simple test of mapping many system memory to gpu.
before
[ RUN      ] KFDMemoryTest.MMap
[          ] Using ISA for GFXIP 9.0
[          ] successfully register/map 32GB system memory to gpu
[       OK ] KFDMemoryTest.MMap (36932 ms)

after
[ RUN      ] KFDMemoryTest.MMap
[          ] Using ISA for GFXIP 9.0
[          ] successfully register/map 32GB system memory to gpu
[       OK ] KFDMemoryTest.MMap (11441 ms)

So there is 11s VS 36s improvement.

Looks like we can do something similar with vm_area too.

Change-Id: I0349aacdeddec3534016d28176f0fabf632c61fc
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
2018-07-08 22:38:22 -04:00
Chris Freehill ae0c852074 Temporary work-around for RSMI change
Change-Id: If4913d5d0cdb0415569c75ab312c39b4253cd4fa
2018-07-07 22:57:42 -05:00
Wilkin 170e2a142f OpenCL BLIT for Image library
- include support for gfx702

Change-Id: If681a4eef9bd076e25300e1c1bca55b4f7c92b46
2018-07-06 10:35:44 -04:00
Felix Kuehling d3228f363e Fix wrong loop termination condition
Compare with gpu_mem_count instead of deprecated NUM_OF_SUPPORTED_GPUS
to prevent overflows in case no dGPUs are present.

Change-Id: I71fcb7503ba4c20bffadbdb04cefc4e4027a7df7
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-07-05 17:04:40 -04:00
Yong Zhao 4839882fc8 Set the write permission according to the flag when allocating host cpu mem
Change-Id: I758c2b5b1799e968fa852646e1494fabb68c782d
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2018-07-03 20:39:01 -04:00
James Edwards 58a411dd36 Change packaging for rocr-dev and rocr-ext.
Change-Id: Ia096a2d31ddd7bef2e05bb3d6c58e94d8c339598
2018-07-02 13:40:45 -05:00
Slava Grigorev 89e35574e3 Fix 'strncpy' truncating warnings when compiling with gcc 8
Change-Id: Ib145bab9450281da05f70dea34433b83438a756b
Signed-off-by: Slava Grigorev <slava.grigorev@amd.com>
2018-06-29 17:06:08 -04:00
Yong Zhao 4eaaf9694d Simplify if else logic for hsaKmtAllocMemory()
The new logic is easier to follow.

Change-Id: I69759a45c5dedaefeff831a2367253d3a4486bd3
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2018-06-29 14:39:52 -04:00
Yong Zhao 5972fac417 Rename two variable names in doorbells structure
There were two doorbells, one embedded in another, which are very confusing.
Change the member variable name to mapping to differentiate them. Also,
rename doorbells_mutex to just mutext for brevity.

Change-Id: Iaa14a1a3ee09449a9089fc1fb39c916fdf32fb44
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2018-06-28 16:04:35 -04:00
Yong Zhao 77ec699460 Fix a bug that fmm_init_process_apertures() returns incorrect value
If opening drm render device fails (usually when the user is not a member
of video group), fmm_init_process_apertures() still returns success,
resulting in weird segfault in a later stage.

Change-Id: Ifbde4481629988944ad7f384d59753c88e287fa9
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2018-06-28 16:03:07 -04:00
Felix Kuehling fb551a44af Fix compiler warning on Fedora 28
Avoid warnings of the type
    error: 'strncpy' specified bound 64 equals destination size

With the destination being 0-initialized, subtracting 1 from the
destination buffer size will ensure that the destination will be a
0-terminated string, even when it's truncated.

Change-Id: I7c3a90482065ce4d020db215e3e41348de51a083
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-06-25 14:36:49 -04:00
Felix Kuehling 4e766615d7 Fixup previous commit
Add back missing pthread_mutex_lock.

Handle all error cases in fmm_release.

Change-Id: I8efa561ddadfd769cede5bf86300215ba3fb3dd1
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-06-25 14:24:23 -04:00
xinhui pan 8ee5647814 THUNK: fix deregister memory issues
__fmm_release actually fails to find the object if address is not
pagesize aligned.  And the caller did not notice this as __fmm_release
has no err code return.

So to fix this, move the object lookup in caller, and use vm-object
instead. Also fmm_release will pass up the error code.

Change-Id: Ib8ea1ea5ae844844fd20e8e01f0fdb841d218f2c
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
2018-06-25 14:12:26 -04:00
Felix Kuehling 9434223752 Clean up cmake install and package
* Use GNUInstallDirs
* Install headers in $prefix/include directly, drop symlink
* Install libraries in $prefix/lib directly, drop symlink
* Move LICENSE.md from hsakmt-roct-dev to hsakmt-roct

Change-Id: I43562f15cc03029be53e9ec18c337824d8116659
Signed-off-by: Slava Grigorev <slava.grigorev@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-06-20 11:32:53 -04:00
Jay Cornwall e388a23344 Add hsa_amd_queue_set_priority extension function
Controls dispatch and wavefront scheduling arbitration across quees.

Change-Id: I498f4898b544f79b8fb8514bf7e789ca9da29462
2018-06-19 19:41:28 -05:00
rohit pathania 6df6ef778d Kernel group memory dynamic allocation, basic allocation and free test
Change-Id: I17fdb77f17567ac1b429d9a571cac70ac1e64dd4
2018-06-15 10:49:10 +05:30
Sean Keely 3e3aa37750 Enable SDMA use without platform atomic support.
SDMA will use atomic completion fences if KFD reports 64bit atomic support.
Otherwise it will fall back to store completion fences.

Change-Id: I12b76f8a74ec3ee96372c250f9824d846051536e
2018-06-12 15:38:44 -04:00
Yong Zhao 7a8566dc03 Improve the return value for hsaKmtOpenKFD()
When KFD is already opened, Opening it again should return
HSAKMT_STATUS_KERNEL_ALREADY_OPENED to align with the specification.

Change-Id: Ib10a2d2c48781600bea7d072557d03ccb1a2bc19
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2018-06-11 14:08:57 -04:00
rohit pathania d8e47ba115 Modified memory atomics on non Large bar system and pool info test memcmp issue fix
Change-Id: I951fdb6c91508f43b1c51f7eb92870543fc58e53
2018-06-11 18:49:29 +05:30
Chris Freehill 12a81ae96f More emulator friendly tests/examples
Change-Id: I27ab26add14743dfb065238129c14b48913d9df8
2018-06-08 17:58:37 -04:00
Chris Freehill 8a6f0d6b24 Disable Signal tests
They are breaking Jenkins builds.

Change-Id: I1647049abee0ebc2a4751e66d9ceed56cadc4c3e
2018-06-08 15:47:09 -04:00
rohit pathania c2ddd11979 Build failure issue in rocmaster 8386
Change-Id: I413abe0c9fbe16ab2e722cf3f7567aa2853e585b
2018-06-07 13:05:39 +05:30
srinivas Charupally f0a1b310fd Adding Signal Kernel tests
Change-Id: Ie34de41741a7c4731a0ff3761e940971b6f08745
2018-06-06 16:25:18 +05:30
Felix Kuehling 5f25d024a8 Prepare for hsakmt build system cleanup
These fixes are needed to find the hsakmt headers and libraries with
an upcoming hsakmt build system cleanup. It should continue to work
with the original hsakmt build system.

Change-Id: I6b3fcea8f2588698c130c9ec50952c66712afa6c
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-06-05 16:01:03 -04:00
rohit pathania d0f6da277d Concurrent testing for queue write index
Change-Id: If5b60b943a861d8f97d01b7fd8f757fdb36845c6
2018-06-05 11:57:38 +05:30
Chris Freehill 2c8cbf61c3 Make emulator friendly
Disable some tests that rely on features not typically available
in emulator and use smaller data and iteration sets

Change-Id: I587bf83162b114719e0361109ed44c6bf2adf34c
2018-06-04 18:51:09 -05:00
Chris Freehill 845a539478 Disable Aql_Barrier_* tests
Change-Id: Ibe08b88c101a60e4c6f0c61cda756e2cb5857d7d
2018-06-04 11:46:08 -04:00
Felix Kuehling 0462744965 Add fallback for GPUVM doorbell mapping
Upstream KFD doesn't support mapping doorbells to GPUVM yet. Fall
back to the old method.

Change-Id: I452a6fc59b88329b833844e3914c480c2f13c82d
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-05-31 13:17:27 -04:00
Felix Kuehling 7495e74257 Cosmetic changes to kfd_ioctl.h
Make it more similar with upstream.

Change-Id: I982ccfd4045d96e3c30bc84d38d0e03db8de9b08
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-05-31 13:17:27 -04:00
Felix Kuehling 571e2cf7e4 Update KFD-Thunk ioctl ABI to match upstream
- Clean up and renumber scratch memory ioctl
- Renumber get_tile_config ioctl
- Renumber set_trap_handler ioctl
- Update KFD_IOC_ALLOC_MEM_FLAGS
- Renumber GPUVM memory management ioctls
- Remove unused SEP_PROCESS_DGPU_APERTURE ioctl
- Update memory management ioctls
    Replace device_ids_array_size (in bytes) with n_devices. Fix error
    handling and use n_success to update device_id arrays in objects.

This commit breaks the ABI and requires a corresponding KFD change.

Change-Id: Ibf0af5a5188e817c886eab388d1533130fc18293
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-05-31 13:17:27 -04:00
srinivas Charupally a632bfddb2 Adding Aql tests
Change-Id: Id22dcafbf0ea0b346f3a03d4acef27350b706f36
2018-05-31 15:29:37 +05:30
srinivas Charupally 2c551d38b6 Adding Signal tests
Change-Id: I1815267a0e19614a84013e797bd3df3e77ee8179
2018-05-30 01:40:31 -04:00
Sean Keely f09eb2e8c7 Move SDMA dependencies back to hardware.
SDMA poll packet preemption has been fixed.

Change-Id: I3da878c433d4594a169e3bc8f173d3651448fd2d
2018-05-29 23:32:47 -04:00
Sean Keely c593dfc6bf Enable SDMA conditionally based on link atomic suport.
Avoids using non-atomic SDMA fences by default since that path can duplicate fences.
If HSA_ENABLE_SDMA is set this will override copy path selection and may use
non-atomic fences.

Change-Id: I4747e9a766f7f649d21ddf6bfded047ac26fd60e
2018-05-29 23:32:34 -04:00
Shaoyun Liu 93d07cf916 Thunk: Add gfx906 support on thunk
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>

Conflicts:
	src/topology.c

Change-Id: I692d9295a954d4eda08eba301312014f7b3969cb
2018-05-29 15:38:26 -04:00
rohit pathania 044fb8dc27 Different Atomic operation tests on GPU and system memory
Change-Id: I04154b588086d49142a64c8fe4826d041ded2991
2018-05-28 22:18:48 +05:30
rohit pathania 08a253684b Queue validation tests and memory alignment tests
Change-Id: I96d8c2898795240288517bdcbc2b48ff2cc04f66
2018-05-28 14:26:05 +05:30
srinivas Charupally 2c1919c681 Adding concurrent shutdown, reference count and max reference count tests
Change-Id: Ib6f40585bf1ab2b1d6f33bbb1675e13545a23a4e
2018-05-28 00:51:05 -04:00
Qingchuan Shi 3a46556dcc Add debug trap rocrtst.
Change-Id: I73682d7a2ad51eed9988075e012478a1afc76c7c
2018-05-22 13:31:45 -04:00
Yong Zhao ec440fb428 Stop allocating eop buffer for SDMA queues
Change-Id: I9a4eaee05588292a797eb424503dd7b793c1408c
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2018-05-16 15:30:23 -04:00
Yong Zhao 43f119bcbc Improve the code readablity
The main point is to move update_ctx_save_restore_size() out of if()
condition.

Change-Id: I58a1a4f3edca2d1c510fdd0e31e59b5c41e92a14
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2018-05-16 14:55:55 -04:00
rohit pathania 47af1d673e Memory Concurrent tests for pool Memory allocate, Memory free and get pool info
Change-Id: I6a1343348e400fe466e041d651adaa67be561a21
2018-05-14 01:30:54 -04:00
Jay Cornwall 536823482b Handle llvm.trap only in gfx9 trap handler
llvm.debugtrap and other trap IDs are reserved and should not place
the queue into an error state.

Change-Id: I98193a35ac7da94c4a42ee75d87754ee552ebea0
2018-05-04 13:15:23 -05:00