Wykres commitów

2959 Commity

Autor SHA1 Wiadomość Data
Harish Kasiviswanathan 7de0199e99 CMA: Initialize SizeCopied return parameter
UCX test cases are reporting uninitialized values when CMA fails. The
application should ideally ignore SizeCopied when the function fails but
it doesn't. This is leading to wrong diagnosis.

v2: Fill in partial SizeCopied in case of failure

Change-Id: I6b7e1c19a8b702ec91ca64201a3dda27bd897877
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
2018-02-08 12:46:40 -05:00
James Edwards b913795c31 : Fix compilation errors with gcc 7.2.1 for hsa runtime.
Change-Id: I3356388753ca78cc0f1e0c3188220d7f3f60283d
2018-02-07 09:22:39 -06:00
Evgeny 8a8d7ad814 ExecutePM4 queue full check fix
Change-Id: Id56ece6d3f5eab1ef3a2758922022f0996c1efe4
2018-02-05 19:35:39 -06:00
Chris Freehill 3449f7dea6 Don't support platform atomics for gfx9XX
Change-Id: I302c862494e221ae2b6b3e1a843f06586b0b28ba
2018-02-02 18:21:16 -05:00
Yong Zhao 55bb61ff9c Revert "Workaround: make mmap memory resident for gfx902"
This reverts commit 716755b1de.

Change-Id: I9f4f0b6b426aeae4cb652b33cf0d4c0f57270ca5
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2018-02-02 12:31:06 -05:00
Sean Keely f59b001c75 Guard against IPC signal use of when profiling copy APIs.
Also update IPC signal API text to allow single process profiling with IPC signals.

Change-Id: I90b246623129d57183acb4ba1789beec360547c3
2018-01-31 19:05:32 -05:00
Sean Keely 91f559802d Revert CRAT table workaround.
Change-Id: Ic2bf9e1fb1d00c5a31d52560e0eb37e0ae1ab08a
2018-01-30 18:26:53 -06:00
Tony Tye d472b24d05 Add support for R_AMDGPU_RELATIVE64
- Add support for R_AMDGPU_RELATIVE64 relocation record.
- Return status error if any unsupported relocation record encountered.

Change-Id: Icbb5dcb81109a70c1f2195412a0df58a11be9da1
2018-01-30 18:20:26 -05:00
Chris Freehill 8bf85cc668 Temporarily disable rocm-smi to integrate with new rocm-smi
Change-Id: I06701cd4ac80bb4f3a9ae48d5374b7d4a788f8a4
2018-01-26 06:44:01 -06:00
Laurent Morichetti 056ddbbc82 Silence Valgrind warnings
Change-Id: I8803f3d310fccd69d0d04b2464b00dccc40270e3
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-01-25 16:48:17 -05:00
rohit pathania ee917eca68 Added CPU to GPU and GPU to CPU MemoryAccess Tests, Added enqueue latency Tests
Change-Id: I18643d283101b792fa25705c8149ddc5a9eefe73
2018-01-08 04:11:32 -05:00
Amber Lin 7031a77428 Update README to reflect cmake change
New CMakeLists.txt sets a default module search so -DCMAKE_MODULE_PATH is
no longer required in the command.

Change-Id: I95189ce2f36016b7c4929239d0e512851bec5ef6
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2018-01-04 15:55:20 -05:00
Chris Freehill 563581223c Report physical memory instead of virtual memory
Change-Id: I18105e3982a96aea40e05cd78521c0c3acf75de4
2017-12-20 22:11:50 -04:00
Sean Keely fe1763848a Merge system heap info.
Workaround pending thunk spec clarification.

Change-Id: I9d96227efde3a551157733cf4050d474d1e658f2
2017-12-19 18:57:29 -06:00
Amber Lin 8bc83e1e9b Update README to include new requirement
Latest Thunk requires the user to belong to video group. Add this
statement to README.md to notify external users on Github.

Change-Id: Id9843abf09de5b63a3b7c3f7b322bc9099c6ff1a
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-12-18 12:10:24 -05:00
rohit pathania 7310312291 Resubmitted added and modified common utilities functions for rocrtest with faile to open file fix
Change-Id: Ie45668df1a15c1be7e8bdb10b967b98fb3024252
2017-12-18 05:06:22 -04:00
Kent Russell c3a880db7d Revert "added and modified common utilities functions for rocrtest"
This reverts commit 7e46704abb.

Change-Id: I825b210ce4fc831f8a978faf1c7d83d54408efa4
2017-12-15 06:04:50 -05:00
Sean Keely b49e5b4917 Remove region/pool size limits for 902.
Temporary measure. Must be reverted once CRAT tables have been fixed.

Change-Id: Id2f2673edbf7b6fc5752f8d871042b4bf4de653c
2017-12-14 16:02:05 -05:00
Yong Zhao 716755b1de Workaround: make mmap memory resident for gfx902
Change-Id: I5f90f316740f7995d54cb083a6d7e05bc4e2966e
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2017-12-14 15:11:01 -05:00
rohit pathania 7e46704abb added and modified common utilities functions for rocrtest
Change-Id: I80afa33a46b3d95058be306869e7ed54b2b7df64
2017-12-14 12:01:16 -05:00
Sean Keely 1addb5e684 Don't use double mappings on GFX9 APUs.
Change-Id: I1225696211d4eac9ce982243ea0a1a9e8b2a318f
2017-12-08 20:18:02 -05:00
Yong Zhao 0f83774635 Report gfx902 as GFX 9.0.2
This change is needed to match other higher level components.

Change-Id: I45114d23f2ed428dfbbb836061b3020c5ab166ec
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
2017-12-07 16:08:10 -05:00
Sean Keely ca4c884306 Report library load errors in debug builds.
Change-Id: I24e63b15ad74fb86ecfe839f543800c2140c09d9
2017-12-05 18:49:33 -05:00
Oak Zeng c2dc301792 Revert "Revert "More cleanup of fmm.c""
This reverts commit 52f6a61970.

Change-Id: I31afe4889794df8cf1e96f5f18771bed75a213d9
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-12-04 15:48:11 -05:00
Oak Zeng 786e470241 Revert "Revert "Cleanup fmm.c""
This reverts commit f7689d4fef,
Plus a bug fix to patch "Cleanup fmm.c":
Call id_in_array with correct parameter. The third parameter
of id_in_array is size in byte of the array, not the number
of array items. Call it correctly.

Change-Id: I72d8e2fcc0df32af76c72967386e92c1be18c159
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-12-04 15:48:11 -05:00
Felix Kuehling 587d4f4bdf Rename fmm_allocate_memory_in_device
to fmm_allocate_memory_object. This function name was confusingly
similar to fmm_allocate_device and __fmm_allocate_device. The new name
reflects its function better: allocate the VM object and the kernel
mode buffer object.

Change-Id: I6604d228004b4d41e871d4de784786823608b5d6
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-12-04 10:23:55 -05:00
Oak Zeng f7689d4fef Revert "Cleanup fmm.c"
This reverts commit b4c89c1ea7.
This change caused a regression ()
Revert temporarily

Change-Id: Ic3829264151e37d1f8c6927c6f464006234ba17f
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-11-29 09:43:11 -05:00
Oak Zeng 52f6a61970 Revert "More cleanup of fmm.c"
This reverts commit 019f7cbd20.
This change caused a regression ()
Revert temporarily

Change-Id: I5af59d319afeb7f0b03e5a09e8397e3853b8b37b
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-11-29 09:42:19 -05:00
Sean Keely d2e70bb999 Cleanup Signal interfaces for doorbells.
Create an interface for doorbell signals to reduce code duplication.
No functional changes.

Change-Id: I101a8997dd582ff99e1537758c804b21fe3bb6af
2017-11-28 22:12:19 -06:00
Sean Keely b93ffafdc7 Pull from github (tstellar):
Prefer using memfd_create() for the ring buffer.

We were using /dev/shm, but this won't work on systems that
either don't have /dev/shm or have mounted it with noexec, because
for everything other than gfx700 we map the ring buffer with PROT_EXEC.

memfd_create() is Linux specific and was added in Linux 3.17, so we
will fallback to using /dev/shm on systems where memfd_create() is
not available.

Change-Id: I58fb533eebc362f6d29dc3e316a80801014d50e8
2017-11-28 20:47:12 -05:00
Sean Keely 4b603e803d Improve loop variables.
Derived from github pull request by folklore1984.

Change-Id: I70cd3da131691543fed8bf913d6245d41c49280d
2017-11-28 20:36:22 -05:00
Sean Keely 5872b618de Pull from github (pmargheritta):
Corrected semantics used in hsa_queue_load_write_index_relaxed.

The semantics that was used in hsa_queue_load_write_index_relaxed
didn't seem to match the name of the function.
I also removed a useless return keyword.

Change-Id: If3819d38fb367f122fc382edf8ee3771a23279ae
2017-11-28 20:35:50 -05:00
Oak Zeng cce57cec26 Cosmetic changes in events.c
Change-Id: Idecb8eede8811020b3af51cbc71da74849029c82
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-11-28 15:20:51 -05:00
Oak Zeng 019f7cbd20 More cleanup of fmm.c
1. Renamed _fmm_map_to_gpu to _fmm_map_to_apu_local
   to reflect the real semantics of this function
2. Renamed _fmm_map_to_gpu_gtt to _fmm_map_to_gpu
   because this function is used to map both gtt
   and local memory
3. Call _fmm_map_to_gpu in _fmm_map_to_apu_local
   to get rid of duplicated codes

Change-Id: Id8e3ebfffe0a3c27ebdcac8a8f4dc3738d67d10a
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-11-27 18:47:35 -05:00
Oak Zeng b4c89c1ea7 Cleanup fmm.c
1. Initialize pointers to NULL in vm_create_and_init_object
2. Added helper function to add/remove device ids to/from mapped arrary
3. Only map nodes that were not mapped currently
4. Remove unnecessary condition check on object frees

Change-Id: I7aed6d40c7464be0d168d5796229af55451e0f34
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-11-27 18:47:23 -05:00
Amber Lin 6f7b55f2d8 Add debug message in PMC trace
Print data in PMC trace when the debug level is set to 7(pr_debug).

Change-Id: I9abbb8f6c3f7962fb637528578c1a58b7784042d
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-11-22 10:09:49 -05:00
Oak Zeng 061db45fe2 Fix unconditional unmap in fmm_map_to_gpu_nodes
_fmm_unmap_from_gpu is called in fmm_map_to_gpu_nodes
to unmap buffer from nodes that is already mapped to
but not in the new map nodes list. Previously, the unmap
was called unconditionally even though the size of the
array to unmap is 0. This fixes the issue by calling
the unmap func only when the unmap array size is not 0.

Also releases the fmm_mutex on error returns

Change-Id: Iadd8383caf7ebb92f02618798c5efd138a352aaa
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-11-21 15:16:39 -05:00
Oak Zeng f06e887725 Properly control lifecycle of ptr info objects
Buffer mapping to devices and buffer registration to
devices can be changed b/t two pointer info queries.
Thus update buffer mapping info and registration info
only when mapping and registration changed. This is
done by free mapped_node_id_array on mapping to new
device and free registered_node_id_array on registration
and re-allocate them on next ptr info query.

Also uses fmm_mutex to avoid race conditions in case
of calling hsaKmtQueryPointerInfo concurrently with
calling of buffer mapping or registration

Change-Id: Ibc2e20be1fc0147066f873dfa44b21f5015104b7
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-11-21 15:16:29 -05:00
Evgeny 86939368d1 _aqlprofile_start() API migration
Change-Id: I7c8c7a6fc6f9b20cc2e4074dde38fb19440927f1
2017-11-20 17:32:19 -05:00
Sean Keely e81d04e11b Use a default module search path if not already specified.
Change-Id: I782f0b758dc908c25abeb7f3536418cb5a48ac5e
2017-11-20 14:12:27 -05:00
Amber Lin 61d1c6ffac Correct command in make package
cmake command in making packages was not updated.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>

Change-Id: Iafd3d9f4941d782bd77cfd0efafe48a02221b002
2017-11-20 10:28:25 -05:00
Chris Freehill 651ae1bf70 Device ID/family corrections for gfx9xx
Change-Id: Icb25fbbaeb99ce886a2852b48d02875ee0f197a2
2017-11-16 07:27:54 -05:00
Evgeny 6e1b9288f6 aqlprofil API: removing from HSA hsa_api_trace/hsa_ext_interface
Change-Id: I12fac55ea9ccfdb119899bf9d000e3c8b0bf4bbb
2017-11-11 10:01:12 -06:00
Evgeny bb8eaf3ac8 aqlprofile API: _aqlprofile_start() returns required profile buffer sizes if undersized
Change-Id: Ib14b2cb2e7e2026c3af0b7bd2f08f51e48e598b2
2017-11-09 20:03:55 -06:00
Oak Zeng d4e6ec0ff0 Added "-g" to CFLAGS for debug build
Previously even for debug build, -O2 is used.
So there wasn't debug information in the debug build.

Change-Id: I6334474e007480eb2db191bdfec5a71677c26a52
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-11-09 07:43:36 -05:00
Sean Keely 6455a69b03 Fix bad casts in tools.
Also virtualize queue profiling enable.

Change-Id: I761b41269be3df7eb64a5914ee9951ed6b51bb04
2017-11-08 15:50:02 -05:00
Sean Keely a6d8a48cbc Add callback exception forwarding.
Modified callbacks for intercept queue, queue error, iterate agent and
iterate region.

Change-Id: I8bdd67f2312510ea7eb9caec93babca244938b40
2017-11-08 15:50:02 -05:00
Sean Keely f312a7386e Exception support for Queue.
Remove "zombie" queue state and report queue creation failure via
exceptions.  Make Shared object a final container and support array
objects with Shared.  Add message printing to hsa_exception in
debug builds.

Change-Id: I459f38c80846018acbf45538874e95f91dd6b195
2017-11-08 15:50:02 -05:00
Sean Keely 0c7dde2d1f Add queue intercept support to the runtime.
Queue intercept is exposed as two tools-only APIs via the API
intercept table.

Change-Id: Iac9602ed3143974d85c3569e9092295ad18037f8
2017-11-08 15:50:01 -05:00
Oak Zeng 07110fbd38 Correctly handle max_map_count limit after failed memory allocation
Also separated a function for removing CPU mapping
and reserving address, as a refactoring of codes

Change-Id: I1feb85b0b2ec942487f899ec3192c7c47dd7c7d5
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2017-11-08 10:05:04 -05:00