Γράφημα Υποβολών

2959 Υποβολές

Συγγραφέας SHA1 Μήνυμα Ημερομηνία
Sean Keely 4275661682 Add env key to disable 2MB suballocation.
Change-Id: Icca3041c3578aa180a656c01aae62f2ad6e8b583
2017-09-19 06:08:36 -04:00
Sean Keely 117be0b55a Add suballocator for ordinary VRAM allocations smaller than 2MB.
Track pointer info for sub 2MB fragment allocations in allocation_map_.

Add fragment support to IPC.

Change-Id: I00cfc2e2fa289aac90a4718c392f9bb056a61a87
2017-09-19 06:08:36 -04:00
Chris Freehill ae4a9c4d91 Remove reliance on env. variables to build rocrtst
Change-Id: I451dc9da4e810db51a4ec19e17a9b5206d09a224
2017-09-16 23:58:20 -04:00
hthangir 87d2df3da3 Use non-RAW version in clock_getres to workaround bug in older kernels
Change-Id: Ice0606a42cd7054f0804baf4af3521ffae3b7d50
2017-09-14 13:56:15 -05:00
Chris Freehill b4f2e8d8f1 Remove use of tools library
Change-Id: I80eb95987e1e91c67bd6c3e4b12df934860940f1
2017-09-13 13:49:56 -05:00
Chris Freehill 0cb7db7d7e Adjust CMakefile to use defines instead of env. vars.
Change-Id: If5e97269774416eb65ab2d6d3f9e299b950c63a4
2017-09-13 09:33:18 -04:00
Chris Freehill 80a7bdf66b Modified to build within Jenkins
Change-Id: I70c9c6b690f198c41641432478343d3714e26ab0
2017-09-08 12:57:23 -05:00
Amber Lin 117fa5034b Fix PmcGetCounterProperties
Blocks inside of HsaCounterProperties structure is not a fixed size. It
varies with number of counters in the block -- size of Counters in
HsaCounterBlockProperties is different in every block. Current
implementation assumes fixed size and the next block will overwrite the
previous block's Counters. This patch change the array implementation to
using a pointer so it'll move the next block to the correction position.

Change-Id: I72800f4db5f2a68215fba477a61ca07ec99054bf
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-09-01 17:58:15 -04:00
Sean Keely f1a661dedb Report tools library load failures in debug builds.
Change-Id: Ie1ff313e929fc46134e58730a1d370c5d7ace8db
2017-08-31 21:32:48 -04:00
Laurent Morichetti 0f05ef73ac Include <errno.h> for EBUSY
Change-Id: I9fa3417445866f3ce37af2169f623afa8e92e873
2017-08-31 07:32:51 -07:00
Chris Freehill 92e46584f8 Async mem. copy test with NUMA awareness
Change-Id: If655ac4c087be2d379e868aad83812f2437d78b9
2017-08-30 21:35:37 -04:00
Amber Lin a74f6896ea Revert "Set guard page as disabled as default"
This reverts commit 65d680c035.

Change-Id: I09b7e7915ec4759cab57d5863089a2c4a44dfacd
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-08-30 10:48:23 -04:00
Amber Lin a81b29890c Fix mmap when the count reaches the max
Applications may try to allocate lots host memory and reaches the mmap
limit (/proc/sys/vm/max_map_count). When Applications fails to allocate
memory and calls hsaKmtFreeMemory to release the memory, Thunk fails to
reduce the maps count so the following hsaKmtAllocMemory calls continue
to fail, which doesn't make sense to the application. This patch checks
the mmap to NORESERVE return value. If it fails and the error number is
ENOMEM, reduce the map count by munmap and map it again immediately.



Change-Id: I127cb479dfd86b199172eef269d59426f23859ea
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-08-29 11:47:29 -04:00
Felix Kuehling 52598cf37e Generalize size-dependent virtual address alignment
Support all fragment sizes up to 2MB by aligning buffers according
to their size.

Change-Id: I82b7ef8be6f1507d941e5c97edb6618adf8c66de
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-08-28 15:01:33 -04:00
Ramesh Errabolu 3980b4268b Add Iommu Perf Cntrs
Change-Id: I1cf3f00a959a923462634a62263707a267ae18af
2017-08-28 12:57:11 -05:00
Amber Lin 65d680c035 Set guard page as disabled as default
Due to max_mmap_count issue, set default of guard page as disabled.


Change-Id: Ic9dfe69b621733e9fac86831b008a122994a67e7
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-08-24 16:27:52 -04:00
Felix Kuehling 395ecaa985 Fix two more device ID array bugs
Use the mapped_device_id_array size when allocating temp_node_id_array
for unmapping queues in fmm_map_to_gpu_nodes. registered_device_id_array
size may be 0. Also, this temporary array is small enough to allocate it
on the stack. Malloc and free are overkill here.

Fix potential memory leak when registering the same device ID array
multiple times.

Change-Id: I83f09fd0925d9de7cf11bf72ba0ebb77273f587d
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-08-24 13:31:25 -04:00
Evgeny 09c732c2f9 aqlprofile api: adding gfx8 mc counters
Change-Id: I84dc06c24b7961dfe665cf7e2ae6cc9ae3b7326b
2017-08-23 15:23:05 -05:00
Amber Lin 369902bf5b IOMMU path in sysfs is renamed
IOMMU path in sysfs was amd_iommu. After implementing multiple devices
support, the path is replaced with amd_iommu_<index>. Current Thunk spec
is not clear about how to support multiple instances in one block. There
is no products having multiple IOMMUs yet at this point. This patch
changes the path to support both amd_iommu and amd_iommu_0 for Carizo.

Change-Id: I3beea2fc78d96296232226191501a02ccf20d6b1
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-08-23 14:00:49 -04:00
Amber Lin 73707766ef Add debugging message on memory
Add pr_debug to all memory APIs and pr_err to some failure cases.

Change-Id: I8b519a1228cc19e6c04118fd87432e7f48f3cbf9
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-08-22 09:55:20 -04:00
Sean Keely 0cb1e8cb35 Correct vm_fault signal cleanup.
Change-Id: Id2f14b911e3991a76771425bc09f38a613280e6b
2017-08-18 22:12:38 -04:00
Harish Kasiviswanathan 00250f7686 Fix hsaKmtQueryPointerInfo for scratch memory
Change-Id: I35a0f1a81f8b0ac6e99c2e1572829eb32d3bc95b
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
2017-08-18 17:01:31 -04:00
Chris Freehill c85322d93d Remove dependency on hsa-runtime-tools
Change-Id: Ic4ce2bcbf1176e7eb859db39f21e7185691837e1
2017-08-17 15:39:35 -04:00
Amber Lin f1a5248cf2 Simplify memory maps
Simplify fmm_map_to_gpu_nodes code. Also fix a memory leak in this change.

Change-Id: I3487338b78c915de44588d0206bac4c53e728c60
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-08-15 17:34:30 -04:00
Evgeny fb4afca8c3 AqlProfile API, commenting out hanging GFX9 blocks, ATC, GCEA, RPB
Change-Id: I411fb33f77c9538ca236b9b6b09c7dfe75220c02
2017-08-15 12:44:00 -05:00
Sean Keely dec5c52e07 Simplify pointer info version check.
Change-Id: I0ed363f1261ffc041547f313970ca67298ace56c
2017-08-12 03:14:39 -04:00
Sean Keely c9642cf7af Initial IPC signal support.
Added an API for creating signals with attributes.
Added two APIs for IPC operations on signals.
Initial use of exceptions for error handling.

Add ref counting to signals.
Removed spin loops from signal destructors.
Signals are no longer to be destroyed with delete, use DeleteSignal instead.
Added delete safety to doorbells.
Added secondary hsa_signal_t -> Signal* translation path for IPC enabled signals.

Change-Id: Id59065d002f0c2566b0a9425694da2ed27cb7d7f
2017-08-11 18:41:34 -05:00
Sean Keely 2732b18092 Initial exception support for signals.
Also separate signal ABI block allocations from the runtime interface object.

Change-Id: If16763338db664f29163a1348f8f4c38cf0597b2
2017-08-11 18:41:34 -05:00
Sean Keely d6acd0edfc Update ipc test to use IPC signals.
Change-Id: Id5984093de45b08261d3196cc6fc3d597324edf4
2017-08-11 17:29:55 -04:00
Evgeny 287afd3a52 adding aqlprofile member to HsaApiTable
Change-Id: Id674186dfa2e83295a51f770ccc0400f1cb51a98
2017-08-09 16:09:39 -05:00
Felix Kuehling 78e683acf4 Changes to run on old kernels
Fall back to older apertures API and old events page size if the new APIs
fail. This allows running on current upstream kernels (with only minor
fixes) on gfx801 and enables testing of further changes during upstreaming.

Change-Id: I9d86d4f576e52fcbb5bc158d80f1bf41261e4e87
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-08-09 16:12:54 -04:00
Evgeny 4824a2db0b Adding HSA_API macro to the API method declarations to be consistent with other HSA header files, TCS removing
Change-Id: Ic217d3b2bdbb22d3600c5ecaacb7ab53bf26096a
2017-08-08 10:46:12 -04:00
Chris Freehill 783a28b68c Remove build of non-existent project
Change-Id: I6b2c59e67c2d2a320e705b725f8c779b9913759a
2017-08-08 10:03:36 -04:00
Evgeny 47322942b3 aqlprofile block list, explicit numbers assigned, IA removed
Change-Id: I9f9358f8e03e13eb81845de2e33dd5f3da27811a
2017-08-03 11:39:21 -04:00
Yong Zhao d0e2872011 Add gfx902 support
Change-Id: Iefc6d1bea0d1d2ea8768867c53f16cdf1279d38f
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2017-08-03 10:27:56 -04:00
Evgeny c66f68041c aql-profile api: reducing blocks list to compute only and new gfx9 blocks
Change-Id: Ib506b82ea407afec4f5d4bcad755d4d57b92e34b
2017-08-02 12:21:24 -04:00
ozeng 8176830577 CMakeLists changes to make thunk buildable on CentOS 6.9
Removed Werror CFLAGS for lower version of gcc. there
will be some warning message on lower gcc version but build
is ok.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>

Change-Id: Icf556625cb870c4ad73e1d89f3d4ade3a96e821f
2017-08-01 12:06:05 -04:00
Chris Freehill ab2248132a Clearer/more concise variable names
Change-Id: Ib92211977066b728f19b2a7fe40639160a8262b3
2017-08-01 10:38:26 -05:00
Chris Freehill cf24f7bb78 Added max. single mem. allocation test.
Change-Id: Ie81c6af0502fde56225b1e197801cf04b474feb2
2017-07-31 12:04:55 -05:00
James Edwards 2c2de075a9 Add back GNU Makefiles.
Change-Id: I4a367655a905a85d4c29980aa2da8ac28db73d10
2017-07-30 08:21:35 -05:00
Chris Freehill bddc89e703 Reorganize tests
Change-Id: I45f92d61070b325bcb57bd72e4a68e7d6495463c
2017-07-28 11:32:20 -04:00
hthangir 9ee0108e58 Fix compilation issue reported with GLIBC 2.12 (RHEL 6.9)
Change-Id: I770b72ba1d61475a76aa72d0c52ebfb380db6019
2017-07-28 11:11:01 -04:00
Chris Freehill a055531eb4 Update tests to use rocm-smi
Change-Id: Ia4692019460f4ba42a12ecba1f9e59576561c73e
2017-07-28 08:34:27 -04:00
Harish Kasiviswanathan 186527d0b7 Support IPC sharing of non paged system memory
Non paged system memory is allocated with node id 0. However, since a
gpu node is required for allocating system memory via KFD, the first
dgpu is used. In hsaKmtShareMemory() if system memory use the same
(first) dgpu.

Change-Id: I85789a89a4e4f7888e3826826401ea89ce4d1718
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
2017-07-26 10:17:07 -04:00
Harish Kasiviswanathan 20f0de71df Fix inconsistent calling of validate_nodeid
Change-Id: I3e8e65a5629059abdde89832b619cd8bf1f2b36c
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
2017-07-26 10:17:07 -04:00
Chris Freehill 8424fd6f23 Add rocm-smi c++ utility classes
Change-Id: I4362151abf84f89942bf2895b45fca498a28dfc9
2017-07-25 00:42:34 -04:00
Amber Lin e46743b1dd Workaround cpuid issue under Valgrind
Topology uses cpuid to get CPU cache information. However when running
under Valgrind, data returned from cpuid are not from the processor we set
affinity to. Instead they are all from one specific processor. For a quick
workaround so other teams can continue their work, this patch will report
CPU cache from that specific processor and ignore others.

Change-Id: I5cfac2329dac277f3dbde1be92fa26e085465401
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-07-24 12:04:17 -04:00
Felix Kuehling d563e2cb1d Update image alignment to 256KB
Needed for some tiling formats.

Change-Id: Icd460edaa77ccbeb3c98bc74b574ca5517db22af
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-07-20 21:03:31 -04:00
James Edwards ee22d80760 Update README.md to include new build instructions.
Change-Id: I72ca67d3016c99682cfe745bfd74c722ea181a61
2017-07-20 09:17:54 -05:00
James Edwards e93d3de0a1 Final changes to roct CMakeLists.txt file for devel package.
Change-Id: Ie0ce0c5cd8e7811f67e92439d1df1612eabefdfa
2017-07-19 17:16:17 -05:00