Граф коммитов

2930 Коммитов

Автор SHA1 Сообщение Дата
Evgeny ce72ce0c92 adding aqlprofile member to HsaApiTable
Change-Id: Id674186dfa2e83295a51f770ccc0400f1cb51a98


[ROCm/ROCR-Runtime commit: 287afd3a52]
2017-08-09 16:09:39 -05:00
Felix Kuehling 3f12312b4f Changes to run on old kernels
Fall back to older apertures API and old events page size if the new APIs
fail. This allows running on current upstream kernels (with only minor
fixes) on gfx801 and enables testing of further changes during upstreaming.

Change-Id: I9d86d4f576e52fcbb5bc158d80f1bf41261e4e87
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 78e683acf4]
2017-08-09 16:12:54 -04:00
Evgeny 200067f1be Adding HSA_API macro to the API method declarations to be consistent with other HSA header files, TCS removing
Change-Id: Ic217d3b2bdbb22d3600c5ecaacb7ab53bf26096a


[ROCm/ROCR-Runtime commit: 4824a2db0b]
2017-08-08 10:46:12 -04:00
Chris Freehill e16e5fb048 Remove build of non-existent project
Change-Id: I6b2c59e67c2d2a320e705b725f8c779b9913759a


[ROCm/ROCR-Runtime commit: 783a28b68c]
2017-08-08 10:03:36 -04:00
Evgeny c0c32288ac aqlprofile block list, explicit numbers assigned, IA removed
Change-Id: I9f9358f8e03e13eb81845de2e33dd5f3da27811a


[ROCm/ROCR-Runtime commit: 47322942b3]
2017-08-03 11:39:21 -04:00
Yong Zhao 40cdf91bad Add gfx902 support
Change-Id: Iefc6d1bea0d1d2ea8768867c53f16cdf1279d38f
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: d0e2872011]
2017-08-03 10:27:56 -04:00
Evgeny d224dc274f aql-profile api: reducing blocks list to compute only and new gfx9 blocks
Change-Id: Ib506b82ea407afec4f5d4bcad755d4d57b92e34b


[ROCm/ROCR-Runtime commit: c66f68041c]
2017-08-02 12:21:24 -04:00
ozeng 2b80ccdd65 CMakeLists changes to make thunk buildable on CentOS 6.9
Removed Werror CFLAGS for lower version of gcc. there
will be some warning message on lower gcc version but build
is ok.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>

Change-Id: Icf556625cb870c4ad73e1d89f3d4ade3a96e821f


[ROCm/ROCR-Runtime commit: 8176830577]
2017-08-01 12:06:05 -04:00
Chris Freehill aca0282c61 Clearer/more concise variable names
Change-Id: Ib92211977066b728f19b2a7fe40639160a8262b3


[ROCm/ROCR-Runtime commit: ab2248132a]
2017-08-01 10:38:26 -05:00
Chris Freehill 9aae431f6d Added max. single mem. allocation test.
Change-Id: Ie81c6af0502fde56225b1e197801cf04b474feb2


[ROCm/ROCR-Runtime commit: cf24f7bb78]
2017-07-31 12:04:55 -05:00
James Edwards 21f920f6f4 Add back GNU Makefiles.
Change-Id: I4a367655a905a85d4c29980aa2da8ac28db73d10


[ROCm/ROCR-Runtime commit: 2c2de075a9]
2017-07-30 08:21:35 -05:00
Chris Freehill b39089e54c Reorganize tests
Change-Id: I45f92d61070b325bcb57bd72e4a68e7d6495463c


[ROCm/ROCR-Runtime commit: bddc89e703]
2017-07-28 11:32:20 -04:00
hthangir ea88523fc7 Fix compilation issue reported with GLIBC 2.12 (RHEL 6.9)
Change-Id: I770b72ba1d61475a76aa72d0c52ebfb380db6019


[ROCm/ROCR-Runtime commit: 9ee0108e58]
2017-07-28 11:11:01 -04:00
Chris Freehill 9443b47a53 Update tests to use rocm-smi
Change-Id: Ia4692019460f4ba42a12ecba1f9e59576561c73e


[ROCm/ROCR-Runtime commit: a055531eb4]
2017-07-28 08:34:27 -04:00
Harish Kasiviswanathan faa5102340 Support IPC sharing of non paged system memory
Non paged system memory is allocated with node id 0. However, since a
gpu node is required for allocating system memory via KFD, the first
dgpu is used. In hsaKmtShareMemory() if system memory use the same
(first) dgpu.

Change-Id: I85789a89a4e4f7888e3826826401ea89ce4d1718
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>


[ROCm/ROCR-Runtime commit: 186527d0b7]
2017-07-26 10:17:07 -04:00
Harish Kasiviswanathan 8581bb285d Fix inconsistent calling of validate_nodeid
Change-Id: I3e8e65a5629059abdde89832b619cd8bf1f2b36c
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>


[ROCm/ROCR-Runtime commit: 20f0de71df]
2017-07-26 10:17:07 -04:00
Chris Freehill 3e458ed991 Add rocm-smi c++ utility classes
Change-Id: I4362151abf84f89942bf2895b45fca498a28dfc9


[ROCm/ROCR-Runtime commit: 8424fd6f23]
2017-07-25 00:42:34 -04:00
Amber Lin 391d9ee403 Workaround cpuid issue under Valgrind
Topology uses cpuid to get CPU cache information. However when running
under Valgrind, data returned from cpuid are not from the processor we set
affinity to. Instead they are all from one specific processor. For a quick
workaround so other teams can continue their work, this patch will report
CPU cache from that specific processor and ignore others.

Change-Id: I5cfac2329dac277f3dbde1be92fa26e085465401
Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: e46743b1dd]
2017-07-24 12:04:17 -04:00
Felix Kuehling 4a241a9d5f Update image alignment to 256KB
Needed for some tiling formats.

Change-Id: Icd460edaa77ccbeb3c98bc74b574ca5517db22af
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: d563e2cb1d]
2017-07-20 21:03:31 -04:00
James Edwards 40efa7d948 Update README.md to include new build instructions.
Change-Id: I72ca67d3016c99682cfe745bfd74c722ea181a61


[ROCm/ROCR-Runtime commit: ee22d80760]
2017-07-20 09:17:54 -05:00
James Edwards c310a3c99b Final changes to roct CMakeLists.txt file for devel package.
Change-Id: Ie0ce0c5cd8e7811f67e92439d1df1612eabefdfa


[ROCm/ROCR-Runtime commit: e93d3de0a1]
2017-07-19 17:16:17 -05:00
James Edwards 954975f452 Update the ROCt CMakeList.txt files to build both runtime and devel packages.
Change-Id: I01b6e4e5db91dd5f56ffea54c548e10f1f4aae5d


[ROCm/ROCR-Runtime commit: a1353acd85]
2017-07-19 01:16:13 -04:00
Felix Kuehling 0d145b951a Make HSA_DISABLE_CACHE work on gfx900
Change-Id: I624390bfa70b2ff4cefd1bbdf8d960b7121f22bb
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: c48ff6b482]
2017-07-17 15:13:52 -04:00
Felix Kuehling 0c811652ff Align large buffers to BigK or huge-page boundary
This should allow us to take advantage of BigK fragments and huge pages
and improve TLB efficiency for VRAM allocations. Huge pages only work
with 4-level page tables (gfx900 and up). BigK fragments work on older
GPUs.

Change-Id: I02e1fbf74de554e16fdaf44e44d03b47df45c3b0
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: c7bd7733e5]
2017-07-17 12:04:05 -04:00
Chris Freehill bcd0bd4e38 Added dispatch time, async copy and test template rocrtst tests
Change-Id: I57a844ee65c36bd61616ee6d60d358303f51db56


[ROCm/ROCR-Runtime commit: a12c5628ea]
2017-07-17 10:30:26 -05:00
Felix Kuehling 838f639b73 Align imported graphics buffers
Imported graphics buffers are most likely images. Align them for
tiled image access. 64KB seems to do the trick.

This fixes VM faults with OpenCL graphics interop.

Change-Id: I7f60e205d93fff9407e0d00d3dbb02cc4990b863
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: dc2c52be78]
2017-07-17 11:19:08 -04:00
Evgeny 4648eb66fd moving hsa-amd-aqlprofile to ssh://gerritgit/hsa/ec/aqlprofile
Change-Id: Ic42752ca41f877db02aa5a5d8d617cd67cce8956


[ROCm/ROCR-Runtime commit: 08d5efe29d]
2017-07-14 14:59:42 -05:00
Evgeny b6d71a8fe6 hsa_ven_amd_aqlprofile.h: include <hsa.h> fix
Change-Id: Idfd2fdde112d502d4b4a3365512ec601f7e56a5b


[ROCm/ROCR-Runtime commit: ab67b8511b]
2017-07-12 15:43:58 -05:00
Sean Keely dd8804d7ad Remove use of anonymous member in C builds.
Tools/CodeXL will retain older versions of structs if them need them.

Change-Id: I568d7b445778dd575ef71000b4b839300572288e


[ROCm/ROCR-Runtime commit: a0a3587345]
2017-07-12 16:40:00 -04:00
Sean Keely 41ab59b1e7 Fix queue interception in tools.
1. Correct amd::AqlQueue::ExecutePM4 to support interception.
2. Minor fixes to AqlPacket and SoftCP.
3. Minimal change to disable interception of runtime internal queues.

Change-Id: I103fece2ebf9a188d27f01e61221c737405d7253


[ROCm/ROCR-Runtime commit: bc0bd00746]
2017-07-12 16:39:43 -04:00
Sean Keely 17d0e450cb Correct handling of slow clocks under linux.
Change-Id: I9a1b08d5457caa6739220603bbd37b00febc64d7


[ROCm/ROCR-Runtime commit: 29b5b5c029]
2017-07-12 12:49:49 -04:00
Sean Keely 19f96afee1 Properly order signal copy agent tagging with copy operation.
Change-Id: Ic428c958551279fbea1b2449afba930b82804ede


[ROCm/ROCR-Runtime commit: 3e50adc7ce]
2017-07-11 13:10:00 -04:00
Sean Keely cafbebc2a5 Decrement hsa_init ref counter when init fails.
Change-Id: If9376344d4b559e601932d070731132c8450104e


[ROCm/ROCR-Runtime commit: c9f0427cb0]
2017-07-07 21:21:03 -05:00
Amber Lin 5cb0798d6a Replace lock file with shared memory
Performance counters have limited slots for concurrent profiling. We
need a mechanism to synchronize the slots access across different
processes. Lock file file was first used for this access control. It
reveals a RedHat bug that /var/lock, symbolic linked to /run/lock, is
not writable by others. To avoid this bug and to simplify the code,
POSIX shared memory is created to replace the lock file usage. Access
of the shared memory is controlled by semaphores.

Change-Id: I1e13c17f0e042fdfe6657afe8b3c88db7e84d292
Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: ac468f676c]
2017-07-06 13:48:34 -04:00
Evgeny ce82829fc1 hsa-runtime integration
Change-Id: I48968966ffe164218ebff88d0e3a1268e96bf1dd


[ROCm/ROCR-Runtime commit: 4174f07fd1]
2017-07-05 10:55:30 -04:00
Jay Cornwall b0e2719436 Always allocate space for control stack at beginning of save area
Hardware block testing is done with the workgroup state offset
initialized to the control stack size on all ASICs. MEC microcode
assumes this space is available when the workgroup state offset is
reset after a context restore event.

Fixes context save area overrun when the full save area is used.

Change-Id: I8eeb62f97140c6fe409fe78b4497d833584feea8
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>


[ROCm/ROCR-Runtime commit: 4fbffcdd9c]
2017-07-04 10:12:02 -04:00
Harish Kasiviswanathan 45a97666b7 Fix fd leak if application forks
If the application forks, close the fd inherited from the parent.

Change-Id: I48e4157d5f0d6f04d07ecb23b719a23934687cdb
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>


[ROCm/ROCR-Runtime commit: dc6ece67fd]
2017-06-30 15:59:41 -04:00
Harish Kasiviswanathan 7a12251015 Honor ReadOnly bit in HsaMemFlags
Change-Id: I456cde81384bf0f4bf055711d94b731179706d28
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>


[ROCm/ROCR-Runtime commit: 4d0697bf65]
2017-06-28 15:02:21 -04:00
Amber Lin 75c3086af9 Replace printf/fprintf with pr_xxx
Libraries normally don't print messages. We use pr_err, pr_warn,
pr_info, and pr_debug to print messages to stderr when prints are
enabled for debugging.

Change-Id: I9caf719343aa618c88e7b500f9737a46702e424a
Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: 897c4e2fff]
2017-06-28 10:47:35 -04:00
Amber Lin 29124d3af3 Introduce debug level to Thunk
Existing Thunk has printf/fprintf in the code while normally libraries
don't print any message. This patch introduces a print machenism similar to
how the Linux kernel prints to console based on the log level. The default
is not to print any message, but setting HSAKMT_DEBUG_LEVEL will enable the
prints.

Change-Id: Ic071e122d35a82260218e9914cde4815e69df742
Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: ccfe739929]
2017-06-26 16:33:17 -04:00
Evgeny 7892cc861c Block list extending
Change-Id: Id17efde25fce287296e80f2b37c77b15aa59b561


[ROCm/ROCR-Runtime commit: c533229bc1]
2017-06-23 16:37:02 -04:00
Amber Lin 29f0578061 Use environment variable to force gfx version
For experimental purpose, we need an option to change compute capability
by forcing the GfxIp version. This patch allows to use environment
variable HSA_OVERRIDE_GFX_VERSION=major.minor.stepping to replace the
default version. For example:
export HSA_OVERRIDE_GFX_VERSION=9.0.1

Change-Id: I90cfbd43619d9d3aebf53321d4e058f01bcd7088
Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: 13aadde56e]
2017-06-22 18:15:17 -04:00
Evgeny 9725267667 minor fixes, debug output, comments, using env vars, dead code
Change-Id: I08ad73b561709c1818d78a9191c96d6ad141a609


[ROCm/ROCR-Runtime commit: 8618bf7e2c]
2017-06-22 18:04:26 -04:00
Evan Quan 17056d07bb Revert "Change gfx900 compute capability to 9.0.1"
This reverts commit 35dcb1c392.


Change-Id: Id9c4f43462820bf09f25674fa30e6eb04098808e
Signed-off-by: Evan Quan <evan.quan@amd.com>


[ROCm/ROCR-Runtime commit: 5b3c9f0b31]
2017-06-20 15:36:09 +08:00
Amber Lin 79eaea7358 Free control stack correctly
ctl_stack_copy is allocated from malloc. It should be freed by free.

Change-Id: Ib924da20200d91f52f106fe173464d47862759a8
Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: 6e113e2634]
2017-06-19 09:01:35 -04:00
Ramesh Errabolu 1351f204d4 Support Perf Cntrs (PMC) and Thread Trace (SQTT) over AQL queues
Change-Id: I716b722895d90b46914c31377e791ad602acecc1


[ROCm/ROCR-Runtime commit: 08e0bca567]
2017-06-15 12:58:31 -04:00
Kenny Ho 415027b89f Revert "Implement memory fault analysis through context save area"
This reverts commit 498f3a7188.

Change-Id: Ibf11b764b383b9be291f3009a30550e1a1e2d115


[ROCm/ROCR-Runtime commit: 5b4df54b10]
2017-06-14 14:21:53 -04:00
Evgeny 231d7e8608 GFX8 API
Change-Id: I9d0c430e4199f043226c8897f3320a7973cbdeda


[ROCm/ROCR-Runtime commit: 35b376e2ee]
2017-06-14 12:24:28 -04:00
Jay Cornwall 498f3a7188 Implement memory fault analysis through context save area
When a fatal memory fault occurs the scheduler context-saves all queues
in the process and notifies the runtime through the memory event. The
saved state contains all GPR/LDS data at the moment of the fault.

Retrieve this state and present it to the user if HSA_DEBUG_FAULT is set
to "analyze" and the wavefront caused the fault. If amdgcn-capable objdump
is in the PATH invoke this to disassemble code around the PC.

Queue lifetime is now managed by the runtime to allow querying the
context save state for all active queues.

Change-Id: I6fee662fad1c4f9aa125bf5c53d7d0ea1ab32f95


[ROCm/ROCR-Runtime commit: 75c9506f9d]
2017-06-13 23:12:28 -04:00
Amber Lin 35dcb1c392 Change gfx900 compute capability to 9.0.1
9.0.1 is XNACK enabled gfx900 compile target. Compiler must generate ISA
that's XNACK enabled.



Change-Id: Ic4987132ef9f8d06d9e2bcdb8f7eeb875cdd2b44
Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: 5114a9368b]
2017-06-13 17:04:42 -04:00