Wykres commitów

2930 Commity

Autor SHA1 Wiadomość Data
Christophe Paquot a96dc6e41b Update addrlib for gfx900.
Change-Id: I2b7b6093406c5498e9a551327701ad8973f1cf3a


[ROCm/ROCR-Runtime commit: 29894df0b5]
2017-03-07 14:41:16 -08:00
Amber Lin 68315e3157 Support profiling on gfx900
Add gfx900 to PMC support. This patch lists SQ counters.

Change-Id: Ia1e60e76ff71ab2e38d9d5de12ac9d527b3e8c6a


[ROCm/ROCR-Runtime commit: 2c2b1e0db2]
2017-03-07 14:30:40 -05:00
Amber Lin ad5c8c4844 Don't duplicate PMC tables
Many devices have the same counter IDs for the hardware block. Devices
in the same GFX generation usually have the same block counters. No need
to list each device individually. Instead, have a table to share with all
devices that have the same counter IDs, and have separated tables for
devices that don't have the same counter IDs.

Change-Id: I857056edc6f491f61af6e9598580e5dc7d372f94


[ROCm/ROCR-Runtime commit: 9e32cdb113]
2017-03-07 11:31:23 -05:00
Amber Lin 0b074112f2 Add gfx803 DID
Add 0x67D0 to gfx803 support list.

Change-Id: Ifdb1fad4a3c42bea54856f6d5248c00ed546ad85


[ROCm/ROCR-Runtime commit: b3b6367cb8]
2017-03-07 07:25:49 -05:00
Amber Lin 4ddf99f4a9 Unify the device ID list
Integrate the supported device ID list distributed in topology, queue, and
pmc into one place: topology.

Change-Id: If035cf8e4a6fc6caff6c94ec627647cfb11c3d79


[ROCm/ROCR-Runtime commit: 4827b09119]
2017-03-06 16:26:51 -05:00
Amber Lin 9b6439a5bb Make the lock file writable by others
Though S_IWOTH flag is set in the open() call, the lock file is not
created as accessable by others if others try to open the file with O_RDWR
permission. It's because the default umask masks off S_IWOTH. This patch
changes the umask to S_IXOTH since others don't need that permission but
it'll open up S_IWOTH. Restore the umask to original after the file is
opened.

Change-Id: I8a239e1566ce0b0b18821913385f239db7c3588e


[ROCm/ROCR-Runtime commit: 1a8a9cb57b]
2017-03-03 11:05:13 -05:00
Amber Lin 492c9623eb Implement Start/Stop/Query Trace
StartTrace and StopTrace send ioctl requests to enable/disable performance
counters. QueryTrace reads the counter from the perf_event fd.

Change-Id: Ibf79675bc23fcf129371bfd100f8e262121bc684


[ROCm/ROCR-Runtime commit: e17c67f049]
2017-03-02 14:00:25 -05:00
James Edwards e7d887e83b Fix permissions on hsakmt include files.
Change-Id: I1d428e60268e6d2de6776ff5f16d03503d00ddcc


[ROCm/ROCR-Runtime commit: ec84fbe264]
2017-03-02 12:00:09 -06:00
Ramesh Errabolu a9a54e2cc3 Extend Rocr Samples to allow collection of Perf Cntrs
Change-Id: I9c7e75128fca28b23ec54efab00bf5d32c95a877


[ROCm/ROCR-Runtime commit: 315ae6439b]
2017-02-28 20:29:24 -06:00
Kent Russell 30e1e20ede fmm.c: Disable userptr for paged mem by default
Unless HSA_USERPTR_FOR_PAGED_MEM is explicitly set, don't use userptr
for all paged memory. This will also allow us to work around some 4.9
issues, and then we can explicitly set HSA_USERPTR_FOR_PAGED_MEM for
all usage once those issues are resolved.

Change-Id: I25ce22b73ae6e93f1567f2318d9d2b47d4a44e69


[ROCm/ROCR-Runtime commit: c991951288]
2017-02-28 16:09:27 -05:00
shaoyun.liu d394e88cbf Thunk: Don't allocate extra control stack memory for gfx900
The control stack memory for CWSR is allocate in kernel together with MQD
allocation.

Change-Id: Ib1c0ab9402df3431e9555649394320380d6c6dd8
Signed-off-by: shaoyun.liu <shaoyun.liu@amd.com>


[ROCm/ROCR-Runtime commit: 116e5c5e8b]
2017-02-27 10:39:05 -05:00
Felix Kuehling a6c90a894a gfx900: Allow doorbell allocation independent of queue ID
On SOC15 chips, the ABI for the create_queue ioctl is changed to
allow doorbell allocation independent of the queue ID. This is
necessary to accommodate doorbell routing to specific engines in
the BIF.

Change-Id: Ie98d0a758758149dd5fc09ae088afccc29904124
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 7de66d149b]
2017-02-27 10:39:05 -05:00
Felix Kuehling ff41927a88 Allocate 64-bit for doorbells and write pointers
On gfx900 we need 64-bit for all doorbells and SDMA WPTRs.

Change-Id: I9b922e16442e967599ae3c928308451d5cc470b3
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: d7063dd102]
2017-02-27 10:39:05 -05:00
Felix Kuehling adb4539c99 Use KFD_IOC_ALLOC_MEM_FLAGS_COHERENT for fine-grained memory
Use KFD_IOC_ALLOC_MEM_FLAGS_COHERENT when allocating fine-grained
memory and doorbell BOs so that they will be mapped with MTYPE_UC
on GFX9 hardware.

Change-Id: I51adf45b13105f479e6bcdaf54955b467920ee9a
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com>


[ROCm/ROCR-Runtime commit: 8cb89b6926]
2017-02-27 10:39:05 -05:00
Felix Kuehling 3d88b3571b Update kfd_ioctl.h
Copied from kernel repository.

Change-Id: I9ed021cfb5b297d9a91dce93ed6355c95fb1127b
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com>


[ROCm/ROCR-Runtime commit: e5dd2f88c6]
2017-02-27 10:39:05 -05:00
Felix Kuehling 3ca06e0e94 Make doorbell-size ASIC specific
This is in preparation for gfx900, which uses 64-bit doorbells. We
maintain the same number of doorbells per process by making the
doorbell page size bigger.

KFD will need to implement the same rule.

Change-Id: I3c4110869b191b83943b5a390a48edfc94d941d8


[ROCm/ROCR-Runtime commit: 48207af92a]
2017-02-27 10:39:05 -05:00
Amber Lin 77b6cca404 Add gfx900 support
Add gfx900 device to the support

Change-Id: I71f30ef43e5e0ef0e7b5f18205b6cc4767d9d861


[ROCm/ROCR-Runtime commit: 9ba2b68fdb]
2017-02-27 10:39:05 -05:00
Amber Lin 39ad1ba008 Implement PMC AcquireTrace
Existing code uses lockf to ensure exclusive PMC access of one process and
one TraceId. However Thunk spec allows hsaKmtPmcAcquireTraceAccess to get
exclusive access to the defined set of counters, not exclusive to one
process or one TraceId. Multiple counter sets of multiple TraceIds is
allowed if they meet the concurrent access limit evaluated by the hardware
/driver.

Change-Id: I59cacb855a707fe326a4070452fcbbd3c95ac223


[ROCm/ROCR-Runtime commit: 1025579c0b]
2017-02-27 09:33:58 -05:00
Felix Kuehling c28c344f78 Avoid COW after fork for API-allocated system memory
Change-Id: I5c7175114c4e6411d3beb5557e16cb71ddb01189
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 64104fc8d9]
2017-02-23 10:28:45 -05:00
Amber Lin 3bafcdb328 Support multiple blocks in RegisterTrace
Existing code assumes all counters sent to hsaKmtPmcRegisterTrace belong
to one PMC block and this block is SQ. This patch considers cases when
counters are in different blocks, and removes the hard-coded SQ. As a
matter of fact, SQ is non-privileged so the user even shouldn't use SQ
counters to register/release trace. This patch also ignores
non-privileged blocks as what HSA Thunk spec describes.

This patch also records counters information in trace structure so
AcquireTrace can get counters information using that TraceId.

Change-Id: Ifa5741050553d4615baab01f7485a9e09435b019
Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: cb60c5f18a]
2017-02-21 15:32:18 -05:00
James Edwards ccbb47d79d Update readme regarding CMAKE_PREFIX_PATH.
Change-Id: I322789f38b1984b2527554c10cb0f3be886d3e91


[ROCm/ROCR-Runtime commit: 470750cc3c]
2017-02-20 14:33:53 -06:00
Harish Kasiviswanathan b7f6ed08ee Add API entrypoints for Cross Memory Attach
Implement two new API for cross memory read and write operation.
 - hsaKmtProcessVMRead
 - hsaKmtProcessVMWrite

Add new ioclts necessary for the above APIs.

Change-Id: I0c153e3b4e1f32b7a8b102ad5c774d9ae9bfc2fa
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>


[ROCm/ROCR-Runtime commit: e79521b556]
2017-02-17 16:59:51 -05:00
James Edwards 1bb65fb587 Modify packaging cmake files to use BUILD_VERSION* instead of RELEASE_VERSION*.
Change-Id: I6f1b83c9faf0d40c1ac27d8998f4651341971b1b


[ROCm/ROCR-Runtime commit: 57ac399652]
2017-02-16 16:40:20 -06:00
ozeng b6c8f1f4c9 libkmt: Change files mode back to 644
events.c and queues.c were accidently changed to 755 by change
fc70f0c30976f4021f7d763bfc10d76a76029553. Change them back.

Change-Id: If51c0b91139afc23e9051cf94c83d61fc20297e6
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>


[ROCm/ROCR-Runtime commit: b3c3f7bae1]
2017-02-16 15:09:26 -05:00
Ramesh Errabolu 513004fb28 Enable code for Perf Cntrs for gfx900 - AI family device
Change-Id: I4659da1a8db17392016fc90c8ea19b5805b5d3aa


[ROCm/ROCR-Runtime commit: 42e751519b]
2017-02-15 21:50:23 -06:00
James Edwards 212d7c3f8f Fixes to HSA CMakeList.txt files
Change-Id: Idd176d24dfd22bd9a6a8860ab035fd5d1aca756d


[ROCm/ROCR-Runtime commit: 900f272622]
2017-02-15 08:26:30 -06:00
James Edwards fca0ec886c Fix TeamCity builds using utils.cmake.
Change-Id: Id15a911dad06643d9457cc4d8c907fc5796772ee


[ROCm/ROCR-Runtime commit: b7e06b471c]
2017-02-14 15:36:21 -06:00
James Edwards edac49b876 Add --dirty tag to utils.cmake
Change-Id: Ib2487eade8d88530df34dfb8e9b442e547e9f52d


[ROCm/ROCR-Runtime commit: acb1f0b689]
2017-02-14 11:19:33 -06:00
James Edwards f851f5adf6 Modify hsa CMakeList.txt file to use PREFIX_PATH and git describe versioning.
Change-Id: I08668df07725369ecf8a2f35e74dd7d64c8ca94b


[ROCm/ROCR-Runtime commit: 73e942cd8a]
2017-02-14 08:39:16 -05:00
James Edwards e716770c85 Add libpci to CMakeLists.txt for thunk.
Change-Id: I0228035ce769feaf54cbca75f076f73614cbb9cc


[ROCm/ROCR-Runtime commit: 9e81f0f5e2]
2017-02-11 16:24:54 -06:00
Konstantin Zhuravlyov 2a071e1456 Bring loader in sync with stg/sc
Change-Id: Iccce07b8fa03d37c4267a2a9bd343e6614dc43e7


[ROCm/ROCR-Runtime commit: 9887c26113]
2017-02-10 11:21:15 -05:00
James Edwards 290cae8d87 Fix file stripping for release builds.
Change-Id: I538c366f0992980ffff1ef337807035b9378845c


[ROCm/ROCR-Runtime commit: a5a30e8199]
2017-02-09 14:42:14 -06:00
Jay Cornwall 16d8e2caa9 Implement code cache invalidation for Gfx9
When a new enough microcode build is running use a vendor AQL packet
to submit the PM4 IB.

Change-Id: Icd3e2b322c418477420ba4a29f4455ce340ef0d2


[ROCm/ROCR-Runtime commit: 4d62b9482a]
2017-02-09 14:15:21 -05:00
James Edwards 163ae6aa9c Update libhsakmt CMakeList.txt file for tag based versioning and proper packaging
Change-Id: I63e82deefa8377ced810d99b5b2f0457299048a6


[ROCm/ROCR-Runtime commit: bc44715be2]
2017-02-08 14:42:29 -06:00
Ramesh Errabolu 34327e7167 Support Gfx9 and Pre-Gfx9 Thread Trace Drivers
Change-Id: Ic3fea4006d76d1e3f58dde6f64c343a1261abe39


[ROCm/ROCR-Runtime commit: e91970a39b]
2017-02-07 15:39:59 -06:00
Sean Keely de0fa51022 Fix Api table copy operation and tools version checking.
Change-Id: Ia76d16f3ea6d0abb931813f90bc3bc2119da5999


[ROCm/ROCR-Runtime commit: 505d722b7d]
2017-02-07 14:26:20 -05:00
Felix Kuehling 26a4ba78d4 Free BOs before munmapping
This avoids unnecessary evictions and failed restores due to the
munmap of userptr BOs that are just about to be freed.

Change-Id: Icf2f0b73991455556a201c54c05ea7e20af80f47
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 74ebfca9f0]
2017-02-07 10:45:37 -05:00
Sean Keely d984194fa5 Use fixed size type for queue type arguments.
Change-Id: I81b605c9cc9b18bcef043a4f0292212241ce5987


[ROCm/ROCR-Runtime commit: bc43f97964]
2017-02-07 01:22:30 -05:00
ozeng 6c1bd8034c libkmt: Misc fixes in thunk
1. Translate thunk queue priority to kfd priority
2. Initialize event SyncVar
3. Added HSAint32 data type


Change-Id: I7decc1be7cbe9c84cb670d9a7c99050b62ba98f3


[ROCm/ROCR-Runtime commit: cb0f851560]
2017-02-06 17:19:40 -05:00
Ramesh Errabolu 835866b042 Refactor Thread Trace Service as an independent library
Change-Id: Ia7579bc16626f3e21c8df50f8a35cb4b82f6bda9


[ROCm/ROCR-Runtime commit: 74e3a49b20]
2017-02-06 17:04:07 -05:00
Jay Cornwall dce2a864ba Fix Gfx9 write pointer setup
Should point directly to amd_queue_t.write_dispatch_id. Only noticeable
with HWS enabled which is not yet stable.

Change-Id: I169906d45225379a3ca2729ff04d298fdbb9a9fb


[ROCm/ROCR-Runtime commit: 28f51d5808]
2017-02-06 14:06:12 -05:00
Amber Lin 00e295eff3 Add IOMMU to performance counter table
Add IOMMUv2 to blocks returned by hsaKmtPmcGetCounterProperties(). IOMMU
information is read from sysfs.

Change-Id: I3a1c6f902f947913570a78700fc0ffc444e1dd72
Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: 9dadac6dc9]
2017-02-03 14:35:27 -05:00
Ramesh Errabolu 4a70f68afb Refactor Device Id to Asic Type Mapping Service
Change-Id: I8969b41f7c4de9fdeee5131e2049053a486f64fb


[ROCm/ROCR-Runtime commit: 7755bd5487]
2017-02-03 14:22:32 -05:00
Amber Lin 67049d7175 Replace spaces with tabs
Thunk follows Linux kernel coding convention to use tabs instead of
spaces.

Change-Id: I4eddcfa9a0513f16c869d9cc63f9f1dae0c39f83
Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: d4dbf562a9]
2017-02-03 10:13:24 -05:00
Felix Kuehling 73c7c197af Allocate paged system memory as userptr
Change-Id: I0864e678681788df37eccd9d7ebc70086e1f93bf


[ROCm/ROCR-Runtime commit: a90abcb317]
2017-02-02 10:32:53 -05:00
Amber Lin e4e6d49bfa Sync up gfx803 DIDs with KFD
Add gfx803 10/11 device IDs that were recently added to KFD.

Change-Id: Id40b117ae47bacedefa6e333fdfdf58dea92cd2d
Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: 72b842a6dc]
2017-02-01 12:05:24 -05:00
Jay Cornwall 7b45ebfe7f Add support for ARM
- Build system fixes
- No user-mode high-precision timer by default, use clock_gettime
- Use C11 aligned_alloc pending C++17 std::aligned_alloc

Change-Id: I268365bdfd11d1e817a89584b9e086ee5b86e1dc


[ROCm/ROCR-Runtime commit: 9e575ea96a]
2017-01-31 16:43:49 -08:00
Jay Cornwall 7f38f6e297 Add support for gfx900
- Route AQL doorbell directly to HW doorbell
- Reuse precompiled Gfx8 shaders on Gfx9 (ISA is compatible)
- Add a warning for unimplemented code cache invalidation

Change-Id: I92096584a1404e35779c96ae6bdc3e0e7fd04721


[ROCm/ROCR-Runtime commit: 7e0a5f9c00]
2017-01-30 16:36:28 -06:00
Sean Keely 2755a4e0a7 Remove old names from API table interfaces.
Change-Id: I41ca38b596e1dac85e871f583e3ffe7078b790e7


[ROCm/ROCR-Runtime commit: 796d31d94d]
2017-01-27 17:45:26 -06:00
Christophe Paquot 2217e709da Adding the new APIs to the .def files and fixed a couple of things
Change-Id: I247a60b8cdbd4acfed72fb6d78ac7faf69d8a556


[ROCm/ROCR-Runtime commit: 9ae1e15750]
2017-01-27 16:51:50 -05:00