Graphe des révisions

1261 Révisions

Auteur SHA1 Message Date
Sean Keely 7ea0cd688f Disable sram-ecc reporting via ISA until HCC is fixed.
Change-Id: I0382825884b727173385f04da9f2088650c3ba1d
2019-03-21 17:46:56 -04:00
Sean Keely fd9fb77e28 Do not strip release builds.
Customer request.

Change-Id: Id77dcdc0b6908c7a5e460edfd7d9468a1691e351
2019-03-07 14:04:14 -06:00
Sean Keely 67376e06ab Report SRAM ECC errors through the system event handler.
Modify the system event handler to support multiple users.
Name memory fault reason codes.

Change-Id: I1b5979b36ab15637eb2be59a61e2d57e76d0a70e
2019-02-27 18:08:07 -05:00
Sean Keely 3c3db0243e Loader support for SRAM ECC.
Change-Id: I0c6791c356d9186cc8dabae9fd698b1d4de19b09
2019-02-25 18:30:05 -05:00
Sean Keely c56d86100b Add fine grain vram pool.
Part 1 of 2.
Enables fine grain vram over PCIe based on env flag.
Part 2 will extend to XGMI.

Change-Id: I8ad506e004b398d56d462b0200274eae2293a461
2019-02-21 13:08:11 -05:00
Sean Keely 344d964f9f Suppress exception reporting for well defined invalid signal handles.
hsa_exceptions with empty what() strings will not report in debug builds.

Change-Id: I0d424d3b1d3044808ece1720a460a57d68bf878e
2019-02-15 19:35:57 -05:00
Sean Keely 400304aa10 Remove stop using ROCm release tags for library version numbers.
Version is now a fixed string that matches previous internal builds.
This also matches released DEB/RPM builds (but not github versions).

Change-Id: Id4819b9de8c855250aadf1a1cebb187b5c031721
2019-02-06 19:22:53 -05:00
Ramesh Errabolu 3fbf03af76 Allows users, via env ROCR_VISIBLE_DEVICES, to surface a subset of Gpu devices
Change-Id: I5662639d5d70f054831969669f9d30dec356dd5a

Update per review comments

Change-Id: I18c7d7cb00b261493b61c2cf5454d486166f40d8
2019-02-06 02:02:29 -06:00
Sean Keely 65d39cc476 Unify APU and dGPU initial queue scratch allocation.
Both support dynamic scratch allocation so there is no reason
to preemptively allocate on APUs.

Change-Id: I22eaec01a83a091ee9dc1f594a1a9106e8dd81fc
2019-01-25 02:11:39 -05:00
Jay Cornwall 079eadd71b Remove legacy microcode version check in GpuAgent::InvalidateCodeCaches
Fixes instruction cache invalidation when using microcode branches.

Change-Id: I932676e683983145f5c807204e592fb5e530c8af
2019-01-22 16:39:52 -06:00
Konstantin Zhuravlyov 8bee6e4976 Loader: update symbol processing for v2+
- Skip symbols that are STB_LOCAL and not STT_AMDGPU_HSA_KERNEL

Change-Id: I68567f58de9bf3f07dbd8020ef63f47667c86367
2019-01-18 15:42:28 -05:00
Konstantin Zhuravlyov c1ad82a6b7 Loader updates for code object v3
- Fix loading in some cases
  - Fix symbol kind

Change-Id: I721b4a35972b6d2a6d0ac733ab770b096cc74e17
2019-01-18 15:41:01 -05:00
Ramesh Errabolu 28c3f9a269 Initialize queue buffer with Invalid Pkt Headers
Change-Id: I4166f1359746ee6829b730bac2db358af72ab16e
2018-11-21 19:09:10 -05:00
Sean Keely 8e4177382a Check max wave scratch limits.
HW has limited bits for wave scratch base address stride.  Enforcement
prevents programs with larger than supported scratch allocations from
running and clobbering neighboring scratch space.

Change-Id: I574da888e9d1d5e290a9c0025ba13b5ef9f1e5c0
2018-11-16 20:59:20 -05:00
Sean Keely 269be0be2e Disable forced explicit selection of public vs internal HSA interfaces.
Temporary to reenable OCL builds on TC.

Change-Id: Ia81f2f9a9dd10ae8ce9627313247a586a8711584
2018-11-16 15:26:26 -06:00
Konstantin Zhuravlyov a447d79430 Fix dynamic relocations:
- Process dynamic relocation even if there is
    no symbol associated to it.

Change-Id: Iaefee682ee52f5acda8280e5764e6d5fd992774a
2018-11-14 15:25:41 -05:00
Sean Keely 4e8597681b Cache KFD Events used by user allocated InterruptSignals.
Change-Id: I7f102f880fea9c78febe28cd262f93ee77f03184
2018-11-12 22:37:42 -06:00
Sean Keely 8323b2e1d7 Add pooling for Signal ABI blocks (SharedSignal).
Makes better use of memory and greatly reduces mmap count.

Change-Id: Ib444cd1ccd144986adbcc7cec297a966e2c08bc7
2018-11-12 22:37:28 -06:00
Sean Keely 936ecd1885 Remove legacy SVM region concept.
Also rename blit_agent to region_gpu and add comments to clarify
its role in deprecated region API support rather than to do blits.

Change-Id: I80b1043db2e1c5d40a58fc801eef70a688ea9169
2018-11-09 06:27:53 -06:00
Sean Keely dda9c17b45 Move VM fault handler init to after all devices are registered.
During registration we must not call any function that depends on registered
data as the lists are not yet complete.  This includes signal allocation since
allocating shared GPU mapped memory depends on the list of GPUs.

Change-Id: I94d59e847802c546c2a5a0d9f55fe5ac3fd1d878
2018-11-09 03:10:08 -06:00
Sean Keely 9ec37b5103 Ensure runtime cleanup when hsa_init ref count reaches 0.
Delete the runtime object when the last hsa_shut_down occurs.

Change-Id: I2005d52d06702eaef166714fd5e471cc277924db
2018-10-22 19:32:00 -05:00
Evgeny d788a53972 aqlprofile extension version check
Change-Id: If824764f199eca15a0341cdf6177d8d6353e29f3
2018-10-22 15:36:57 -04:00
Sean Keely 757502ccd6 Report internal queue creation to tools.
Debug agent requires handles to internal queues for single step debugging.
Added tools only API hsa_amd_runtime_queue_create_register for reporting.

hsa_amd_runtime_queue_create_register sets a callback which is invoked
when internal queues are created.

Change-Id: Ia5190ae724fadba686c15f25b2cd085350eeff0e
2018-10-20 23:12:27 -04:00
Sean Keely 5975c465ad Fully initialize GPU agents before loading tools.
Required for debug agent requires copy API and trap handler to be initalized
prior to loading.  Existing tools do not make use of internal queue or scratch
memory intercept which is what PostToolsInit allows.

PostToolsInit() will be removed in a following cleanup change.

Change-Id: If43377843808e3eff0defd9204910a67a852902f
2018-10-20 23:12:14 -04:00
Sean Keely 6852282a07 Refactor of Runtime::CopyMemory()
Change-Id: I32a7cb24d00660ff4471d121ef7b3c2eec8fced2
2018-10-20 14:38:50 -04:00
Konstantin Zhuravlyov 509bb777e0 Loader: Update license for AMDHSAKernelDescriptor.h
Change-Id: I3a48b595ba089ca8a25f878c056b04a417a2364f
2018-10-12 14:51:05 -04:00
Sean Keely 1e0d690948 Use ptrinfo rather than apertures in hsa_memory_copy
Apertures now overlap with the change to 48bit addressing which
precludes using aperture checks to discover buffer ownership.
Switches to ptrinfo to decide which device a buffer owned by.

This corrects faults in the legacy hsa_memory_copy api.

Change-Id: I5c7ce0216e1cdc96f836fc6fec9c3defdf4b9d90
2018-10-11 13:34:53 -04:00
Konstantin Zhuravlyov 386874da55 Loader: Add support for v3 object code.
Change-Id: I7215bd0c1277c2036bf0fadf5b23cb57fdf7f665
2018-10-06 14:01:59 -04:00
Jay Cornwall f1ffbc3286 Revert "Extend SDMA disable list until firmware stability resolved"
This reverts commit 5e1ccdc4a9.

Change-Id: I17b379e4d0e49a79dc8d4a60f01ea424fda24f02
2018-10-05 15:17:27 -04:00
Kent Russell ed9baefd75 Only remove ldconf on uninstall
On update, the removal will occur AFTER the new package is installed,
due to some stupidity with how yum/rpm does things. Only remove it if
we're doing a pure uninstall

Change-Id: I4982610828d8bc1f2d8691b1e4ee1718c89413cc
2018-10-03 08:10:06 -04:00
Evgeny fdbe277f2a hsa_ven_amd_aqlprofile_pfn_t alias
Change-Id: Ia4a67ef0d2f8975f0e541e85c215afec76e9de5f
2018-09-26 14:10:21 -04:00
Scott Linder 47f0e6f7d3 Apply dynamic relocations for STT_FUNC symbols
Required to support function calls through GOT table.

Change-Id: I174a0269fdd67369d38fe41855b7bd01f350b839
2018-09-23 21:42:32 -04:00
Ramesh Errabolu 01eea21d6c Capture number of Numa Nodes present on system
Change-Id: Ic789a6b9da8e316cb483e50b0fe9faa03798f97c
2018-09-18 16:27:30 -05:00
Ramesh Errabolu f007870792 ROCr changes to enable small BAR P2P over xGMI
Change-Id: I6aaa3fe2565cdf7e15d58a7484d6bd5916ffff64
2018-09-17 22:54:40 -04:00
Evgeny 81532bb6f5 VERSION_MINOR macro typo fix
aqlprofile info ENABLE_CMD enum adding;

Change-Id: I7b19082144d2bd0bf7af7ddc282358168b225759
2018-09-17 20:49:47 -04:00
Sean Keely 3357cadeec Check fill addresses for alignment.
Check was documented but missing.

Change-Id: I97951635d794fd22e20c25d20e9d0e35035254af
2018-09-05 16:34:19 -04:00
Sean Keely 2843988dd7 Remove redundant initialization.
LinkInfo is already initialized to zero in its default constructor.

Change-Id: Ifa4fb886cce9b474c6879c9c82744044ab394082
2018-08-29 19:36:07 -04:00
Sean Keely 56ed5c8904 Refactor blocking sdma commands.
Remove fence pool and use two signals.  Two signals allows overlapped
submission and copy while reducing thread busy polling.

Change-Id: Idb5f8e4c7f482a596ffce9e7799191fdd785a216
2018-08-29 19:13:23 -04:00
Sean Keely e0839ab27e Implement SDMA copy rect for gfx9.
Fix pitch overflow due to small element detection.
Add wide pitch 2D copy handling.
Cleanup code duplication.

Change-Id: I93b1584aba8e5964957eb7ab3544df806ca3e2f9
2018-08-29 19:13:07 -04:00
Sean Keely aca00b7238 Add debug checking of time stamps validity.
Can only check that the signal has some time stamp, can't check if
the translating agent matches the last used agent or not.

Change-Id: I62943a864318808059c617280bb65a269dfadd1b
2018-08-26 12:36:35 -04:00
Sean Keely cd8e5c1da8 Expose ROCr build ID.
Adds HSA_AMD_SYSTEM_INFO_BUILD_VERSION=0x200 to hsa_system_info_t.
This returns a const char* pointing at the build string (git describe).

Change-Id: I73e6612482bf6ffc4037fd365808eb9211a650ad
2018-08-20 20:44:32 -05:00
Sean Keely 6c47780620 Experimental flag to swap copy agent for async copy APIs.
Adds env flag HSA_REV_COPY_DIR.  If set to 1 async copy will
copy from dst device to src device rather than from src to dst.

Change-Id: I3095642066fa026dc112c2eac06db9393341cd7e
2018-08-09 10:58:14 -04:00
Jay Cornwall 5e1ccdc4a9 Extend SDMA disable list until firmware stability resolved
Change-Id: I5e21cb761ae970ba2b68edd97b1564b36ca1f0f4
2018-08-08 11:20:14 -05:00
James Edwards 4d7d50feba Add tools headers and library back to packaging.
Change-Id: If6c9befe50fc111eb154bd5b4eb5c7858f5d510b
2018-07-16 16:51:12 -04:00
Sean Keely 35a270ef7e Do not initialize runtime internal queues based on mapping memory to a GPU.
Conserves VMIDs when multiple processes are in use and memory operations
are not GPU specific.  For instance HIP API hipHostMalloc does not accept
a target GPU so when used with one process per GPU (ie GPU == MPI rank) we can
quickly exceed the available VMID slots if every process consumes a VMID on
every GPU.

Change-Id: Ib6fa051290089f71581029c09f9a44b9992237d1
2018-07-13 19:58:45 -04:00
Sean Keely c6cf161125 Fix git describe command to retrieve version tags correctly.
Change-Id: I904f5ccdb88c1e28d5eeffd104174fcd57626ee7
2018-07-10 20:19:04 -05:00
Wilkin 170e2a142f OpenCL BLIT for Image library
- include support for gfx702

Change-Id: If681a4eef9bd076e25300e1c1bca55b4f7c92b46
2018-07-06 10:35:44 -04:00
James Edwards 58a411dd36 Change packaging for rocr-dev and rocr-ext.
Change-Id: Ia096a2d31ddd7bef2e05bb3d6c58e94d8c339598
2018-07-02 13:40:45 -05:00
Jay Cornwall e388a23344 Add hsa_amd_queue_set_priority extension function
Controls dispatch and wavefront scheduling arbitration across quees.

Change-Id: I498f4898b544f79b8fb8514bf7e789ca9da29462
2018-06-19 19:41:28 -05:00
Sean Keely 3e3aa37750 Enable SDMA use without platform atomic support.
SDMA will use atomic completion fences if KFD reports 64bit atomic support.
Otherwise it will fall back to store completion fences.

Change-Id: I12b76f8a74ec3ee96372c250f9824d846051536e
2018-06-12 15:38:44 -04:00