Γράφημα Υποβολών

443 Υποβολές

Συγγραφέας SHA1 Μήνυμα Ημερομηνία
Sean Keely f7dbaf103b Add fine grain vram pool.
Part 1 of 2.
Enables fine grain vram over PCIe based on env flag.
Part 2 will extend to XGMI.

Change-Id: I8ad506e004b398d56d462b0200274eae2293a461


[ROCm/ROCR-Runtime commit: c56d86100b]
2019-02-21 13:08:11 -05:00
Sean Keely c66675d192 Suppress exception reporting for well defined invalid signal handles.
hsa_exceptions with empty what() strings will not report in debug builds.

Change-Id: I0d424d3b1d3044808ece1720a460a57d68bf878e


[ROCm/ROCR-Runtime commit: 344d964f9f]
2019-02-15 19:35:57 -05:00
Sean Keely 3183fea597 Remove stop using ROCm release tags for library version numbers.
Version is now a fixed string that matches previous internal builds.
This also matches released DEB/RPM builds (but not github versions).

Change-Id: Id4819b9de8c855250aadf1a1cebb187b5c031721


[ROCm/ROCR-Runtime commit: 400304aa10]
2019-02-06 19:22:53 -05:00
Ramesh Errabolu 30363b9dbb Allows users, via env ROCR_VISIBLE_DEVICES, to surface a subset of Gpu devices
Change-Id: I5662639d5d70f054831969669f9d30dec356dd5a

Update per review comments

Change-Id: I18c7d7cb00b261493b61c2cf5454d486166f40d8


[ROCm/ROCR-Runtime commit: 3fbf03af76]
2019-02-06 02:02:29 -06:00
Chris Freehill 4661538a0a Fix boolean semantic error
Change-Id: Ic927370d5874af3f33105fca6ee0b581ebc6fa08


[ROCm/ROCR-Runtime commit: 014945310a]
2019-01-31 14:03:48 -06:00
Chris Freehill 8f1a9494cd Fix async memory test; temporarily disable NUMA memory test
Change-Id: I1c0618f5dba513c1cf8fafb5fc64e5c811df8454


[ROCm/ROCR-Runtime commit: 626f13a88b]
2019-01-31 09:14:54 -05:00
Sean Keely f39ed0dc23 Unify APU and dGPU initial queue scratch allocation.
Both support dynamic scratch allocation so there is no reason
to preemptively allocate on APUs.

Change-Id: I22eaec01a83a091ee9dc1f594a1a9106e8dd81fc


[ROCm/ROCR-Runtime commit: 65d39cc476]
2019-01-25 02:11:39 -05:00
Jay Cornwall 268ae92794 Remove legacy microcode version check in GpuAgent::InvalidateCodeCaches
Fixes instruction cache invalidation when using microcode branches.

Change-Id: I932676e683983145f5c807204e592fb5e530c8af


[ROCm/ROCR-Runtime commit: 079eadd71b]
2019-01-22 16:39:52 -06:00
Konstantin Zhuravlyov a506e18fd2 Loader: update symbol processing for v2+
- Skip symbols that are STB_LOCAL and not STT_AMDGPU_HSA_KERNEL

Change-Id: I68567f58de9bf3f07dbd8020ef63f47667c86367


[ROCm/ROCR-Runtime commit: 8bee6e4976]
2019-01-18 15:42:28 -05:00
Konstantin Zhuravlyov 564ac4b348 Loader updates for code object v3
- Fix loading in some cases
  - Fix symbol kind

Change-Id: I721b4a35972b6d2a6d0ac733ab770b096cc74e17


[ROCm/ROCR-Runtime commit: c1ad82a6b7]
2019-01-18 15:41:01 -05:00
Chris Freehill c848f8a365 Decrease test size for emulatation runs
Decrease number of iterations and array sizes in some cases.

Change-Id: I1a0a43faa907b28662ff3a44c172950ed7b1500e


[ROCm/ROCR-Runtime commit: 6bca866e6c]
2019-01-14 21:23:04 -05:00
Ramesh Errabolu efc2ac9024 Initialize queue buffer with Invalid Pkt Headers
Change-Id: I4166f1359746ee6829b730bac2db358af72ab16e


[ROCm/ROCR-Runtime commit: 28c3f9a269]
2018-11-21 19:09:10 -05:00
Mark Searles 508124a012 Force object code v2 until v3 is supported
Change-Id: I4c2a64bf9bd515686d1f1d90aece2a9ac40e5685


[ROCm/ROCR-Runtime commit: 8ea836017a]
2018-11-21 10:06:08 -08:00
Sean Keely d79cd9abf3 Check max wave scratch limits.
HW has limited bits for wave scratch base address stride.  Enforcement
prevents programs with larger than supported scratch allocations from
running and clobbering neighboring scratch space.

Change-Id: I574da888e9d1d5e290a9c0025ba13b5ef9f1e5c0


[ROCm/ROCR-Runtime commit: 8e4177382a]
2018-11-16 20:59:20 -05:00
Sean Keely d5c5f476fb Disable forced explicit selection of public vs internal HSA interfaces.
Temporary to reenable OCL builds on TC.

Change-Id: Ia81f2f9a9dd10ae8ce9627313247a586a8711584


[ROCm/ROCR-Runtime commit: 269be0be2e]
2018-11-16 15:26:26 -06:00
Konstantin Zhuravlyov fde14b8588 Fix dynamic relocations:
- Process dynamic relocation even if there is
    no symbol associated to it.

Change-Id: Iaefee682ee52f5acda8280e5764e6d5fd992774a


[ROCm/ROCR-Runtime commit: a447d79430]
2018-11-14 15:25:41 -05:00
Sean Keely 799e40f3b9 Cache KFD Events used by user allocated InterruptSignals.
Change-Id: I7f102f880fea9c78febe28cd262f93ee77f03184


[ROCm/ROCR-Runtime commit: 4e8597681b]
2018-11-12 22:37:42 -06:00
Sean Keely ed18ee7f38 Add pooling for Signal ABI blocks (SharedSignal).
Makes better use of memory and greatly reduces mmap count.

Change-Id: Ib444cd1ccd144986adbcc7cec297a966e2c08bc7


[ROCm/ROCR-Runtime commit: 8323b2e1d7]
2018-11-12 22:37:28 -06:00
Sean Keely 9652ba6de2 Remove legacy SVM region concept.
Also rename blit_agent to region_gpu and add comments to clarify
its role in deprecated region API support rather than to do blits.

Change-Id: I80b1043db2e1c5d40a58fc801eef70a688ea9169


[ROCm/ROCR-Runtime commit: 936ecd1885]
2018-11-09 06:27:53 -06:00
Sean Keely 3f55198dd5 Move VM fault handler init to after all devices are registered.
During registration we must not call any function that depends on registered
data as the lists are not yet complete.  This includes signal allocation since
allocating shared GPU mapped memory depends on the list of GPUs.

Change-Id: I94d59e847802c546c2a5a0d9f55fe5ac3fd1d878


[ROCm/ROCR-Runtime commit: dda9c17b45]
2018-11-09 03:10:08 -06:00
Sean Keely 37aead15c7 Ensure runtime cleanup when hsa_init ref count reaches 0.
Delete the runtime object when the last hsa_shut_down occurs.

Change-Id: I2005d52d06702eaef166714fd5e471cc277924db


[ROCm/ROCR-Runtime commit: 9ec37b5103]
2018-10-22 19:32:00 -05:00
Evgeny e53c7c63c0 aqlprofile extension version check
Change-Id: If824764f199eca15a0341cdf6177d8d6353e29f3


[ROCm/ROCR-Runtime commit: d788a53972]
2018-10-22 15:36:57 -04:00
Sean Keely b8de13150b Report internal queue creation to tools.
Debug agent requires handles to internal queues for single step debugging.
Added tools only API hsa_amd_runtime_queue_create_register for reporting.

hsa_amd_runtime_queue_create_register sets a callback which is invoked
when internal queues are created.

Change-Id: Ia5190ae724fadba686c15f25b2cd085350eeff0e


[ROCm/ROCR-Runtime commit: 757502ccd6]
2018-10-20 23:12:27 -04:00
Sean Keely 5aa7af4280 Fully initialize GPU agents before loading tools.
Required for debug agent requires copy API and trap handler to be initalized
prior to loading.  Existing tools do not make use of internal queue or scratch
memory intercept which is what PostToolsInit allows.

PostToolsInit() will be removed in a following cleanup change.

Change-Id: If43377843808e3eff0defd9204910a67a852902f


[ROCm/ROCR-Runtime commit: 5975c465ad]
2018-10-20 23:12:14 -04:00
Sean Keely b0013a3e4d Refactor of Runtime::CopyMemory()
Change-Id: I32a7cb24d00660ff4471d121ef7b3c2eec8fced2


[ROCm/ROCR-Runtime commit: 6852282a07]
2018-10-20 14:38:50 -04:00
Konstantin Zhuravlyov 6d5b1f0bde Loader: Update license for AMDHSAKernelDescriptor.h
Change-Id: I3a48b595ba089ca8a25f878c056b04a417a2364f


[ROCm/ROCR-Runtime commit: 509bb777e0]
2018-10-12 14:51:05 -04:00
Sean Keely 5f454d102d Use ptrinfo rather than apertures in hsa_memory_copy
Apertures now overlap with the change to 48bit addressing which
precludes using aperture checks to discover buffer ownership.
Switches to ptrinfo to decide which device a buffer owned by.

This corrects faults in the legacy hsa_memory_copy api.

Change-Id: I5c7ce0216e1cdc96f836fc6fec9c3defdf4b9d90


[ROCm/ROCR-Runtime commit: 1e0d690948]
2018-10-11 13:34:53 -04:00
Konstantin Zhuravlyov dd2ab28ddb Loader: Add support for v3 object code.
Change-Id: I7215bd0c1277c2036bf0fadf5b23cb57fdf7f665


[ROCm/ROCR-Runtime commit: 386874da55]
2018-10-06 14:01:59 -04:00
Jay Cornwall e2454b084b Revert "Extend SDMA disable list until firmware stability resolved"
This reverts commit 795fd231b0.

Change-Id: I17b379e4d0e49a79dc8d4a60f01ea424fda24f02


[ROCm/ROCR-Runtime commit: f1ffbc3286]
2018-10-05 15:17:27 -04:00
Kent Russell 61249bc910 Only remove ldconf on uninstall
On update, the removal will occur AFTER the new package is installed,
due to some stupidity with how yum/rpm does things. Only remove it if
we're doing a pure uninstall

Change-Id: I4982610828d8bc1f2d8691b1e4ee1718c89413cc


[ROCm/ROCR-Runtime commit: ed9baefd75]
2018-10-03 08:10:06 -04:00
Evgeny 54428e93aa hsa_ven_amd_aqlprofile_pfn_t alias
Change-Id: Ia4a67ef0d2f8975f0e541e85c215afec76e9de5f


[ROCm/ROCR-Runtime commit: fdbe277f2a]
2018-09-26 14:10:21 -04:00
Scott Linder 42d4d4ebcf Apply dynamic relocations for STT_FUNC symbols
Required to support function calls through GOT table.

Change-Id: I174a0269fdd67369d38fe41855b7bd01f350b839


[ROCm/ROCR-Runtime commit: 47f0e6f7d3]
2018-09-23 21:42:32 -04:00
Ramesh Errabolu f4d18d5256 Capture number of Numa Nodes present on system
Change-Id: Ic789a6b9da8e316cb483e50b0fe9faa03798f97c


[ROCm/ROCR-Runtime commit: 01eea21d6c]
2018-09-18 16:27:30 -05:00
Ramesh Errabolu 14767d0f4c ROCr changes to enable small BAR P2P over xGMI
Change-Id: I6aaa3fe2565cdf7e15d58a7484d6bd5916ffff64


[ROCm/ROCR-Runtime commit: f007870792]
2018-09-17 22:54:40 -04:00
Evgeny f53ee46725 VERSION_MINOR macro typo fix
aqlprofile info ENABLE_CMD enum adding;

Change-Id: I7b19082144d2bd0bf7af7ddc282358168b225759


[ROCm/ROCR-Runtime commit: 81532bb6f5]
2018-09-17 20:49:47 -04:00
Sean Keely 01b35916c7 Check fill addresses for alignment.
Check was documented but missing.

Change-Id: I97951635d794fd22e20c25d20e9d0e35035254af


[ROCm/ROCR-Runtime commit: 3357cadeec]
2018-09-05 16:34:19 -04:00
Sean Keely a550bf2687 Remove redundant initialization.
LinkInfo is already initialized to zero in its default constructor.

Change-Id: Ifa4fb886cce9b474c6879c9c82744044ab394082


[ROCm/ROCR-Runtime commit: 2843988dd7]
2018-08-29 19:36:07 -04:00
Sean Keely 0af87e4a02 Refactor blocking sdma commands.
Remove fence pool and use two signals.  Two signals allows overlapped
submission and copy while reducing thread busy polling.

Change-Id: Idb5f8e4c7f482a596ffce9e7799191fdd785a216


[ROCm/ROCR-Runtime commit: 56ed5c8904]
2018-08-29 19:13:23 -04:00
Sean Keely 61b53915d7 Implement SDMA copy rect for gfx9.
Fix pitch overflow due to small element detection.
Add wide pitch 2D copy handling.
Cleanup code duplication.

Change-Id: I93b1584aba8e5964957eb7ab3544df806ca3e2f9


[ROCm/ROCR-Runtime commit: e0839ab27e]
2018-08-29 19:13:07 -04:00
Sean Keely 94f2f17cb0 Add debug checking of time stamps validity.
Can only check that the signal has some time stamp, can't check if
the translating agent matches the last used agent or not.

Change-Id: I62943a864318808059c617280bb65a269dfadd1b


[ROCm/ROCR-Runtime commit: aca00b7238]
2018-08-26 12:36:35 -04:00
Sean Keely fe44e57a6c Expose ROCr build ID.
Adds HSA_AMD_SYSTEM_INFO_BUILD_VERSION=0x200 to hsa_system_info_t.
This returns a const char* pointing at the build string (git describe).

Change-Id: I73e6612482bf6ffc4037fd365808eb9211a650ad


[ROCm/ROCR-Runtime commit: cd8e5c1da8]
2018-08-20 20:44:32 -05:00
Chris Freehill 2a1b236843 Use 64 suffix for rocm_smi lib name
Change-Id: Idab0f5576f830657afb6bf26e1d88b18244431cb


[ROCm/ROCR-Runtime commit: c1fbd8aa54]
2018-08-20 08:05:31 -05:00
Sean Keely 896a035951 Experimental flag to swap copy agent for async copy APIs.
Adds env flag HSA_REV_COPY_DIR.  If set to 1 async copy will
copy from dst device to src device rather than from src to dst.

Change-Id: I3095642066fa026dc112c2eac06db9393341cd7e


[ROCm/ROCR-Runtime commit: 6c47780620]
2018-08-09 10:58:14 -04:00
Jay Cornwall 795fd231b0 Extend SDMA disable list until firmware stability resolved
Change-Id: I5e21cb761ae970ba2b68edd97b1564b36ca1f0f4


[ROCm/ROCR-Runtime commit: 5e1ccdc4a9]
2018-08-08 11:20:14 -05:00
James Edwards 9a71f7634e Add tools headers and library back to packaging.
Change-Id: If6c9befe50fc111eb154bd5b4eb5c7858f5d510b


[ROCm/ROCR-Runtime commit: 4d7d50feba]
2018-07-16 16:51:12 -04:00
Sean Keely 9751587239 Do not initialize runtime internal queues based on mapping memory to a GPU.
Conserves VMIDs when multiple processes are in use and memory operations
are not GPU specific.  For instance HIP API hipHostMalloc does not accept
a target GPU so when used with one process per GPU (ie GPU == MPI rank) we can
quickly exceed the available VMID slots if every process consumes a VMID on
every GPU.

Change-Id: Ib6fa051290089f71581029c09f9a44b9992237d1


[ROCm/ROCR-Runtime commit: 35a270ef7e]
2018-07-13 19:58:45 -04:00
Chris Freehill 6a51ad6aff Use the new name of the rocm_smi library
Change-Id: I7358b7b819826f1d3d3b0ca99fc5fd1a4e6d9536


[ROCm/ROCR-Runtime commit: 65c3cf27f5]
2018-07-13 11:46:49 -04:00
Chris Freehill 183d68a407 Fix NUMA async copy test
Change-Id: I64b5bd1ac5bf9b58d86c3dfc170bcf06a39abee4


[ROCm/ROCR-Runtime commit: 3cca09ccca]
2018-07-11 19:20:13 -04:00
Sean Keely f902a37ae3 Fix git describe command to retrieve version tags correctly.
Change-Id: I904f5ccdb88c1e28d5eeffd104174fcd57626ee7


[ROCm/ROCR-Runtime commit: c6cf161125]
2018-07-10 20:19:04 -05:00
Chris Freehill 3dca2b343f Undo temporary work-around for RSMI change
Change-Id: I9bf144add951c95e4eebc8647bffb71d13f4f612


[ROCm/ROCR-Runtime commit: 06759fed5f]
2018-07-09 08:46:57 -05:00