Граф коммитов

867 Коммитов

Автор SHA1 Сообщение Дата
Graham Sider ff52cbb201 Make queue memory allocation non-paged
Non-paged allocation for queue memory necessary for binding wptr to
GART. Required to support usermode queue oversubscription with MES for
GFX11.

Adds AllocateNonPaged entry to MemoryRegion::AllocateEnum for clarity;
aliases AllocateIPC.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I1a97a1820da26cf2433d9c237b2e6d2b0b8628b4


[ROCm/ROCR-Runtime commit: 061aa04147]
2022-08-04 11:21:00 -04:00
Graham Sider c4ae784f4b Clean up includes in queue.h
Formatting.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I141c8308d6b283b376035e21344629dc665289bb


[ROCm/ROCR-Runtime commit: db1a13aa05]
2022-08-03 10:57:17 -04:00
David Yat Sin 63b4fe36dd Add new ImageManager for GFX11
Adding new ImageManager class for GFX11 GPUs

ImageManagerGfx11 functions copied from ImageManagerNv.
Register descriptions in resource_gfx11.h updated for gfx11.

Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Change-Id: I48b39f6a633aef14aa829f7240a43fe0feb1c290


[ROCm/ROCR-Runtime commit: 907e05c1b3]
2022-08-03 10:57:09 -04:00
David Yat Sin 1b06817f57 Add gfx1102 support
Change-Id: I39cbda81a7a999aa2ecfad7a3e720000f7ca3408
Signed-off-by: David Yat Sin <David.YatSin@amd.com>


[ROCm/ROCR-Runtime commit: cc3bd31591]
2022-08-03 10:56:54 -04:00
Graham Sider d67faa5e1f Add gfx1100 support
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: Ic5d5559e43df5c73409ba900a42c6901aabae661


[ROCm/ROCR-Runtime commit: 446c5e9672]
2022-08-03 10:56:49 -04:00
Jay Cornwall 3d6da5d16a Add gfx11 blit/trap shaders
David Yat Sin:
   Rebased to amd-staging branch
   Changed MSG_GET_DOORBELL to MSG_RTN_GET_DOORBELL

Change-Id: I6015e54c4d8897f4c796f58c7fbc298758c6d76d


[ROCm/ROCR-Runtime commit: 710adcc252]
2022-08-03 10:56:41 -04:00
Jonathan Kim cae4ed0056 Fix GPU destruction when user disabled
GPUs excluded by RVD are not expected to have scratch, memory, trap
handling nor memory regions set up.  Now that these GPUs are added to
a new list, early return on agent destruction to prevent bad function
calls on destroy.

Also fix up broken memory releases between the gpu lists and ugly braces.

Change-Id: I52fc6e86ceba0a0383cedc63310eb409515eaf9f


[ROCm/ROCR-Runtime commit: 9d2fe1ac2a]
2022-08-02 14:18:43 -04:00
jie1zhan da0ca94219 Free the executable memory , when it don't used
Fix the issue of rocrtst test - The runtime failed to allocate the necessary resources

Change-Id: Ie4ffeb939fb322db068f3132a7973a359c204176


[ROCm/ROCR-Runtime commit: 8a0fe6a832]
2022-07-29 15:16:37 -04:00
skhatri 23bd10b0ce Enabled allocation of pseudo fine grain memory where memory ordering is per point to point connection
Atomic memory operations on these memory buffers are not guaranteed
to be visible at system scope

Change-Id: I4cccde114632071a000384502a83bc191e77e85b


[ROCm/ROCR-Runtime commit: 364715cbc6]
2022-07-29 15:15:56 -04:00
Konstantin Zhuravlyov 91448848c6 Add support for the following kernel symbol query:
- HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_DYNAMIC_CALLSTACK

Change-Id: Idff5c1a2ce2a3e2d65bcc9cf1f66a68d37cd41ef


[ROCm/ROCR-Runtime commit: d962fc39bb]
2022-07-29 15:15:24 -04:00
Konstantin Zhuravlyov 2ac93924c2 Bring AMDHSAKernelDescriptor.h in sync with llvm
Change-Id: Icd35100ad4d7eb8638786d306ecfbbb1c8842db1


[ROCm/ROCR-Runtime commit: 5a49b4d17f]
2022-07-29 15:14:39 -04:00
David Yat Sin b39ab88348 Temporarily disable CU Masking test
Disabling CU Masking test until it is fixed

Change-Id: I58fa2ec760ac5c942eb017108dbe832be4dc8f77


[ROCm/ROCR-Runtime commit: d77cc854ff]
2022-07-22 09:42:38 -04:00
Ashutosh Mishra da87e16464 Removing package dependency to thunk
The current state of hsa-rocr does
NOT requires thunk lib as its dependency.
Its unnecessary pulling thunk package while
installing rocr. This patch corrects
the same

Change-Id: Id98ede8b66ffd9aaf4a47da96ba2f981f4c3da73


[ROCm/ROCR-Runtime commit: a229f5c320]
2022-07-22 09:42:38 -04:00
Sean Keely 00b4273d5c Add missing query on CPU agents.
Adds HSA_AMD_AGENT_INFO_SVM_DIRECT_HOST_ACCESS.

Change-Id: I317d7b451ed2910cdf2290b196fd89e3bf0be435


[ROCm/ROCR-Runtime commit: c2b9abaa1d]
2022-07-22 09:42:38 -04:00
Ashutosh Mishra 66a5ec2ffc Adding Maintainer DL
Maintainer distribution list field had wrong information.
Adding the newly formed DL by the component team.

Change-Id: I61651e429375cdc512d0fe4b0768f917506b5392


[ROCm/ROCR-Runtime commit: 23f908708a]
2022-07-22 09:42:28 -04:00
Jonathan Kim 0edaa45b8a Only allow pairwise CU enable for devices with WGPs
A work group processor (WGP) require both its CU to be enabled
in order to be enabled.

The KFD will round robin distribute by even-indexed pairs so
enforce this requirement for runtime set mask calls.

Change-Id: Ic46661b01f398aa1fe24d96b5c9c31f122f967a3


[ROCm/ROCR-Runtime commit: f600687537]
2022-07-07 12:50:24 -04:00
Sean Keely cf6775fbc5 Fix IPC copy agent lookup.
Discovered agent handles should only apply to copy routing, not to
copy device selection.  The user may not have mapped all allocations
to all GPUs so we must ensure that the copying device is one passed
by the user.

Change-Id: I2532e66d30e6842624e594f235dd144a186220d4


[ROCm/ROCR-Runtime commit: a8603b9397]
2022-07-05 22:51:26 -05:00
Sean Keely 966f6309f4 Report nominal GPU wallclock frequency.
Adds agent info query HSA_AMD_AGENT_INFO_TIMESTAMP_FREQUENCY.

Change-Id: Ib9108d51f9df89f8566291258aab3d1b87243441


[ROCm/ROCR-Runtime commit: dec37625ed]
2022-06-28 11:25:18 -04:00
Sean Keely edfc08e30a Add hwloc5 dev headers to rocrtst.
Allows easy building on platforms without native hwloc v1 support.

Change-Id: I20d711f914d176decb1b64381fd4b51ccc4262b5


[ROCm/ROCR-Runtime commit: 33e8919743]
2022-06-28 11:23:43 -04:00
Sean Keely 9969ed9f0d Add cu masking test.
Change-Id: I8b62ebd60f2edde3ea0b298f0353381855163fea


[ROCm/ROCR-Runtime commit: d27d4545e2]
2022-06-28 11:22:42 -04:00
Sean Keely 2c54cc6419 Basic SVM profiler.
Mostly a demo at this point.  Logs SVM (aka HMM) info to
HSA_SVM_PROFILE if set.

Example: HSA_SVM_PROFILE=log.txt SomeApp

Change-Id: Ib6fd688f661a21b2c695f586b833be93662a15f4


[ROCm/ROCR-Runtime commit: 965df6eef7]
2022-06-23 19:30:06 -05:00
skhatri 6b88a15fdc Adding support for rocrtracer tools loading without environment variable
During hsa initializing stage, ROCr now searches all the loaded libraries
for a  symbol "HSA_AMD_TOOL_PRIORITY" and adds all those libraries to
the tools library init list.  Tools libraries listed in HSA_TOOLS_LIB
env variable are also loaded in the given order and take priority
over HSA_AMD_TOOL_PRIORITY.

Change-Id: I739af42bbd777c44a9152c11e17dd69979b65e82


[ROCm/ROCR-Runtime commit: e7fc301aa7]
2022-06-23 20:08:30 -04:00
Sean Keely 44278554de Add format script.
Adds a script to run clang-format on the latest patch so we don't
need to remember the command line.

Also applies missing formatting to the prior commit,
"Add API for available GPU memory".

Change-Id: Ida51aedc38af229f6a26e275072654860748fa93


[ROCm/ROCR-Runtime commit: e7152c8b16]
2022-06-23 20:08:30 -04:00
Ranjith Ramakrishnan 8127554371 : Use GNUInstallDirs
Use GNUInstallDirs variables in post install scripts

Change-Id: Id0e3e37d412a30521d9846082d025a9e19a43942


[ROCm/ROCR-Runtime commit: 52bea549e3]
2022-06-22 16:28:06 -04:00
David Yat Sin a3f395eacb Add API for available GPU memory
Add support for AMD Agent to return amount of memory available

Change-Id: I5c32e2cebbaa2993b044250aefe434e4cc02d8c2
Signed-off-by: David Yat Sin <david.yatsin@amd.com>


[ROCm/ROCR-Runtime commit: 4ac840269c]
2022-06-07 10:33:18 -04:00
Sean Keely bb5bb604c5 Lookup copy agent when blit is selected.
Disallow passing agent 0 to avoid any API change.

Change-Id: I704fb2e04cec50500fac41a405c8a7e83a3c9fb5


[ROCm/ROCR-Runtime commit: dd671b49e5]
2022-05-14 18:08:57 -05:00
Sean Keely a1f04fe1ed Add experimental option to force discovery of all copy agents.
Discards all user provided async copy agent info and relies on
pointer info discovery.

Change-Id: Ife3e708a49ffccbede4983ab47d5ed0032970857


[ROCm/ROCR-Runtime commit: 3ebe99f96d]
2022-05-14 18:08:57 -05:00
Sean Keely 3fd1f5696e Use block pointer info in async copy.
Only block info can return an agent which is disabled in the
process.

Change-Id: I34cb1f9eea9217e10a484726c90d930e3414e769


[ROCm/ROCR-Runtime commit: 13a0cdfa77]
2022-05-14 18:08:57 -05:00
Sean Keely b757b209ad Report owning agent with pointer info block information.
Physical owning agent may not be visible to the current process
due to RVD.

Change-Id: Ib463336a5ed73a479f3aa74eb140932b9e0435fb


[ROCm/ROCR-Runtime commit: 247606c455]
2022-05-14 18:08:57 -05:00
Sean Keely 289a86785b Allow zero agent handle in AsyncCopy APIs.
IPC use cases with RVD set can't convey proper agent handles.
Runtime discovery is required to properly route the copy in this
case.

Change-Id: I4c97e132fb4b6ac1040de1cb17fe5a3e36d6be48


[ROCm/ROCR-Runtime commit: c289a43e88]
2022-05-14 18:08:49 -05:00
Sean Keely 14c6bd37fd Report pointer info queries to released fragments as type UNKNOWN.
We should not leak suballocation info to users.

Change-Id: I13b2a22bf5517b523ba04ddc039b49da8378b55f


[ROCm/ROCR-Runtime commit: ace0599c69]
2022-05-09 13:46:16 -05:00
Sean Keely 588e124c4e Ensure IPC imports always create an allocation map entry.
Simplifies behavior.  A memory type now either always generates an
entry or never does.

Change-Id: Ie98cddea01e801308ac0ba650795fdef92b7e47d


[ROCm/ROCR-Runtime commit: 0ba9b162db]
2022-05-09 13:46:16 -05:00
Sean Keely c96272841b Adjust include paths for new header locations.
Thunk and rocm_smi_lib paths have been updated.

Change-Id: If2948172f8064dd992cbccbc2a80f9161ad4d457


[ROCm/ROCR-Runtime commit: 752cfd5ffd]
2022-05-09 14:44:32 -04:00
Ranjith Ramakrishnan 416074aaac File Reorganization changes with backward compatibility
Wrapper header files and library soft links for backward compatibility
Install interface updated with /opt/rocm/include

Change-Id: If772b24320f9d1de90f9be0930b1f2aa1d073777


[ROCm/ROCR-Runtime commit: bb4da8545a]
2022-05-06 19:12:14 -04:00
Sean Keely 35ae610c0c Drop build dependency on DeviceLibs.
DeviceLibs is still needed but is found and included by clang now.

Change-Id: I03ff7dc91c028d2ee6747aa1779d223a9ba13915


[ROCm/ROCR-Runtime commit: 7f370dd84c]
2022-05-06 01:01:05 -04:00
Sean Keely 2b8c129efb Switch to CLOCK_BOOTTIME for HSA system clock.
This is consistent with KFD and has significantly better latency.
KFD is taking this as the definition of the SystemClockCounter.

Change-Id: I4c1b3bc58c738206265c55ebefd41356c013bfe5


[ROCm/ROCR-Runtime commit: 0ee82742a7]
2022-05-05 15:27:29 -04:00
David Yat Sin be1d3bef2d Remove unused variable
Change-Id: Ie29eb1cabef38c259280237c32d83aaa126e3b7a


[ROCm/ROCR-Runtime commit: cd0788938c]
2022-05-04 13:32:06 -04:00
Yifan Zhang a57d706974 add gfx1036 support
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: Ifc1b3cf2e46cf753f57470ebc6b034c1a349d3d2


[ROCm/ROCR-Runtime commit: 54c8b7900d]
2022-04-29 17:52:22 -04:00
Shweta Khatri 2a635aa54d Assemble trap handler at build time.
Eliminates the need for manually assembling the source of the
second level trap handler to produce the shader binary.  Also
separated blit shaders' binary source and version one second
level trap handler binary sources into different header files.

Change-Id: If29a18ee06dc083ec880ea962f234c6b5cac806a


[ROCm/ROCR-Runtime commit: 1b0440e7b3]
2022-04-28 20:14:14 -04:00
Jonathan Kim 495a3f233f Bypass HDP flush during SDMA copies on A+A GPU-CPU xGMI connections
Host to device SDMA copies do not require an HDP cache flush when
connected by xGMI since data copies over the data fabric and not HDP.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Sean Keely <sean.keely@amd.com>
Change-Id: I78d73a47edcc1a9c0ba59f33cf91485f13f1c45b


[ROCm/ROCR-Runtime commit: 658b053943]
2022-04-27 21:45:26 -04:00
Sean Keely cdf734c771 Minor typo fixes.
Declare the type of HSA_AMD_AGENT_INFO_COOPERATIVE_COMPUTE_UNIT_COUNT
and add a missing break statement.

Change-Id: I86ce8a2e620438e046b60cee991ce1fbe07a3e88


[ROCm/ROCR-Runtime commit: 64dae113b1]
2022-04-26 15:51:22 -04:00
Sean Keely 761653fa00 Handle scratch interleave per SE for gfx10+
On gfx10+ we need to issue a minimum count of active lanes or
groups before ADC moves on.  Ensure that scratch allocations
attempt to reach this limit.

Occupancy throttling due to OOM condition may still drop below this
limit.

Change-Id: I0edf2e40fbe1a95e9a262564cebd2b6a82501a0b


[ROCm/ROCR-Runtime commit: 2eedf953f3]
2022-04-26 15:32:03 -04:00
Shweta Khatri 4effeb8f9f Fix heap-buffer-overflow error in Memory access test. Also reverted most of first array element from 0 to 1 changes.
Change-Id: I62dee9bab379210a322848132e2846dc153724d9


[ROCm/ROCR-Runtime commit: 539ec6a87d]
2022-04-21 12:09:58 -04:00
Jeremy Newton 9e346a1c58 Drop some unnecessary definitions
__x86_64__ and __AMD64__ should be already defined by the compiler to
specify the compilation target and shouldn't be defined manually.

I fixed two x86_64 checks to include VS variables, as removing this
might cause it to fail to compile on that compiler.

Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
Change-Id: I600ff449af85bf7d83ecab167d97933922e2d917


[ROCm/ROCR-Runtime commit: 178a7a5cfa]
2022-04-19 12:22:42 -04:00
Jeremy Newton feba682013 Use CMAKE_INSTALL_*
Instead of installing to lib or include, use CMAKE_INSTALL_LIBDIR and
CMAKE_INSTALL_INCLUDEDIR to allow the builder to override if desired.

The default LIBDIR should be "lib" to avoid breaking ROCm packaging, but
using GNUInstallDirs would use lib64 on RHEL. By setting a default value
prior to including GNUInstallDirs, we can always use "lib" unless the
builder explicitly overrides it via "-DCMAKE_INSTALL_LIBDIR", which is
typical in most distro scripts.

Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
Change-Id: I135f21bcfeb02b6849f6e8ca403b39c029a02d5c


[ROCm/ROCR-Runtime commit: ddf4edcafc]
2022-04-19 12:22:42 -04:00
Jeremy Newton 3d0b0fd774 Only default IMAGE_SUPPORT=ON for x86
Image support does not compile on other archectures, since it relies on
the x86 only header "x86intrin.h".

Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
Change-Id: I120d15870e74e20bd618e6f5da8c05e28fb1203b


[ROCm/ROCR-Runtime commit: a0931f4a3c]
2022-04-12 09:24:45 -04:00
Konstantin Zhuravlyov 625b1c99b3 Add code object v5 support
Change-Id: I03522765056e99ed49e6c5e213ee3753852de27b


[ROCm/ROCR-Runtime commit: 9265409f08]
2022-04-12 08:53:27 -04:00
Sean Keely 6622fe0163 Revert "Release host buffers after segment freeze."
This reverts commit cf3f441625.

Change-Id: Idc7e568b2b54a226dbe4d189b25a78be3bd16eea


[ROCm/ROCR-Runtime commit: b3caf6782b]
2022-04-11 20:43:07 -05:00
Sean Keely 16efad0cdc Correct inf loop defect in fast clock init.
Each time delay is grown we need to reset elapsed.  We want to take
the most accurate sample from the set at fixed delay.

Without this we will hang if there is ever an insufficiently accurate,
high unit clock read.

Change-Id: Ic65f364067789ac85a6572d67af2d77528e265bb


[ROCm/ROCR-Runtime commit: 4e9849034d]
2022-04-01 16:15:37 -04:00
Sean Keely cf3f441625 Release host buffers after segment freeze.
Release staging buffers after loading has completed.  The debugger
no longer uses this copy.

Change-Id: I46f36b50033bebe5a9ebc648b291d46f1d09b21d


[ROCm/ROCR-Runtime commit: 03a52655a8]
2022-03-23 23:53:02 -05:00