Граф коммитов

2959 Коммитов

Автор SHA1 Сообщение Дата
David Yat Sin a742b7e830 Update addrLib to support gfx11
This library was taken from public MESA library:
https://gitlab.freedesktop.org/mesa/mesa/-/tree/main/src/amd/addrlib

with top commit:
2866ae32da0348caf71ad2d11c353321df626ff4

Removing macros.h as it is no longer used by addrlib

Change-Id: I0fdabfe48b74c259b4d29d81beae89604bbc141a
2022-08-04 11:23:28 -04:00
David Yat Sin c2a60a4d5d Fix scratch memory alignment on GFX11
GFX11 requires scratch memory alignment of 256 Bytes instead of 1024.

Change-Id: I103de1c12f3a4877d7d36f13254301166c66e11f
2022-08-04 11:23:28 -04:00
David Yat Sin 90322899fe Update scratch register definitions for GFX11
Update scratch register definitions for GFX11 asics.

Change-Id: I6195e04b0a099fe84d1015c2f34ca3756a8175ef
2022-08-04 11:23:28 -04:00
Graham Sider 061aa04147 Make queue memory allocation non-paged
Non-paged allocation for queue memory necessary for binding wptr to
GART. Required to support usermode queue oversubscription with MES for
GFX11.

Adds AllocateNonPaged entry to MemoryRegion::AllocateEnum for clarity;
aliases AllocateIPC.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I1a97a1820da26cf2433d9c237b2e6d2b0b8628b4
2022-08-04 11:21:00 -04:00
Graham Sider db1a13aa05 Clean up includes in queue.h
Formatting.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I141c8308d6b283b376035e21344629dc665289bb
2022-08-03 10:57:17 -04:00
David Yat Sin 907e05c1b3 Add new ImageManager for GFX11
Adding new ImageManager class for GFX11 GPUs

ImageManagerGfx11 functions copied from ImageManagerNv.
Register descriptions in resource_gfx11.h updated for gfx11.

Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Change-Id: I48b39f6a633aef14aa829f7240a43fe0feb1c290
2022-08-03 10:57:09 -04:00
David Yat Sin cc3bd31591 Add gfx1102 support
Change-Id: I39cbda81a7a999aa2ecfad7a3e720000f7ca3408
Signed-off-by: David Yat Sin <David.YatSin@amd.com>
2022-08-03 10:56:54 -04:00
Graham Sider 446c5e9672 Add gfx1100 support
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: Ic5d5559e43df5c73409ba900a42c6901aabae661
2022-08-03 10:56:49 -04:00
Jay Cornwall 710adcc252 Add gfx11 blit/trap shaders
David Yat Sin:
   Rebased to amd-staging branch
   Changed MSG_GET_DOORBELL to MSG_RTN_GET_DOORBELL

Change-Id: I6015e54c4d8897f4c796f58c7fbc298758c6d76d
2022-08-03 10:56:41 -04:00
Jonathan Kim 9d2fe1ac2a Fix GPU destruction when user disabled
GPUs excluded by RVD are not expected to have scratch, memory, trap
handling nor memory regions set up.  Now that these GPUs are added to
a new list, early return on agent destruction to prevent bad function
calls on destroy.

Also fix up broken memory releases between the gpu lists and ugly braces.

Change-Id: I52fc6e86ceba0a0383cedc63310eb409515eaf9f
2022-08-02 14:18:43 -04:00
Kent Russell 90ada94141 kfdtest: Add return statement for ReadSMIEventThread
This didn't return anything, so add a "return 0" at the end, since the
function expects to return an int value

Change-Id: I17c398e431b2ce4571e6ca4abe6d567f110ea2a7
2022-08-02 09:22:49 -04:00
jie1zhan 17fb40f1f6 Fix allocate memory failed in VRAM
: The kernel driver will do align VRAM allocations to 2MB, instead of 4KB.
Change-Id: Iea9d8c0f02999b9ea5fd931da82240a33f7bcc69
2022-07-30 01:18:50 -04:00
jie1zhan 8a0fe6a832 Free the executable memory , when it don't used
Fix the issue of rocrtst test - The runtime failed to allocate the necessary resources

Change-Id: Ie4ffeb939fb322db068f3132a7973a359c204176
2022-07-29 15:16:37 -04:00
skhatri 364715cbc6 Enabled allocation of pseudo fine grain memory where memory ordering is per point to point connection
Atomic memory operations on these memory buffers are not guaranteed
to be visible at system scope

Change-Id: I4cccde114632071a000384502a83bc191e77e85b
2022-07-29 15:15:56 -04:00
Konstantin Zhuravlyov d962fc39bb Add support for the following kernel symbol query:
- HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_DYNAMIC_CALLSTACK

Change-Id: Idff5c1a2ce2a3e2d65bcc9cf1f66a68d37cd41ef
2022-07-29 15:15:24 -04:00
Konstantin Zhuravlyov 5a49b4d17f Bring AMDHSAKernelDescriptor.h in sync with llvm
Change-Id: Icd35100ad4d7eb8638786d306ecfbbb1c8842db1
2022-07-29 15:14:39 -04:00
Felix Kuehling deb7a20c92 libhsakmt: Make CWSR area executable
The debugger depends on the CWSR area being executable. Set the right
flag when registering SVM memory.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Laurent Morichetti <Laurent.Morichetti@amd.com>
Change-Id: I7441e214d1a4da8324d775e777976fabd1c81a6f
2022-07-25 17:43:52 -04:00
David Yat Sin d77cc854ff Temporarily disable CU Masking test
Disabling CU Masking test until it is fixed

Change-Id: I58fa2ec760ac5c942eb017108dbe832be4dc8f77
2022-07-22 09:42:38 -04:00
Ashutosh Mishra a229f5c320 Removing package dependency to thunk
The current state of hsa-rocr does
NOT requires thunk lib as its dependency.
Its unnecessary pulling thunk package while
installing rocr. This patch corrects
the same

Change-Id: Id98ede8b66ffd9aaf4a47da96ba2f981f4c3da73
2022-07-22 09:42:38 -04:00
Sean Keely c2b9abaa1d Add missing query on CPU agents.
Adds HSA_AMD_AGENT_INFO_SVM_DIRECT_HOST_ACCESS.

Change-Id: I317d7b451ed2910cdf2290b196fd89e3bf0be435
2022-07-22 09:42:38 -04:00
Ashutosh Mishra 23f908708a Adding Maintainer DL
Maintainer distribution list field had wrong information.
Adding the newly formed DL by the component team.

Change-Id: I61651e429375cdc512d0fe4b0768f917506b5392
2022-07-22 09:42:28 -04:00
Felix Kuehling 9d33827a84 kfdtest: Disable host access for VRAM
KFDExceptionTest.SdmaQueueException allocates VRAM with host access. This
fails on small-BAR GPUs. This error was incorrectly ignored before
412b24137e ("kfdtest: Full TearDown and SetUp in child process").

The test doesn't really need host access to the memory. Therefore the fix
is to disable the HostAccess flag.


Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ifec279eeb6c1ecb1160db9b692e6dc8816d761a3
2022-07-21 16:08:19 -04:00
Felix Kuehling 9b2b81e555 libhsakmt: Remove CMA implementation
The CMA feature is deprecated and about to be removed from the DKMS
branch. It was never supported upstream. Leave dummy functions in
place for now.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: I9e51403d753cb91630553aff4f19e931af509740
2022-07-21 16:08:19 -04:00
Felix Kuehling cdaaf8236a kfdtest: Remove CMA tests
The CMA feature is deprecated and about to be removed from the DKMS
branch. It was never supported upstream.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: I64b6213eb3adbdc550542e51181cd8ba6ca4cb45
2022-07-21 16:08:19 -04:00
Felix Kuehling 9ac2c75171 libhsakmt: Map VRAM only on supported peer GPUs
hsaKmtMapMemoryToGPU should not try to map VRAM on peer GPUs that don't
have an IO-Link to the memory. The new P2P mapping code in KFD will
fail otherwise.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: I6d59b55651b98756865a0f69eafef3e386372cf3
2022-07-19 21:21:59 -04:00
Felix Kuehling 87aca673e8 libhsakmt: Init apertures in AcquireSystemProperties
This allows init_process_apertures to use the whole consistent topoology
instead of taking its own partial snapshot.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ia13e7aa7fcd090ea8d6cacd4babb29a27c20207f
2022-07-19 21:21:59 -04:00
Felix Kuehling 412b24137e kfdtest: Full TearDown and SetUp in child process
With the next patch, child processes need to fully reinitialize the
topology in order to recreate the process apertures. Just calling
hsaKmtOpenKFD is no longer sufficient. Tests based on
KFDMultiProcessTest already did this correctly (KFDHWSTest, eviction
tests). This patch fixes KFDExceptionTest and KFDIPCTest.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Iaad24e88ddd29c1105bf791a77891cc55a6072ff
2022-07-19 21:21:59 -04:00
Graham Sider c6da7d1353 kfdtest: Remove fixed tests from GFX11 blacklist
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I2dc785ea88a1a9222c6b7886fb75b6c7d699036a
2022-07-18 14:35:17 -04:00
Jeremy Newton aa25cb1acc Fix numa linking
We should link against numa without hardcoding the path to it.
CMake should determine how to link numa automatically, similar to how rt
and pthread is linked.

Fixes 

Change-Id: Ifb9ac30e200c66cbd7f1cf80d25fffef1dcf8d2f
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
2022-07-14 11:04:09 -04:00
David Francis 39e8a85aac kfdtest: Remove strange load from LoopIsa
LoopIsa is a shader that performs a variety of intensive
calculations in a loop. It is used by tests such as
KFDQMTest.QueuePriorityOn*

It contained a scalar load, despite not having any buffer to
read from. This load causes page faults on GFX11. It is
unclear why it did not cause page faults on earlier ASICs.

Remove the load.

Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: I7426d0db48e933f3bb870467ea88476f7a283040
2022-07-13 09:37:01 -04:00
David Francis 4b8c74bf04 kfdtest: Change KFDQMTest.EmptyDispatch to NoopIsa
When the shaders were moved to ShaderStore,
KFDQMTest.EmptyDispatch was erroneously
changed to use LoopIsa instead of NoopIsa.

Change it back.

Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: Iaf7d0d107e3bf3bd8b7d616b137a1740e309cf91
2022-07-13 09:36:45 -04:00
Jonathan Kim a31206e98b libhsakmt: fix runtime disable check
Kernel debug IOCTL got version bumped to v11.
Updated runtime enable but missed runtime disable check update.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I71a8970ccfe7dc517abe7b4ad962369aea6a0496
2022-07-12 20:59:08 -04:00
Kent Russell 9745db3053 KFDTest: Use default packaging name for kfdtest
Previously we omitted the version and arch in the filenames. By adding this,
as well as the ROCM build variable, this will allow for easy version
version detection on systems. Instead of kfdtest = v1.0.0, now it will
feature the build number, allowing for easier identification as to which
version is installed.

Change-Id: I311ed7010486e7c70af669d282910fe29ee8db45
2022-07-12 12:41:20 -04:00
Eric Huang 37be876cad libhsakmt: allocate unified memory for ctx save restore area
To improve performance on queue preemption, allocate ctx s/r
 area in VRAM instead of system memory, and migrate it back
 to system memory when VRAM is full.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Change-Id: If775782027188dbe84b6868260e429373675434c
2022-07-11 18:06:55 -04:00
Eric Huang e1d1a6fbb0 libhsakmt: add new flag for svm
It is to add new option for always keeping gpu mapping
and bump KFD version for the feature of unified save
restore memory.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Change-Id: Iebee35e6de4d52fa29f82dd19f6bbf5640249492
2022-07-11 18:06:48 -04:00
Jonathan Kim f600687537 Only allow pairwise CU enable for devices with WGPs
A work group processor (WGP) require both its CU to be enabled
in order to be enabled.

The KFD will round robin distribute by even-indexed pairs so
enforce this requirement for runtime set mask calls.

Change-Id: Ic46661b01f398aa1fe24d96b5c9c31f122f967a3
2022-07-07 12:50:24 -04:00
Philip Yang 7799611c01 kfdtest: add KFDSVMRangeTest.HMMProfilingEvent
Open SMI event file handle, prefetch to migrate svm range to GPU, read
HMM profiling events, then check event_id, address, size, pid, event
triggers are the expected value.

Start separate thread to read SMI event, the same way applications use.
Use thread barrier to ensure no event is dropped.

Change-Id: I0683969d18d1579847e125d86aa4257602adb13f
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2022-07-06 10:08:44 -04:00
Sean Keely a8603b9397 Fix IPC copy agent lookup.
Discovered agent handles should only apply to copy routing, not to
copy device selection.  The user may not have mapped all allocations
to all GPUs so we must ensure that the copying device is one passed
by the user.

Change-Id: I2532e66d30e6842624e594f235dd144a186220d4
2022-07-05 22:51:26 -05:00
Sean Keely dec37625ed Report nominal GPU wallclock frequency.
Adds agent info query HSA_AMD_AGENT_INFO_TIMESTAMP_FREQUENCY.

Change-Id: Ib9108d51f9df89f8566291258aab3d1b87243441
2022-06-28 11:25:18 -04:00
Sean Keely 33e8919743 Add hwloc5 dev headers to rocrtst.
Allows easy building on platforms without native hwloc v1 support.

Change-Id: I20d711f914d176decb1b64381fd4b51ccc4262b5
2022-06-28 11:23:43 -04:00
Sean Keely d27d4545e2 Add cu masking test.
Change-Id: I8b62ebd60f2edde3ea0b298f0353381855163fea
2022-06-28 11:22:42 -04:00
Jonathan Kim 79cd63fab6 libhsakmt: permit runtime enable version for new hw mode set restrictions
The KFD no longer allow debug ops that modify HW state prior to
trap activation so permit bump in major version.

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I072d3998b7b043df9a67f0f6762b0afdfa9382c6
2022-06-24 10:21:57 -04:00
Sean Keely 965df6eef7 Basic SVM profiler.
Mostly a demo at this point.  Logs SVM (aka HMM) info to
HSA_SVM_PROFILE if set.

Example: HSA_SVM_PROFILE=log.txt SomeApp

Change-Id: Ib6fd688f661a21b2c695f586b833be93662a15f4
2022-06-23 19:30:06 -05:00
skhatri e7fc301aa7 Adding support for rocrtracer tools loading without environment variable
During hsa initializing stage, ROCr now searches all the loaded libraries
for a  symbol "HSA_AMD_TOOL_PRIORITY" and adds all those libraries to
the tools library init list.  Tools libraries listed in HSA_TOOLS_LIB
env variable are also loaded in the given order and take priority
over HSA_AMD_TOOL_PRIORITY.

Change-Id: I739af42bbd777c44a9152c11e17dd69979b65e82
2022-06-23 20:08:30 -04:00
Sean Keely e7152c8b16 Add format script.
Adds a script to run clang-format on the latest patch so we don't
need to remember the command line.

Also applies missing formatting to the prior commit,
"Add API for available GPU memory".

Change-Id: Ida51aedc38af229f6a26e275072654860748fa93
2022-06-23 20:08:30 -04:00
Graham Sider 350eba3a07 kfdtest: Update gpuName logic for ip discovery
Kernel amd-staging-drm-next branch changes GFX11 fish_colour sysfs
naming to "ip discovery". Update run_kfdtest.sh to use sysfs
gfx_target_version for ASICs that have transitioned to IP discovery
topology.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: If202a0ceeed7324364539a33661f0abcf0973f07
2022-06-23 13:48:26 -04:00
Graham Sider e17b159230 libhsakmt: Make queue memory allocation non-paged
Non-paged allocation for queue memory necessary for binding wptr to
GART. Required to support usermode queue oversubscription with MES on
GFX11.

Change ensures queue memory does not specify ATS.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I10b23b0205c90dad902c711a88cfb5e9b4979617
2022-06-22 18:37:26 -04:00
Ranjith Ramakrishnan 52bea549e3 : Use GNUInstallDirs
Use GNUInstallDirs variables in post install scripts

Change-Id: Id0e3e37d412a30521d9846082d025a9e19a43942
2022-06-22 16:28:06 -04:00
Philip Yang 405fbd6f93 libhsakmt: add open SMI event handle
System Management Interface event is read from anonymous file handle,
this helper wrap the ioctl interface to get anonymous file handle for
GPU nodeid.

Define SMI event IDs, event triggers, copy the same value from
kfd_ioctl.h to avoid translation.

Change-Id: I5c8ba5301473bb3b80bb4e2aa33a9f675bedb001
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2022-06-20 08:49:13 -04:00
Philip Yang 88778e13dc libhsakmt: hsaKmtGetNodeProperties add gpu_id
Add KFDGpuID to HsaNodeProperties to return gpu_id to upper layer,
gpu_id is hash ID generated by KFD to distinguish GPUs on the system.
ROCr and ROCProfiler will use gpu_id to analyze SMI event message.

Change-Id: I6eabe6849230e04120674f5bc55e6ea254a532d6
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2022-06-20 08:48:44 -04:00