Graf commitů

911 Commity

Autor SHA1 Zpráva Datum
David Yat Sin ecdebef0b9 Add agent info for fw and sdma ucode
Add two new agent info fields:
HSA_AMD_AGENT_INFO_UCODE_VERSION
HSA_AMD_AGENT_INFO_SDMA_UCODE_VERSION

Change-Id: I51cb853724b23a26e945e5c1ac32c16d0cb3bc31
2022-12-07 19:07:31 -05:00
raghavmedicherla 5727a10a1b [hsa-runtime] Modify elfsection checks in amd_elf_image class
Modified If condition checks in GElfImage::pullElf() of amd_elf_image.cpp to
 check using section types instead of a string check.

Change-Id: I1ab92f0a9118fb2382652a1cc900a3150cbee2da
2022-12-05 14:42:02 -05:00
David Yat Sin e39ad34d9c Check for debug support after parsing topology
Thunk keeps an internal cache of system topology that can be used to
speed up subsequent calls to hsaKmtAcquireSystemProperties(). This cache
is cleared by calling hsaKmtReleaseSystemProperties() at the beginning
of BuildTopology().
hsaKmtRuntimeEnable() also calls hsaKmtAcquireSystemProperties() inside
Thunk. Move call to hsaKmtRuntimeEnable() after BuildTopology() so that
we can re-use Thunks internal cache.
Parsing of of topology can take ~150 ms on systems for large number of
nodes.

Change-Id: I741709d49d67d244f5fbd707fe8f01ab923bb153
2022-12-02 11:26:00 -05:00
Shweta Khatri 8751e65b79 Fixed callback method for dl_iterate_phdr api which is called for each loaded shared object
Simplified the callback method. Also fixed the way, loaded shared object were getting appended into a string vector,
which was not being passed to this callback method.

Change-Id: I68661dd73f61a11c42fa92f670e8e7b6ffcb5711
2022-11-21 19:00:34 -05:00
Ranjith Ramakrishnan a34804ed3e Change pragma message to warning
File reorganization feature was implemented with backward compatibility
The backward compatibility support will be deprecated in future release.
Changed the #pragma message to #warning for a smooth transition

Change-Id: Ibaedc1873bc764d25f74d9ca9416077d084e332d
2022-11-17 09:38:24 -08:00
David Yat Sin b9d1ad8604 Revert "Correct limit query return type to match spec ABI."
This reverts commit 7826d4ca2d.

Changing the parameter sizes breaks backward ABI.

Change-Id: Iff14b7c11294f0931f36fcfd42fff11a492d4205
2022-11-14 19:13:58 -05:00
David Yat Sin cb71e2d715 Allow page-aligned len for ipc_memory_create
Previous versions of HIP will call hsa_amd_ipc_memory_create with then
len aligned to granularity. Temporarily allow this so that we go not
break backward compability. Will remove this after 2 releaes

Change-Id: I6b5ac2cad5d32d62c803637cf1a2c6deebc03169
2022-11-09 15:01:47 +00:00
David Yat Sin c1e836b6ab Use paged memory for queues on MEC devices
MES devices need GART mappings and therefore need non-paged memory. But
using non-paged memory introduces performance regression where it can
take over 80 ms to see the signal changes if the memory is in the wrong
NUMA node. Currently, we cannot control NUMA affinity when allocating
non-paged memory. Using non-paged memory allocation only on devices that
have MES scheduler

Change-Id: Ib27fb01d75247aa4f2bb2aa4503c6af5a98afda0
2022-11-04 13:23:21 +00:00
David Yat Sin 0e4c7336ff Use os::createThread to launch SVM profiler thread
Using previous method of std::thread for SVM profiler task was causing
segfaults on thread launch on RHEL 8 if libhsa-runtime library is loaded
using dlopen.

Change-Id: Ic010cd6ae9bc6e6ed0605de02b93f6aae8ed3e97
2022-11-03 10:52:11 -04:00
Jonathan Kim f9edf73cd7 Fix doorbell offset fetch for GFX11
Transient exec usage is not required for GFX11 and will result in a NULL
return of s_sendmsg_rtn if directly returned to exec_lo.

Directly fetch and mask the doorbell ID to ttmp3 for GFX11 instead.

Change-Id: Ie17ed69d68d84ab18869b1c7871a0ed0482cd661
2022-11-02 11:55:37 -04:00
Nirmal Unnikrishnan 8225271e18 Updating the Rocrtst packaging
Update rocrtst packaging to add dependency on rocm-core so that rocrtst
gets uninstalled when rocm-core package is removed

Depends-On : I1e7ed52d7eed2c190d0b5651e7ded7192d7634b5

Change-Id: I7243dd29950b93a2665720a0062816c574f0f640
2022-11-02 09:38:48 -04:00
Ranjith Ramakrishnan 76cf5d2edc Add libelf-dev to package depends list
In ubuntu, the package depends list was not showing libelf. Added the same

Change-Id: I713951bd7181f44d667561aaf437f85c6cd783b0
2022-10-31 13:07:55 -07:00
David Yat Sin b4f26534eb No-Op for allow access on imported IPC
If hsa_amd_agents_allow_access is called for an imported IPC handle,
ignore the request as this pointer will already have these pointers
mapped to other GPUs during IPCAttach()

Change-Id: I4bf33ed57e93b5a3ead749d4f87ab6f2750bed58
2022-10-25 22:38:47 +00:00
David Yat Sin 18547173e9 Early return for invalid pointer queries
If a user queries the pointer info on an invalid pointer,
hsaKmtQueryPointerInfo will return error or unknown pointer. The other
fields in HsaPointerInfo are invalid, so we do not return them to the
user.
Also removing the assert and returning unknown pointer instead. As the
assert will not trigger in release builds.
hsaKmtQueryPointerInfo may also return unknown pointer for userptrs as
they are not always tracked by thunk. Adjusting code to still treat
these pointers as valid in this case.

Change-Id: Idf5cd8b61cd532d31b072f449839d223369bb138
2022-10-21 15:28:48 -04:00
Freddy Paul ac66865385 Remove RPATH/RUNPATH from ROCm libraries
:Since all public interface libraries are present in
same folder RUNPATH/RPATH is not required in the library itself.
Application shall provide the required RPATH/RUNPATH to load all
libraries.

Change-Id: I1d1ba920bf291eb89bd1f4c0fd0cfd80c7d739bd
2022-10-21 11:05:06 -04:00
David Belanger a0d3db6e8d Initial changes for gfx1101, based on gfx1100/gfx1102 implementation.
Change-Id: I949c1027ccabf38b4f924590e42e7327dc550f73
Signed-off-by: David Belanger <david.belanger@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
2022-10-13 09:28:39 -04:00
David Yat Sin 39632a713e Use user requested size for memory fragments
Amount of memory requested by user may be aligned-up internally to
the memory pool granularity. The extra padded memory should not be
considered when validating pointers from the user. Also return the
user requested size when user queries pointer information.

Change-Id: I28b25448ea03c836b44fafdb34b7330cf6887424
2022-10-07 21:32:49 +00:00
David Yat Sin 9cb10a3dd8 Fix compile warnings and remove unused variables
Change-Id: I7acaee5e9cf218b358ffaf0e3af6067faf6f3d2a
2022-10-06 10:11:17 -04:00
Yifan Zhang f05770610c Adjust the passing value for GPU agent when do max single allocation test
For APU asics, the default configuration size of video memory is
relatively small, while the reserved region becomes larger in recent
generation asics, ratio of max alloc size to the pool size may below
the expected value, so adjust it.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: I0e847c4c13e957cf6e811d3f379842619cf53370
2022-10-05 22:16:58 -04:00
Sean Keely 7826d4ca2d Correct limit query return type to match spec ABI.
Change-Id: I2eeed1f4b79d10c7d9ab0fd36c0146063053c76a
2022-10-04 01:48:26 +00:00
Jeremy Newton 1621936e32 Implement RPM Recommends for libdrm
What we want for libdrm-amdgpu is for it to be a recommended package.
Either libdrm or libdrm-amdgpu can be used, but we recommend the latter.

Using "SUGGESTS" does not seem like a strong enough requirement, but
CPACK does not support RPM recommends. Although, it does allow
customizing the RPM SPEC file template. By generating a template, which
is done by setting:

-DCPACK_RPM_GENERATE_USER_BINARY_SPECFILE_TEMPLATE=1

This template file can be trivially modified to allow adding a line to
implement CPACK_RPM_PACKAGE_RECOMMENDS.

Fixes 

Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
Change-Id: I34467b1ba878827ced9b8db74977967815732552
2022-10-03 12:42:51 -04:00
David Yat Sin d9935e6fba Add .kd to symbol kernel name for Binary Search sample
Fix Binary Search sample code as kernel symbol name has a .kd
extension.

Change-Id: Id21d2e432faa40bcd5cf343345502e823678fd0f
2022-09-12 16:17:04 -04:00
David Yat Sin dd255d31b8 Fix uninitialized variable warning
Fix warning when using valgrind

Change-Id: Ie59eaa990b9b5d339a178a2c6f9f4fac0e34e925
2022-09-08 09:10:00 -04:00
David Yat Sin e2388f242a Disable automatic dependency for rocrtst RPM
Disable automatic dependency detection when generating rocrtst RPMs.
This was adding unnecessary dependency on libhwloc, which is now
provided with the rocrtst package.
This matches behavior for DEB packages where there is no dependency
list for rocrtst.

Change-Id: If4a93f5b4c039b2f45e9445f60f65eefe84e32eb
2022-09-06 15:05:40 -04:00
jie1zhan 8941e7135c fix rocrtst on hang issue
close the file at the end of every test, instead of the whole test

Change-Id: Ia510990dad8d0bd82625bbd9b2958181e8f1dd25
2022-08-31 17:03:09 +08:00
Lang Yu d0e7c617df Query agent family id from roct
Add agent info query HSA_AMD_AGENT_INFO_ASIC_FAMILY_ID.
Then we can remove the codes to parse family id.

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Change-Id: I3ac4746d3015e89b32322ebc0f8a3084f98677a4
2022-08-25 10:15:43 -04:00
David Yat Sin 0647960019 Revert "Change search path to use RPATH"
This reverts commit c904cc5856.

The change from using RUNPATH to RPATH was not approved formally.
Reverting this patch until this gets approved.

Change-Id: Ibc1a8f9d5dfa6694adacccfd9e3b0d053660e848
2022-08-23 07:28:14 -04:00
Jonathan Kim 2b75a73ce7 Report no cooperative launch support with CU masking
The allocation logic of the SPI does not take into account compute
user thread management settings for masking CUs with the exception of
skipping fully disabled SEs.  This means that occupancy limited
dispatches such as cooperative launch may over allocate onto hardware
resources that are not immediately available, resulting in a potential
barrier logic hang as occupying work groups are waiting on enqueued
work groups to reach the barrier.

Further work will have to be done to get the per-SA CU enablement count
from the KFD in order to correctly clip the cooperative CU limit based
on the CU mask, which will require breaking the current ABI.

For now, report that cooperative launch is not supported while a CU
mask has been applied to prevent potential shader hangs.

Change-Id: I8be4bb47d65ceb62d805f36ef6ef3996d756021f
2022-08-22 08:22:28 -04:00
David Yat Sin c904cc5856 Change search path to use RPATH
Change default behavior for library search to use RPATH instead of
RUNPATH.

Change-Id: I328766006d02c2a8c76a3b1e0780ae5ca678ed86
2022-08-21 19:14:27 -04:00
David Yat Sin df3fe8c2fb Add env variable to disable CPU affinity override
New environment variable HSA_OVERRIDE_CPU_AFFINITY_DEBUG to
enable/disable overriding CPU affinity.

Default value is enabled(1).

This is a temporary variable and may be removed in the future.

Change-Id: Id6a7c611730471ddc276ca333fde1e57046bf32a
2022-08-19 11:07:49 -04:00
David Yat Sin a7db31c5d1 Expose memory executable bit for SVM ranges
Add support to expose executable bit.

Change-Id: I054f5c3173822c369dd9908eec5c449459600ce1
Signed-off-by: David Yat Sin <David.YatSin@amd.com>
2022-08-17 12:05:42 -04:00
David Yat Sin 50b636d1d8 Fix for too many open files in rocrtst
Fix for regression in commit:
8a0fe6a832

When running rocrtstNeg.Queue_Validation_InvalidWorkGroupSize, each
time rocrtst::LoadKernelFromObjFile is called, a new CodeObject is
created and not deleted until end of the whole test. Each CodeObject
keeps an open file descriptor of the kernel file and this can exceed
maximum allowed open files on some systems. Deleting the CodeObjects
after each iteration in the test.

Change-Id: I388e56f95f7b671ecc29d5ecb4eb8ac2d0ddc412
2022-08-16 14:55:38 -04:00
David Yat Sin ec759c7995 Add rocrtst to Query agent memory available
Add new test for GPU agents memory available

Change-Id: Ib07e2003a21659b99732b535cd004081635d6aa1
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
2022-08-11 09:36:58 -04:00
David Yat Sin 86e4cb1ddd Add max enum value to hsa_agent_info_t
Add max enum value to force size of enum and avoid clang compile
warnings.

Change-Id: I9cdf529517cc605a5039c3a924fd718ece16029d
2022-08-10 11:11:36 -04:00
David Yat Sin 117495fe88 Fix image LUT for gfx11
For gfx11 the image type table has some different values compared to
previous asic families (e.g TYPE_SRGB). Creating a new LUT class to
use these new values.

Change-Id: Ifdfc6cd29bfd5f4ec2643c848fcb9986eb874f9e
2022-08-04 11:23:28 -04:00
Yifan Zhang daa01b8d57 Add gfx1103 support
This patch adds gfx1103 support

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: I7f1d580059fcd501bce2c8fea894637960c29bc1
2022-08-04 11:23:28 -04:00
David Yat Sin 574bea4a4c Use FAMILY_GFX1103 for gfx1103
Also adding elf entry

Change-Id: Id47ec379f2880961022b4607eb7f106b7e9d7048
2022-08-04 11:23:28 -04:00
David Yat Sin f971834d7a Update entries for gfx11
Update image table enums and format tables for gfx11.
Remove some entries that are not needed.

Change-Id: I060c1e285925a6d428ef1c5498f5dd89f5d79d97
2022-08-04 11:23:28 -04:00
David Yat Sin 319e71e79f Use FAMILY_GFX1100 for GFX11 devices
Change-Id: Ib182b647a91987040d655dbc05cbe5f867d4f61a
2022-08-04 11:23:28 -04:00
David Yat Sin a742b7e830 Update addrLib to support gfx11
This library was taken from public MESA library:
https://gitlab.freedesktop.org/mesa/mesa/-/tree/main/src/amd/addrlib

with top commit:
2866ae32da0348caf71ad2d11c353321df626ff4

Removing macros.h as it is no longer used by addrlib

Change-Id: I0fdabfe48b74c259b4d29d81beae89604bbc141a
2022-08-04 11:23:28 -04:00
David Yat Sin c2a60a4d5d Fix scratch memory alignment on GFX11
GFX11 requires scratch memory alignment of 256 Bytes instead of 1024.

Change-Id: I103de1c12f3a4877d7d36f13254301166c66e11f
2022-08-04 11:23:28 -04:00
David Yat Sin 90322899fe Update scratch register definitions for GFX11
Update scratch register definitions for GFX11 asics.

Change-Id: I6195e04b0a099fe84d1015c2f34ca3756a8175ef
2022-08-04 11:23:28 -04:00
Graham Sider 061aa04147 Make queue memory allocation non-paged
Non-paged allocation for queue memory necessary for binding wptr to
GART. Required to support usermode queue oversubscription with MES for
GFX11.

Adds AllocateNonPaged entry to MemoryRegion::AllocateEnum for clarity;
aliases AllocateIPC.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I1a97a1820da26cf2433d9c237b2e6d2b0b8628b4
2022-08-04 11:21:00 -04:00
Graham Sider db1a13aa05 Clean up includes in queue.h
Formatting.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I141c8308d6b283b376035e21344629dc665289bb
2022-08-03 10:57:17 -04:00
David Yat Sin 907e05c1b3 Add new ImageManager for GFX11
Adding new ImageManager class for GFX11 GPUs

ImageManagerGfx11 functions copied from ImageManagerNv.
Register descriptions in resource_gfx11.h updated for gfx11.

Signed-off-by: David Yat Sin <david.yatsin@amd.com>
Change-Id: I48b39f6a633aef14aa829f7240a43fe0feb1c290
2022-08-03 10:57:09 -04:00
David Yat Sin cc3bd31591 Add gfx1102 support
Change-Id: I39cbda81a7a999aa2ecfad7a3e720000f7ca3408
Signed-off-by: David Yat Sin <David.YatSin@amd.com>
2022-08-03 10:56:54 -04:00
Graham Sider 446c5e9672 Add gfx1100 support
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: Ic5d5559e43df5c73409ba900a42c6901aabae661
2022-08-03 10:56:49 -04:00
Jay Cornwall 710adcc252 Add gfx11 blit/trap shaders
David Yat Sin:
   Rebased to amd-staging branch
   Changed MSG_GET_DOORBELL to MSG_RTN_GET_DOORBELL

Change-Id: I6015e54c4d8897f4c796f58c7fbc298758c6d76d
2022-08-03 10:56:41 -04:00
Jonathan Kim 9d2fe1ac2a Fix GPU destruction when user disabled
GPUs excluded by RVD are not expected to have scratch, memory, trap
handling nor memory regions set up.  Now that these GPUs are added to
a new list, early return on agent destruction to prevent bad function
calls on destroy.

Also fix up broken memory releases between the gpu lists and ugly braces.

Change-Id: I52fc6e86ceba0a0383cedc63310eb409515eaf9f
2022-08-02 14:18:43 -04:00
jie1zhan 8a0fe6a832 Free the executable memory , when it don't used
Fix the issue of rocrtst test - The runtime failed to allocate the necessary resources

Change-Id: Ie4ffeb939fb322db068f3132a7973a359c204176
2022-07-29 15:16:37 -04:00