İşleme Grafiği

2959 İşleme

Yazar SHA1 Mesaj Tarih
David Yat Sin e30be76f37 Add query for IOMMU support
Reporting whether IOMMU V2 is supported.
IOMMU V1 support is not relevant to user, so not reporting it.

Change-Id: I77389484a87a352da9c2f7b2a5d9de264f90ee53
2023-01-19 11:33:21 -05:00
David Yat Sin 722794e258 Add memory pool query to return location
Change-Id: I240b77119d7b8ccfc5ff6a3190d6669d69f243e8
2023-01-19 08:45:05 -05:00
David Yat Sin a4f898ad15 Add env variable to print image SRD contents
Add environment variable HSA_IMAGE_PRINT_SRD to print contents of SRD
registers for image functions

Change-Id: Ifb47a73dcfad8745ee7445e20de96e1021b80bd6
2023-01-13 11:01:04 -05:00
Alexander Turek f7e3782b42 isa: Add fix for hsa_isa_iterate_wavefronts always returns 64
Currently, Wavefront::GetInfo(HSA_WAVEFRONT_INFO_SIZE.. always returns
64. Instead, return the proper wavefront size based on the ISA.

Temporarily, we only return 1 wavefront size for each ISA. As we do not
have mechanism from upper layers to determine correct wavefront when
there are multiple wavefronts supported. We are temporarily
returning 32 for all gfx1xxx cards even though they support 64 as the
kernels for gfx1xxx are compiled for wavefront-32 by default.

Change-Id: Ic6c2917b7e6d3704daf742d243f5ec7f49430de9
2023-01-12 08:40:07 -05:00
David Belanger b25867c4b8 libhsakmt: Disabled allocation of CWSR with SVM for GFX11.
This is a temporary work around for GPU hang issues observed on GFX11.

Change-Id: I98fbedbbd1c51fe402c2116b35ca548931a390c9
Signed-off-by: David Belanger <david.belanger@amd.com>
2023-01-11 17:28:31 -05:00
Shweta Khatri ed0a1be2c3 Enforce uncached memory on AllocatePCIeRW request
Change-Id: Ib5a624ab979220d50205448ef37b4550672fb97d
2023-01-11 16:52:15 -05:00
Ranjith Ramakrishnan dbf8905dd1 Revert "Remove RPATH/RUNPATH from ROCm libraries"
This reverts commit ac66865385.

Reason for revert:  is blocked due to new proposal. so reverting the changes 

Change-Id: Id9b8cc1560ba3eea6e484e67df3fdc647da9f37d
2023-01-10 13:52:02 -05:00
Eric Huang 505287412f Revert "libhsakmt: Remove unnecessary CPU unmap"
This reverts commit 7787a039bd.

It causes a regression in pytorch benchmark.

Change-Id: I96173dbd061cf38d6f451c02cb181ae51b7f625e
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
2023-01-06 17:16:40 -05:00
David Yat Sin 0f2fa3ba72 Force rocrtsts to use Code Object V4
Temporarily force rocrtsts to use Code Object V4 while compiler team is
about to switch the default Code Object to V5. Will switch back to using
default compiler setting once everything is tested/fixed.

Change-Id: I18e5c6771fffd8c60792fc197501d373c7ec22f3
2023-01-06 12:01:03 -05:00
Shweta Khatri e72329ab76 Fixed GFX11 Texture, Buffer and Sampler Resource Descriptor definitions
Change-Id: I101806f9f91ec2ad78339dabc98375bd09946dd0
2023-01-05 15:40:47 -05:00
Ranjith Ramakrishnan 5c90c762f9 Corrected libelf package name in depends list
libelf1 package contains libelf.so.1. Updated the package name
Improvement: Removed the initialization of cmake_install_libdir in  source code
Build scripts is initializing  the variable to "lib" and passed as build argument

Change-Id: I16a8cdc4c231487410c1114b818e9d01df4854de
2022-12-15 23:30:22 -08:00
Alex Sierra f2bda56d04 Revert "src: use SVM mechanism to register userptr memory"
This reverts commit 178a619b80.
There are some openMP issues that were introduced after SVM userptr
feature was added.

Signed-off-by: Alex Sierra <Alex.Sierra@amd.com>
Change-Id: I7ef87c5232a3bcbe594c743fa4b4958601845ba5
2022-12-08 17:33:51 -06:00
Alex Sierra d9f86ae02b Revert "libhsakmt: query svm info from userptrs at fault events"
This reverts commit 45fad29752.
There are some openMP issues that were introduced after SVM userptr
feature was added.

Signed-off-by: Alex Sierra <Alex.Sierra@amd.com>
Change-Id: I6566c9f0d39d05ecb92f38159880763f432939a5
2022-12-08 17:33:50 -06:00
Alex Sierra 21e95a4f2a Revert "libhsakmt: add env var to en/dis registration through SVM"
This reverts commit 8a746bdaed.
There are some openMP issues that were introduced after SVM userptr
feature was added.

Signed-off-by: Alex Sierra <Alex.Sierra@amd.com>
Change-Id: Ib01046571d2c84fa0fd228ecba0dee0eae3f994d
2022-12-08 17:33:48 -06:00
David Yat Sin 6bfe57aeb2 Add Stream Performance Monitor(SPM) APIs
Change-Id: I0d48782887814ef245b7e0182e2d5570aa8c3f50
2022-12-08 13:56:29 -05:00
David Yat Sin ecdebef0b9 Add agent info for fw and sdma ucode
Add two new agent info fields:
HSA_AMD_AGENT_INFO_UCODE_VERSION
HSA_AMD_AGENT_INFO_SDMA_UCODE_VERSION

Change-Id: I51cb853724b23a26e945e5c1ac32c16d0cb3bc31
2022-12-07 19:07:31 -05:00
raghavmedicherla 5727a10a1b [hsa-runtime] Modify elfsection checks in amd_elf_image class
Modified If condition checks in GElfImage::pullElf() of amd_elf_image.cpp to
 check using section types instead of a string check.

Change-Id: I1ab92f0a9118fb2382652a1cc900a3150cbee2da
2022-12-05 14:42:02 -05:00
David Yat Sin e39ad34d9c Check for debug support after parsing topology
Thunk keeps an internal cache of system topology that can be used to
speed up subsequent calls to hsaKmtAcquireSystemProperties(). This cache
is cleared by calling hsaKmtReleaseSystemProperties() at the beginning
of BuildTopology().
hsaKmtRuntimeEnable() also calls hsaKmtAcquireSystemProperties() inside
Thunk. Move call to hsaKmtRuntimeEnable() after BuildTopology() so that
we can re-use Thunks internal cache.
Parsing of of topology can take ~150 ms on systems for large number of
nodes.

Change-Id: I741709d49d67d244f5fbd707fe8f01ab923bb153
2022-12-02 11:26:00 -05:00
James Zhu 7db29c4797 kfdtest: track Test Status in syslog
Track Test Status in syslog, it will help understand
sys log assoicated with test cases.

Change-Id: I7c0749102db9bc73d6ae3a237ec347a8fefb12e9
Signed-off-by: James Zhu <James.Zhu@amd.com>
2022-11-29 17:46:40 -05:00
Felix Kuehling 7787a039bd libhsakmt: Remove unnecessary CPU unmap
This is handled by __fmm_release calling aperture_release_area.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ib8ed300e1734f03aeb9dfc8074897ece310b8af9
2022-11-28 17:18:13 -05:00
Felix Kuehling 73b0fb3d7c libhsakmt: Refactor and clean up CPU mappings
Use a common helper for CPU mappings to reduce duplicate code.
Consistently use MAP_SHARED for all render_fd mappings.
Remove double-mapping for AQL queue buffers on the CPU. This workaround
is only needed on the GPU.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Iff86c8cc9f1e5c982614b3f11129bc2cf8cbba02
2022-11-28 17:18:05 -05:00
Felix Kuehling 2d53430ce3 libhsakmt: Fix and simplify debug_get_reg_status
The NULL pointer check was the only way for that function to fail. And it
was done after the pointer was accessed. Simplify this by just returning
the result as a return value instead of using a pointer as output
parameter. This way the function can never fail and the caller doesn't
need to do any error handling.

Declare the function in libhsakmt.h instead of duplicating the
declaration in fmm.c.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: I91b90d66166fd3b5cdc47c73a9bbc369c45b51fe
2022-11-28 17:17:43 -05:00
Alex Sierra 8a746bdaed libhsakmt: add env var to en/dis registration through SVM
Setting this variable to '0' will force to disable memory
registration/allocation through SVM API mechanism.
Not setting this or setting to '1', SVM API will be used only if all
GPUs support it.

Signed-off-by: Alex Sierra <Alex.Sierra@amd.com>
Change-Id: Icdf7656de09aa9988b567ec6c024953398e9bb48
2022-11-28 13:42:43 -05:00
Daniel Phillips e71eb13784 kfdtest: Also detect under-reporting of available memory
Detect under-reporting of available memory by initially attempting to
allocate substantially more than reported available memory, and ensure
that the allocation fails. Continue shrinking the attempted allocation
until it succeeds, then fail the test if the successful allocation is
either too much more than or too much less than reported available.

Signed-off-by: Daniel Phillips <daniel.phillips@amd.com>
Change-Id: Ib418f0aa26e8db80590a6c5f2578da56a4b60f2b
2022-11-28 11:43:48 -05:00
Felix Kuehling 8e69b9c70e libhsakmt: Fix use of uninitialized variable
When is hsaKmtCreateQueue called first time for node
doorbells[NodeId].size is initialized to zero in init_process_doorbells
but used to calculate the doorbell offset. It works just by accident
because doorbells[NodeId].size is uint32_t so -1 will be 0xFFFFFFFF which
is zero extended into 0x00000000FFFFFFFF and it will work as long as mmap
offset bits are not within lower 32 bits.

Bug: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/issues/78
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ia791adfc51363d4704cb50fa4f01137b7dd48a75
2022-11-25 14:07:45 -05:00
Eric Huang 8e8aa024fd kfdtest: remove scc test in MapUnmapToNodes for gfx90a A+A
Modifier scc is disabled from gfx90a's asm, so remove the
shader for gfx90a A+A and keep it for newer asics with scc
support.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Change-Id: Iec3c7ccd5156a855adb2b02feb3db0761876aa2f
2022-11-25 13:55:28 -05:00
David Yat Sin f46ddb7ead libhsakmt: Initialize fd to -1
Fix compile error due to warning in some environments

Change-Id: Ie5fcfabb872c27c0de349eb215345b997fae7201
2022-11-25 15:01:53 +00:00
Ranjith Ramakrishnan 01fd84db5e Change pragma message to warning
File reorganization feature was implemented with backward compatibility
The backward compatibility support will be deprecated in future release.
Changed the #pragma message to #warning for a smooth transition

Change-Id: I21025f4cefb40721f095130263b4247877979d36
2022-11-23 13:06:34 -05:00
Shweta Khatri 8751e65b79 Fixed callback method for dl_iterate_phdr api which is called for each loaded shared object
Simplified the callback method. Also fixed the way, loaded shared object were getting appended into a string vector,
which was not being passed to this callback method.

Change-Id: I68661dd73f61a11c42fa92f670e8e7b6ffcb5711
2022-11-21 19:00:34 -05:00
Ranjith Ramakrishnan a34804ed3e Change pragma message to warning
File reorganization feature was implemented with backward compatibility
The backward compatibility support will be deprecated in future release.
Changed the #pragma message to #warning for a smooth transition

Change-Id: Ibaedc1873bc764d25f74d9ca9416077d084e332d
2022-11-17 09:38:24 -08:00
David Francis 88934cec2c libhsakmt: Don't close kfd_fd
When hsa is closed, it would close open fds for /dev/kfd but
not for /dev/dri/renderD*. This caused issues with CRIU
checkpoint, which expects that /dev/kfd will be open if
/dev/dri/renderD* is.

As a workaround for the CRIU behaviour, leave /dev/kfd open
when closing hsa.

Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: Ie1b2d5b1d8986750b0e560ae2934b7c73cff942e
2022-11-17 10:04:24 -05:00
David Yat Sin b9d1ad8604 Revert "Correct limit query return type to match spec ABI."
This reverts commit 7826d4ca2d.

Changing the parameter sizes breaks backward ABI.

Change-Id: Iff14b7c11294f0931f36fcfd42fff11a492d4205
2022-11-14 19:13:58 -05:00
Nirmal Unnikrishnan e0476629fe : updating the kfdtest packaging version
Change-Id: I4132d1106bd997b64b1496ea268961172a545102
2022-11-14 11:40:45 -06:00
David Yat Sin cb71e2d715 Allow page-aligned len for ipc_memory_create
Previous versions of HIP will call hsa_amd_ipc_memory_create with then
len aligned to granularity. Temporarily allow this so that we go not
break backward compability. Will remove this after 2 releaes

Change-Id: I6b5ac2cad5d32d62c803637cf1a2c6deebc03169
2022-11-09 15:01:47 +00:00
Graham Sider 6467664ec7 kfdtest: Rename IterateIsa to PersistentIterateIsa
To avoid confusion since this shader has changed to be persistent
(original IterateIsa may be re-used for debugger tests).

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I4643692765fc7665933257e89d5b922e779ad2e5
2022-11-09 09:37:06 -05:00
Jeremy Newton a63d53ad8b Fix libc/gcc hardpath issues
Don't use the full path to link against libc, but rather let
cmake find it.

Regarding gcc_s, it doesn't seem like this is still needed, so I've
removed it instead.

Change-Id: I1dc594f10c647b2abfdab7c5e0de90c331c6eeaf
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
2022-11-08 10:51:29 -05:00
Jeremy Newton 4fa9404fe2 Rework libdrm requires/recommends
To be more alligned with ROCr, libdrm dev package appears to be
required, but we don't care if it's ours or the distro's. So require
either but recommend our package to get the latest version.

Change-Id: I744ce4861644a83ba94c39e0bf4230eab58cc68a
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
2022-11-04 17:33:08 -04:00
David Yat Sin c1e836b6ab Use paged memory for queues on MEC devices
MES devices need GART mappings and therefore need non-paged memory. But
using non-paged memory introduces performance regression where it can
take over 80 ms to see the signal changes if the memory is in the wrong
NUMA node. Currently, we cannot control NUMA affinity when allocating
non-paged memory. Using non-paged memory allocation only on devices that
have MES scheduler

Change-Id: Ib27fb01d75247aa4f2bb2aa4503c6af5a98afda0
2022-11-04 13:23:21 +00:00
David Yat Sin 0e4c7336ff Use os::createThread to launch SVM profiler thread
Using previous method of std::thread for SVM profiler task was causing
segfaults on thread launch on RHEL 8 if libhsa-runtime library is loaded
using dlopen.

Change-Id: Ic010cd6ae9bc6e6ed0605de02b93f6aae8ed3e97
2022-11-03 10:52:11 -04:00
Ramesh Errabolu 75428364a7 Add support for CRIU testing
Change-Id: I8945a078ee8ae491245da6091e64b118584a48ab
2022-11-02 15:40:03 -04:00
Jonathan Kim f9edf73cd7 Fix doorbell offset fetch for GFX11
Transient exec usage is not required for GFX11 and will result in a NULL
return of s_sendmsg_rtn if directly returned to exec_lo.

Directly fetch and mask the doorbell ID to ttmp3 for GFX11 instead.

Change-Id: Ie17ed69d68d84ab18869b1c7871a0ed0482cd661
2022-11-02 11:55:37 -04:00
Nirmal Unnikrishnan 8225271e18 Updating the Rocrtst packaging
Update rocrtst packaging to add dependency on rocm-core so that rocrtst
gets uninstalled when rocm-core package is removed

Depends-On : I1e7ed52d7eed2c190d0b5651e7ded7192d7634b5

Change-Id: I7243dd29950b93a2665720a0062816c574f0f640
2022-11-02 09:38:48 -04:00
Ranjith Ramakrishnan 76cf5d2edc Add libelf-dev to package depends list
In ubuntu, the package depends list was not showing libelf. Added the same

Change-Id: I713951bd7181f44d667561aaf437f85c6cd783b0
2022-10-31 13:07:55 -07:00
David Yat Sin b4f26534eb No-Op for allow access on imported IPC
If hsa_amd_agents_allow_access is called for an imported IPC handle,
ignore the request as this pointer will already have these pointers
mapped to other GPUs during IPCAttach()

Change-Id: I4bf33ed57e93b5a3ead749d4f87ab6f2750bed58
2022-10-25 22:38:47 +00:00
Alex Sierra 45fad29752 libhsakmt: query svm info from userptrs at fault events
Get more debug information about user pointers that were registered
through SVM API, and triggered by memory exception events.
A new kfdtest with this use case was also included inside
KFDExceptionTest.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I8e9df3c1c6c3f42d7b9235d12406d80d31746443
2022-10-21 15:33:14 -04:00
Alex Sierra 178a619b80 src: use SVM mechanism to register userptr memory
Register and map userptrs through Shared Virtual Memory(SVM) API at
the Kernel level when available. Using this approach, performance
will be improve as register/unregister memory will not trigger any
system call to KFD driver.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I20723cbeb340bf48b95e1115f0102c031397bc14
2022-10-21 15:32:02 -04:00
David Yat Sin 18547173e9 Early return for invalid pointer queries
If a user queries the pointer info on an invalid pointer,
hsaKmtQueryPointerInfo will return error or unknown pointer. The other
fields in HsaPointerInfo are invalid, so we do not return them to the
user.
Also removing the assert and returning unknown pointer instead. As the
assert will not trigger in release builds.
hsaKmtQueryPointerInfo may also return unknown pointer for userptrs as
they are not always tracked by thunk. Adjusting code to still treat
these pointers as valid in this case.

Change-Id: Idf5cd8b61cd532d31b072f449839d223369bb138
2022-10-21 15:28:48 -04:00
Freddy Paul ac66865385 Remove RPATH/RUNPATH from ROCm libraries
:Since all public interface libraries are present in
same folder RUNPATH/RPATH is not required in the library itself.
Application shall provide the required RPATH/RUNPATH to load all
libraries.

Change-Id: I1d1ba920bf291eb89bd1f4c0fd0cfd80c7d739bd
2022-10-21 11:05:06 -04:00
David Belanger a0d3db6e8d Initial changes for gfx1101, based on gfx1100/gfx1102 implementation.
Change-Id: I949c1027ccabf38b4f924590e42e7327dc550f73
Signed-off-by: David Belanger <david.belanger@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
2022-10-13 09:28:39 -04:00
Yifan Zhang 4b9461dd42 kfdtest: add blacklist for gfx1103
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: I9ffe4dd8add505d0a6cfd3ed974fab6cef05f039
2022-10-12 10:08:13 +08:00