نمودار کامیت

2959 کامیت‌ها

مولف SHA1 پیام تاریخ
David Yat Sin 39632a713e Use user requested size for memory fragments
Amount of memory requested by user may be aligned-up internally to
the memory pool granularity. The extra padded memory should not be
considered when validating pointers from the user. Also return the
user requested size when user queries pointer information.

Change-Id: I28b25448ea03c836b44fafdb34b7330cf6887424
2022-10-07 21:32:49 +00:00
David Yat Sin 9cb10a3dd8 Fix compile warnings and remove unused variables
Change-Id: I7acaee5e9cf218b358ffaf0e3af6067faf6f3d2a
2022-10-06 10:11:17 -04:00
Yifan Zhang f05770610c Adjust the passing value for GPU agent when do max single allocation test
For APU asics, the default configuration size of video memory is
relatively small, while the reserved region becomes larger in recent
generation asics, ratio of max alloc size to the pool size may below
the expected value, so adjust it.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: I0e847c4c13e957cf6e811d3f379842619cf53370
2022-10-05 22:16:58 -04:00
Sean Keely 7826d4ca2d Correct limit query return type to match spec ABI.
Change-Id: I2eeed1f4b79d10c7d9ab0fd36c0146063053c76a
2022-10-04 01:48:26 +00:00
Jeremy Newton 1621936e32 Implement RPM Recommends for libdrm
What we want for libdrm-amdgpu is for it to be a recommended package.
Either libdrm or libdrm-amdgpu can be used, but we recommend the latter.

Using "SUGGESTS" does not seem like a strong enough requirement, but
CPACK does not support RPM recommends. Although, it does allow
customizing the RPM SPEC file template. By generating a template, which
is done by setting:

-DCPACK_RPM_GENERATE_USER_BINARY_SPECFILE_TEMPLATE=1

This template file can be trivially modified to allow adding a line to
implement CPACK_RPM_PACKAGE_RECOMMENDS.

Fixes 

Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
Change-Id: I34467b1ba878827ced9b8db74977967815732552
2022-10-03 12:42:51 -04:00
Graham Sider 73adbdee2c kfdtest: Make KFDCWSRTest.BasicTest buffer sizes dynamic
KFDCWSRTest.BasicTest is parameterized to allow an easy method of
tweaking the number of work-items (and save/restores). The input/output
buffers were previously hardcoded to a single page, which would cause a
segmentation fault if the number of work-items specified is greater than
1024 for wave32.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: Ieefc819a5d81c77cee88081a287fd383e6378e74
2022-09-30 18:18:17 -04:00
Graham Sider af48352d9a kfdtest: Add missing break in FamilyIdFromNode
Add missing break after GFX11 FamilyId patch.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I8fd727054442d74a95f69724ab07c4095f8ae77e
2022-09-29 09:22:05 -04:00
Graham Sider 6294ef564b kfdtest: Update COMPUTE_PGM_RSRC1 for software trap
For software trap in GFX11, COMPUTE_PGM_RSRC1 must have PRIV = 1.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: Id504889c3ca2588b6c8cefdebaec00dcfc217995
2022-09-23 16:38:31 -04:00
Graham Sider e44952d8a6 kfdtest: Add dedicated FamilyId for GFX11
Add FAMILY_GFX11 to KfdFamilyId enum.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: Ib2aa3f8cf31041f7b4cdd9a2f3e36489dde5554c
2022-09-23 16:29:09 -04:00
Philip Yang 590fd531c0 kfdtest: Correct mmap return value checking
On error mmap returns value MAP_FAILED, which is (void *)-1, not NULL
pointer.

Change-Id: I81b187266c943fa0aa4fab21b529d4c2989b12ad
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2022-09-21 15:38:11 -04:00
Graham Sider 79279e860f libhsakmt: Skip hsa_gfxip_table search for GFX11+
Prior to launch some ASICs may re-use PCI DIDs from older generations.
This can cause issues during topology initialization as hsa_gfxip_table
lookups will override sysfs-provided gfx versions, causing incorrect
gfxip selection. Since no new entries will be added to hsa_gfxip_table,
limit its search only to pre-GFX11 ASICs.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I53eaefac5db2650a36a6ce9f21daf750f50cfd26
2022-09-21 14:09:35 -04:00
Daniel Phillips 169673a435 kfdtest: Add thunk test for KFD memory availability ioctl
Signed-off-by: Daniel Phillips <Daniel.Phillips@amd.com>
Change-Id: Ic4c1ffefdc3570718a1fce4e53ca5f1ebde8c479
2022-09-21 13:26:38 -04:00
David Yat Sin d9935e6fba Add .kd to symbol kernel name for Binary Search sample
Fix Binary Search sample code as kernel symbol name has a .kd
extension.

Change-Id: Id21d2e432faa40bcd5cf343345502e823678fd0f
2022-09-12 16:17:04 -04:00
Graham Sider 14aa475905 kfdtest: Update exclude for gfx1101
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I78c1130c4a85a98a265a090e40390e56d3be2819
2022-09-09 09:40:23 -04:00
Graham Sider 82a41c7e4d kfdtest: Remove unnecessary v_ ops in IterateIsa
IterateIsa had some leftover instructions from when the shader was
getting updated for KFDCWSRTest.BasicTest.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I41ae7b7948cbe2aff8bf61b170b9a7d498b836a3
2022-09-09 09:40:04 -04:00
Philip Yang 093cf898fb libhsakmt: Set CWSR SVM range MADV_DONTFORK
fork process copy-on-write MMU nitifier on CWSR range will evict user
queues, and then update GPU mapping and resume queues, use MADV_DONTFORK
to avoid COW MMU notifier callback on CWSR SVM range.

Use mmap to alloc SVM range for CWSR because posix_memalign don't alloc
new range in child process, this fails to register svm range as range is
invalid address in forked child process.

Change-Id: Ibaea56a691dd6f577ed2e1f2d43f4a3500b8316f
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2022-09-08 22:53:47 -04:00
Philip Yang b2691c359d libhsakmt: Use mmap aligned for scratch allocation
To remove duplicate mmap aligned allocation code.

Change-Id: Ibc05cc4aaf6d190bd2382e33bdeca1496960c5f2
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2022-09-08 22:53:47 -04:00
Philip Yang b7710a1dda libhsakmt: Add mmap alloc aligned helper function
mmap alloc larger address range with align padding page plus guard
pages, then unmap the padding and guard pages at beginning and end
of the range, return aligned address range.

Change-Id: Iaf3c711a079c744289efbafee9b5e63aaf724765
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2022-09-08 22:53:47 -04:00
Philip Yang 2230d01c8a kfdtest: Add KFDSVMRangeTest.PrefaultPartialRangeTest
Change-Id: I00617dd5a2216fab90c2b0d116825ec274d14d13
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2022-09-08 22:53:47 -04:00
Philip Yang 115f1f8d1f kfdtest: Add KFDSVMRangeTest.VramOvercommitGiantRangeTest
Change-Id: I5f6b3b6a910ff6646bf4b0c48ae3e94ad243cf88
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2022-09-08 22:53:47 -04:00
Philip Yang 3190f189b4 kfdtest: Add KFDSVMRangeTest.VramOvercommitTest
Change-Id: Id5b23d5efd4f6a9717d1ca196c8635846493f77d
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2022-09-08 22:52:43 -04:00
Aaron Liu 06a90612e9 libhsakmt: expand control stack size limit for all gfx103x
GFX1036(ISA version) is not included in the previous range.
This patch can really include all gfx10 series ASICs.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Change-Id: I0e28dbfc031c216166b306b9fb39f644f75a330f
2022-09-08 22:37:50 -04:00
jie1zhan d98c729ff9 libhsakmt: Add check for the runntime debuuger
Avoiding the segfault, runtime debugger enable is not supported
if the firmware of gpu doesn't support debug exceptions.

Signed-off-by: jie1zhan <jesse.zhang@amd.com>
Change-Id: Ifad57a6e78cb1c92b1f8927355ece8c64e89c51b
2022-09-08 20:52:01 -04:00
Mukul Joshi 57a1c6f3ff libhsakmt: Remove potential double free in queue creation
Remove potential double free condition when free_queue() is called
after hsaKmtDestroyQueue() if mapping doorbell fails during queue
creation.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Change-Id: If2aa19c455b30d2940b232dbafb9cc1eaad721a5
2022-09-08 11:39:34 -04:00
David Yat Sin dd255d31b8 Fix uninitialized variable warning
Fix warning when using valgrind

Change-Id: Ie59eaa990b9b5d339a178a2c6f9f4fac0e34e925
2022-09-08 09:10:00 -04:00
Shikai Guo 4951495fca libhsakmt: add filter node for new chip
When running kfdtest test case, because the filter node of the new chip is
missing in libhsakmt, the test case is not supported, so a new test node
is added in order to spporting kfdtest case.

Signed-off-by: shikaguo <shikai.guo@amd.com>
Change-Id: I0cd9ffd7d4387129cfb0f8de6b669f431949ab49
2022-09-07 15:18:18 +08:00
David Yat Sin e2388f242a Disable automatic dependency for rocrtst RPM
Disable automatic dependency detection when generating rocrtst RPMs.
This was adding unnecessary dependency on libhwloc, which is now
provided with the rocrtst package.
This matches behavior for DEB packages where there is no dependency
list for rocrtst.

Change-Id: If4a93f5b4c039b2f45e9445f60f65eefe84e32eb
2022-09-06 15:05:40 -04:00
Philip Yang 3dbf5feffe libhsakmt: Fix queue leaking debug memory
Queue ctx_save_restore memory is allocated with size
ctx_save_restore_size + debug_memory_size, use the same size
in free_queue to free ctx_save_restore memory.

Change-Id: I4902ff15fb82ddea64b8342b89776a1bf5c38d13
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2022-09-01 12:31:16 -04:00
David Yat Sin 1c385fb257 libhsakmt: Add check for invalid aperture
Avoiding segfault when an invalid SharedMemoryHandle is passed in
when calling fmm_register_shared_memory.

Change-Id: I0e0bbed01487fc10afcbb170eb9330e70b209d14
Signed-off-by: David Yat Sin <David.YatSin@amd.com>
2022-08-31 15:14:16 -04:00
jie1zhan 8941e7135c fix rocrtst on hang issue
close the file at the end of every test, instead of the whole test

Change-Id: Ia510990dad8d0bd82625bbd9b2958181e8f1dd25
2022-08-31 17:03:09 +08:00
kent.russell@amd.com ea4d4917c1 src/topology.c: Fix NULL check
Now that HsaNodeProperties is passed in to
topology_get_node_props_from_drm, check that pointer instead of the
pointer for MarketingName (which throws a compiler warning)

Signed-off-by: kent.russell@amd.com <kent.russell@amd.com>
Change-Id: If76b24e1bab5a62e514ab440b6316c7b7cd264c1
2022-08-29 08:56:41 -04:00
Lang Yu d0e7c617df Query agent family id from roct
Add agent info query HSA_AMD_AGENT_INFO_ASIC_FAMILY_ID.
Then we can remove the codes to parse family id.

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Change-Id: I3ac4746d3015e89b32322ebc0f8a3084f98677a4
2022-08-25 10:15:43 -04:00
Lang Yu 66e9e97e0d libhsakmt: add FamilyID info into HsaNodeProperties
Query family id info from drm render node, then
ROCr can query this info directly from Thunk
instead of parsing the info by itself.

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Change-Id: I030bd27ab2379fbf87f3d787302c3b8613456278
2022-08-24 21:14:06 -04:00
Graham Sider 4267c4b524 kfdtest: Bump C++ compiler to gnu++17
Required due to LLVM retirement of llvm::apply_tuple, instead using
std::apply which was introduced in C++17.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I6646ebcca7d71d3e1bcf340ccfa3db2c15a3110a
2022-08-24 11:46:18 -04:00
Graham Sider 0055ef46c4 kfdtest: Add KFDCWSRTest.BasicTest* to GFX10 blacklist
Failure with new CWSR tests reported for GFX10, for now add to blacklist.



Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I5b2bd9ec61c64ad66e1c34ba2c192bece808f56f
2022-08-23 16:03:24 -04:00
David Yat Sin 0647960019 Revert "Change search path to use RPATH"
This reverts commit c904cc5856.

The change from using RUNPATH to RPATH was not approved formally.
Reverting this patch until this gets approved.

Change-Id: Ibc1a8f9d5dfa6694adacccfd9e3b0d053660e848
2022-08-23 07:28:14 -04:00
Graham Sider 0dbac97b75 kfdtest: Overhaul KFDCWSRTest.BasicTest
This patch restructures the CWSR basic test and allows for
creating parameterized CWSR tests. This patch introduces four
parameterizations. These tests behave as follows:

This test dispatches the IterateIsa shader, which continuously
increments a vgpr for (num_witems / WAVE_SIZE) waves. While this shader
is running, dequeue/requeue requests are sent in a loop to trigger
CWSRs.

This test defines a CWSR threshold. Once the number of CWSRs triggered
reaches the threshold, a known-value is filled into the inputBuf to
signal the shader to exit.

4 parameterized tests are defined:

KFDCWSRTest.BasicTest/0
KFDCWSRTest.BasicTest/1
KFDCWSRTest.BasicTest/2
KFDCWSRTest.BasicTest/3

0: 1 work-item, CWSR threshold of 10
1: 256 work-items, CWSR threshold of 50
2: 512 work-items, CWSR threshold of 100
3: 1024 work-items, CWSR threshold of 1000

Tuple Format: (num_witems, cwsr_thresh)

num_witems: Defines the number of work-items.
cwsr_thresh: Defines the number of CWSRs to trigger.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I639eb7bd75b14ee70e190b4bd19dcf34096fc7bf
2022-08-22 19:55:04 -04:00
Jonathan Kim c1d8ac8437 libhsakmt: bump debug major rev for snapshot and watchpoint changes
The debugger can now request snapshot copies with entry size and
set/clear watchpoints by device.

v3: drop min version check to v10.0

v2: check runtime allowance from v10.3 to 13.x

Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Change-Id: I9befefb596201a11591de218db29a9317b41e69b
2022-08-22 12:11:04 -04:00
Jonathan Kim 2b75a73ce7 Report no cooperative launch support with CU masking
The allocation logic of the SPI does not take into account compute
user thread management settings for masking CUs with the exception of
skipping fully disabled SEs.  This means that occupancy limited
dispatches such as cooperative launch may over allocate onto hardware
resources that are not immediately available, resulting in a potential
barrier logic hang as occupying work groups are waiting on enqueued
work groups to reach the barrier.

Further work will have to be done to get the per-SA CU enablement count
from the KFD in order to correctly clip the cooperative CU limit based
on the CU mask, which will require breaking the current ABI.

For now, report that cooperative launch is not supported while a CU
mask has been applied to prevent potential shader hangs.

Change-Id: I8be4bb47d65ceb62d805f36ef6ef3996d756021f
2022-08-22 08:22:28 -04:00
David Yat Sin c904cc5856 Change search path to use RPATH
Change default behavior for library search to use RPATH instead of
RUNPATH.

Change-Id: I328766006d02c2a8c76a3b1e0780ae5ca678ed86
2022-08-21 19:14:27 -04:00
David Yat Sin df3fe8c2fb Add env variable to disable CPU affinity override
New environment variable HSA_OVERRIDE_CPU_AFFINITY_DEBUG to
enable/disable overriding CPU affinity.

Default value is enabled(1).

This is a temporary variable and may be removed in the future.

Change-Id: Id6a7c611730471ddc276ca333fde1e57046bf32a
2022-08-19 11:07:49 -04:00
David Yat Sin a7db31c5d1 Expose memory executable bit for SVM ranges
Add support to expose executable bit.

Change-Id: I054f5c3173822c369dd9908eec5c449459600ce1
Signed-off-by: David Yat Sin <David.YatSin@amd.com>
2022-08-17 12:05:42 -04:00
David Yat Sin 50b636d1d8 Fix for too many open files in rocrtst
Fix for regression in commit:
8a0fe6a832

When running rocrtstNeg.Queue_Validation_InvalidWorkGroupSize, each
time rocrtst::LoadKernelFromObjFile is called, a new CodeObject is
created and not deleted until end of the whole test. Each CodeObject
keeps an open file descriptor of the kernel file and this can exceed
maximum allowed open files on some systems. Deleting the CodeObjects
after each iteration in the test.

Change-Id: I388e56f95f7b671ecc29d5ecb4eb8ac2d0ddc412
2022-08-16 14:55:38 -04:00
David Yat Sin ec759c7995 Add rocrtst to Query agent memory available
Add new test for GPU agents memory available

Change-Id: Ib07e2003a21659b99732b535cd004081635d6aa1
Signed-off-by: David Yat Sin <david.yatsin@amd.com>
2022-08-11 09:36:58 -04:00
David Yat Sin 86e4cb1ddd Add max enum value to hsa_agent_info_t
Add max enum value to force size of enum and avoid clang compile
warnings.

Change-Id: I9cdf529517cc605a5039c3a924fd718ece16029d
2022-08-10 11:11:36 -04:00
David Yat Sin 117495fe88 Fix image LUT for gfx11
For gfx11 the image type table has some different values compared to
previous asic families (e.g TYPE_SRGB). Creating a new LUT class to
use these new values.

Change-Id: Ifdfc6cd29bfd5f4ec2643c848fcb9986eb874f9e
2022-08-04 11:23:28 -04:00
Yifan Zhang daa01b8d57 Add gfx1103 support
This patch adds gfx1103 support

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: I7f1d580059fcd501bce2c8fea894637960c29bc1
2022-08-04 11:23:28 -04:00
David Yat Sin 574bea4a4c Use FAMILY_GFX1103 for gfx1103
Also adding elf entry

Change-Id: Id47ec379f2880961022b4607eb7f106b7e9d7048
2022-08-04 11:23:28 -04:00
David Yat Sin f971834d7a Update entries for gfx11
Update image table enums and format tables for gfx11.
Remove some entries that are not needed.

Change-Id: I060c1e285925a6d428ef1c5498f5dd89f5d79d97
2022-08-04 11:23:28 -04:00
David Yat Sin 319e71e79f Use FAMILY_GFX1100 for GFX11 devices
Change-Id: Ib182b647a91987040d655dbc05cbe5f867d4f61a
2022-08-04 11:23:28 -04:00