rocm-systems

مولف	SHA1	پیام	تاریخ
David Yat Sin	39632a713e	Use user requested size for memory fragments Amount of memory requested by user may be aligned-up internally to the memory pool granularity. The extra padded memory should not be considered when validating pointers from the user. Also return the user requested size when user queries pointer information. Change-Id: I28b25448ea03c836b44fafdb34b7330cf6887424	2022-10-07 21:32:49 +00:00
David Yat Sin	9cb10a3dd8	Fix compile warnings and remove unused variables Change-Id: I7acaee5e9cf218b358ffaf0e3af6067faf6f3d2a	2022-10-06 10:11:17 -04:00
Yifan Zhang	f05770610c	Adjust the passing value for GPU agent when do max single allocation test For APU asics, the default configuration size of video memory is relatively small, while the reserved region becomes larger in recent generation asics, ratio of max alloc size to the pool size may below the expected value, so adjust it. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Change-Id: I0e847c4c13e957cf6e811d3f379842619cf53370	2022-10-05 22:16:58 -04:00
Sean Keely	7826d4ca2d	Correct limit query return type to match spec ABI. Change-Id: I2eeed1f4b79d10c7d9ab0fd36c0146063053c76a	2022-10-04 01:48:26 +00:00
Jeremy Newton	1621936e32	Implement RPM Recommends for libdrm What we want for libdrm-amdgpu is for it to be a recommended package. Either libdrm or libdrm-amdgpu can be used, but we recommend the latter. Using "SUGGESTS" does not seem like a strong enough requirement, but CPACK does not support RPM recommends. Although, it does allow customizing the RPM SPEC file template. By generating a template, which is done by setting: -DCPACK_RPM_GENERATE_USER_BINARY_SPECFILE_TEMPLATE=1 This template file can be trivially modified to allow adding a line to implement CPACK_RPM_PACKAGE_RECOMMENDS. Fixes Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com> Change-Id: I34467b1ba878827ced9b8db74977967815732552	2022-10-03 12:42:51 -04:00
Graham Sider	73adbdee2c	kfdtest: Make KFDCWSRTest.BasicTest buffer sizes dynamic KFDCWSRTest.BasicTest is parameterized to allow an easy method of tweaking the number of work-items (and save/restores). The input/output buffers were previously hardcoded to a single page, which would cause a segmentation fault if the number of work-items specified is greater than 1024 for wave32. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: Ieefc819a5d81c77cee88081a287fd383e6378e74	2022-09-30 18:18:17 -04:00
Graham Sider	af48352d9a	kfdtest: Add missing break in FamilyIdFromNode Add missing break after GFX11 FamilyId patch. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I8fd727054442d74a95f69724ab07c4095f8ae77e	2022-09-29 09:22:05 -04:00
Graham Sider	6294ef564b	kfdtest: Update COMPUTE_PGM_RSRC1 for software trap For software trap in GFX11, COMPUTE_PGM_RSRC1 must have PRIV = 1. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: Id504889c3ca2588b6c8cefdebaec00dcfc217995	2022-09-23 16:38:31 -04:00
Graham Sider	e44952d8a6	kfdtest: Add dedicated FamilyId for GFX11 Add FAMILY_GFX11 to KfdFamilyId enum. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: Ib2aa3f8cf31041f7b4cdd9a2f3e36489dde5554c	2022-09-23 16:29:09 -04:00
Philip Yang	590fd531c0	kfdtest: Correct mmap return value checking On error mmap returns value MAP_FAILED, which is (void *)-1, not NULL pointer. Change-Id: I81b187266c943fa0aa4fab21b529d4c2989b12ad Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2022-09-21 15:38:11 -04:00
Graham Sider	79279e860f	libhsakmt: Skip hsa_gfxip_table search for GFX11+ Prior to launch some ASICs may re-use PCI DIDs from older generations. This can cause issues during topology initialization as hsa_gfxip_table lookups will override sysfs-provided gfx versions, causing incorrect gfxip selection. Since no new entries will be added to hsa_gfxip_table, limit its search only to pre-GFX11 ASICs. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I53eaefac5db2650a36a6ce9f21daf750f50cfd26	2022-09-21 14:09:35 -04:00
Daniel Phillips	169673a435	kfdtest: Add thunk test for KFD memory availability ioctl Signed-off-by: Daniel Phillips <Daniel.Phillips@amd.com> Change-Id: Ic4c1ffefdc3570718a1fce4e53ca5f1ebde8c479	2022-09-21 13:26:38 -04:00
David Yat Sin	d9935e6fba	Add .kd to symbol kernel name for Binary Search sample Fix Binary Search sample code as kernel symbol name has a .kd extension. Change-Id: Id21d2e432faa40bcd5cf343345502e823678fd0f	2022-09-12 16:17:04 -04:00
Graham Sider	14aa475905	kfdtest: Update exclude for gfx1101 Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I78c1130c4a85a98a265a090e40390e56d3be2819	2022-09-09 09:40:23 -04:00
Graham Sider	82a41c7e4d	kfdtest: Remove unnecessary v_ ops in IterateIsa IterateIsa had some leftover instructions from when the shader was getting updated for KFDCWSRTest.BasicTest. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I41ae7b7948cbe2aff8bf61b170b9a7d498b836a3	2022-09-09 09:40:04 -04:00
Philip Yang	093cf898fb	libhsakmt: Set CWSR SVM range MADV_DONTFORK fork process copy-on-write MMU nitifier on CWSR range will evict user queues, and then update GPU mapping and resume queues, use MADV_DONTFORK to avoid COW MMU notifier callback on CWSR SVM range. Use mmap to alloc SVM range for CWSR because posix_memalign don't alloc new range in child process, this fails to register svm range as range is invalid address in forked child process. Change-Id: Ibaea56a691dd6f577ed2e1f2d43f4a3500b8316f Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2022-09-08 22:53:47 -04:00
Philip Yang	b2691c359d	libhsakmt: Use mmap aligned for scratch allocation To remove duplicate mmap aligned allocation code. Change-Id: Ibc05cc4aaf6d190bd2382e33bdeca1496960c5f2 Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2022-09-08 22:53:47 -04:00
Philip Yang	b7710a1dda	libhsakmt: Add mmap alloc aligned helper function mmap alloc larger address range with align padding page plus guard pages, then unmap the padding and guard pages at beginning and end of the range, return aligned address range. Change-Id: Iaf3c711a079c744289efbafee9b5e63aaf724765 Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2022-09-08 22:53:47 -04:00
Philip Yang	2230d01c8a	kfdtest: Add KFDSVMRangeTest.PrefaultPartialRangeTest Change-Id: I00617dd5a2216fab90c2b0d116825ec274d14d13 Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2022-09-08 22:53:47 -04:00
Philip Yang	115f1f8d1f	kfdtest: Add KFDSVMRangeTest.VramOvercommitGiantRangeTest Change-Id: I5f6b3b6a910ff6646bf4b0c48ae3e94ad243cf88 Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2022-09-08 22:53:47 -04:00
Philip Yang	3190f189b4	kfdtest: Add KFDSVMRangeTest.VramOvercommitTest Change-Id: Id5b23d5efd4f6a9717d1ca196c8635846493f77d Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2022-09-08 22:52:43 -04:00
Aaron Liu	06a90612e9	libhsakmt: expand control stack size limit for all gfx103x GFX1036(ISA version) is not included in the previous range. This patch can really include all gfx10 series ASICs. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Change-Id: I0e28dbfc031c216166b306b9fb39f644f75a330f	2022-09-08 22:37:50 -04:00
jie1zhan	d98c729ff9	libhsakmt: Add check for the runntime debuuger Avoiding the segfault, runtime debugger enable is not supported if the firmware of gpu doesn't support debug exceptions. Signed-off-by: jie1zhan <jesse.zhang@amd.com> Change-Id: Ifad57a6e78cb1c92b1f8927355ece8c64e89c51b	2022-09-08 20:52:01 -04:00
Mukul Joshi	57a1c6f3ff	libhsakmt: Remove potential double free in queue creation Remove potential double free condition when free_queue() is called after hsaKmtDestroyQueue() if mapping doorbell fails during queue creation. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Change-Id: If2aa19c455b30d2940b232dbafb9cc1eaad721a5	2022-09-08 11:39:34 -04:00
David Yat Sin	dd255d31b8	Fix uninitialized variable warning Fix warning when using valgrind Change-Id: Ie59eaa990b9b5d339a178a2c6f9f4fac0e34e925	2022-09-08 09:10:00 -04:00
Shikai Guo	4951495fca	libhsakmt: add filter node for new chip When running kfdtest test case, because the filter node of the new chip is missing in libhsakmt, the test case is not supported, so a new test node is added in order to spporting kfdtest case. Signed-off-by: shikaguo <shikai.guo@amd.com> Change-Id: I0cd9ffd7d4387129cfb0f8de6b669f431949ab49	2022-09-07 15:18:18 +08:00
David Yat Sin	e2388f242a	Disable automatic dependency for rocrtst RPM Disable automatic dependency detection when generating rocrtst RPMs. This was adding unnecessary dependency on libhwloc, which is now provided with the rocrtst package. This matches behavior for DEB packages where there is no dependency list for rocrtst. Change-Id: If4a93f5b4c039b2f45e9445f60f65eefe84e32eb	2022-09-06 15:05:40 -04:00
Philip Yang	3dbf5feffe	libhsakmt: Fix queue leaking debug memory Queue ctx_save_restore memory is allocated with size ctx_save_restore_size + debug_memory_size, use the same size in free_queue to free ctx_save_restore memory. Change-Id: I4902ff15fb82ddea64b8342b89776a1bf5c38d13 Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2022-09-01 12:31:16 -04:00
David Yat Sin	1c385fb257	libhsakmt: Add check for invalid aperture Avoiding segfault when an invalid SharedMemoryHandle is passed in when calling fmm_register_shared_memory. Change-Id: I0e0bbed01487fc10afcbb170eb9330e70b209d14 Signed-off-by: David Yat Sin <David.YatSin@amd.com>	2022-08-31 15:14:16 -04:00
jie1zhan	8941e7135c	fix rocrtst on hang issue close the file at the end of every test, instead of the whole test Change-Id: Ia510990dad8d0bd82625bbd9b2958181e8f1dd25	2022-08-31 17:03:09 +08:00
kent.russell@amd.com	ea4d4917c1	src/topology.c: Fix NULL check Now that HsaNodeProperties is passed in to topology_get_node_props_from_drm, check that pointer instead of the pointer for MarketingName (which throws a compiler warning) Signed-off-by: kent.russell@amd.com <kent.russell@amd.com> Change-Id: If76b24e1bab5a62e514ab440b6316c7b7cd264c1	2022-08-29 08:56:41 -04:00
Lang Yu	d0e7c617df	Query agent family id from roct Add agent info query HSA_AMD_AGENT_INFO_ASIC_FAMILY_ID. Then we can remove the codes to parse family id. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Change-Id: I3ac4746d3015e89b32322ebc0f8a3084f98677a4	2022-08-25 10:15:43 -04:00
Lang Yu	66e9e97e0d	libhsakmt: add FamilyID info into HsaNodeProperties Query family id info from drm render node, then ROCr can query this info directly from Thunk instead of parsing the info by itself. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Change-Id: I030bd27ab2379fbf87f3d787302c3b8613456278	2022-08-24 21:14:06 -04:00
Graham Sider	4267c4b524	kfdtest: Bump C++ compiler to gnu++17 Required due to LLVM retirement of llvm::apply_tuple, instead using std::apply which was introduced in C++17. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I6646ebcca7d71d3e1bcf340ccfa3db2c15a3110a	2022-08-24 11:46:18 -04:00
Graham Sider	0055ef46c4	kfdtest: Add KFDCWSRTest.BasicTest* to GFX10 blacklist Failure with new CWSR tests reported for GFX10, for now add to blacklist. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I5b2bd9ec61c64ad66e1c34ba2c192bece808f56f	2022-08-23 16:03:24 -04:00
David Yat Sin	0647960019	Revert "Change search path to use RPATH" This reverts commit `c904cc5856`. The change from using RUNPATH to RPATH was not approved formally. Reverting this patch until this gets approved. Change-Id: Ibc1a8f9d5dfa6694adacccfd9e3b0d053660e848	2022-08-23 07:28:14 -04:00
Graham Sider	0dbac97b75	kfdtest: Overhaul KFDCWSRTest.BasicTest This patch restructures the CWSR basic test and allows for creating parameterized CWSR tests. This patch introduces four parameterizations. These tests behave as follows: This test dispatches the IterateIsa shader, which continuously increments a vgpr for (num_witems / WAVE_SIZE) waves. While this shader is running, dequeue/requeue requests are sent in a loop to trigger CWSRs. This test defines a CWSR threshold. Once the number of CWSRs triggered reaches the threshold, a known-value is filled into the inputBuf to signal the shader to exit. 4 parameterized tests are defined: KFDCWSRTest.BasicTest/0 KFDCWSRTest.BasicTest/1 KFDCWSRTest.BasicTest/2 KFDCWSRTest.BasicTest/3 0: 1 work-item, CWSR threshold of 10 1: 256 work-items, CWSR threshold of 50 2: 512 work-items, CWSR threshold of 100 3: 1024 work-items, CWSR threshold of 1000 Tuple Format: (num_witems, cwsr_thresh) num_witems: Defines the number of work-items. cwsr_thresh: Defines the number of CWSRs to trigger. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I639eb7bd75b14ee70e190b4bd19dcf34096fc7bf	2022-08-22 19:55:04 -04:00
Jonathan Kim	c1d8ac8437	libhsakmt: bump debug major rev for snapshot and watchpoint changes The debugger can now request snapshot copies with entry size and set/clear watchpoints by device. v3: drop min version check to v10.0 v2: check runtime allowance from v10.3 to 13.x Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Change-Id: I9befefb596201a11591de218db29a9317b41e69b	2022-08-22 12:11:04 -04:00
Jonathan Kim	2b75a73ce7	Report no cooperative launch support with CU masking The allocation logic of the SPI does not take into account compute user thread management settings for masking CUs with the exception of skipping fully disabled SEs. This means that occupancy limited dispatches such as cooperative launch may over allocate onto hardware resources that are not immediately available, resulting in a potential barrier logic hang as occupying work groups are waiting on enqueued work groups to reach the barrier. Further work will have to be done to get the per-SA CU enablement count from the KFD in order to correctly clip the cooperative CU limit based on the CU mask, which will require breaking the current ABI. For now, report that cooperative launch is not supported while a CU mask has been applied to prevent potential shader hangs. Change-Id: I8be4bb47d65ceb62d805f36ef6ef3996d756021f	2022-08-22 08:22:28 -04:00
David Yat Sin	c904cc5856	Change search path to use RPATH Change default behavior for library search to use RPATH instead of RUNPATH. Change-Id: I328766006d02c2a8c76a3b1e0780ae5ca678ed86	2022-08-21 19:14:27 -04:00
David Yat Sin	df3fe8c2fb	Add env variable to disable CPU affinity override New environment variable HSA_OVERRIDE_CPU_AFFINITY_DEBUG to enable/disable overriding CPU affinity. Default value is enabled(1). This is a temporary variable and may be removed in the future. Change-Id: Id6a7c611730471ddc276ca333fde1e57046bf32a	2022-08-19 11:07:49 -04:00
David Yat Sin	a7db31c5d1	Expose memory executable bit for SVM ranges Add support to expose executable bit. Change-Id: I054f5c3173822c369dd9908eec5c449459600ce1 Signed-off-by: David Yat Sin <David.YatSin@amd.com>	2022-08-17 12:05:42 -04:00
David Yat Sin	50b636d1d8	Fix for too many open files in rocrtst Fix for regression in commit: `8a0fe6a832` When running rocrtstNeg.Queue_Validation_InvalidWorkGroupSize, each time rocrtst::LoadKernelFromObjFile is called, a new CodeObject is created and not deleted until end of the whole test. Each CodeObject keeps an open file descriptor of the kernel file and this can exceed maximum allowed open files on some systems. Deleting the CodeObjects after each iteration in the test. Change-Id: I388e56f95f7b671ecc29d5ecb4eb8ac2d0ddc412	2022-08-16 14:55:38 -04:00
David Yat Sin	ec759c7995	Add rocrtst to Query agent memory available Add new test for GPU agents memory available Change-Id: Ib07e2003a21659b99732b535cd004081635d6aa1 Signed-off-by: David Yat Sin <david.yatsin@amd.com>	2022-08-11 09:36:58 -04:00
David Yat Sin	86e4cb1ddd	Add max enum value to hsa_agent_info_t Add max enum value to force size of enum and avoid clang compile warnings. Change-Id: I9cdf529517cc605a5039c3a924fd718ece16029d	2022-08-10 11:11:36 -04:00
David Yat Sin	117495fe88	Fix image LUT for gfx11 For gfx11 the image type table has some different values compared to previous asic families (e.g TYPE_SRGB). Creating a new LUT class to use these new values. Change-Id: Ifdfc6cd29bfd5f4ec2643c848fcb9986eb874f9e	2022-08-04 11:23:28 -04:00
Yifan Zhang	daa01b8d57	Add gfx1103 support This patch adds gfx1103 support Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Change-Id: I7f1d580059fcd501bce2c8fea894637960c29bc1	2022-08-04 11:23:28 -04:00
David Yat Sin	574bea4a4c	Use FAMILY_GFX1103 for gfx1103 Also adding elf entry Change-Id: Id47ec379f2880961022b4607eb7f106b7e9d7048	2022-08-04 11:23:28 -04:00
David Yat Sin	f971834d7a	Update entries for gfx11 Update image table enums and format tables for gfx11. Remove some entries that are not needed. Change-Id: I060c1e285925a6d428ef1c5498f5dd89f5d79d97	2022-08-04 11:23:28 -04:00
David Yat Sin	319e71e79f	Use FAMILY_GFX1100 for GFX11 devices Change-Id: Ib182b647a91987040d655dbc05cbe5f867d4f61a	2022-08-04 11:23:28 -04:00

... 19 20 21 22 23 ...

2959 کامیت‌ها