rocm-systems

Автор	SHA1	Сообщение	Дата
David Yat Sin	ecdebef0b9	Add agent info for fw and sdma ucode Add two new agent info fields: HSA_AMD_AGENT_INFO_UCODE_VERSION HSA_AMD_AGENT_INFO_SDMA_UCODE_VERSION Change-Id: I51cb853724b23a26e945e5c1ac32c16d0cb3bc31	2022-12-07 19:07:31 -05:00
raghavmedicherla	5727a10a1b	[hsa-runtime] Modify elfsection checks in amd_elf_image class Modified If condition checks in GElfImage::pullElf() of amd_elf_image.cpp to check using section types instead of a string check. Change-Id: I1ab92f0a9118fb2382652a1cc900a3150cbee2da	2022-12-05 14:42:02 -05:00
David Yat Sin	e39ad34d9c	Check for debug support after parsing topology Thunk keeps an internal cache of system topology that can be used to speed up subsequent calls to hsaKmtAcquireSystemProperties(). This cache is cleared by calling hsaKmtReleaseSystemProperties() at the beginning of BuildTopology(). hsaKmtRuntimeEnable() also calls hsaKmtAcquireSystemProperties() inside Thunk. Move call to hsaKmtRuntimeEnable() after BuildTopology() so that we can re-use Thunks internal cache. Parsing of of topology can take ~150 ms on systems for large number of nodes. Change-Id: I741709d49d67d244f5fbd707fe8f01ab923bb153	2022-12-02 11:26:00 -05:00
Shweta Khatri	8751e65b79	Fixed callback method for dl_iterate_phdr api which is called for each loaded shared object Simplified the callback method. Also fixed the way, loaded shared object were getting appended into a string vector, which was not being passed to this callback method. Change-Id: I68661dd73f61a11c42fa92f670e8e7b6ffcb5711	2022-11-21 19:00:34 -05:00
Ranjith Ramakrishnan	a34804ed3e	Change pragma message to warning File reorganization feature was implemented with backward compatibility The backward compatibility support will be deprecated in future release. Changed the #pragma message to #warning for a smooth transition Change-Id: Ibaedc1873bc764d25f74d9ca9416077d084e332d	2022-11-17 09:38:24 -08:00
David Yat Sin	b9d1ad8604	Revert "Correct limit query return type to match spec ABI." This reverts commit `7826d4ca2d`. Changing the parameter sizes breaks backward ABI. Change-Id: Iff14b7c11294f0931f36fcfd42fff11a492d4205	2022-11-14 19:13:58 -05:00
David Yat Sin	cb71e2d715	Allow page-aligned len for ipc_memory_create Previous versions of HIP will call hsa_amd_ipc_memory_create with then len aligned to granularity. Temporarily allow this so that we go not break backward compability. Will remove this after 2 releaes Change-Id: I6b5ac2cad5d32d62c803637cf1a2c6deebc03169	2022-11-09 15:01:47 +00:00
David Yat Sin	c1e836b6ab	Use paged memory for queues on MEC devices MES devices need GART mappings and therefore need non-paged memory. But using non-paged memory introduces performance regression where it can take over 80 ms to see the signal changes if the memory is in the wrong NUMA node. Currently, we cannot control NUMA affinity when allocating non-paged memory. Using non-paged memory allocation only on devices that have MES scheduler Change-Id: Ib27fb01d75247aa4f2bb2aa4503c6af5a98afda0	2022-11-04 13:23:21 +00:00
David Yat Sin	0e4c7336ff	Use os::createThread to launch SVM profiler thread Using previous method of std::thread for SVM profiler task was causing segfaults on thread launch on RHEL 8 if libhsa-runtime library is loaded using dlopen. Change-Id: Ic010cd6ae9bc6e6ed0605de02b93f6aae8ed3e97	2022-11-03 10:52:11 -04:00
Jonathan Kim	f9edf73cd7	Fix doorbell offset fetch for GFX11 Transient exec usage is not required for GFX11 and will result in a NULL return of s_sendmsg_rtn if directly returned to exec_lo. Directly fetch and mask the doorbell ID to ttmp3 for GFX11 instead. Change-Id: Ie17ed69d68d84ab18869b1c7871a0ed0482cd661	2022-11-02 11:55:37 -04:00
Ranjith Ramakrishnan	76cf5d2edc	Add libelf-dev to package depends list In ubuntu, the package depends list was not showing libelf. Added the same Change-Id: I713951bd7181f44d667561aaf437f85c6cd783b0	2022-10-31 13:07:55 -07:00
David Yat Sin	b4f26534eb	No-Op for allow access on imported IPC If hsa_amd_agents_allow_access is called for an imported IPC handle, ignore the request as this pointer will already have these pointers mapped to other GPUs during IPCAttach() Change-Id: I4bf33ed57e93b5a3ead749d4f87ab6f2750bed58	2022-10-25 22:38:47 +00:00
David Yat Sin	18547173e9	Early return for invalid pointer queries If a user queries the pointer info on an invalid pointer, hsaKmtQueryPointerInfo will return error or unknown pointer. The other fields in HsaPointerInfo are invalid, so we do not return them to the user. Also removing the assert and returning unknown pointer instead. As the assert will not trigger in release builds. hsaKmtQueryPointerInfo may also return unknown pointer for userptrs as they are not always tracked by thunk. Adjusting code to still treat these pointers as valid in this case. Change-Id: Idf5cd8b61cd532d31b072f449839d223369bb138	2022-10-21 15:28:48 -04:00
Freddy Paul	ac66865385	Remove RPATH/RUNPATH from ROCm libraries :Since all public interface libraries are present in same folder RUNPATH/RPATH is not required in the library itself. Application shall provide the required RPATH/RUNPATH to load all libraries. Change-Id: I1d1ba920bf291eb89bd1f4c0fd0cfd80c7d739bd	2022-10-21 11:05:06 -04:00
David Belanger	a0d3db6e8d	Initial changes for gfx1101, based on gfx1100/gfx1102 implementation. Change-Id: I949c1027ccabf38b4f924590e42e7327dc550f73 Signed-off-by: David Belanger <david.belanger@amd.com> Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>	2022-10-13 09:28:39 -04:00
David Yat Sin	39632a713e	Use user requested size for memory fragments Amount of memory requested by user may be aligned-up internally to the memory pool granularity. The extra padded memory should not be considered when validating pointers from the user. Also return the user requested size when user queries pointer information. Change-Id: I28b25448ea03c836b44fafdb34b7330cf6887424	2022-10-07 21:32:49 +00:00
David Yat Sin	9cb10a3dd8	Fix compile warnings and remove unused variables Change-Id: I7acaee5e9cf218b358ffaf0e3af6067faf6f3d2a	2022-10-06 10:11:17 -04:00
Sean Keely	7826d4ca2d	Correct limit query return type to match spec ABI. Change-Id: I2eeed1f4b79d10c7d9ab0fd36c0146063053c76a	2022-10-04 01:48:26 +00:00
Jeremy Newton	1621936e32	Implement RPM Recommends for libdrm What we want for libdrm-amdgpu is for it to be a recommended package. Either libdrm or libdrm-amdgpu can be used, but we recommend the latter. Using "SUGGESTS" does not seem like a strong enough requirement, but CPACK does not support RPM recommends. Although, it does allow customizing the RPM SPEC file template. By generating a template, which is done by setting: -DCPACK_RPM_GENERATE_USER_BINARY_SPECFILE_TEMPLATE=1 This template file can be trivially modified to allow adding a line to implement CPACK_RPM_PACKAGE_RECOMMENDS. Fixes Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com> Change-Id: I34467b1ba878827ced9b8db74977967815732552	2022-10-03 12:42:51 -04:00
David Yat Sin	dd255d31b8	Fix uninitialized variable warning Fix warning when using valgrind Change-Id: Ie59eaa990b9b5d339a178a2c6f9f4fac0e34e925	2022-09-08 09:10:00 -04:00
Lang Yu	d0e7c617df	Query agent family id from roct Add agent info query HSA_AMD_AGENT_INFO_ASIC_FAMILY_ID. Then we can remove the codes to parse family id. Signed-off-by: Lang Yu <Lang.Yu@amd.com> Change-Id: I3ac4746d3015e89b32322ebc0f8a3084f98677a4	2022-08-25 10:15:43 -04:00
David Yat Sin	0647960019	Revert "Change search path to use RPATH" This reverts commit `c904cc5856`. The change from using RUNPATH to RPATH was not approved formally. Reverting this patch until this gets approved. Change-Id: Ibc1a8f9d5dfa6694adacccfd9e3b0d053660e848	2022-08-23 07:28:14 -04:00
Jonathan Kim	2b75a73ce7	Report no cooperative launch support with CU masking The allocation logic of the SPI does not take into account compute user thread management settings for masking CUs with the exception of skipping fully disabled SEs. This means that occupancy limited dispatches such as cooperative launch may over allocate onto hardware resources that are not immediately available, resulting in a potential barrier logic hang as occupying work groups are waiting on enqueued work groups to reach the barrier. Further work will have to be done to get the per-SA CU enablement count from the KFD in order to correctly clip the cooperative CU limit based on the CU mask, which will require breaking the current ABI. For now, report that cooperative launch is not supported while a CU mask has been applied to prevent potential shader hangs. Change-Id: I8be4bb47d65ceb62d805f36ef6ef3996d756021f	2022-08-22 08:22:28 -04:00
David Yat Sin	c904cc5856	Change search path to use RPATH Change default behavior for library search to use RPATH instead of RUNPATH. Change-Id: I328766006d02c2a8c76a3b1e0780ae5ca678ed86	2022-08-21 19:14:27 -04:00
David Yat Sin	df3fe8c2fb	Add env variable to disable CPU affinity override New environment variable HSA_OVERRIDE_CPU_AFFINITY_DEBUG to enable/disable overriding CPU affinity. Default value is enabled(1). This is a temporary variable and may be removed in the future. Change-Id: Id6a7c611730471ddc276ca333fde1e57046bf32a	2022-08-19 11:07:49 -04:00
David Yat Sin	a7db31c5d1	Expose memory executable bit for SVM ranges Add support to expose executable bit. Change-Id: I054f5c3173822c369dd9908eec5c449459600ce1 Signed-off-by: David Yat Sin <David.YatSin@amd.com>	2022-08-17 12:05:42 -04:00
David Yat Sin	86e4cb1ddd	Add max enum value to hsa_agent_info_t Add max enum value to force size of enum and avoid clang compile warnings. Change-Id: I9cdf529517cc605a5039c3a924fd718ece16029d	2022-08-10 11:11:36 -04:00
David Yat Sin	117495fe88	Fix image LUT for gfx11 For gfx11 the image type table has some different values compared to previous asic families (e.g TYPE_SRGB). Creating a new LUT class to use these new values. Change-Id: Ifdfc6cd29bfd5f4ec2643c848fcb9986eb874f9e	2022-08-04 11:23:28 -04:00
Yifan Zhang	daa01b8d57	Add gfx1103 support This patch adds gfx1103 support Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Change-Id: I7f1d580059fcd501bce2c8fea894637960c29bc1	2022-08-04 11:23:28 -04:00
David Yat Sin	574bea4a4c	Use FAMILY_GFX1103 for gfx1103 Also adding elf entry Change-Id: Id47ec379f2880961022b4607eb7f106b7e9d7048	2022-08-04 11:23:28 -04:00
David Yat Sin	f971834d7a	Update entries for gfx11 Update image table enums and format tables for gfx11. Remove some entries that are not needed. Change-Id: I060c1e285925a6d428ef1c5498f5dd89f5d79d97	2022-08-04 11:23:28 -04:00
David Yat Sin	319e71e79f	Use FAMILY_GFX1100 for GFX11 devices Change-Id: Ib182b647a91987040d655dbc05cbe5f867d4f61a	2022-08-04 11:23:28 -04:00
David Yat Sin	a742b7e830	Update addrLib to support gfx11 This library was taken from public MESA library: https://gitlab.freedesktop.org/mesa/mesa/-/tree/main/src/amd/addrlib with top commit: 2866ae32da0348caf71ad2d11c353321df626ff4 Removing macros.h as it is no longer used by addrlib Change-Id: I0fdabfe48b74c259b4d29d81beae89604bbc141a	2022-08-04 11:23:28 -04:00
David Yat Sin	c2a60a4d5d	Fix scratch memory alignment on GFX11 GFX11 requires scratch memory alignment of 256 Bytes instead of 1024. Change-Id: I103de1c12f3a4877d7d36f13254301166c66e11f	2022-08-04 11:23:28 -04:00
David Yat Sin	90322899fe	Update scratch register definitions for GFX11 Update scratch register definitions for GFX11 asics. Change-Id: I6195e04b0a099fe84d1015c2f34ca3756a8175ef	2022-08-04 11:23:28 -04:00
Graham Sider	061aa04147	Make queue memory allocation non-paged Non-paged allocation for queue memory necessary for binding wptr to GART. Required to support usermode queue oversubscription with MES for GFX11. Adds AllocateNonPaged entry to MemoryRegion::AllocateEnum for clarity; aliases AllocateIPC. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I1a97a1820da26cf2433d9c237b2e6d2b0b8628b4	2022-08-04 11:21:00 -04:00
Graham Sider	db1a13aa05	Clean up includes in queue.h Formatting. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I141c8308d6b283b376035e21344629dc665289bb	2022-08-03 10:57:17 -04:00
David Yat Sin	907e05c1b3	Add new ImageManager for GFX11 Adding new ImageManager class for GFX11 GPUs ImageManagerGfx11 functions copied from ImageManagerNv. Register descriptions in resource_gfx11.h updated for gfx11. Signed-off-by: David Yat Sin <david.yatsin@amd.com> Change-Id: I48b39f6a633aef14aa829f7240a43fe0feb1c290	2022-08-03 10:57:09 -04:00
David Yat Sin	cc3bd31591	Add gfx1102 support Change-Id: I39cbda81a7a999aa2ecfad7a3e720000f7ca3408 Signed-off-by: David Yat Sin <David.YatSin@amd.com>	2022-08-03 10:56:54 -04:00
Graham Sider	446c5e9672	Add gfx1100 support Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: Ic5d5559e43df5c73409ba900a42c6901aabae661	2022-08-03 10:56:49 -04:00
Jay Cornwall	710adcc252	Add gfx11 blit/trap shaders David Yat Sin: Rebased to amd-staging branch Changed MSG_GET_DOORBELL to MSG_RTN_GET_DOORBELL Change-Id: I6015e54c4d8897f4c796f58c7fbc298758c6d76d	2022-08-03 10:56:41 -04:00
Jonathan Kim	9d2fe1ac2a	Fix GPU destruction when user disabled GPUs excluded by RVD are not expected to have scratch, memory, trap handling nor memory regions set up. Now that these GPUs are added to a new list, early return on agent destruction to prevent bad function calls on destroy. Also fix up broken memory releases between the gpu lists and ugly braces. Change-Id: I52fc6e86ceba0a0383cedc63310eb409515eaf9f	2022-08-02 14:18:43 -04:00
skhatri	364715cbc6	Enabled allocation of pseudo fine grain memory where memory ordering is per point to point connection Atomic memory operations on these memory buffers are not guaranteed to be visible at system scope Change-Id: I4cccde114632071a000384502a83bc191e77e85b	2022-07-29 15:15:56 -04:00
Konstantin Zhuravlyov	d962fc39bb	Add support for the following kernel symbol query: - HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_DYNAMIC_CALLSTACK Change-Id: Idff5c1a2ce2a3e2d65bcc9cf1f66a68d37cd41ef	2022-07-29 15:15:24 -04:00
Konstantin Zhuravlyov	5a49b4d17f	Bring AMDHSAKernelDescriptor.h in sync with llvm Change-Id: Icd35100ad4d7eb8638786d306ecfbbb1c8842db1	2022-07-29 15:14:39 -04:00
Ashutosh Mishra	a229f5c320	Removing package dependency to thunk The current state of hsa-rocr does NOT requires thunk lib as its dependency. Its unnecessary pulling thunk package while installing rocr. This patch corrects the same Change-Id: Id98ede8b66ffd9aaf4a47da96ba2f981f4c3da73	2022-07-22 09:42:38 -04:00
Sean Keely	c2b9abaa1d	Add missing query on CPU agents. Adds HSA_AMD_AGENT_INFO_SVM_DIRECT_HOST_ACCESS. Change-Id: I317d7b451ed2910cdf2290b196fd89e3bf0be435	2022-07-22 09:42:38 -04:00
Ashutosh Mishra	23f908708a	Adding Maintainer DL Maintainer distribution list field had wrong information. Adding the newly formed DL by the component team. Change-Id: I61651e429375cdc512d0fe4b0768f917506b5392	2022-07-22 09:42:28 -04:00
Jonathan Kim	f600687537	Only allow pairwise CU enable for devices with WGPs A work group processor (WGP) require both its CU to be enabled in order to be enabled. The KFD will round robin distribute by even-indexed pairs so enforce this requirement for runtime set mask calls. Change-Id: Ic46661b01f398aa1fe24d96b5c9c31f122f967a3	2022-07-07 12:50:24 -04:00
Sean Keely	a8603b9397	Fix IPC copy agent lookup. Discovered agent handles should only apply to copy routing, not to copy device selection. The user may not have mapped all allocations to all GPUs so we must ensure that the copying device is one passed by the user. Change-Id: I2532e66d30e6842624e594f235dd144a186220d4	2022-07-05 22:51:26 -05:00

1 2 3 4 5 ...

712 Коммитов