Previous versions of HIP will call hsa_amd_ipc_memory_create with then
len aligned to granularity. Temporarily allow this so that we go not
break backward compability. Will remove this after 2 releaes
Change-Id: I6b5ac2cad5d32d62c803637cf1a2c6deebc03169
[ROCm/ROCR-Runtime commit: cb71e2d715]
To avoid confusion since this shader has changed to be persistent
(original IterateIsa may be re-used for debugger tests).
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I4643692765fc7665933257e89d5b922e779ad2e5
[ROCm/ROCR-Runtime commit: 6467664ec7]
Don't use the full path to link against libc, but rather let
cmake find it.
Regarding gcc_s, it doesn't seem like this is still needed, so I've
removed it instead.
Change-Id: I1dc594f10c647b2abfdab7c5e0de90c331c6eeaf
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
[ROCm/ROCR-Runtime commit: a63d53ad8b]
To be more alligned with ROCr, libdrm dev package appears to be
required, but we don't care if it's ours or the distro's. So require
either but recommend our package to get the latest version.
Change-Id: I744ce4861644a83ba94c39e0bf4230eab58cc68a
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
[ROCm/ROCR-Runtime commit: 4fa9404fe2]
MES devices need GART mappings and therefore need non-paged memory. But
using non-paged memory introduces performance regression where it can
take over 80 ms to see the signal changes if the memory is in the wrong
NUMA node. Currently, we cannot control NUMA affinity when allocating
non-paged memory. Using non-paged memory allocation only on devices that
have MES scheduler
Change-Id: Ib27fb01d75247aa4f2bb2aa4503c6af5a98afda0
[ROCm/ROCR-Runtime commit: c1e836b6ab]
Using previous method of std::thread for SVM profiler task was causing
segfaults on thread launch on RHEL 8 if libhsa-runtime library is loaded
using dlopen.
Change-Id: Ic010cd6ae9bc6e6ed0605de02b93f6aae8ed3e97
[ROCm/ROCR-Runtime commit: 0e4c7336ff]
Transient exec usage is not required for GFX11 and will result in a NULL
return of s_sendmsg_rtn if directly returned to exec_lo.
Directly fetch and mask the doorbell ID to ttmp3 for GFX11 instead.
Change-Id: Ie17ed69d68d84ab18869b1c7871a0ed0482cd661
[ROCm/ROCR-Runtime commit: f9edf73cd7]
Update rocrtst packaging to add dependency on rocm-core so that rocrtst
gets uninstalled when rocm-core package is removed
Depends-On : I1e7ed52d7eed2c190d0b5651e7ded7192d7634b5
Change-Id: I7243dd29950b93a2665720a0062816c574f0f640
[ROCm/ROCR-Runtime commit: 8225271e18]
In ubuntu, the package depends list was not showing libelf. Added the same
Change-Id: I713951bd7181f44d667561aaf437f85c6cd783b0
[ROCm/ROCR-Runtime commit: 76cf5d2edc]
If hsa_amd_agents_allow_access is called for an imported IPC handle,
ignore the request as this pointer will already have these pointers
mapped to other GPUs during IPCAttach()
Change-Id: I4bf33ed57e93b5a3ead749d4f87ab6f2750bed58
[ROCm/ROCR-Runtime commit: b4f26534eb]
Get more debug information about user pointers that were registered
through SVM API, and triggered by memory exception events.
A new kfdtest with this use case was also included inside
KFDExceptionTest.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I8e9df3c1c6c3f42d7b9235d12406d80d31746443
[ROCm/ROCR-Runtime commit: 45fad29752]
Register and map userptrs through Shared Virtual Memory(SVM) API at
the Kernel level when available. Using this approach, performance
will be improve as register/unregister memory will not trigger any
system call to KFD driver.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I20723cbeb340bf48b95e1115f0102c031397bc14
[ROCm/ROCR-Runtime commit: 178a619b80]
If a user queries the pointer info on an invalid pointer,
hsaKmtQueryPointerInfo will return error or unknown pointer. The other
fields in HsaPointerInfo are invalid, so we do not return them to the
user.
Also removing the assert and returning unknown pointer instead. As the
assert will not trigger in release builds.
hsaKmtQueryPointerInfo may also return unknown pointer for userptrs as
they are not always tracked by thunk. Adjusting code to still treat
these pointers as valid in this case.
Change-Id: Idf5cd8b61cd532d31b072f449839d223369bb138
[ROCm/ROCR-Runtime commit: 18547173e9]
:Since all public interface libraries are present in
same folder RUNPATH/RPATH is not required in the library itself.
Application shall provide the required RPATH/RUNPATH to load all
libraries.
Change-Id: I1d1ba920bf291eb89bd1f4c0fd0cfd80c7d739bd
[ROCm/ROCR-Runtime commit: ac66865385]
Amount of memory requested by user may be aligned-up internally to
the memory pool granularity. The extra padded memory should not be
considered when validating pointers from the user. Also return the
user requested size when user queries pointer information.
Change-Id: I28b25448ea03c836b44fafdb34b7330cf6887424
[ROCm/ROCR-Runtime commit: 39632a713e]
For APU asics, the default configuration size of video memory is
relatively small, while the reserved region becomes larger in recent
generation asics, ratio of max alloc size to the pool size may below
the expected value, so adjust it.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: I0e847c4c13e957cf6e811d3f379842619cf53370
[ROCm/ROCR-Runtime commit: f05770610c]
What we want for libdrm-amdgpu is for it to be a recommended package.
Either libdrm or libdrm-amdgpu can be used, but we recommend the latter.
Using "SUGGESTS" does not seem like a strong enough requirement, but
CPACK does not support RPM recommends. Although, it does allow
customizing the RPM SPEC file template. By generating a template, which
is done by setting:
-DCPACK_RPM_GENERATE_USER_BINARY_SPECFILE_TEMPLATE=1
This template file can be trivially modified to allow adding a line to
implement CPACK_RPM_PACKAGE_RECOMMENDS.
Fixes
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
Change-Id: I34467b1ba878827ced9b8db74977967815732552
[ROCm/ROCR-Runtime commit: 1621936e32]
KFDCWSRTest.BasicTest is parameterized to allow an easy method of
tweaking the number of work-items (and save/restores). The input/output
buffers were previously hardcoded to a single page, which would cause a
segmentation fault if the number of work-items specified is greater than
1024 for wave32.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: Ieefc819a5d81c77cee88081a287fd383e6378e74
[ROCm/ROCR-Runtime commit: 73adbdee2c]
For software trap in GFX11, COMPUTE_PGM_RSRC1 must have PRIV = 1.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: Id504889c3ca2588b6c8cefdebaec00dcfc217995
[ROCm/ROCR-Runtime commit: 6294ef564b]
On error mmap returns value MAP_FAILED, which is (void *)-1, not NULL
pointer.
Change-Id: I81b187266c943fa0aa4fab21b529d4c2989b12ad
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: 590fd531c0]
Prior to launch some ASICs may re-use PCI DIDs from older generations.
This can cause issues during topology initialization as hsa_gfxip_table
lookups will override sysfs-provided gfx versions, causing incorrect
gfxip selection. Since no new entries will be added to hsa_gfxip_table,
limit its search only to pre-GFX11 ASICs.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I53eaefac5db2650a36a6ce9f21daf750f50cfd26
[ROCm/ROCR-Runtime commit: 79279e860f]
Fix Binary Search sample code as kernel symbol name has a .kd
extension.
Change-Id: Id21d2e432faa40bcd5cf343345502e823678fd0f
[ROCm/ROCR-Runtime commit: d9935e6fba]
IterateIsa had some leftover instructions from when the shader was
getting updated for KFDCWSRTest.BasicTest.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I41ae7b7948cbe2aff8bf61b170b9a7d498b836a3
[ROCm/ROCR-Runtime commit: 82a41c7e4d]
fork process copy-on-write MMU nitifier on CWSR range will evict user
queues, and then update GPU mapping and resume queues, use MADV_DONTFORK
to avoid COW MMU notifier callback on CWSR SVM range.
Use mmap to alloc SVM range for CWSR because posix_memalign don't alloc
new range in child process, this fails to register svm range as range is
invalid address in forked child process.
Change-Id: Ibaea56a691dd6f577ed2e1f2d43f4a3500b8316f
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: 093cf898fb]
mmap alloc larger address range with align padding page plus guard
pages, then unmap the padding and guard pages at beginning and end
of the range, return aligned address range.
Change-Id: Iaf3c711a079c744289efbafee9b5e63aaf724765
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: b7710a1dda]
GFX1036(ISA version) is not included in the previous range.
This patch can really include all gfx10 series ASICs.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Change-Id: I0e28dbfc031c216166b306b9fb39f644f75a330f
[ROCm/ROCR-Runtime commit: 06a90612e9]
Avoiding the segfault, runtime debugger enable is not supported
if the firmware of gpu doesn't support debug exceptions.
Signed-off-by: jie1zhan <jesse.zhang@amd.com>
Change-Id: Ifad57a6e78cb1c92b1f8927355ece8c64e89c51b
[ROCm/ROCR-Runtime commit: d98c729ff9]
Remove potential double free condition when free_queue() is called
after hsaKmtDestroyQueue() if mapping doorbell fails during queue
creation.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Change-Id: If2aa19c455b30d2940b232dbafb9cc1eaad721a5
[ROCm/ROCR-Runtime commit: 57a1c6f3ff]
When running kfdtest test case, because the filter node of the new chip is
missing in libhsakmt, the test case is not supported, so a new test node
is added in order to spporting kfdtest case.
Signed-off-by: shikaguo <shikai.guo@amd.com>
Change-Id: I0cd9ffd7d4387129cfb0f8de6b669f431949ab49
[ROCm/ROCR-Runtime commit: 4951495fca]
Disable automatic dependency detection when generating rocrtst RPMs.
This was adding unnecessary dependency on libhwloc, which is now
provided with the rocrtst package.
This matches behavior for DEB packages where there is no dependency
list for rocrtst.
Change-Id: If4a93f5b4c039b2f45e9445f60f65eefe84e32eb
[ROCm/ROCR-Runtime commit: e2388f242a]
Queue ctx_save_restore memory is allocated with size
ctx_save_restore_size + debug_memory_size, use the same size
in free_queue to free ctx_save_restore memory.
Change-Id: I4902ff15fb82ddea64b8342b89776a1bf5c38d13
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: 3dbf5feffe]
Avoiding segfault when an invalid SharedMemoryHandle is passed in
when calling fmm_register_shared_memory.
Change-Id: I0e0bbed01487fc10afcbb170eb9330e70b209d14
Signed-off-by: David Yat Sin <David.YatSin@amd.com>
[ROCm/ROCR-Runtime commit: 1c385fb257]
close the file at the end of every test, instead of the whole test
Change-Id: Ia510990dad8d0bd82625bbd9b2958181e8f1dd25
[ROCm/ROCR-Runtime commit: 8941e7135c]
Now that HsaNodeProperties is passed in to
topology_get_node_props_from_drm, check that pointer instead of the
pointer for MarketingName (which throws a compiler warning)
Signed-off-by: kent.russell@amd.com <kent.russell@amd.com>
Change-Id: If76b24e1bab5a62e514ab440b6316c7b7cd264c1
[ROCm/ROCR-Runtime commit: ea4d4917c1]