This patch is the hot fix to fix the param number checking after remove
dgpu input.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Change-Id: Ic980588f78616f99076de742af580afb4273fb2f
[ROCm/ROCR-Runtime commit: 8fc816affe]
gfx90c should use GFX902 which is the same with gfx902.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Change-Id: Id24dc2c85c9f49f36b00889c3b8b1b19cce34e09
[ROCm/ROCR-Runtime commit: 8ea0d49337]
These are removed now that we've consolidated the dev package
information into CMakeLists.txt from hsakmt-dev.txt.
Change-Id: I49496ec5def85b0af7fa6b15110910528a8e0be0
[ROCm/ROCR-Runtime commit: 654ee83ac8]
Add extended descriptions and e-mail address to CMakeLists
A lintian error will remain regarding stripping the .so, as we
will not be doing this for Release versions of the hsakmt .so
Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: I41c768dee28c0564d92b9c103a6e2d97590e4589
[ROCm/ROCR-Runtime commit: 0a4b23d625]
Whether use dgpu path will check the props which exposed from kernel.
We won't need hard code in the ASIC table.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Change-Id: I0c018a26b219914a41197ff36dbec7a75945d452
[ROCm/ROCR-Runtime commit: ad87f38dad]
KFD already implemented the fallback path for APU. Thunk will use flag
which exposed by kfd to configure is_dgpu instead of hardcode before.
Signed-off-by: Huang Rui <ray.huang@amd.com>
Change-Id: I445f6cf668f9484dd06cd9ae1bb3cfe7428ec7eb
[ROCm/ROCR-Runtime commit: 12813691a2]
For small copies cache flush latency is larger than data transfer
latency in local VRAM. Select SDMA for small copies.
Environment key HSA_FORCE_SDMA_SIZE is added for easy adjustment
of the small copy size. This may be removed after tuning is done.
Change-Id: I733fa0ae01c616617c5de50e71226b51fd589ef2
[ROCm/ROCR-Runtime commit: 2a0c6774fb]
Gfx10 need 12bytes/wave control stack
Change-Id: I6c6f2819572e6b43aa3140d4dbe79d930e4c1c9c
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
[ROCm/ROCR-Runtime commit: 3d3b28b670]
l_name is populated by strdup which requires using free rather
than delete.
Change-Id: I9d9bdcfaa3ef095502270f332b95a0ee5c0bbcfc
[ROCm/ROCR-Runtime commit: 9c20f0e649]
We want wraparound behavior here but we don't want to trigger sanitizer
warnings. Converting to int64_t and then wraping around by cast to
uint64_t avoids the UB issue that triggers the sanitizer warning.
Change-Id: I9400b988dce7899e9ba42cab3e35c7ffedec8fe1
[ROCm/ROCR-Runtime commit: 5f43778a51]
This is needed to avoid additional references to mapped BOs in child
processes that can prevent freeing memory in the parent process and lead
to out-of-memory conditions.
Change-Id: I25c90510a14dde515cc23ea5dc1f68e8d7e37a66
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
[ROCm/ROCR-Runtime commit: f7a3427c99]
strlen(src) should not be used as the length in strncpy. Use memcpy
since we know the length of the string, and ensure that we
NULL-terminate regardless of length
Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: I21cc6d106510c69464e7ac9d3fc7da3a1e6d1a68
[ROCm/ROCR-Runtime commit: 04f6b9e16b]
The option to use kfd_fd for cpu mapping is for very old broken KFD
version, it is not used in upstreaming process. This causes issue when
multiple process uses shared system memory because the GTT address is
over 40 bits.
Change to always use render node fd to create CPU mapping.
Change-Id: Id7e7b2a2e2f13c6e62c5de170589abfff4d456b0
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: 9e9771a7d9]
Using mordern cmake changes for rocrtest in accordance with the
recent changes in HSA. These changes also make sure that tests
can be compiled both for static as well as dynamic libs
Change-Id: I6dfb5259a4cbd994f413f68d1ebadc2ba5fe4f34
[ROCm/ROCR-Runtime commit: d13342d03a]
Following Cmake changes are in accordance to the changes in HSA / THUNK , VDI etc
These have made the code compilable now both for satic as well as dynamic libs
Change-Id: I4d8d3e2b84d6e1ea00531594522111ccbce8a87b
[ROCm/ROCR-Runtime commit: 4827d1d4d4]
Make explicit reference to hsa_api_trace.cpp from
initialization of hsa_table_interface.cpp. Breaks
the ability to use hsa_table_interface.cpp in plugins.
Change-Id: I22a42d3a132512b0d9ec7a1ca629b169e7f8eba7
[ROCm/ROCR-Runtime commit: f4fe7ddf47]
Rather than manually linking to the device libraries, the compiler
can now handle linking with them. Allow the build to continue using
old layout if the build system still uses it. Therefore maintain
compatibility with ROCm 3.7 and earlier.
Change-Id: Ida81775da3d0f7c2c67386a71cb057ede31a1545
[ROCm/ROCR-Runtime commit: d23b26f760]
The excess declarations mark implemenation functions as default
visibility. Normally this is not an issue since our linker script
will specify which visible symbols will be permitted into the dynamic
symbol table. However, for static linking methods which apply linker
directives during incremental linking symbol visibility must be correct
in the (non-dynamic) symbol table.
Change-Id: I13dc8dd1019368e8943920d36335a91f0c555a92
[ROCm/ROCR-Runtime commit: f6e6eae86d]
The size of the m0 payload for MSG_INTERRUPT has changed in gfx10. It is
now 23bit wide instead of 24bit wide in gfx9.
Since we are generating different binaries for gfx9 and gfx10, we can
conditionally set DEBUG_INTERRUPT_CONTEXT_ID_BIT to 23 for gfx9 and
22 for gfx10.
Change-Id: Ifc15a9fa4399d35328ab58b742f791f1660bcd9a
[ROCm/ROCR-Runtime commit: 23df617150]
This reverts commit c0a0ada18b.
Reason for revert: Change was submitted by accident
Change-Id: If05c705e22296fd3ca789f269737d379a933361d
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: fec3780c1a]
- Make code object reader use mmap when loading from a file on Linux.
- Support computing code object URI for memory either fro the loaded
host executables, or from all mmapped files. Define the environment
variable HSA_LOADER_ENABLE_MMAP_URI to non 0 to search the mmap
files, otherwise only the loaded executables will be seatched.
- For mmap search, determine file size and ommit offset and size URI
fragment when the code object is the whole file even when specifying
a file size explicitly or specifying memory that has been mmaped.
- Always return a non-empty code object URI.
- When a code object reader is created, complete all fields to ensure
it can be used in a multi-threaded manner using only const
operations.
- Add missing exception handlers in the AMD vendor extentions.
- More rigorous checking for errors.
Change-Id: I07797b1dc60c5c64245142d77becf9f7c9643395
[ROCm/ROCR-Runtime commit: 91cb98dab6]
Since CMAKE_MODULE_PATH can already be set by another project,
we should just append the libhsamkt cmake module directory to it.
Change-Id: I999dc52a2862e4bbff02e0a8e8b39530f4dae2cd
Signed-off-by: Vlad Sytchenko <vladislav.sytchenko@amd.com>
[ROCm/ROCR-Runtime commit: 5fb771a195]
This is to avoid circular dependencies when using Ninja as a generator.
Change-Id: I703f225c9f342dfb07c36ad0920927c40c922fb8
[ROCm/ROCR-Runtime commit: ea80e94756]
New addrlib trips this warning in release builds on UB 18.04 with
gcc.
Change-Id: I4a8aa0e531fa21011ddde99d769a8452d333ff20
[ROCm/ROCR-Runtime commit: 2e1b863195]