Граф коммитов

2930 Коммитов

Автор SHA1 Сообщение Дата
Yifan Zhang 94d5ab8c9e Avoid memory leak when rocrtstFunc.Memory_Available fails
Assert abort the test thread w/ memPtr1 allocated. Free memPtr1
to avoid memory leak.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: I4e1a202c1acb9ba71a23e112254f875bf5a0abcf


[ROCm/ROCR-Runtime commit: afae35b0fd]
2023-02-16 20:13:15 +08:00
Yifan Zhang 04ea6db7e6 Fix rocrtstFunc.Memory_Max_Mem failure for APUs w/ small VRAM
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: I1c0f481af8b1d2a0939d28fb184ff6887747ab03


[ROCm/ROCR-Runtime commit: 4ebb9857ee]
2023-02-16 20:12:19 +08:00
Lang Yu d44cbfd4ad Fix memory async copy test performance issue
Copying memory from device to host with a CPU agent
would cause a poor performance due to the reading of
uncahced device memory by CPU.

Fix it by using a GPU agent.

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Change-Id: Ia3b562758fe73ef9efaa284f47e67bf569cc7b7b


[ROCm/ROCR-Runtime commit: 8501c0bcb1]
2023-02-15 22:24:55 -05:00
Ranjith Ramakrishnan 1ff8eae7be File reorg backward compatibility message changed to #error
Change-Id: I699dee834865ee573a516d58b8b8faa1da4f288a


[ROCm/ROCR-Runtime commit: 3636d487c9]
2023-02-14 21:46:43 -08:00
David Yat Sin 3d2f3991f1 Fix for unitialized variables
Change-Id: Ie8a004db699248d0cde4213077520ea503754399


[ROCm/ROCR-Runtime commit: 53d53655d7]
2023-02-14 14:19:31 +00:00
Jonathan Kim ff620e9fdc Add interface to DMA copy directly to a target engine.
Change-Id: Ic87cfeabb11c1a465f98f3f444d39955f5300525


[ROCm/ROCR-Runtime commit: 30920fc94d]
2023-02-13 13:50:49 -05:00
Jonathan Kim f161963c09 Make SDMA engine availability status queryable.
Report the availability of SDMA engines for memory copies.

Change-Id: Ie31b02d6b65355122bb8c98bc73700a59bee166e


[ROCm/ROCR-Runtime commit: 8f27f495c6]
2023-02-13 13:50:49 -05:00
Jonathan Kim 9021f5970d Make the number of per agent SDMA engines queryable.
Change-Id: Iae1cc9b7ec783fdda05f9384f0ad0327ea1a8cc3


[ROCm/ROCR-Runtime commit: 4f283d9bb3]
2023-02-13 13:50:49 -05:00
Ranjith Ramakrishnan df4cae2121 File reorg backward compatibility message changed to #error
Change-Id: I70b6f06b5e82242b3f50e7d1f0dac8a1eb8add11


[ROCm/ROCR-Runtime commit: 053b89414a]
2023-02-10 13:09:10 -05:00
David Yat Sin 59195c9478 Fix unitialized variable warning in valgrind
Change-Id: I91e70d67671a8f7289b734407011380b6b97238a


[ROCm/ROCR-Runtime commit: fb8f42233d]
2023-02-09 17:35:53 -05:00
Xiaogang Chen 2fb44725df libhsakmt: Correct reporting of Shader Engines number.
The Shader Engines number should be shadder array_count divided by simd_arrays_per_engine
not array_count.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I808d1fedd6b9843500719e902ecf759f5668a7d1


[ROCm/ROCR-Runtime commit: efcc9b275b]
2023-02-09 14:34:17 -05:00
Cordell Bloor 921ccf5f60 Fix static initialization order
Change-Id: I1d51e150b526d050b988fe5a422644667a561cd7


[ROCm/ROCR-Runtime commit: 5873a78d58]
2023-02-09 13:51:08 -05:00
David Yat Sin 000f4c0547 Add flag for external memory allocations
ROCr internally uses the same allocation_map_ list to track memory
allocations that are both for internal allocations and allocations by
users of ROCr library. In some edge cases, the library user would call
hsa_amd_pointer_info on an invalid pointer, but ROCR would return the
pointer as valid because this pointer belongs to a memory range that
was allocated internally within ROCr. Adding a flag to differentiate
between internal and external allocations.

Change-Id: I98c52bd85f3985d1ba1b0e3101d2254b003412cf


[ROCm/ROCR-Runtime commit: 59685f4492]
2023-02-09 13:21:43 -05:00
Sean Keely c6d7c62307 Track size of pending operations in blits.
Track and report the size, in bytes, of pending unexecuted blit
commands.  To be used in copy ganging.

Change-Id: Ia7453ff88571e927df771c6c819b73c17e67708e


[ROCm/ROCR-Runtime commit: 27596aef0c]
2023-02-06 12:38:40 -05:00
Konstantin Zhuravlyov e51d58a646 Compile image blit kernels with code object v4
Change-Id: I4b1923fe8f22dda1277409794d0856419228eceb


[ROCm/ROCR-Runtime commit: f115a3505c]
2023-02-02 17:33:15 -05:00
Graham Sider a56d22e215 kfdtest: Remove redundant SGPR/VGPR size checks in KFDTopologyTest
KFDTopologyTest.BasicTest duplicates Thunk logic to calculate VGPR size,
meaning it will always be the same, and SGPR size is a constant. Since
no benefit, remove comparisons.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I99e7ff6fb69ed07bc0716fdf43946b19c67b9268


[ROCm/ROCR-Runtime commit: 3fb1496fb3]
2023-01-26 15:08:12 -05:00
Ranjith Ramakrishnan f4ccd8979e rocrtst package name updated as per ROCm standards
Change-Id: I6d7096a5c5c27648bf0cbfb4b1e83e72b7949421


[ROCm/ROCR-Runtime commit: a649357dec]
2023-01-26 13:04:41 -05:00
Shweta Khatri 4f670d6df7 Fixes hang due to change in order of initialization of libraries
Fixes hang due to change in order of initialization of libraries
that have cyclical dependencies and they call hsa_init() during their
initialization phase.
This implementation looks for a symbol called "HSA_AMD_TOOL_PRIORITY"
across all loaded shared libraries using dynamic section entries of the
loaded lib instead of using dlopen and dlsym for the same purpose.

Change-Id: I4865f2fd18dd186ec311a432ec38fbb5583805d2


[ROCm/ROCR-Runtime commit: 8aac885318]
2023-01-26 01:17:22 -05:00
David Belanger 868a9fe9a5 Revert "libhsakmt: Disabled allocation of CWSR with SVM for GFX11."
This reverts commit 1ba59d64e1.

Change-Id: I05bf82266f563c63c0b794a24b0926e7652ce42d
Signed-off-by: David Belanger <david.belanger@amd.com>


[ROCm/ROCR-Runtime commit: 0eb0bae38b]
2023-01-25 10:48:46 -05:00
David Belanger dbf94b7dd3 libhsakmt: Fixed VGPR memory size for GFX11.0 and GFX11.1.
Fixed VGPR memory size, size was too small for some GPU, causing a memory overflow.
Refactored macro code into a function.
Thanks to Jay Cornwall for locating the problem and proposing the fix.

Change-Id: Iffedea1c4f341967f02c56d810ff048225b02c16
Signed-off-by: David Belanger <david.belanger@amd.com>


[ROCm/ROCR-Runtime commit: a847a7b80e]
2023-01-25 10:45:44 -05:00
Ranjith Ramakrishnan b99bf7f3b6 Added OS details to kfdtest rpm packge name
Change-Id: I600e094c364e1c7219ae3db12f0c4e1f7598c132


[ROCm/ROCR-Runtime commit: bc40579f96]
2023-01-23 12:13:32 -08:00
David Yat Sin 8a86cddd3a Add query for IOMMU support
Reporting whether IOMMU V2 is supported.
IOMMU V1 support is not relevant to user, so not reporting it.

Change-Id: I77389484a87a352da9c2f7b2a5d9de264f90ee53


[ROCm/ROCR-Runtime commit: e30be76f37]
2023-01-19 11:33:21 -05:00
David Yat Sin 580ce4fd25 Add memory pool query to return location
Change-Id: I240b77119d7b8ccfc5ff6a3190d6669d69f243e8


[ROCm/ROCR-Runtime commit: 722794e258]
2023-01-19 08:45:05 -05:00
David Yat Sin 523bdde26f Add env variable to print image SRD contents
Add environment variable HSA_IMAGE_PRINT_SRD to print contents of SRD
registers for image functions

Change-Id: Ifb47a73dcfad8745ee7445e20de96e1021b80bd6


[ROCm/ROCR-Runtime commit: a4f898ad15]
2023-01-13 11:01:04 -05:00
Alexander Turek e907b85904 isa: Add fix for hsa_isa_iterate_wavefronts always returns 64
Currently, Wavefront::GetInfo(HSA_WAVEFRONT_INFO_SIZE.. always returns
64. Instead, return the proper wavefront size based on the ISA.

Temporarily, we only return 1 wavefront size for each ISA. As we do not
have mechanism from upper layers to determine correct wavefront when
there are multiple wavefronts supported. We are temporarily
returning 32 for all gfx1xxx cards even though they support 64 as the
kernels for gfx1xxx are compiled for wavefront-32 by default.

Change-Id: Ic6c2917b7e6d3704daf742d243f5ec7f49430de9


[ROCm/ROCR-Runtime commit: f7e3782b42]
2023-01-12 08:40:07 -05:00
David Belanger 1ba59d64e1 libhsakmt: Disabled allocation of CWSR with SVM for GFX11.
This is a temporary work around for GPU hang issues observed on GFX11.

Change-Id: I98fbedbbd1c51fe402c2116b35ca548931a390c9
Signed-off-by: David Belanger <david.belanger@amd.com>


[ROCm/ROCR-Runtime commit: b25867c4b8]
2023-01-11 17:28:31 -05:00
Shweta Khatri 36da397f96 Enforce uncached memory on AllocatePCIeRW request
Change-Id: Ib5a624ab979220d50205448ef37b4550672fb97d


[ROCm/ROCR-Runtime commit: ed0a1be2c3]
2023-01-11 16:52:15 -05:00
Ranjith Ramakrishnan 829d6536f8 Revert "Remove RPATH/RUNPATH from ROCm libraries"
This reverts commit 993b1dee7e.

Reason for revert:  is blocked due to new proposal. so reverting the changes 

Change-Id: Id9b8cc1560ba3eea6e484e67df3fdc647da9f37d


[ROCm/ROCR-Runtime commit: dbf8905dd1]
2023-01-10 13:52:02 -05:00
Eric Huang 54e3e5ab8f Revert "libhsakmt: Remove unnecessary CPU unmap"
This reverts commit 1bb6d872ac.

It causes a regression in pytorch benchmark.

Change-Id: I96173dbd061cf38d6f451c02cb181ae51b7f625e
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>


[ROCm/ROCR-Runtime commit: 505287412f]
2023-01-06 17:16:40 -05:00
David Yat Sin 474d087407 Force rocrtsts to use Code Object V4
Temporarily force rocrtsts to use Code Object V4 while compiler team is
about to switch the default Code Object to V5. Will switch back to using
default compiler setting once everything is tested/fixed.

Change-Id: I18e5c6771fffd8c60792fc197501d373c7ec22f3


[ROCm/ROCR-Runtime commit: 0f2fa3ba72]
2023-01-06 12:01:03 -05:00
Shweta Khatri 811c411e78 Fixed GFX11 Texture, Buffer and Sampler Resource Descriptor definitions
Change-Id: I101806f9f91ec2ad78339dabc98375bd09946dd0


[ROCm/ROCR-Runtime commit: e72329ab76]
2023-01-05 15:40:47 -05:00
Ranjith Ramakrishnan 2d2992ea65 Corrected libelf package name in depends list
libelf1 package contains libelf.so.1. Updated the package name
Improvement: Removed the initialization of cmake_install_libdir in  source code
Build scripts is initializing  the variable to "lib" and passed as build argument

Change-Id: I16a8cdc4c231487410c1114b818e9d01df4854de


[ROCm/ROCR-Runtime commit: 5c90c762f9]
2022-12-15 23:30:22 -08:00
Alex Sierra d1bbeac5fe Revert "src: use SVM mechanism to register userptr memory"
This reverts commit ea19fbb646.
There are some openMP issues that were introduced after SVM userptr
feature was added.

Signed-off-by: Alex Sierra <Alex.Sierra@amd.com>
Change-Id: I7ef87c5232a3bcbe594c743fa4b4958601845ba5


[ROCm/ROCR-Runtime commit: f2bda56d04]
2022-12-08 17:33:51 -06:00
Alex Sierra 933292eedb Revert "libhsakmt: query svm info from userptrs at fault events"
This reverts commit a89bcd0518.
There are some openMP issues that were introduced after SVM userptr
feature was added.

Signed-off-by: Alex Sierra <Alex.Sierra@amd.com>
Change-Id: I6566c9f0d39d05ecb92f38159880763f432939a5


[ROCm/ROCR-Runtime commit: d9f86ae02b]
2022-12-08 17:33:50 -06:00
Alex Sierra f0e2e1936c Revert "libhsakmt: add env var to en/dis registration through SVM"
This reverts commit 6789a0f3bd.
There are some openMP issues that were introduced after SVM userptr
feature was added.

Signed-off-by: Alex Sierra <Alex.Sierra@amd.com>
Change-Id: Ib01046571d2c84fa0fd228ecba0dee0eae3f994d


[ROCm/ROCR-Runtime commit: 21e95a4f2a]
2022-12-08 17:33:48 -06:00
David Yat Sin 93c4ffe473 Add Stream Performance Monitor(SPM) APIs
Change-Id: I0d48782887814ef245b7e0182e2d5570aa8c3f50


[ROCm/ROCR-Runtime commit: 6bfe57aeb2]
2022-12-08 13:56:29 -05:00
David Yat Sin 652a617846 Add agent info for fw and sdma ucode
Add two new agent info fields:
HSA_AMD_AGENT_INFO_UCODE_VERSION
HSA_AMD_AGENT_INFO_SDMA_UCODE_VERSION

Change-Id: I51cb853724b23a26e945e5c1ac32c16d0cb3bc31


[ROCm/ROCR-Runtime commit: ecdebef0b9]
2022-12-07 19:07:31 -05:00
raghavmedicherla 2b666f57fa [hsa-runtime] Modify elfsection checks in amd_elf_image class
Modified If condition checks in GElfImage::pullElf() of amd_elf_image.cpp to
 check using section types instead of a string check.

Change-Id: I1ab92f0a9118fb2382652a1cc900a3150cbee2da


[ROCm/ROCR-Runtime commit: 5727a10a1b]
2022-12-05 14:42:02 -05:00
David Yat Sin 736e5ae731 Check for debug support after parsing topology
Thunk keeps an internal cache of system topology that can be used to
speed up subsequent calls to hsaKmtAcquireSystemProperties(). This cache
is cleared by calling hsaKmtReleaseSystemProperties() at the beginning
of BuildTopology().
hsaKmtRuntimeEnable() also calls hsaKmtAcquireSystemProperties() inside
Thunk. Move call to hsaKmtRuntimeEnable() after BuildTopology() so that
we can re-use Thunks internal cache.
Parsing of of topology can take ~150 ms on systems for large number of
nodes.

Change-Id: I741709d49d67d244f5fbd707fe8f01ab923bb153


[ROCm/ROCR-Runtime commit: e39ad34d9c]
2022-12-02 11:26:00 -05:00
James Zhu 9d84ed8de6 kfdtest: track Test Status in syslog
Track Test Status in syslog, it will help understand
sys log assoicated with test cases.

Change-Id: I7c0749102db9bc73d6ae3a237ec347a8fefb12e9
Signed-off-by: James Zhu <James.Zhu@amd.com>


[ROCm/ROCR-Runtime commit: 7db29c4797]
2022-11-29 17:46:40 -05:00
Felix Kuehling 1bb6d872ac libhsakmt: Remove unnecessary CPU unmap
This is handled by __fmm_release calling aperture_release_area.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ib8ed300e1734f03aeb9dfc8074897ece310b8af9


[ROCm/ROCR-Runtime commit: 7787a039bd]
2022-11-28 17:18:13 -05:00
Felix Kuehling b3db03a7d5 libhsakmt: Refactor and clean up CPU mappings
Use a common helper for CPU mappings to reduce duplicate code.
Consistently use MAP_SHARED for all render_fd mappings.
Remove double-mapping for AQL queue buffers on the CPU. This workaround
is only needed on the GPU.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Iff86c8cc9f1e5c982614b3f11129bc2cf8cbba02


[ROCm/ROCR-Runtime commit: 73b0fb3d7c]
2022-11-28 17:18:05 -05:00
Felix Kuehling eac689291a libhsakmt: Fix and simplify debug_get_reg_status
The NULL pointer check was the only way for that function to fail. And it
was done after the pointer was accessed. Simplify this by just returning
the result as a return value instead of using a pointer as output
parameter. This way the function can never fail and the caller doesn't
need to do any error handling.

Declare the function in libhsakmt.h instead of duplicating the
declaration in fmm.c.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: I91b90d66166fd3b5cdc47c73a9bbc369c45b51fe


[ROCm/ROCR-Runtime commit: 2d53430ce3]
2022-11-28 17:17:43 -05:00
Alex Sierra 6789a0f3bd libhsakmt: add env var to en/dis registration through SVM
Setting this variable to '0' will force to disable memory
registration/allocation through SVM API mechanism.
Not setting this or setting to '1', SVM API will be used only if all
GPUs support it.

Signed-off-by: Alex Sierra <Alex.Sierra@amd.com>
Change-Id: Icdf7656de09aa9988b567ec6c024953398e9bb48


[ROCm/ROCR-Runtime commit: 8a746bdaed]
2022-11-28 13:42:43 -05:00
Daniel Phillips e1b6def53c kfdtest: Also detect under-reporting of available memory
Detect under-reporting of available memory by initially attempting to
allocate substantially more than reported available memory, and ensure
that the allocation fails. Continue shrinking the attempted allocation
until it succeeds, then fail the test if the successful allocation is
either too much more than or too much less than reported available.

Signed-off-by: Daniel Phillips <daniel.phillips@amd.com>
Change-Id: Ib418f0aa26e8db80590a6c5f2578da56a4b60f2b


[ROCm/ROCR-Runtime commit: e71eb13784]
2022-11-28 11:43:48 -05:00
Felix Kuehling 021ceccd80 libhsakmt: Fix use of uninitialized variable
When is hsaKmtCreateQueue called first time for node
doorbells[NodeId].size is initialized to zero in init_process_doorbells
but used to calculate the doorbell offset. It works just by accident
because doorbells[NodeId].size is uint32_t so -1 will be 0xFFFFFFFF which
is zero extended into 0x00000000FFFFFFFF and it will work as long as mmap
offset bits are not within lower 32 bits.

Bug: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/issues/78
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ia791adfc51363d4704cb50fa4f01137b7dd48a75


[ROCm/ROCR-Runtime commit: 8e69b9c70e]
2022-11-25 14:07:45 -05:00
Eric Huang 6d9fc60ea1 kfdtest: remove scc test in MapUnmapToNodes for gfx90a A+A
Modifier scc is disabled from gfx90a's asm, so remove the
shader for gfx90a A+A and keep it for newer asics with scc
support.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Change-Id: Iec3c7ccd5156a855adb2b02feb3db0761876aa2f


[ROCm/ROCR-Runtime commit: 8e8aa024fd]
2022-11-25 13:55:28 -05:00
David Yat Sin 0e7a4d8ace libhsakmt: Initialize fd to -1
Fix compile error due to warning in some environments

Change-Id: Ie5fcfabb872c27c0de349eb215345b997fae7201


[ROCm/ROCR-Runtime commit: f46ddb7ead]
2022-11-25 15:01:53 +00:00
Ranjith Ramakrishnan ddaee6ccc6 Change pragma message to warning
File reorganization feature was implemented with backward compatibility
The backward compatibility support will be deprecated in future release.
Changed the #pragma message to #warning for a smooth transition

Change-Id: I21025f4cefb40721f095130263b4247877979d36


[ROCm/ROCR-Runtime commit: 01fd84db5e]
2022-11-23 13:06:34 -05:00
Shweta Khatri c236e10be6 Fixed callback method for dl_iterate_phdr api which is called for each loaded shared object
Simplified the callback method. Also fixed the way, loaded shared object were getting appended into a string vector,
which was not being passed to this callback method.

Change-Id: I68661dd73f61a11c42fa92f670e8e7b6ffcb5711


[ROCm/ROCR-Runtime commit: 8751e65b79]
2022-11-21 19:00:34 -05:00