* Changed stream error warning, remove regex search from attach execute test
* Formatting
* Revert accidental change
* Fix stream hang error due to grabbing same lock twice
* Updated add stream code, need to update tests
* Update attachment tests to use streams, threads, and multiple devices
* Update tests and fix stream issues
* Updated error messages to be more explicit, updated json to csv code in conftest to include streams and threads
* Formatting
* Add attachment label to attachment tests and update validation to fix errors
* Fix attach twice conftest
* Disabled thread san tests for attachment since they no longer work with bin file changes
* Updated for comment
* Added null check for getting attach status
* migrate docs update workflow from rocm-libraries
* add test branch to the trigger condition
* modify docs to test workflow
* temporarily rename project folder name to match the test project
* add more content for testing
* test successful, restore test modifications
Modify the code that computes the adjusted CU mask array to take
into account of additional cases for inactive CUs.
Signed-off-by: David Belanger <david.belanger@amd.com>
* SWDEV-534207 - fix 'Unit_hipFreeMipmappedArrayImplicitSyncArray - float' out of memory error with extent (1024, 1024, 1024) and 1 levels on 740M iGPUs. totalGlobalMem is not really the amount of device memory available for compute
* SWDEV-534207 - compare expected available memory within a range in Unit_hipMalloc3D_Basic; to take into account some bookkeeping overhead (instead of in exact 64MB chunks)
* SWDEV-534207 - fix missing setting of SvmGpuMemoryCreateInfo::interprocess in the 'fine' and 'fine uncached' memory and 'MemorySubAllocator' cases. Coarse allocation was added first; the flag was missed when the other three cases were added
* SWDEV-534207 - allow more room for the check of available memory after hipFree() in Unit_hipMalloc3D_Basic; it was till failing on 740M
---------
Co-authored-by: Gerardo Hernandez <gerardo.hernandez@amd.com>
Co-authored-by: systems-assistant[bot] <systems-assistant[bot]@users.noreply.github.com>
* Add GHCR retry logic
* Add retries to Install ROCm Packages step in rocprofiler-systems-redhat.yml
* Update containers-ci.yml file to use latest RHEL9/10 releases
* Use build-docker-ci script in rocprofiler-systems-containers
* Remove working-directory from step in rocprofiler-systems-redhat.yml
* Remove shell bash from Install ROCm Packages step
* Revert RHEL version change in rocprofiler-systems-redhat.yml
this test will prefetch SVM memory, and then verify the memory is sourced
from the expected numa node.
Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
* Add rdhc script in to rocm-core package
* Create the rdhc symlink within the package itself.
* Removed hard-coding of rocm-core name, used CORE_TARGET instead.
* [RDHC] Check if the required pip pkgs are present and warn .
rdhc checks the required pip packages are present or not.
if not warns the user and exits gracefully.
Signed-off-by: Saravanan Solaiyappan <saravanan.solaiyappan@amd.com>
Tools running for sanity checks are
detecting buffer overrun which is
not the case. Still getting rid of
function which is causing the issue
removed and making the code more robust
using defensive programming so that
any tool is not able to detect issues
hereafter. Fixed comments, corrected
typos and added a new return type
which is required as per refactoring
Signed-off-by: Ashutosh Mishra <ashutosh.mishra@amd.com>
Query IPC handles on shared memory export/import for any metadata as a
means to uniquely identify handles that happen to be backed by buffers
that point to the same memory.
- On some hosts the wget can finish too soon and PAPI doesn't catch even a single network event.
- On some hosts, there are multiple default NICs and the scripts didn't work in that case.
- The test script was writing the output of wget to /tmp directory, which causes a problem if another user tries to run the same test. Because the output file with the same name already exists in the same directory, but with a different owner, the test fails
---------
Co-authored-by: David Galiffi <David.Galiffi@amd.com>