* update AG direct and single node LL threshold
* update thresholds based on MI350 expeirmental results
* disable using LL for direct AG
* enable direct AG for lower GPU counts
* direct AG single node tuning
* fix in-place buffer allocation for AG unit test
* whitespace fix
* gate direct AG for gfx950 and gfx942
---------
Co-authored-by: Nusrat Islam <nusislam@nova-login-gtu2.prov.gtu.zts.cpe.ice.amd.com>
Minimum required cmake version of test/CMakeList.txt is bumped from 2.8
to 3.16. This alignes with the version used in CMakeList.txt and will
enable building with cmake 4.
* [rocm_regression] Return errors when HSA_NO_SCRATCH_RECLAIM=1 even for rocm >= 6.4.0
* [rocm_regression] Check firmware version
* [rocm_regression] Resolve review comments
* [rocm_regression] Move hsa env checking into init once func
* [rocm_regression] Prevent hot fix version in firmware
* [rocm_regression] Improve unit tests
Changed `TestBedChild` protocol to send the result code before the return value to avoid hanging if the call fails. Switched `TestBedChild::GetUniqueId` to use this.
* File containing test for comm.h
* Update CommTest.cpp
Added gtest API for assert
* Update CommTest.cpp
Adding copyright
* Update CommTest.cpp
Removing info and tested as not required.
* Update and rename CommTest.cpp to CommTests.cpp
* Update CMakeLists.txt
* Added new binary for executing unit tests
Added new unit tests for argcheck.cc and alt_rsmi.cc files
Modified the method to execute unit tests to cover static methods
by using a bash script to convert static to non-static functions
and variables on the fly restricted to debug build type.
* Added new unit tests for src/transport/shm.cc
* Added new unit tests for graph/xml.cc
* Increased max stack size to 640
* Added new binary for executing unit tests
Added new unit tests for argcheck.cc and alt_rsmi.cc files
Modified the method to execute unit tests to cover static methods
by using a bash script to convert static to non-static functions
and variables on the fly restricted to debug build type.
This feature tracks the proxy events and status of each send/recv op. ProxyTrace keeps a fixed number of active ops in host mem and dumps the status of each op when the program crashes or hangs.
* First version of new replayer, with comments on future TODOs
* plus minor fixes for UT
* Updated format of recorder, especially in binary department, according to replayer's need
* Revert "Revert "replacing rccl_float8 with hip_fp8 and address compatibility …"
This reverts commit 824b81c034.
* [UT] Modify max stack size to 496
* adding a check for OCP type and replacing ROCM_VERSION with HIP_VERSION
* addressing the ci failure
* Adding the device tag
---------
Co-authored-by: Marzieh Berenjkoub <mberenjk@amd.com>