* Update primitive tests for multi-workgroup support
* Update workgroup primitive tests for multi-workgroup support
* Update workfront primitive tests for multi-workgroup support
* Update team based primitive tests for multi-workgroup support
* Update RMA functional tests to capture timing after quiet call
- Modified RMA functional tests to record the time after a `quiet` call in thread, wavefront, and workgroup RMA calls.
* Improve error handling and memory management
- Replaced `cout` with `cerr` for improved error reporting.
- Ensured all allocated memory is freed when `rocshmem_malloc` fails.
* Update start time in primitive tests and latency calculations
- Modified primitive tests to capture the earliest start time.
- Updated latency calculations in functional tests.
* Remove `GetSwarmTester`
* Update start time in team primitive tests
* Invoke quiet call from a single thread within a block on a rocshmem context
[ROCm/rocshmem commit: aa3121a967]
- Remove hdp and ipc pointers from BlockHandle, align RO stats with RO contexts
- Add run commands for `rocshmem_g` and `rocshmem_p` API tests in driver.sh
- Allocate rocshmem API return buffers based on number of device contexts.
- Associate status flag address with blocking calls and remove threadId dependency
- Associated the status flag address with each blocking call request to notify the GPU thread.
- Removed dependency on threadId for determining the appropriate status flag index.
- Move status flag buffer allocation to backend.
- Initialize allocated memeory to zero
[ROCm/rocshmem commit: df4ad2c04d]
* Rearrange CMakefile
* Enable linking to external rocshmem library
* Minor fix for the functional test driver
* ROCSHMEM_HOME detection fixed
[ROCm/rocshmem commit: 96424a59a8]
* Update install_dependencies.sh
* Updated to ROCm repos
* Merge pull request #37 from ROCm/depBuild
locked specific version on ompi and ucx
* locked specific version on ompi and ucx
* [IPC] Fix ROCSHMEM_SIGNAL_ADD
* Generate CMake Package Configuration Files
---------
Co-authored-by: akolliasAMD <99202231+akolliasAMD@users.noreply.github.com>
Co-authored-by: akolliasAMD <akollias@amd.com>
[ROCm/rocshmem commit: 785e31aa48]
* Implemented tiled version of put*_wave and get*_wave functions
* Maintain single kernel that supports both tiled and untiled versions
* Disable IPC in the default RO build script
[ROCm/rocshmem commit: b6d31ac7ef]
* These functional tests are simple puts and gets, where every wave will get/put the same amount of data
* Enabled workgroup level puts and gets tests
[ROCm/rocshmem commit: ff954237dd]
* add backend_ipc.{cpp & hpp}
* rename context_ipc.{cpp & hpp} to context_ipc_device.{cpp & hpp}
* add host interface to IPC backend
* add context_ipc_host.{cpp & hpp} to support host interface
* add USE_RO compile flag to enable support for single backend interface at a time
* add ipc_single script to build rocSHMEM with IPC backend
[ROCm/rocshmem commit: 49779863c2]