Граф коммитов

26 Коммитов

Автор SHA1 Сообщение Дата
Avinash Kethineedi 526105d315 Implement rocshmem_ptr in IPC conduit (#197)
* Implement `rocshmem_ptr` in IPC conduit

* tests: add functional test for `rocshmem_ptr`
  - Add safety check for pointer access and condition check before printing results for `rocshmem_ptr` test
  - Use `rocshmem_put` to store `rocshmem_ptr` availability for data validation
2025-07-28 12:01:02 -05:00
Aurelien Bouteiller 63a79892b2 rocshmem_config.h has a different include path when installed and built-dir (#186)
* rocshmem_config.h needs to be in a similar directory structure for
includes to work when building testers in build, and from an installed
library

* Do not change installed rocshmem.hpp
2025-07-02 16:51:38 -04:00
Avinash Kethineedi bf48bcabf2 Refactor Barrier_all and Sync_all APIs to use default context (#159)
* Refactor `Barrier_all` and `Sync_all` to use default context

- Removed context-specific implementations of barrier_all and sync_all
- Added barrier_all and sync_all to the default context implementation
- Updated functional tests to use the default context for barrier_all and sync_all

* Update `Barrier_all` and `Sync_all` API usage in documentation

* Update `CHANGELOG`

---------

Co-authored-by: Yiltan <ytemucin@amd.com>
2025-06-17 11:16:18 -05:00
Avinash Kethineedi f6ef19f5a9 Add SPDX license identifiers and update copyright headers (#85)
* Update copyright information and add SPDX license identifier

* Update AUTHORS

* Remove `sos_tests`
2025-04-15 15:37:53 -05:00
Avinash Kethineedi c652f58cef Update Barrier_All and Sync_All APIs (#72)
* Fix deadlock in `rocshmem_ctx_wg_barrier_all` API in IPC conduit by adding per-context pSync buffers and context IDs
  - Added separate pSync buffers for each device context
  - Resolved deadlock when invoking barrier API (`rocshmem_ctx_wg_barrier_all`) concurrently from multiple contexts

* Update barrier_all functional tests for multi-context support

* Add thread, wavefront, and workgroup-level barrier_all APIs in IPC and RO conduits
  - Implemented barrier_all APIs at thread, wavefront, and workgroup granularity
  - Added support in both IPC and RO conduits
  - Updated functional tests to cover all `barrier_all` APIs

* Add thread, wavefront, and workgroup-level sync_all APIs in IPC and RO conduits
  - Implemented sync_all APIs for thread, wavefront, and workgroup scopes
  - Added support into both IPC and RO conduits
  - Added functional tests to cover all `sync_all` APIs
2025-04-02 11:58:55 -05:00
Avinash Kethineedi 248972b30b Merge pull request #31 from avinashkethineedi/rocshmem_g
Implement `rocshmem_g` API and optimize memory usage
2025-02-04 11:15:41 -06:00
Yiltan Hassan Temucin fd3eaa3f69 [IPC] Fix ROCSHMEM_SIGNAL_ADD 2025-02-03 09:59:28 -08:00
avinashkethineedi 757d7e53ca Implement rocshmem_g API and optimize memory usage
- Implement `rocshmem_g` API
- Free up memory space allocated for `rocshmem_g` and atomic operations' return values
2025-02-02 05:56:46 +00:00
avinashkethineedi 6486e29078 Rename config.h to roc_shmem_config.h 2024-12-06 01:08:13 +00:00
avinashkethineedi d8ce066adc Merge branch PR #55 into naming_scheme 2024-12-04 21:46:38 +00:00
Brandon Potter fd8dbc7fb6 Use new naming scheme 2024-11-25 14:25:29 -06:00
Yiltan Temucin d8f44e4436 Added Signalling Operations 2024-11-22 15:36:17 -06:00
avinashkethineedi d1ee997542 Update puts and gets to include a fence following data movement, ensuring data visibility 2024-11-12 16:52:07 +00:00
Yiltan Hassan Temucin fe767d9abf remove cooperative groups 2024-10-30 20:10:21 +00:00
Edgar Gabriel 87db7f7d38 fix barrier synchronization on gfx90a 2024-10-24 15:08:28 +00:00
avinashkethineedi 6685d0ab60 Add fine grained memory buffers for work/sync arrays
* Add interanl put_mem/get_mem{_wave, _wg} functions to read/write to work/sync arrays
* Add condition check to ensure all MPI processes are on the same compute node for IPC conduit
2024-10-21 15:28:39 +00:00
Yiltan Hassan Temucin 509277c034 fixed notifier bug 2024-10-10 06:45:43 -07:00
avinashkethineedi 92fb1abaf2 Add team information to the context
* Update roc_shmem_ctx_fence API to use team-relative PE numbering
* Update backend to populate team_opaque member of ROC_SHMEM_CTX_DEFAULT (used to store information about the team wrt TEAM_WORLD)
2024-10-04 17:56:15 +00:00
avinashkethineedi 979aed105a Add fence and quiet functionality
* Perform atomic stores to enforce memory ordering
2024-10-03 06:28:12 +00:00
Avinash Kethineedi e58077e3cf Merge branch 'ipc_bringup' into ipc_atomics 2024-09-09 14:22:55 -05:00
avinashkethineedi 7bbf34d334 remove local_pe calculation from puts, gets and atomics functions
* All the PEs are assumed to be accessible using IPC backend
2024-09-05 11:52:00 -07:00
Edgar Gabriel aae6295460 ipc/context_ipc_device.cpp: set barrier_sync
set the barrier_sync variable on the context during
object creation
2024-08-28 09:41:05 -07:00
avinashkethineedi 45a8cb3354 Update IPC object
* Update the IPC object in the context class with the instance created in the IPC backend
2024-08-28 08:14:38 -07:00
Edgar Gabriel 0de3b5e6fc first cut on collectives and sync
code is based on the GPUIB implementations of the routines, which seem
however generic enough to work also for the IPC conduit.

Some code is in for broadcast, fcollect, and alltoall.
2024-08-27 15:03:38 -07:00
avinashkethineedi c8b0f2378e Add gets and puts functionality to IPC context 2024-08-15 13:17:44 -07:00
avinashkethineedi 49779863c2 Add IPC backend
* add backend_ipc.{cpp & hpp}
* rename context_ipc.{cpp & hpp} to context_ipc_device.{cpp & hpp}
* add host interface to IPC backend
* add context_ipc_host.{cpp & hpp} to support host interface
* add USE_RO compile flag to enable support for single backend interface at a time
* add ipc_single script to build rocSHMEM with IPC backend
2024-08-14 22:59:02 -07:00