Gráfico de commits

16 Commits

Autor SHA1 Mensaje Fecha
Dimple Prajapati 87f99e7ec6 Add host APIs for querying device ctx and remote heap pointer (#200)
* Add host APIs for querying device ctx and remote heap pointer

* Host API to query device pointer for ROCSHMEM_DEFAULT_CONTEXT,
  this is needed to support dynamic module initialization via device kernel
  library bitcode.
* Host API to query remote symmetric heap pointer that can be used in
  custom device kernel for RMA operations.

* Added rocshmem_ptr implementation within the Host Context class
* Enables pointer retrieval functionality for symmetric data objects
* Copy IPC pointers to host memory in RO host context

---------

Co-authored-by: avinashkethineedi <avinash.kethineedi@amd.com>
2025-07-24 11:03:03 -07:00
Aurelien Bouteiller 63a79892b2 rocshmem_config.h has a different include path when installed and built-dir (#186)
* rocshmem_config.h needs to be in a similar directory structure for
includes to work when building testers in build, and from an installed
library

* Do not change installed rocshmem.hpp
2025-07-02 16:51:38 -04:00
Avinash Kethineedi bf48bcabf2 Refactor Barrier_all and Sync_all APIs to use default context (#159)
* Refactor `Barrier_all` and `Sync_all` to use default context

- Removed context-specific implementations of barrier_all and sync_all
- Added barrier_all and sync_all to the default context implementation
- Updated functional tests to use the default context for barrier_all and sync_all

* Update `Barrier_all` and `Sync_all` API usage in documentation

* Update `CHANGELOG`

---------

Co-authored-by: Yiltan <ytemucin@amd.com>
2025-06-17 11:16:18 -05:00
Aurelien Bouteiller 3600291558 Detailed logs (#124)
* Use a single printf per line (reduce chances of lines being cut in logs)

* team_comm can be an int or a pointer depending on MPI impl.
Received is confusing (since we are on the origin), use submitted
instead

* Print arguments to calls when using DEBUG

---------

Signed-off-by: Aurelien Bouteiller <abouteil@amd.com>
2025-05-15 10:37:12 -04:00
Aurelien Bouteiller 9befbe8293 bugfix: do not dereference ctx during create_ctx if we did run out (#83) 2025-04-16 10:37:44 -04:00
Avinash Kethineedi f6ef19f5a9 Add SPDX license identifiers and update copyright headers (#85)
* Update copyright information and add SPDX license identifier

* Update AUTHORS

* Remove `sos_tests`
2025-04-15 15:37:53 -05:00
Avinash Kethineedi 68421895d6 Update collective APIs naming (#77)
* Update the naming convention for collective APIs to ensure consistency across the interface.

* Move all collective API declarations to rocshmem_COLL.hpp

* The following APIs were updated as part of this change:
  - `barrier`
  - `barrier_all`
  - `sync`
  - `sync_all`
  - `all_to_all`
  - `broadcast`
  - `fcollect`
  - `all_reduce`

* Update header file generation code for collective APIs
2025-04-10 12:14:47 -05:00
Avinash Kethineedi dc61bca066 Update Barrier and Sync APIs (#73)
* Add thread, wavefront, and workgroup-level `barrier` APIs in IPC and RO conduits; remove collectives on default context
 - Implemented `barrier` APIs for thread, wavefront, and workgroup scopes
 - Added support into both IPC and RO conduits
 - Added functional tests to cover all `barrier` APIs
 - Removed collective operations on default context

* Add thread, wavefront, and workgroup-level `sync` APIs in IPC and RO conduits.
  - Implemented `sync` APIs for thread, wavefront, and workgroup scopes
  - Added support into both IPC and RO conduits
  - Added functional tests to cover all `sync` APIs

* update naming convention for context-based `barrier` APIs
2025-04-08 11:25:31 -05:00
Avinash Kethineedi c652f58cef Update Barrier_All and Sync_All APIs (#72)
* Fix deadlock in `rocshmem_ctx_wg_barrier_all` API in IPC conduit by adding per-context pSync buffers and context IDs
  - Added separate pSync buffers for each device context
  - Resolved deadlock when invoking barrier API (`rocshmem_ctx_wg_barrier_all`) concurrently from multiple contexts

* Update barrier_all functional tests for multi-context support

* Add thread, wavefront, and workgroup-level barrier_all APIs in IPC and RO conduits
  - Implemented barrier_all APIs at thread, wavefront, and workgroup granularity
  - Added support in both IPC and RO conduits
  - Updated functional tests to cover all `barrier_all` APIs

* Add thread, wavefront, and workgroup-level sync_all APIs in IPC and RO conduits
  - Implemented sync_all APIs for thread, wavefront, and workgroup scopes
  - Added support into both IPC and RO conduits
  - Added functional tests to cover all `sync_all` APIs
2025-04-02 11:58:55 -05:00
Edgar Gabriel bcbc42e78f add rocshmem_barrier() (#61)
* add team-barrier implementation

add a team-barrier API and implementation in the IPC and RO conduit.
Clean up some of the logic in the RO Conduit to distinguish between
sync, sync_all, barrier, and barrier_all.

* add team_barrier_tests to functional tests
2025-03-24 11:23:03 -05:00
Yiltan 658bf2a3b5 Removed GPU_IB (#59) 2025-03-24 09:04:52 -04:00
avinashkethineedi 21dbd5cc5e Remove rocshmem_timer function 2025-02-17 17:10:51 +00:00
avinashkethineedi e311400d15 Fix rocshmem_ctx_my_pe and rocshmem_ctx_n_pes APIs to return PE numbering and size relative to the team in a team-specific context. 2025-02-05 03:41:40 +00:00
avinashkethineedi 6486e29078 Rename config.h to roc_shmem_config.h 2024-12-06 01:08:13 +00:00
avinashkethineedi d8ce066adc Merge branch PR #55 into naming_scheme 2024-12-04 21:46:38 +00:00
Brandon Potter fd8dbc7fb6 Use new naming scheme 2024-11-25 14:25:29 -06:00