27 کامیت‌ها

مولف SHA1 پیام تاریخ
Aurelien Bouteiller 972893bab2 Reenable building test-only with external MPI (#352)
[ROCm/rocshmem commit: 1a16b3bedc]
2025-12-10 11:40:29 -05:00
Edgar Gabriel feab645795 update to build system: (#277)
- make adding PMIx library to compile time based on the result of
  finding PMIx support. This is required eg if compiling rocSHMEM with ompi
  4.0/4.1, which do not have a built-in PMIx version.
- when setting USE_EXTERNAL_MPI=OFF  which ensures that we do not
  check for external MPI libraries (even if one would be available).

[ROCm/rocshmem commit: ed957302d4]
2025-10-17 07:42:11 -05:00
Aurelien Bouteiller 8837414042 Cleanup/wg init (#260)
* remove wg_init and wg_finalize from functional tests

* Remove wg_init and wg_finalize from examples

* deprecate wg_init/finalize

* Updated docs

* Typo in documentation

---------

Co-authored-by: Yiltan <yiltan@amd.com>

[ROCm/rocshmem commit: 6e7277b544]
2025-10-07 14:34:18 -04:00
Edgar Gabriel 53fa35b980 Remove MPI compile-time dependency (#264)
* use dlsym for MPI functions

to allow compiling without MPI support, convert the usage of MPI functions and symbols to be based on a dlopen/dlsym based mechanism. Turns out this cannot be done entirely vendor neutral, slightly different solutions might be required for Open MPI, MPICH and the new MPI ABI.

* checkpoint

more work to be done.

* checkpoint 2

* checkpoint 3

* checkpoint 4

examples compile and link correctly

* checkpoitn 5 (I think)

* Checkpoitn 6

* dyld-mpi: adapt GDA

* dyldmpi: tests that depend on MPI need to link with it themselves

* do not ../mpi_instance.h

* dyldmpi: make the symetricHeapTestFixture compile

* dyldmpi: Change cmakery, compiles and run gda w/o external MPI

* Make it also compile in external MPI mode

* dyldmpi: ipc unit tests compile but do not link

* dyldmpi: new approach, if external mpi required, link with mpi,
otherwise use ompi5 abi

* C-style comments in cmakelist..

* dyldmpi: examples: do not fail compiling if MPI not found at build time,
instead do not compile the MPI required examples

* more updates to CMake logic

* convert RO backend

and a few other cleanups

* update some unit tests

to work with the dlopen MPI environment correctly.

---------

Co-authored-by: Aurelien Bouteiller <abouteil@amd.com>

[ROCm/rocshmem commit: e4c427a736]
2025-10-01 08:06:56 -05:00
Edgar Gabriel e167f50803 Introduce support for executing the IPC conduit without MPI (#153)
* relax MPI dependency from code

This commit (series) removes the strict dependency on MPI in code base.
rocSHMEM will still be compiled with MPI, but the goal is to make the
code work even if MPI_Init_thread has not been invoked, at least for
certain, well-defined scenarios. Hence, the goal is not remove any
mentioning of MPI from rocSHMEM, but to ensure correct execution of the
ipc conduit even if the library has been initialized using other means.

Details:
 - add non-MPI version of remote_heap and WindowInfo classes
 - host interfaces work on WindowInfoMPI, they will not work with the
   non-MPI code path. Since it is unclear whether we plan to support the
   host interfaces at all, this is probably not a major limitation.

* update symmetric_heap structures and backend

* first cut on initialization

and enabling non-MPI initialization of the IPCBackend

* add non-MPI hostInterface methods

at the moment, only barrier_all and sync_all are explicitely supported.

* add non-mpi version of ipc_policy

and a number of smaller fixes required in other files.
A small init/finalize test already passes now with the branch.

* add non-mpi team_split_strided code

* minor fixes for non-MPI use-case

* disable symmetric-heap-window-ionfo test

disable this test for now just to make the compilation pass. Will have
to rework it.

* make no-mpi great again

after rebasing on top of the MPI singleton changes.

* enable running functional tests with uuid init

to run the functional tests using rocshmem_init_attr and the uuid
mechanism requires
a) a PMIx installation on the system
b) setting the environment variable ROCSHMEM_TEST_UUID=1

* fix multi-team creation bug

fix a bug occuring when creating many teams, which was the result of
incorrectly applying two indices in our own implementation of Allreduce.

* make unit tests pass again

* reverse offload was impacted by code change

fix the RO conduit to cope wioth the non-MPI path introduced for the IPC
conduit.

* update to cmake logic to find pmix

* Update src/memory/window_info.hpp

Co-authored-by: Yiltan <ytemucin@amd.com>

* Update CMakeLists.txt

Co-authored-by: Yiltan <ytemucin@amd.com>

* document ROCSHMEM_UNIQUEID_NO_MPI

* rename env. variable to UNIQUEID_WITH_MPI

* update host.cpp to use USE_HDP_FLUSH macro

instead of the deprecated USE_COHERENT_HEAP.

* add note for running example with RO conduit

add a note clarifying that running init_attr_test from the example
directory requires setting an additional environment variable with the
RO conduit.

* Find PMIx in more cases, only apply pmix build options to the test that
needs it, if OMPI_COMM_WORLD_LOCA_RANK is not setenv, abort

---------

Co-authored-by: Yiltan <ytemucin@amd.com>
Co-authored-by: Aurelien Bouteiller <abouteil@amd.com>

[ROCm/rocshmem commit: 6ea5edc951]
2025-06-21 13:23:11 -05:00
Aurelien Bouteiller 08d8324f74 Rework cmakery: (#136)
* Rework cmakery:
  * detect rocm/hip/rocshmem better, make sure that ROCM_PATH and
    ROCM_ROOT don't conflict and are taken by default
  * add /opt/rocm as a fallback when nothing else found
  * obtain hipcc in a sanitized way (ensure we use the same logic we
    use to later find_package hip)
  * factorize redundancies
  * export GPU_TARGETS as part of the cmake target for librocshmem,
    this helps with a clean error when an application tries to link
    with the wrong offload-target flag (rather than a cryptic link error)
  * phased out ROCSHMEM_HOME, in favor of rocshmem_ROOT (the cmake
    blessed way)

* Remove references to ROCSHMEM_HOME, we prefer ROCSHMEM_ROOT

* Pick CMAKE_PREFIX_PATH method for consistent finding hip/rocm

* Undo this pr using LANGUAGE HIP, maybe later

* Use only rocmcmakebuildtools as recommended from 6.4 onward

[ROCm/rocshmem commit: ee5363be7a]
2025-06-18 11:46:33 -04:00
Yiltan bceeadeb63 Multi-Node rocshmem_finalize() bug (#138)
[ROCm/rocshmem commit: 3f01d89207]
2025-06-04 10:02:03 -04:00
Yiltan 835de6be0e Added XNACK support (#94)
* Added xnack flags
* Updated examples compile command

[ROCm/rocshmem commit: edcd1ed57e]
2025-04-30 08:57:55 -04:00
Edgar Gabriel 38346e5bdd use correct MPI initialization method (#90)
* use correct MPI initialization method

rocSHMEM requires that the MPI library is initialized using
THREAD_MULTIPLE support. Lets use that function therefore in our
examples.

* Update examples/rocshmem_init_attr_test.cc

Co-authored-by: Aurelien Bouteiller <Aurelien.bouteiller@gmail.com>

---------

Co-authored-by: Aurelien Bouteiller <Aurelien.bouteiller@gmail.com>

[ROCm/rocshmem commit: 2e01af22ca]
2025-04-29 16:22:46 -05:00
Avinash Kethineedi c4de6833f6 Add SPDX license identifiers and update copyright headers (#85)
* Update copyright information and add SPDX license identifier

* Update AUTHORS

* Remove `sos_tests`

[ROCm/rocshmem commit: f6ef19f5a9]
2025-04-15 15:37:53 -05:00
Edgar Gabriel bac7769483 Revamp the uniqueId code to support subgroups of processes (#80)
* add code for bootstrapping

the bootstrapping code has been extracted from the MSCCLPP library,
which in parts is based on the code from NVIDIA. The code has been
modified to match the specific requirements of the rocSHMEM library.

* add code to use the new uniqueId bootstrapping

* adjust init_attr example

extend the rocshmem_init_attr example to use two disjoint groups
of processe, in order to trigger the new code path.

* add env variable for bootstrap timeout

* Update examples/rocshmem_init_attr_test.cc

Co-authored-by: Aurelien Bouteiller <Aurelien.bouteiller@gmail.com>

* Update src/rocshmem.cpp

Co-authored-by: Aurelien Bouteiller <Aurelien.bouteiller@gmail.com>

---------

Co-authored-by: Aurelien Bouteiller <Aurelien.bouteiller@gmail.com>

[ROCm/rocshmem commit: b5830a623b]
2025-04-14 12:02:09 -05:00
Avinash Kethineedi 41d5d739e2 Update collective APIs naming (#77)
* Update the naming convention for collective APIs to ensure consistency across the interface.

* Move all collective API declarations to rocshmem_COLL.hpp

* The following APIs were updated as part of this change:
  - `barrier`
  - `barrier_all`
  - `sync`
  - `sync_all`
  - `all_to_all`
  - `broadcast`
  - `fcollect`
  - `all_reduce`

* Update header file generation code for collective APIs

[ROCm/rocshmem commit: 68421895d6]
2025-04-10 12:14:47 -05:00
Edgar Gabriel 2ab585ce8d add uniqueID initialization (#69)
add the interfaces required to support rocshmem initialization
through the uniqueID mechanism. At the moment this still maps to
MPI initialization underneath the hood, but adding the functions might
simplify the porting of some applications to rocshmem. In addition, if
we need to transition away from MPI one day, this is also one step into
this direction.

[ROCm/rocshmem commit: e9f6227d75]
2025-03-28 16:34:00 -05:00
Yiltan 1380f43156 ROCm 6.4.0rc3 bug fix (#56)
[ROCm/rocshmem commit: 68a1646399]
2025-03-19 15:37:58 -04:00
Yiltan a16492cdf9 Added option to build only tests and link to an external rocshmem library (#43)
* Rearrange CMakefile

* Enable linking to external rocshmem library

* Minor fix for the functional test driver

* ROCSHMEM_HOME detection fixed

[ROCm/rocshmem commit: 96424a59a8]
2025-03-13 15:49:50 -04:00
Yiltan Temucin 3164874941 Use ROCm-CMake
[ROCm/rocshmem commit: b60a460681]
2024-12-06 15:49:41 -06:00
avinashkethineedi 52088167ae Add header files based on sections in OpenSHMEM specifications
* rocshmem_RMA.hpp
* rocshmem_AMO.hpp
* rocshmem_SIG_OP.hpp
* rocshmem_COLL.hpp
* rocshmem_P2P_SYNC.hpp
* rocshmem_RMA_X.hpp


[ROCm/rocshmem commit: 3117a47b8d]
2024-12-05 23:24:10 +00:00
avinashkethineedi 04daad3625 Merge branch PR #55 into naming_scheme
[ROCm/rocshmem commit: d8ce066adc]
2024-12-04 21:46:38 +00:00
Brandon Potter 913ce47ef1 Use new naming scheme
[ROCm/rocshmem commit: fd8dbc7fb6]
2024-11-25 14:25:29 -06:00
Yiltan Temucin 134911a5fb Fixed typo in examples
[ROCm/rocshmem commit: ff8aab522b]
2024-11-22 15:36:17 -06:00
Yiltan Temucin fc3855514d Create put_signal example
[ROCm/rocshmem commit: ec72aad517]
2024-11-22 15:36:17 -06:00
avinashkethineedi 6bfcb173ee Add CMake file for examples folder
[ROCm/rocshmem commit: 1f3b242e12]
2024-11-14 19:50:23 +00:00
Yiltan Temucin 8df27a93be updated examples to use new APIs
[ROCm/rocshmem commit: 799d9d5ed7]
2024-11-06 09:49:06 -06:00
avinashkethineedi 832dda25e6 Merge branch 'ROCm:develop' into active_set_APIs
[ROCm/rocshmem commit: b2b0d559cb]
2024-11-05 23:02:44 +00:00
avinashkethineedi f682dcee3f Add example code demonstrating team-based broadcast and alltoall API usage
* Update all_reduce test to keep the naming convention uniform across the examples


[ROCm/rocshmem commit: 68c893d790]
2024-10-30 19:09:17 +00:00
avinashkethineedi 5869709dac Update all_reduce algorithm to use internal put/get functions for updating pWrk and pSync arrays
* Change log_stride calcualtions to stride calculations
* Update all_reduce example code to use team based interface


[ROCm/rocshmem commit: abec29bd6a]
2024-10-28 22:10:18 +00:00
Edgar Gabriel 777401ae29 add some example code
first examples include a getmem testcase and an allreduce (to_all)
example.


[ROCm/rocshmem commit: a0ac7b2d60]
2024-10-24 15:07:17 +00:00