Commit Graph

285 Commits

Author SHA1 Message Date
Avinash Kethineedi 2a7416d016 Implement rocshmem_ptr in IPC conduit (#197)
* Implement `rocshmem_ptr` in IPC conduit

* tests: add functional test for `rocshmem_ptr`
  - Add safety check for pointer access and condition check before printing results for `rocshmem_ptr` test
  - Use `rocshmem_put` to store `rocshmem_ptr` availability for data validation

[ROCm/rocshmem commit: 526105d315]
2025-07-28 12:01:02 -05:00
Dimple Prajapati 72ed270a5c Add host APIs for querying device ctx and remote heap pointer (#200)
* Add host APIs for querying device ctx and remote heap pointer

* Host API to query device pointer for ROCSHMEM_DEFAULT_CONTEXT,
  this is needed to support dynamic module initialization via device kernel
  library bitcode.
* Host API to query remote symmetric heap pointer that can be used in
  custom device kernel for RMA operations.

* Added rocshmem_ptr implementation within the Host Context class
* Enables pointer retrieval functionality for symmetric data objects
* Copy IPC pointers to host memory in RO host context

---------

Co-authored-by: avinashkethineedi <avinash.kethineedi@amd.com>

[ROCm/rocshmem commit: 87f99e7ec6]
2025-07-24 11:03:03 -07:00
Aurelien Bouteiller 93cf1b680e Documentation for RO (#189)
* Update documentation to include RO and how to use it

* Clarify supported configuration

Co-authored-by: yugang-amd <yugang.wang@amd.com>


[ROCm/rocshmem commit: 42e28835ad]
2025-07-10 18:49:10 -04:00
Edgar Gabriel f38ffbf84d Revert "Add host API to query Device side context detail (#183)" (#196)
This reverts commit 31804fcad3.

[ROCm/rocshmem commit: a66f782540]
2025-07-07 16:51:44 -05:00
Dimple Prajapati 31804fcad3 Add host API to query Device side context detail (#183)
* API support for enabling rocshmem bitcode integration

* move implementation to along with host side APIs

[ROCm/rocshmem commit: 105382710a]
2025-07-07 16:04:16 -05:00
akolliasAMD d2e4a18f11 changed the function tests name on the codebase (#177)
[ROCm/rocshmem commit: ebd92a7b3c]
2025-07-04 13:28:59 -06:00
akolliasAMD 3bb0b22fa5 fixed compilation targets on cmake (#182)
* fixed compilation targets on cmake
* moved gpu target generation


[ROCm/rocshmem commit: e2e334e630]
2025-07-04 13:27:20 -06:00
Avinash Kethineedi 81b55c3769 functional_tests: use size_t for size variable (#190)
Changed the data type of `size` to `size_t` in all functional tests to ensure
consistency with rocSHMEM APIs.

[ROCm/rocshmem commit: 7a5c6f86d7]
2025-07-03 13:26:54 -05:00
Aurelien Bouteiller 63da8a137f Let it compile when pmix is not found (#185)
[ROCm/rocshmem commit: 0c40bc58d8]
2025-07-02 17:02:43 -04:00
Aurelien Bouteiller c4d3488f62 rocshmem_config.h has a different include path when installed and built-dir (#186)
* rocshmem_config.h needs to be in a similar directory structure for
includes to work when building testers in build, and from an installed
library

* Do not change installed rocshmem.hpp

[ROCm/rocshmem commit: 63a79892b2]
2025-07-02 16:51:38 -04:00
Avinash Kethineedi 63d52b73c8 Fix pSync buffer initialization in IPC(#180)
- Initialize the entire pSync buffer with the default synchronization value

[ROCm/rocshmem commit: 6dba253890]
2025-06-30 13:38:24 -05:00
akolliasAMD 9cb96eafdb added new gfx target (#171)
[ROCm/rocshmem commit: e2bae5131a]
2025-06-25 17:35:54 -06:00
dependabot[bot] 86ab9a8f89 Bump urllib3 from 2.4.0 to 2.5.0 in /docs/sphinx (#170)
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.4.0 to 2.5.0.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/2.4.0...2.5.0)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-version: 2.5.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

[ROCm/rocshmem commit: 47bd7ec0d8]
2025-06-25 11:08:42 -04:00
dependabot[bot] a33dbbee03 Bump requests from 2.32.3 to 2.32.4 in /docs/sphinx (#169)
Bumps [requests](https://github.com/psf/requests) from 2.32.3 to 2.32.4.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.32.3...v2.32.4)

---
updated-dependencies:
- dependency-name: requests
  dependency-version: 2.32.4
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

[ROCm/rocshmem commit: 49f7f1bab1]
2025-06-25 11:08:28 -04:00
Edgar Gabriel e167f50803 Introduce support for executing the IPC conduit without MPI (#153)
* relax MPI dependency from code

This commit (series) removes the strict dependency on MPI in code base.
rocSHMEM will still be compiled with MPI, but the goal is to make the
code work even if MPI_Init_thread has not been invoked, at least for
certain, well-defined scenarios. Hence, the goal is not remove any
mentioning of MPI from rocSHMEM, but to ensure correct execution of the
ipc conduit even if the library has been initialized using other means.

Details:
 - add non-MPI version of remote_heap and WindowInfo classes
 - host interfaces work on WindowInfoMPI, they will not work with the
   non-MPI code path. Since it is unclear whether we plan to support the
   host interfaces at all, this is probably not a major limitation.

* update symmetric_heap structures and backend

* first cut on initialization

and enabling non-MPI initialization of the IPCBackend

* add non-MPI hostInterface methods

at the moment, only barrier_all and sync_all are explicitely supported.

* add non-mpi version of ipc_policy

and a number of smaller fixes required in other files.
A small init/finalize test already passes now with the branch.

* add non-mpi team_split_strided code

* minor fixes for non-MPI use-case

* disable symmetric-heap-window-ionfo test

disable this test for now just to make the compilation pass. Will have
to rework it.

* make no-mpi great again

after rebasing on top of the MPI singleton changes.

* enable running functional tests with uuid init

to run the functional tests using rocshmem_init_attr and the uuid
mechanism requires
a) a PMIx installation on the system
b) setting the environment variable ROCSHMEM_TEST_UUID=1

* fix multi-team creation bug

fix a bug occuring when creating many teams, which was the result of
incorrectly applying two indices in our own implementation of Allreduce.

* make unit tests pass again

* reverse offload was impacted by code change

fix the RO conduit to cope wioth the non-MPI path introduced for the IPC
conduit.

* update to cmake logic to find pmix

* Update src/memory/window_info.hpp

Co-authored-by: Yiltan <ytemucin@amd.com>

* Update CMakeLists.txt

Co-authored-by: Yiltan <ytemucin@amd.com>

* document ROCSHMEM_UNIQUEID_NO_MPI

* rename env. variable to UNIQUEID_WITH_MPI

* update host.cpp to use USE_HDP_FLUSH macro

instead of the deprecated USE_COHERENT_HEAP.

* add note for running example with RO conduit

add a note clarifying that running init_attr_test from the example
directory requires setting an additional environment variable with the
RO conduit.

* Find PMIx in more cases, only apply pmix build options to the test that
needs it, if OMPI_COMM_WORLD_LOCA_RANK is not setenv, abort

---------

Co-authored-by: Yiltan <ytemucin@amd.com>
Co-authored-by: Aurelien Bouteiller <abouteil@amd.com>

[ROCm/rocshmem commit: 6ea5edc951]
2025-06-21 13:23:11 -05:00
Aurelien Bouteiller 08d8324f74 Rework cmakery: (#136)
* Rework cmakery:
  * detect rocm/hip/rocshmem better, make sure that ROCM_PATH and
    ROCM_ROOT don't conflict and are taken by default
  * add /opt/rocm as a fallback when nothing else found
  * obtain hipcc in a sanitized way (ensure we use the same logic we
    use to later find_package hip)
  * factorize redundancies
  * export GPU_TARGETS as part of the cmake target for librocshmem,
    this helps with a clean error when an application tries to link
    with the wrong offload-target flag (rather than a cryptic link error)
  * phased out ROCSHMEM_HOME, in favor of rocshmem_ROOT (the cmake
    blessed way)

* Remove references to ROCSHMEM_HOME, we prefer ROCSHMEM_ROOT

* Pick CMAKE_PREFIX_PATH method for consistent finding hip/rocm

* Undo this pr using LANGUAGE HIP, maybe later

* Use only rocmcmakebuildtools as recommended from 6.4 onward

[ROCm/rocshmem commit: ee5363be7a]
2025-06-18 11:46:33 -04:00
Avinash Kethineedi 14756a73b1 Refactor Barrier_all and Sync_all APIs to use default context (#159)
* Refactor `Barrier_all` and `Sync_all` to use default context

- Removed context-specific implementations of barrier_all and sync_all
- Added barrier_all and sync_all to the default context implementation
- Updated functional tests to use the default context for barrier_all and sync_all

* Update `Barrier_all` and `Sync_all` API usage in documentation

* Update `CHANGELOG`

---------

Co-authored-by: Yiltan <ytemucin@amd.com>

[ROCm/rocshmem commit: bf48bcabf2]
2025-06-17 11:16:18 -05:00
Aurelien Bouteiller 56a3181a6f Swdev/536571 with additional issues found for other various missing includes (#158)
* Revert "SWDEV-536571 - Include assert header. (#157)"

This reverts commit 87d2efa430.

* Fix use of assert/abort and required includes

* Disable IPC AMO testers for non-implemented functions

[ROCm/rocshmem commit: 551603829c]
2025-06-16 20:21:06 -04:00
Aurelien Bouteiller b5c685ef9d Fix rocshmem_info not compiling in out of source builds (#160)
[ROCm/rocshmem commit: 8138d130b9]
2025-06-16 11:49:30 -04:00
Yiltan 66e267235f Added simple rocshmem_info command (#156)
* Added simple rocshmem_info
* Add GPU Arch info

[ROCm/rocshmem commit: 72639277a3]
2025-06-13 15:40:32 -04:00
akolliasAMD 482490f48f added init example and all_reduce example on the files (#150)
* added init example and all_reduce example on the files

* typo fix on folder name

[ROCm/rocshmem commit: 08a6a733d8]
2025-06-13 15:28:13 -04:00
Aurelien Bouteiller f3345dbf05 Use finegrain allocator by default (#140)
* Use FineGrained allocator for heap by default, consolidate all types of
allocators under saner cmake controls

Co-authored-by: Yiltan <ytemucin@amd.com>

* Uncached may not be only for debug

Need to include the rocshmem config otherwise produce an inconsistent
build with different allocators used in different files

* Undo this pr adding presumably useless hip_host_allocator_noncoherent

* Rename HEAP_IS_COHERENT/USE_COHERENT_HEAP to USE_HDP_FLUSH as the former
was misleading

* Remove unused __roc_inv()

---------

Co-authored-by: Yiltan <ytemucin@amd.com>

[ROCm/rocshmem commit: 41fd9e2d57]
2025-06-13 15:26:26 -04:00
Aurelien Bouteiller 87d2efa430 SWDEV-536571 - Include assert header. (#157)
(cherry picked from commit 96cd36e47e4759b7515373e8fa455385f3d985aa)

Co-authored-by: Patel <jaypatel@amd.com>

[ROCm/rocshmem commit: bcc14b1a34]
2025-06-12 13:16:30 -04:00
Yiltan 381625f060 [SWDEV-536571] Update OMPI Commit (#152)
Signed-off-by: Yiltan Hassan Temucin <yiltan.temucin@amd.com>

[ROCm/rocshmem commit: e340a220f9]
2025-06-09 11:03:48 -04:00
Jobbins 229e97afef Fix typo (#147)
[ROCm/rocshmem commit: e0ef34a9d1]
2025-06-04 10:46:38 -06:00
Yiltan 9e2e489451 Updated CHANGELOG.md with ROCm 6.4.2 changes (#149)
[ROCm/rocshmem commit: 6d1dd5f113]
2025-06-04 11:25:22 -04:00
Yiltan bceeadeb63 Multi-Node rocshmem_finalize() bug (#138)
[ROCm/rocshmem commit: 3f01d89207]
2025-06-04 10:02:03 -04:00
akolliasAMD 032d5e5c6b updated version and made the header its only source of truth (#144)
* updated version and made it only source of truth

* bumped Version number

[ROCm/rocshmem commit: ca5fdd4718]
2025-05-28 14:48:20 -06:00
akolliasAMD 13ed3ff034 added akolliasAMD to codeowners (#145)
[ROCm/rocshmem commit: fc22f8130d]
2025-05-28 12:52:58 -06:00
dependabot[bot] 5727670930 Bump tornado from 6.4.2 to 6.5.1 in /docs/sphinx (#143)
Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.4.2 to 6.5.1.
- [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst)
- [Commits](https://github.com/tornadoweb/tornado/compare/v6.4.2...v6.5.1)

---
updated-dependencies:
- dependency-name: tornado
  dependency-version: 6.5.1
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

[ROCm/rocshmem commit: e0c9ee45a7]
2025-05-27 11:10:07 -04:00
Yiltan 92bb6aeaaa [SWDEV-534546] Disable building tests in default build (#141)
[ROCm/rocshmem commit: 9fe166c8e1]
2025-05-26 16:50:22 -04:00
yugang-amd 8fbb892cc1 Final edits (#126)
* final edits

* more edits per review

* more edits

* attempt to fix dead link

[ROCm/rocshmem commit: 8a266e698c]
2025-05-21 16:59:00 -04:00
Edgar Gabriel 35e3a27890 free MPI Communicators on destruction (#134)
make sure that all communicators that have been created during the
runtime of the application are correctly freed again, to avoid memory
leaks.

[ROCm/rocshmem commit: bce329abd6]
2025-05-20 09:59:00 -05:00
Jobbins 22ea3f0c8b Code Coverage (#82)
code coverage:  generate code coverage reports

* Add instrumentation flags to rocshmem target when adding -DBUILD_CODE_COVERAGE cmake flag
* Add helper script to build all subprojects and generate code coverage reports
* Update README with code coverage instructions

[ROCm/rocshmem commit: 474112d03c]
2025-05-16 09:09:17 -06:00
Yiltan 6a7644e467 Updated ROCm-docs to match the current status of the repository (#117)
* Updated docs to match the current status of the repository

Co-authored-by: yugang-amd <yugang.wang@amd.com>

[ROCm/rocshmem commit: f43e3cf4fa]
2025-05-16 09:26:59 -04:00
Aurelien Bouteiller 03a9fac960 Detailed logs (#124)
* Use a single printf per line (reduce chances of lines being cut in logs)

* team_comm can be an int or a pointer depending on MPI impl.
Received is confusing (since we are on the origin), use submitted
instead

* Print arguments to calls when using DEBUG

---------

Signed-off-by: Aurelien Bouteiller <abouteil@amd.com>

[ROCm/rocshmem commit: 3600291558]
2025-05-15 10:37:12 -04:00
yugang-amd 17cde51fb7 Style edits (#122)
[ROCm/rocshmem commit: 67bff9ca30]
2025-05-13 16:26:28 -04:00
alexxu-amd 2f82ed9bf0 move requirements.txt from docs/ to docs/sphinx/ (#118)
[ROCm/rocshmem commit: 9088383dab]
2025-05-08 15:37:58 -04:00
Yiltan 1667e63e30 Initial ROCm-docs (#92)
* Initial ROCm-docs commit

Co-authored-by: Aurélien Bouteiller <bouteill@icl.utk.edu>
Co-authored-by: Alex Xu <alex.xu@amd.com>
Co-authored-by: yugang-amd <yugang.wang@amd.com>

[ROCm/rocshmem commit: f693c98fb2]
2025-05-08 13:39:28 -04:00
Aurelien Bouteiller 44d43901b1 Remove unused parts of dlmalloc to improve coverity score (#106)
[ROCm/rocshmem commit: 87179b1ffd]
2025-05-07 13:05:04 -04:00
Yiltan 95bce1c2ba Added initial changelog (#105)
[ROCm/rocshmem commit: 644857d375]
2025-05-07 11:39:14 -04:00
Aurelien Bouteiller 394b52c0ed func-tests: Don't rely on asserts to catch invalid argv/env params (#96)
[ROCm/rocshmem commit: 2bbe21db56]
2025-05-02 12:00:35 -04:00
Aurelien Bouteiller 19e98852af cleanup leftovers from SOS testers removal (#97)
Followup to pr#85

[ROCm/rocshmem commit: f0501550f7]
2025-05-02 11:59:52 -04:00
Aurelien Bouteiller 27d1189ff3 Substitute pow2bin allocator with a dlmalloc based allocator (#71)
* Add dlmalloc_strat allocator strategy
 - Use mspace variant to ease encapsulation
 - Make pow2bins and dlmalloc cmake selectable
* Add unit tester for dlmalloc, rework single_heap, pow2bins unit testers
accordingly
 - add dlmalloc get_used/get_avail, and have all strats allocators also have a get_used
 - Rework memallocator unit tests: bin size is per strat, alignment is verified in singleheap
* bugfix: dlmalloc exposed that the pingpong test would write past end of
allocation with -w 32
* iostream leakage/mixed usage of cerr and fprintf(stderr

---------

Signed-off-by: Aurelien Bouteiller <aurelien.bouteiller@amd.com>

[ROCm/rocshmem commit: b835de6cd5]
2025-05-01 11:55:23 -04:00
Yiltan 835de6be0e Added XNACK support (#94)
* Added xnack flags
* Updated examples compile command

[ROCm/rocshmem commit: edcd1ed57e]
2025-04-30 08:57:55 -04:00
Edgar Gabriel 38346e5bdd use correct MPI initialization method (#90)
* use correct MPI initialization method

rocSHMEM requires that the MPI library is initialized using
THREAD_MULTIPLE support. Lets use that function therefore in our
examples.

* Update examples/rocshmem_init_attr_test.cc

Co-authored-by: Aurelien Bouteiller <Aurelien.bouteiller@gmail.com>

---------

Co-authored-by: Aurelien Bouteiller <Aurelien.bouteiller@gmail.com>

[ROCm/rocshmem commit: 2e01af22ca]
2025-04-29 16:22:46 -05:00
Edgar Gabriel 2e6fed8e79 unify env variables and use DPRINTF (#89)
* unify handling of env variables

create a class containing all (most?) environment variables used by rocshmem and an object that is instatiated
before library_init, since some of the environment variables need to be
set before we start the bootstraping process.

This allows us to remove two files from the bootstrap directory.

* replace INFO and TRACE macros with DPRINTF

to be more consistent with the rest of the rocSHMEM code

[ROCm/rocshmem commit: db74307195]
2025-04-29 06:05:25 -05:00
Yiltan 8f135af156 Check RMA functional test data in GPU kernel (#91)
[ROCm/rocshmem commit: c81722c339]
2025-04-28 16:06:05 -04:00
Aurelien Bouteiller 19e7b4798e Show and log what the functional test driver is running (#70)
Show and log what the functional test driver is running
* Log errors in the log file
* list all failed tests at the end
* pretty colors :x
* Print stderr when the test has failed

---------

Signed-off-by: Aurelien Bouteiller <aurelien.bouteiller@amd.com>

[ROCm/rocshmem commit: 67bc5b9e5a]
2025-04-23 10:21:35 -04:00
Edgar Gabriel 32f11bd5e5 use correct id when accessing ipc-bases (#88)
we need to use the position of that processes in the local ipc-bases
array, not the global rank.

[ROCm/rocshmem commit: e3b0353fa9]
2025-04-17 10:11:32 -05:00