Grafico dei commit

326 Commit

Autore SHA1 Messaggio Data
Avinash Kethineedi 526105d315 Implement rocshmem_ptr in IPC conduit (#197)
* Implement `rocshmem_ptr` in IPC conduit

* tests: add functional test for `rocshmem_ptr`
  - Add safety check for pointer access and condition check before printing results for `rocshmem_ptr` test
  - Use `rocshmem_put` to store `rocshmem_ptr` availability for data validation
2025-07-28 12:01:02 -05:00
Dimple Prajapati 87f99e7ec6 Add host APIs for querying device ctx and remote heap pointer (#200)
* Add host APIs for querying device ctx and remote heap pointer

* Host API to query device pointer for ROCSHMEM_DEFAULT_CONTEXT,
  this is needed to support dynamic module initialization via device kernel
  library bitcode.
* Host API to query remote symmetric heap pointer that can be used in
  custom device kernel for RMA operations.

* Added rocshmem_ptr implementation within the Host Context class
* Enables pointer retrieval functionality for symmetric data objects
* Copy IPC pointers to host memory in RO host context

---------

Co-authored-by: avinashkethineedi <avinash.kethineedi@amd.com>
2025-07-24 11:03:03 -07:00
Aurelien Bouteiller 42e28835ad Documentation for RO (#189)
* Update documentation to include RO and how to use it

* Clarify supported configuration

Co-authored-by: yugang-amd <yugang.wang@amd.com>
2025-07-10 18:49:10 -04:00
Edgar Gabriel a66f782540 Revert "Add host API to query Device side context detail (#183)" (#196)
This reverts commit 105382710a.
2025-07-07 16:51:44 -05:00
Dimple Prajapati 105382710a Add host API to query Device side context detail (#183)
* API support for enabling rocshmem bitcode integration

* move implementation to along with host side APIs
2025-07-07 16:04:16 -05:00
akolliasAMD ebd92a7b3c changed the function tests name on the codebase (#177) 2025-07-04 13:28:59 -06:00
akolliasAMD e2e334e630 fixed compilation targets on cmake (#182)
* fixed compilation targets on cmake
* moved gpu target generation
2025-07-04 13:27:20 -06:00
Avinash Kethineedi 7a5c6f86d7 functional_tests: use size_t for size variable (#190)
Changed the data type of `size` to `size_t` in all functional tests to ensure
consistency with rocSHMEM APIs.
2025-07-03 13:26:54 -05:00
Aurelien Bouteiller 0c40bc58d8 Let it compile when pmix is not found (#185) 2025-07-02 17:02:43 -04:00
Aurelien Bouteiller 63a79892b2 rocshmem_config.h has a different include path when installed and built-dir (#186)
* rocshmem_config.h needs to be in a similar directory structure for
includes to work when building testers in build, and from an installed
library

* Do not change installed rocshmem.hpp
2025-07-02 16:51:38 -04:00
Avinash Kethineedi 6dba253890 Fix pSync buffer initialization in IPC(#180)
- Initialize the entire pSync buffer with the default synchronization value
2025-06-30 13:38:24 -05:00
akolliasAMD e2bae5131a added new gfx target (#171) 2025-06-25 17:35:54 -06:00
dependabot[bot] 47bd7ec0d8 Bump urllib3 from 2.4.0 to 2.5.0 in /docs/sphinx (#170)
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.4.0 to 2.5.0.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/2.4.0...2.5.0)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-version: 2.5.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-25 11:08:42 -04:00
dependabot[bot] 49f7f1bab1 Bump requests from 2.32.3 to 2.32.4 in /docs/sphinx (#169)
Bumps [requests](https://github.com/psf/requests) from 2.32.3 to 2.32.4.
- [Release notes](https://github.com/psf/requests/releases)
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md)
- [Commits](https://github.com/psf/requests/compare/v2.32.3...v2.32.4)

---
updated-dependencies:
- dependency-name: requests
  dependency-version: 2.32.4
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-25 11:08:28 -04:00
Edgar Gabriel 6ea5edc951 Introduce support for executing the IPC conduit without MPI (#153)
* relax MPI dependency from code

This commit (series) removes the strict dependency on MPI in code base.
rocSHMEM will still be compiled with MPI, but the goal is to make the
code work even if MPI_Init_thread has not been invoked, at least for
certain, well-defined scenarios. Hence, the goal is not remove any
mentioning of MPI from rocSHMEM, but to ensure correct execution of the
ipc conduit even if the library has been initialized using other means.

Details:
 - add non-MPI version of remote_heap and WindowInfo classes
 - host interfaces work on WindowInfoMPI, they will not work with the
   non-MPI code path. Since it is unclear whether we plan to support the
   host interfaces at all, this is probably not a major limitation.

* update symmetric_heap structures and backend

* first cut on initialization

and enabling non-MPI initialization of the IPCBackend

* add non-MPI hostInterface methods

at the moment, only barrier_all and sync_all are explicitely supported.

* add non-mpi version of ipc_policy

and a number of smaller fixes required in other files.
A small init/finalize test already passes now with the branch.

* add non-mpi team_split_strided code

* minor fixes for non-MPI use-case

* disable symmetric-heap-window-ionfo test

disable this test for now just to make the compilation pass. Will have
to rework it.

* make no-mpi great again

after rebasing on top of the MPI singleton changes.

* enable running functional tests with uuid init

to run the functional tests using rocshmem_init_attr and the uuid
mechanism requires
a) a PMIx installation on the system
b) setting the environment variable ROCSHMEM_TEST_UUID=1

* fix multi-team creation bug

fix a bug occuring when creating many teams, which was the result of
incorrectly applying two indices in our own implementation of Allreduce.

* make unit tests pass again

* reverse offload was impacted by code change

fix the RO conduit to cope wioth the non-MPI path introduced for the IPC
conduit.

* update to cmake logic to find pmix

* Update src/memory/window_info.hpp

Co-authored-by: Yiltan <ytemucin@amd.com>

* Update CMakeLists.txt

Co-authored-by: Yiltan <ytemucin@amd.com>

* document ROCSHMEM_UNIQUEID_NO_MPI

* rename env. variable to UNIQUEID_WITH_MPI

* update host.cpp to use USE_HDP_FLUSH macro

instead of the deprecated USE_COHERENT_HEAP.

* add note for running example with RO conduit

add a note clarifying that running init_attr_test from the example
directory requires setting an additional environment variable with the
RO conduit.

* Find PMIx in more cases, only apply pmix build options to the test that
needs it, if OMPI_COMM_WORLD_LOCA_RANK is not setenv, abort

---------

Co-authored-by: Yiltan <ytemucin@amd.com>
Co-authored-by: Aurelien Bouteiller <abouteil@amd.com>
2025-06-21 13:23:11 -05:00
Aurelien Bouteiller ee5363be7a Rework cmakery: (#136)
* Rework cmakery:
  * detect rocm/hip/rocshmem better, make sure that ROCM_PATH and
    ROCM_ROOT don't conflict and are taken by default
  * add /opt/rocm as a fallback when nothing else found
  * obtain hipcc in a sanitized way (ensure we use the same logic we
    use to later find_package hip)
  * factorize redundancies
  * export GPU_TARGETS as part of the cmake target for librocshmem,
    this helps with a clean error when an application tries to link
    with the wrong offload-target flag (rather than a cryptic link error)
  * phased out ROCSHMEM_HOME, in favor of rocshmem_ROOT (the cmake
    blessed way)

* Remove references to ROCSHMEM_HOME, we prefer ROCSHMEM_ROOT

* Pick CMAKE_PREFIX_PATH method for consistent finding hip/rocm

* Undo this pr using LANGUAGE HIP, maybe later

* Use only rocmcmakebuildtools as recommended from 6.4 onward
2025-06-18 11:46:33 -04:00
Avinash Kethineedi bf48bcabf2 Refactor Barrier_all and Sync_all APIs to use default context (#159)
* Refactor `Barrier_all` and `Sync_all` to use default context

- Removed context-specific implementations of barrier_all and sync_all
- Added barrier_all and sync_all to the default context implementation
- Updated functional tests to use the default context for barrier_all and sync_all

* Update `Barrier_all` and `Sync_all` API usage in documentation

* Update `CHANGELOG`

---------

Co-authored-by: Yiltan <ytemucin@amd.com>
2025-06-17 11:16:18 -05:00
Aurelien Bouteiller 551603829c Swdev/536571 with additional issues found for other various missing includes (#158)
* Revert "SWDEV-536571 - Include assert header. (#157)"

This reverts commit bcc14b1a34.

* Fix use of assert/abort and required includes

* Disable IPC AMO testers for non-implemented functions
2025-06-16 20:21:06 -04:00
Aurelien Bouteiller 8138d130b9 Fix rocshmem_info not compiling in out of source builds (#160) 2025-06-16 11:49:30 -04:00
Yiltan 72639277a3 Added simple rocshmem_info command (#156)
* Added simple rocshmem_info
* Add GPU Arch info
2025-06-13 15:40:32 -04:00
akolliasAMD 08a6a733d8 added init example and all_reduce example on the files (#150)
* added init example and all_reduce example on the files

* typo fix on folder name
2025-06-13 15:28:13 -04:00
Aurelien Bouteiller 41fd9e2d57 Use finegrain allocator by default (#140)
* Use FineGrained allocator for heap by default, consolidate all types of
allocators under saner cmake controls

Co-authored-by: Yiltan <ytemucin@amd.com>

* Uncached may not be only for debug

Need to include the rocshmem config otherwise produce an inconsistent
build with different allocators used in different files

* Undo this pr adding presumably useless hip_host_allocator_noncoherent

* Rename HEAP_IS_COHERENT/USE_COHERENT_HEAP to USE_HDP_FLUSH as the former
was misleading

* Remove unused __roc_inv()

---------

Co-authored-by: Yiltan <ytemucin@amd.com>
2025-06-13 15:26:26 -04:00
Aurelien Bouteiller bcc14b1a34 SWDEV-536571 - Include assert header. (#157)
(cherry picked from commit 96cd36e47e4759b7515373e8fa455385f3d985aa)

Co-authored-by: Patel <jaypatel@amd.com>
2025-06-12 13:16:30 -04:00
Yiltan e340a220f9 [SWDEV-536571] Update OMPI Commit (#152)
Signed-off-by: Yiltan Hassan Temucin <yiltan.temucin@amd.com>
2025-06-09 11:03:48 -04:00
Jobbins e0ef34a9d1 Fix typo (#147) 2025-06-04 10:46:38 -06:00
Yiltan 6d1dd5f113 Updated CHANGELOG.md with ROCm 6.4.2 changes (#149) 2025-06-04 11:25:22 -04:00
Yiltan 3f01d89207 Multi-Node rocshmem_finalize() bug (#138) 2025-06-04 10:02:03 -04:00
akolliasAMD ca5fdd4718 updated version and made the header its only source of truth (#144)
* updated version and made it only source of truth

* bumped Version number
2025-05-28 14:48:20 -06:00
akolliasAMD fc22f8130d added akolliasAMD to codeowners (#145) 2025-05-28 12:52:58 -06:00
dependabot[bot] e0c9ee45a7 Bump tornado from 6.4.2 to 6.5.1 in /docs/sphinx (#143)
Bumps [tornado](https://github.com/tornadoweb/tornado) from 6.4.2 to 6.5.1.
- [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst)
- [Commits](https://github.com/tornadoweb/tornado/compare/v6.4.2...v6.5.1)

---
updated-dependencies:
- dependency-name: tornado
  dependency-version: 6.5.1
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-27 11:10:07 -04:00
Yiltan 9fe166c8e1 [SWDEV-534546] Disable building tests in default build (#141) 2025-05-26 16:50:22 -04:00
yugang-amd 8a266e698c Final edits (#126)
* final edits

* more edits per review

* more edits

* attempt to fix dead link
2025-05-21 16:59:00 -04:00
Edgar Gabriel bce329abd6 free MPI Communicators on destruction (#134)
make sure that all communicators that have been created during the
runtime of the application are correctly freed again, to avoid memory
leaks.
2025-05-20 09:59:00 -05:00
Jobbins 474112d03c Code Coverage (#82)
code coverage:  generate code coverage reports

* Add instrumentation flags to rocshmem target when adding -DBUILD_CODE_COVERAGE cmake flag
* Add helper script to build all subprojects and generate code coverage reports
* Update README with code coverage instructions
2025-05-16 09:09:17 -06:00
Yiltan f43e3cf4fa Updated ROCm-docs to match the current status of the repository (#117)
* Updated docs to match the current status of the repository

Co-authored-by: yugang-amd <yugang.wang@amd.com>
2025-05-16 09:26:59 -04:00
Aurelien Bouteiller 3600291558 Detailed logs (#124)
* Use a single printf per line (reduce chances of lines being cut in logs)

* team_comm can be an int or a pointer depending on MPI impl.
Received is confusing (since we are on the origin), use submitted
instead

* Print arguments to calls when using DEBUG

---------

Signed-off-by: Aurelien Bouteiller <abouteil@amd.com>
2025-05-15 10:37:12 -04:00
yugang-amd 67bff9ca30 Style edits (#122) 2025-05-13 16:26:28 -04:00
alexxu-amd 9088383dab move requirements.txt from docs/ to docs/sphinx/ (#118) 2025-05-08 15:37:58 -04:00
Yiltan f693c98fb2 Initial ROCm-docs (#92)
* Initial ROCm-docs commit

Co-authored-by: Aurélien Bouteiller <bouteill@icl.utk.edu>
Co-authored-by: Alex Xu <alex.xu@amd.com>
Co-authored-by: yugang-amd <yugang.wang@amd.com>
2025-05-08 13:39:28 -04:00
Aurelien Bouteiller 87179b1ffd Remove unused parts of dlmalloc to improve coverity score (#106) 2025-05-07 13:05:04 -04:00
Yiltan 644857d375 Added initial changelog (#105) 2025-05-07 11:39:14 -04:00
Aurelien Bouteiller 2bbe21db56 func-tests: Don't rely on asserts to catch invalid argv/env params (#96) 2025-05-02 12:00:35 -04:00
Aurelien Bouteiller f0501550f7 cleanup leftovers from SOS testers removal (#97)
Followup to pr#85
2025-05-02 11:59:52 -04:00
Aurelien Bouteiller b835de6cd5 Substitute pow2bin allocator with a dlmalloc based allocator (#71)
* Add dlmalloc_strat allocator strategy
 - Use mspace variant to ease encapsulation
 - Make pow2bins and dlmalloc cmake selectable
* Add unit tester for dlmalloc, rework single_heap, pow2bins unit testers
accordingly
 - add dlmalloc get_used/get_avail, and have all strats allocators also have a get_used
 - Rework memallocator unit tests: bin size is per strat, alignment is verified in singleheap
* bugfix: dlmalloc exposed that the pingpong test would write past end of
allocation with -w 32
* iostream leakage/mixed usage of cerr and fprintf(stderr

---------

Signed-off-by: Aurelien Bouteiller <aurelien.bouteiller@amd.com>
2025-05-01 11:55:23 -04:00
Yiltan edcd1ed57e Added XNACK support (#94)
* Added xnack flags
* Updated examples compile command
2025-04-30 08:57:55 -04:00
Edgar Gabriel 2e01af22ca use correct MPI initialization method (#90)
* use correct MPI initialization method

rocSHMEM requires that the MPI library is initialized using
THREAD_MULTIPLE support. Lets use that function therefore in our
examples.

* Update examples/rocshmem_init_attr_test.cc

Co-authored-by: Aurelien Bouteiller <Aurelien.bouteiller@gmail.com>

---------

Co-authored-by: Aurelien Bouteiller <Aurelien.bouteiller@gmail.com>
2025-04-29 16:22:46 -05:00
Edgar Gabriel db74307195 unify env variables and use DPRINTF (#89)
* unify handling of env variables

create a class containing all (most?) environment variables used by rocshmem and an object that is instatiated
before library_init, since some of the environment variables need to be
set before we start the bootstraping process.

This allows us to remove two files from the bootstrap directory.

* replace INFO and TRACE macros with DPRINTF

to be more consistent with the rest of the rocSHMEM code
2025-04-29 06:05:25 -05:00
Yiltan c81722c339 Check RMA functional test data in GPU kernel (#91) 2025-04-28 16:06:05 -04:00
Aurelien Bouteiller 67bc5b9e5a Show and log what the functional test driver is running (#70)
Show and log what the functional test driver is running
* Log errors in the log file
* list all failed tests at the end
* pretty colors :x
* Print stderr when the test has failed

---------

Signed-off-by: Aurelien Bouteiller <aurelien.bouteiller@amd.com>
2025-04-23 10:21:35 -04:00
Edgar Gabriel e3b0353fa9 use correct id when accessing ipc-bases (#88)
we need to use the position of that processes in the local ipc-bases
array, not the global rank.
2025-04-17 10:11:32 -05:00