51 Коммитов

Автор SHA1 Сообщение Дата
akolliasAMD 2606c13155 Tests package (#384)
* added packaging for the tests and for the driver.sh

* making .sh files into programs so they keep permissions

[ROCm/rocshmem commit: e7269cb925]
2026-01-16 09:10:36 -07:00
Edgar Gabriel d37af80d7e add support for GPUs using wavefront size of 32 (#285)
* add gfx1100 support

Add support for Radeon 7900 GPUs (RX and PRO), and 7800 PRO.

I was contemplating to add gfx1101 and gfx1102 GPUs as well, but those are the lower end models that are more unlikely to be used for compute intensive jobs. In addition, I do not have access to them to test the support.

* update WF_SIZe for different options

Radeon systems use a WarpSize of 32, unlike current Instinct systems,
which use a warp size of 64. For the device side, a gfx specific ifdef
is sufficient. For the host side, we need to query the device
properties.

* adjust functional tests to wf_size of 32

* update unit tests to handle wf_size of 32

* address reviewer comments

[ROCm/rocshmem commit: d0c2845031]
2025-10-22 16:04:58 -05:00
Omri Mor 5bc35a7eb6 Unify environment variable management (#235)
* Add environment variable configuration infrastructure
  - Namespace rocshmem::envvar
  - Track all config env vars in per-category lists
  - Remove duplicates from list of allowed env var types
  - Reject negative inputs for unsigned integer types
  - Accept empty strings for std::string
  - Print error source location using C++20 std::source_location
  - Unit tests
* Port environment variables
  - ROCSHMEM_UNIQUEID_WITH_MPI
  - ROCSHMEM_RO_DISABLE_IPC
  - ROCSHMEM_BOOTSTRAP_TIMEOUT
  - ROCSHMEM_BOOTSTRAP_HOSTID
  - ROCSHMEM_BOOTSTRAP_SOCKET_IFNAME
  - ROCSHMEM_RO_PROGRESS_DELAY
  - ROCSHMEM_BOOTSTRAP_SOCKET_FAMILY
  - ROCSHMEM_MAX_NUM_CONTEXTS
    + Merge the independent per-backend copies into a single variable
      that is used by all three backends (IPC, RO, GDA).
    + Set default to 32 (for GDA); prior default for IPC and RO was 1024.
  - ROCSHMEM_MAX_NUM_HOST_CONTEXTS
  - ROCSHMEM_MAX_WF_BUFFERS
  - ROCSHMEM_SQ_SIZE
  - ROCSHMEM_RO_NET_CPU_QUEUE
    + Renamed from RO_NET_CPU_QUEUE
    + Change env var input type to bool, default to false
    + Invert code logic: setting RO_NET_CPU_QUEUE to anything
      would /disable/ a variable gpu_queue, which defaulted to true.
      Variable is now named config::ro::net_cpu_queue,
      with all prior checks for gpu_queue inverted.
  - ROCSHMEM_USE_IB_HCA
  - ROCSHMEM_HEAP_SIZE
    + Defaults to 1L << 30 i.e. 1 GiB,
      from default heap size in memory/heap_memory.hpp.
  - ROCSHMEM_MAX_NUM_TEAMS
    + Unlike other env vars, this can be referenced from devices.
    + Function currently narrows from size_t to int: uses need to be audited
      for safety and correctness in using size_t directly.
  - ROCSHMEM_GDA_ALTERNATE_QP_PORTS
* New env var ROCSHMEM_DEBUG
  - Debug levels:
    + NONE
    + VERSION
    + WARN
    + INFO
    + TRACE
  - Currently unused - will be added later
  - Mirrors RCCL debug control
* Remove rocshmem::rocshmem_env_config
* Change interface for GetClosestNicToGpu
  to accept const char** instead of char**:
  the pointed-to string does not need to be modified
  - Files were not audited for inclusion of util.hpp only for env vars
---------
Signed-off-by: Omri Mor <Omri.Mor@amd.com>

[ROCm/rocshmem commit: a0fcbf8d35]
2025-10-06 10:05:57 -07:00
Edgar Gabriel 53fa35b980 Remove MPI compile-time dependency (#264)
* use dlsym for MPI functions

to allow compiling without MPI support, convert the usage of MPI functions and symbols to be based on a dlopen/dlsym based mechanism. Turns out this cannot be done entirely vendor neutral, slightly different solutions might be required for Open MPI, MPICH and the new MPI ABI.

* checkpoint

more work to be done.

* checkpoint 2

* checkpoint 3

* checkpoint 4

examples compile and link correctly

* checkpoitn 5 (I think)

* Checkpoitn 6

* dyld-mpi: adapt GDA

* dyldmpi: tests that depend on MPI need to link with it themselves

* do not ../mpi_instance.h

* dyldmpi: make the symetricHeapTestFixture compile

* dyldmpi: Change cmakery, compiles and run gda w/o external MPI

* Make it also compile in external MPI mode

* dyldmpi: ipc unit tests compile but do not link

* dyldmpi: new approach, if external mpi required, link with mpi,
otherwise use ompi5 abi

* C-style comments in cmakelist..

* dyldmpi: examples: do not fail compiling if MPI not found at build time,
instead do not compile the MPI required examples

* more updates to CMake logic

* convert RO backend

and a few other cleanups

* update some unit tests

to work with the dlopen MPI environment correctly.

---------

Co-authored-by: Aurelien Bouteiller <abouteil@amd.com>

[ROCm/rocshmem commit: e4c427a736]
2025-10-01 08:06:56 -05:00
Avinash Kethineedi 39894092eb tests: remove rocthrust and rocprim dependencies from free_list unit tests (#231)
[ROCm/rocshmem commit: 0b973bcfc7]
2025-09-02 10:37:49 -05:00
Edgar Gabriel e167f50803 Introduce support for executing the IPC conduit without MPI (#153)
* relax MPI dependency from code

This commit (series) removes the strict dependency on MPI in code base.
rocSHMEM will still be compiled with MPI, but the goal is to make the
code work even if MPI_Init_thread has not been invoked, at least for
certain, well-defined scenarios. Hence, the goal is not remove any
mentioning of MPI from rocSHMEM, but to ensure correct execution of the
ipc conduit even if the library has been initialized using other means.

Details:
 - add non-MPI version of remote_heap and WindowInfo classes
 - host interfaces work on WindowInfoMPI, they will not work with the
   non-MPI code path. Since it is unclear whether we plan to support the
   host interfaces at all, this is probably not a major limitation.

* update symmetric_heap structures and backend

* first cut on initialization

and enabling non-MPI initialization of the IPCBackend

* add non-MPI hostInterface methods

at the moment, only barrier_all and sync_all are explicitely supported.

* add non-mpi version of ipc_policy

and a number of smaller fixes required in other files.
A small init/finalize test already passes now with the branch.

* add non-mpi team_split_strided code

* minor fixes for non-MPI use-case

* disable symmetric-heap-window-ionfo test

disable this test for now just to make the compilation pass. Will have
to rework it.

* make no-mpi great again

after rebasing on top of the MPI singleton changes.

* enable running functional tests with uuid init

to run the functional tests using rocshmem_init_attr and the uuid
mechanism requires
a) a PMIx installation on the system
b) setting the environment variable ROCSHMEM_TEST_UUID=1

* fix multi-team creation bug

fix a bug occuring when creating many teams, which was the result of
incorrectly applying two indices in our own implementation of Allreduce.

* make unit tests pass again

* reverse offload was impacted by code change

fix the RO conduit to cope wioth the non-MPI path introduced for the IPC
conduit.

* update to cmake logic to find pmix

* Update src/memory/window_info.hpp

Co-authored-by: Yiltan <ytemucin@amd.com>

* Update CMakeLists.txt

Co-authored-by: Yiltan <ytemucin@amd.com>

* document ROCSHMEM_UNIQUEID_NO_MPI

* rename env. variable to UNIQUEID_WITH_MPI

* update host.cpp to use USE_HDP_FLUSH macro

instead of the deprecated USE_COHERENT_HEAP.

* add note for running example with RO conduit

add a note clarifying that running init_attr_test from the example
directory requires setting an additional environment variable with the
RO conduit.

* Find PMIx in more cases, only apply pmix build options to the test that
needs it, if OMPI_COMM_WORLD_LOCA_RANK is not setenv, abort

---------

Co-authored-by: Yiltan <ytemucin@amd.com>
Co-authored-by: Aurelien Bouteiller <abouteil@amd.com>

[ROCm/rocshmem commit: 6ea5edc951]
2025-06-21 13:23:11 -05:00
Aurelien Bouteiller 08d8324f74 Rework cmakery: (#136)
* Rework cmakery:
  * detect rocm/hip/rocshmem better, make sure that ROCM_PATH and
    ROCM_ROOT don't conflict and are taken by default
  * add /opt/rocm as a fallback when nothing else found
  * obtain hipcc in a sanitized way (ensure we use the same logic we
    use to later find_package hip)
  * factorize redundancies
  * export GPU_TARGETS as part of the cmake target for librocshmem,
    this helps with a clean error when an application tries to link
    with the wrong offload-target flag (rather than a cryptic link error)
  * phased out ROCSHMEM_HOME, in favor of rocshmem_ROOT (the cmake
    blessed way)

* Remove references to ROCSHMEM_HOME, we prefer ROCSHMEM_ROOT

* Pick CMAKE_PREFIX_PATH method for consistent finding hip/rocm

* Undo this pr using LANGUAGE HIP, maybe later

* Use only rocmcmakebuildtools as recommended from 6.4 onward

[ROCm/rocshmem commit: ee5363be7a]
2025-06-18 11:46:33 -04:00
Aurelien Bouteiller 56a3181a6f Swdev/536571 with additional issues found for other various missing includes (#158)
* Revert "SWDEV-536571 - Include assert header. (#157)"

This reverts commit 87d2efa430.

* Fix use of assert/abort and required includes

* Disable IPC AMO testers for non-implemented functions

[ROCm/rocshmem commit: 551603829c]
2025-06-16 20:21:06 -04:00
Yiltan bceeadeb63 Multi-Node rocshmem_finalize() bug (#138)
[ROCm/rocshmem commit: 3f01d89207]
2025-06-04 10:02:03 -04:00
Aurelien Bouteiller 27d1189ff3 Substitute pow2bin allocator with a dlmalloc based allocator (#71)
* Add dlmalloc_strat allocator strategy
 - Use mspace variant to ease encapsulation
 - Make pow2bins and dlmalloc cmake selectable
* Add unit tester for dlmalloc, rework single_heap, pow2bins unit testers
accordingly
 - add dlmalloc get_used/get_avail, and have all strats allocators also have a get_used
 - Rework memallocator unit tests: bin size is per strat, alignment is verified in singleheap
* bugfix: dlmalloc exposed that the pingpong test would write past end of
allocation with -w 32
* iostream leakage/mixed usage of cerr and fprintf(stderr

---------

Signed-off-by: Aurelien Bouteiller <aurelien.bouteiller@amd.com>

[ROCm/rocshmem commit: b835de6cd5]
2025-05-01 11:55:23 -04:00
Yiltan 835de6be0e Added XNACK support (#94)
* Added xnack flags
* Updated examples compile command

[ROCm/rocshmem commit: edcd1ed57e]
2025-04-30 08:57:55 -04:00
Avinash Kethineedi c4de6833f6 Add SPDX license identifiers and update copyright headers (#85)
* Update copyright information and add SPDX license identifier

* Update AUTHORS

* Remove `sos_tests`

[ROCm/rocshmem commit: f6ef19f5a9]
2025-04-15 15:37:53 -05:00
Brandon Potter a3a211a677 Cleanup unused code in repository (#75)
* Remove unused forward_list

* Remove unused __read_clock function

* Replace wallClk code with hip function

* Remove unused unit test for ipc

* Remove slab heap

* Remove unused EBO spinlock

[ROCm/rocshmem commit: 0fd628458c]
2025-04-10 14:47:24 -05:00
Yiltan 0cde5f53dc Update GTEST version (#68)
[ROCm/rocshmem commit: e16ca7a1e3]
2025-03-31 08:58:30 -04:00
Avinash Kethineedi 370e2dda09 Add AtomicWFQueue implementation and tests (#62)
* feat: Add AtomicWFQueue implementation
  - Implemented wavefront-safe atomic FIFO queue ensuring first-come, first-serve order
  - Added efficient synchronization using atomics
  - Enhanced `dequeue` to wait until an element is available

* test: Add GTest for AtomicWFQueue
  - Implemented unit tests for AtomicWFQueue using GoogleTest framework
  - Added tests for `enqueue`, `dequeue`, and edge cases
  - Ensured synchronization behavior and correctness under concurrent conditions

* Add assert in `enqueue` and update atomics
  - Added an assert in the `enqueue` function to ensure it fails if the queue is full

[ROCm/rocshmem commit: b84b5638cf]
2025-03-25 00:45:19 -05:00
Yiltan 6d6dccfebe Sync Reverse Offload Scripts (#52)
* Sync Reverse Offload scripts
- Disable IPC unit tests when IPC is not available in the rocSHMEM configuration

* Added missing ptr in ipc_policy

[ROCm/rocshmem commit: 3428957de9]
2025-03-19 14:31:07 -04:00
Yiltan a16492cdf9 Added option to build only tests and link to an external rocshmem library (#43)
* Rearrange CMakefile

* Enable linking to external rocshmem library

* Minor fix for the functional test driver

* ROCSHMEM_HOME detection fixed

[ROCm/rocshmem commit: 96424a59a8]
2025-03-13 15:49:50 -04:00
Yiltan 95c4c0d428 Fix ROCm 6.4 warnings (#47)
* Removed __AMDGCN_WAVEFRONT_SIZE

* Added unit test to validate WF_SIZE

[ROCm/rocshmem commit: 487e5b7d0f]
2025-02-24 13:34:13 -05:00
Brandon Potter 413114da9f Fix signal calculation bug for fine-tiled unit tests
[ROCm/rocshmem commit: b1f6621f33]
2024-12-19 18:34:47 +00:00
Yiltan Temucin 48605db5de Remove comparisons of signed to unsigned values
[ROCm/rocshmem commit: fa0858833e]
2024-12-12 10:21:08 -06:00
Yiltan Temucin 3164874941 Use ROCm-CMake
[ROCm/rocshmem commit: b60a460681]
2024-12-06 15:49:41 -06:00
Brandon Potter 40016e0a9e Merge pull request #48 from BKP/ipc_fine_tiled_unit_11-04-24
Add tiled fine-grained unit tests

[ROCm/rocshmem commit: 46f0b42ac3]
2024-11-25 14:36:04 -06:00
Yiltan Temucin 5ed5c4642e Explicitly require rocPRIM and rocThrust.
[ROCm/rocshmem commit: 50e46847c6]
2024-11-19 08:54:18 -06:00
Brandon Potter 4fed83a40b Update tests/unit_tests/ipc_impl_tiled_fine_gtest.cpp
Co-authored-by: Avinash Kethineedi <avinash.kethineedi@amd.com>

[ROCm/rocshmem commit: 03719bbb0e]
2024-11-14 13:08:20 -06:00
Brandon Potter 17f9e07ecc Add tiled fine-grained unit tests
[ROCm/rocshmem commit: d241015e0f]
2024-11-04 17:16:07 -06:00
Brandon Potter 74dec9374d Convert simple fine tests into parameterized tests
[ROCm/rocshmem commit: 749d9f0781]
2024-11-04 10:46:50 -06:00
Brandon Potter 7f19a42778 Merge branch 'ROCm:develop' into ipc_parameterized_simple_tests_10-01-24
[ROCm/rocshmem commit: ce0ca36d37]
2024-10-11 12:49:56 -05:00
Brandon Potter 8e44e5d458 Merge pull request #31 from BKP/ipc_bringup_fine_unit_09-26-24
Add IPC Simple Buffer Fine-grained Unit Tests

[ROCm/rocshmem commit: 787cf0ff3f]
2024-10-01 15:12:30 -05:00
avinashkethineedi 285ac5cab6 Add MPI_THREAD_MULTIPLE check
[ROCm/rocshmem commit: 2f0739d823]
2024-10-01 20:05:15 +00:00
Brandon Potter 44803b3ba1 Use gtest parameterized test macros for IPC simple
The IPC simple test fixtures had replicated code in many places.
This changeset removes most of the duplication in the relevant files.


[ROCm/rocshmem commit: 526811957b]
2024-10-01 14:57:21 -05:00
avinashkethineedi 0641a4a29e make MPI_Init and MPI_Finalize independent of the test fixtures
[ROCm/rocshmem commit: 0f7dc70894]
2024-10-01 18:33:36 +00:00
Brandon Potter 25d7d7fccd Change notifier max thread block value to account for MI300 CPX
[ROCm/rocshmem commit: db221b022a]
2024-09-27 11:17:53 -05:00
Brandon Potter 325ce3cba7 Bugfixes for the ipc unit tests
[ROCm/rocshmem commit: f85c46ec0a]
2024-09-26 13:40:05 -05:00
Brandon Potter 56c1626df1 Update fine-grained simple tests
[ROCm/rocshmem commit: 46fdb1851c]
2024-09-10 09:35:41 -07:00
Brandon Potter 10d351b6a1 Intermediate commit for rebase
[ROCm/rocshmem commit: 2806e1be79]
2024-09-10 07:10:22 -07:00
Brandon Potter 74c4a248cc Add an extra assertion check for nullptr
[ROCm/rocshmem commit: 678564ba3c]
2024-09-10 07:10:22 -07:00
Brandon Potter aed0da61d0 Add sync method to notifier class
[ROCm/rocshmem commit: 359d6be797]
2024-09-10 07:10:21 -07:00
Brandon Potter 9b0e4dc05d Change notifier fixture to prep for other fixtures
[ROCm/rocshmem commit: 1289d50be5]
2024-09-10 07:10:21 -07:00
Brandon Potter 68384da019 Update Notifier fixture to Block
[ROCm/rocshmem commit: 5b42cff96c]
2024-09-10 07:10:21 -07:00
Brandon Potter 13ec689cdf Updates to Notifier
[ROCm/rocshmem commit: 51c33b2a66]
2024-09-10 07:10:21 -07:00
Brandon Potter 4896235ada Change read/write to load/store in Nofitier API
[ROCm/rocshmem commit: 039ea82777]
2024-09-10 07:10:21 -07:00
Brandon Potter ef4a14e947 Fix problems with Notifier
[ROCm/rocshmem commit: 0c53a075f2]
2024-09-10 07:10:21 -07:00
Brandon Potter 4610659bf1 Add simple fine test
[ROCm/rocshmem commit: da93542c40]
2024-09-10 07:10:21 -07:00
avinashkethineedi fe866aaea6 Update context_ipc_gtest.cpp to use IPCbackend
[ROCm/rocshmem commit: c9dbcf80c2]
2024-08-15 11:54:56 -07:00
avinashkethineedi a13e2599d9 Code refactor
move ipc_policy.hpp and ipc_policy.cpp files to src, since they are used by all the conduits.


[ROCm/rocshmem commit: 24375a949e]
2024-08-14 20:44:35 -07:00
Brandon Potter bda68c964e Add ipc unit_tests
[ROCm/rocshmem commit: 58c5a98b5d]
2024-08-07 12:18:12 -07:00
Edgar Gabriel 7e446cfc81 unit_tests: add ipc_context tests
add the initial outline of an ipc_context unit test. The current test
only invokes the ipc_context constructor, more tests will be added later
as the class is being populated.

Also, at the moment the unit test takes an ROBackend as an argument for
the constructor, not sure whether this will be the final solution.


[ROCm/rocshmem commit: b1bc4a497a]
2024-07-30 15:22:03 -07:00
Brandon Potter 950a4f75cd Disable forward_list unit_test
[ROCm/rocshmem commit: afd51b3cbb]
2024-07-11 10:22:36 -07:00
Brandon Potter 77ddf075a2 Remove SpinEBOBlockMutex usage and unit tests
[ROCm/rocshmem commit: 770890a107]
2024-07-11 10:12:19 -07:00
Brandon Potter 76f739ca40 Disable unit_test for slab allocator
[ROCm/rocshmem commit: 3aaf29399c]
2024-07-11 08:49:36 -07:00