avinashkethineedi
4e71fa5d22
Remove active-set-based reduction test from the functional tests suite
...
[ROCm/rocshmem commit: e9484bbb86 ]
2024-10-28 21:22:46 +00:00
Yiltan
da9466f7cc
Merge pull request #44 from ROCm/fix-printing
...
Clean up functional tests output
[ROCm/rocshmem commit: 9885f984f6 ]
2024-10-28 15:45:28 -04:00
Yiltan
bd3a07d28b
Merge pull request #43 from ROCm/LWPRHMEM-75-API-differences-bug-fix
...
Lwprhmem 75 api differences bug fix
[ROCm/rocshmem commit: 794b888d69 ]
2024-10-28 15:45:15 -04:00
Yiltan Temucin
0ed689439b
Cleaned up how we print the output
...
[ROCm/rocshmem commit: 9576ff6440 ]
2024-10-28 13:37:33 -05:00
Yiltan Temucin
bc315d1c8b
API bug fix in IB conduit
...
[ROCm/rocshmem commit: 98afb41263 ]
2024-10-24 11:52:03 -05:00
Yiltan Temucin
2da1f4e925
API change bug fix
...
[ROCm/rocshmem commit: e210020e9b ]
2024-10-24 11:52:03 -05:00
Edgar Gabriel
95c01c67af
add ascii art for ring allredude
...
[ROCm/rocshmem commit: 11df5427a6 ]
2024-10-24 15:08:32 +00:00
Edgar Gabriel
2836240906
fix odd-case allreduce scenarios
...
if the number of elements to be used in the allreduce operation is not
exact multiple of the work-array buffer size and number of pe's, we need
to adjust the algorithm to:
- initially perform a ring_allreduce on n_segments * chunk_size (which
is the integer division of the number of elements and the work-buffer
size, i.e. will not cover the entire buffer)
- perform another ring_allreduce where chunk_size is reduced to match
the remaining elements
- if the remaining elements from the previous step cannot evenly be
divded by the number of pe's, we need to perform a direct_allreduce on
the outstanding number of elements.
[ROCm/rocshmem commit: a4b4281f50 ]
2024-10-24 15:08:32 +00:00
Edgar Gabriel
7b760bb023
fix barrier synchronization on gfx90a
...
[ROCm/rocshmem commit: 87db7f7d38 ]
2024-10-24 15:08:28 +00:00
Edgar Gabriel
777401ae29
add some example code
...
first examples include a getmem testcase and an allreduce (to_all)
example.
[ROCm/rocshmem commit: a0ac7b2d60 ]
2024-10-24 15:07:17 +00:00
Edgar Gabriel
c9b5f03548
ipc: add ring_allreduce algorithms
...
add the ring allreduce algorithm to the ipc conduit in order to be able
to execute slightly largers reductions.
[ROCm/rocshmem commit: 1fbb89bc73 ]
2024-10-24 15:07:17 +00:00
Edgar Gabriel
5f0f2f6e85
ipc/to_all: add direct allreduce algorithm
...
add a simple version of an allreduce algorithm as a starting point.
[ROCm/rocshmem commit: ba21cb7b85 ]
2024-10-24 15:07:14 +00:00
Brandon Potter
a032b41c20
Merge pull request #34 from BKP/ipc_parameterized_simple_tests_10-01-24
...
IPC Parameterized Simple Tests
[ROCm/rocshmem commit: 416dffa129 ]
2024-10-24 08:23:26 -05:00
Avinash Kethineedi
6979677ec5
Merge pull request #41 from avinashkethineedi/collective_routine_buffers
...
Fine grained memory buffers for work/sync arrays
[ROCm/rocshmem commit: 8a16968cf2 ]
2024-10-23 23:33:48 -05:00
avinashkethineedi
82d296db73
Fix quiet and fence of default context
...
* Update tinfo of default context
[ROCm/rocshmem commit: d5ea5868e3 ]
2024-10-22 16:18:05 +00:00
avinashkethineedi
fbcba80cd3
Add fine grained memory buffers for work/sync arrays
...
* Add interanl put_mem/get_mem{_wave, _wg} functions to read/write to work/sync arrays
* Add condition check to ensure all MPI processes are on the same compute node for IPC conduit
[ROCm/rocshmem commit: 6685d0ab60 ]
2024-10-21 15:28:39 +00:00
Yiltan
08ab0b4a41
Merge pull request #39 from Yiltan/LWPRHMEM-75-API-differences
...
LWPRHMEM-75 API Differences
[ROCm/rocshmem commit: b922bdcf4c ]
2024-10-18 15:27:34 -04:00
avinashkethineedi
10eb11c1d5
Use C++ iota function to reset buffers and use its values for verification
...
* Update functional test script to include new tests
[ROCm/rocshmem commit: 18a1bdd0ac ]
2024-10-15 20:23:25 +00:00
Avinash Kethineedi
1a5536bfaa
Merge branch 'ROCm:develop' into functional_tests/puts_gets
...
[ROCm/rocshmem commit: e981f61693 ]
2024-10-14 10:27:54 -05:00
Yiltan Hassan Temucin
31fe937259
updated atomic_fetch() parameters
...
[ROCm/rocshmem commit: 8b3854b252 ]
2024-10-11 13:34:28 -07:00
Yiltan Hassan Temucin
3d0fca0387
updated *_wait* APIs to use int rather than roc_shmem_cmps
...
[ROCm/rocshmem commit: 722a5f0731 ]
2024-10-11 13:34:28 -07:00
Yiltan Hassan Temucin
496f06dd2b
*_wait* routines changed parameter from ptr to ivars to match OpenSHMEM
...
[ROCm/rocshmem commit: bcf3fdff10 ]
2024-10-11 13:34:28 -07:00
Brandon Potter
7f19a42778
Merge branch 'ROCm:develop' into ipc_parameterized_simple_tests_10-01-24
...
[ROCm/rocshmem commit: ce0ca36d37 ]
2024-10-11 12:49:56 -05:00
Brandon Potter
5b47cf482d
Merge pull request #29 from ROCm/improve-ib-latency
...
Vectorize WQE segments writes
[ROCm/rocshmem commit: e419a8b963 ]
2024-10-11 11:55:48 -05:00
Yiltan Hassan Temucin
17323323f8
fixed notifier bug
...
[ROCm/rocshmem commit: 509277c034 ]
2024-10-10 06:45:43 -07:00
Yiltan Hassan Temucin
8334214b98
added notifier->sync() when we are not using cooperative groups
...
updated scope bug
[ROCm/rocshmem commit: b1134e8633 ]
2024-10-09 13:11:28 -07:00
Yiltan Hassan Temucin
caa6d356c0
Added Cooperative Groups configure option and header
...
[ROCm/rocshmem commit: 63667a3167 ]
2024-10-09 13:11:12 -07:00
Yiltan Hassan Temucin
45976b23ae
Fix initialization order bug
...
[ROCm/rocshmem commit: 1baa071edf ]
2024-10-09 13:11:12 -07:00
Yiltan Hassan Temucin
ef571f5863
fixed barrier issue on MI250X
...
[ROCm/rocshmem commit: e2f6a65284 ]
2024-10-08 13:18:04 -07:00
Yiltan Hassan Temucin
3cba9ccd42
added .gitignore, we do not want to include the build directory in our commits
...
[ROCm/rocshmem commit: 120453c75c ]
2024-10-08 13:18:04 -07:00
avinashkethineedi
7eec77ea17
Add script to run unit tests
...
[ROCm/rocshmem commit: c1bcf336b4 ]
2024-10-08 18:12:07 +00:00
avinashkethineedi
37b1de86cd
Add team information to the context
...
* Update roc_shmem_ctx_fence API to use team-relative PE numbering
* Update backend to populate team_opaque member of ROC_SHMEM_CTX_DEFAULT (used to store information about the team wrt TEAM_WORLD)
[ROCm/rocshmem commit: 92fb1abaf2 ]
2024-10-04 17:56:15 +00:00
avinashkethineedi
69784a7423
Add fence and quiet functionality
...
* Perform atomic stores to enforce memory ordering
[ROCm/rocshmem commit: 979aed105a ]
2024-10-03 06:28:12 +00:00
Brandon Potter
8e44e5d458
Merge pull request #31 from BKP/ipc_bringup_fine_unit_09-26-24
...
Add IPC Simple Buffer Fine-grained Unit Tests
[ROCm/rocshmem commit: 787cf0ff3f ]
2024-10-01 15:12:30 -05:00
avinashkethineedi
285ac5cab6
Add MPI_THREAD_MULTIPLE check
...
[ROCm/rocshmem commit: 2f0739d823 ]
2024-10-01 20:05:15 +00:00
Brandon Potter
cd44115728
Poll the signal from one thread instead of all
...
[ROCm/rocshmem commit: 24b928a007 ]
2024-10-01 15:01:37 -05:00
Brandon Potter
44803b3ba1
Use gtest parameterized test macros for IPC simple
...
The IPC simple test fixtures had replicated code in many places.
This changeset removes most of the duplication in the relevant files.
[ROCm/rocshmem commit: 526811957b ]
2024-10-01 14:57:21 -05:00
avinashkethineedi
0641a4a29e
make MPI_Init and MPI_Finalize independent of the test fixtures
...
[ROCm/rocshmem commit: 0f7dc70894 ]
2024-10-01 18:33:36 +00:00
Brandon Potter
c097da70c4
Poll the signal from one thread instead of all
...
[ROCm/rocshmem commit: 0659f8d93c ]
2024-09-27 15:17:57 -05:00
Brandon Potter
25d7d7fccd
Change notifier max thread block value to account for MI300 CPX
...
[ROCm/rocshmem commit: db221b022a ]
2024-09-27 11:17:53 -05:00
Brandon Potter
24a527dcda
Reset config options to original values
...
[ROCm/rocshmem commit: 56b2ed699b ]
2024-09-27 11:17:11 -05:00
Brandon Potter
325ce3cba7
Bugfixes for the ipc unit tests
...
[ROCm/rocshmem commit: f85c46ec0a ]
2024-09-26 13:40:05 -05:00
Edgar Gabriel
bed676f89d
fix assembly switch/case instruction
...
move the case statement out of the architecture specific section.
[ROCm/rocshmem commit: c133ea18a5 ]
2024-09-20 20:25:40 +00:00
Muhammad Awad
fe3ecde6f6
Vectorize WQe segments writes
...
Signed-off-by: Muhammad Awad <MuhammadAbdelghaffar.Awad@amd.com >
[ROCm/rocshmem commit: 3162d49b56 ]
2024-09-17 20:34:18 -05:00
Brandon Potter
56c1626df1
Update fine-grained simple tests
...
[ROCm/rocshmem commit: 46fdb1851c ]
2024-09-10 09:35:41 -07:00
Brandon Potter
e64264d233
Add missing header file
...
[ROCm/rocshmem commit: 86a2f34539 ]
2024-09-10 09:35:02 -07:00
Brandon Potter
bf79c21ea8
Conservatively use SEQ_CST atomics in IPC conduit
...
[ROCm/rocshmem commit: 7411c45591 ]
2024-09-10 09:34:45 -07:00
Brandon Potter
10d351b6a1
Intermediate commit for rebase
...
[ROCm/rocshmem commit: 2806e1be79 ]
2024-09-10 07:10:22 -07:00
Brandon Potter
74c4a248cc
Add an extra assertion check for nullptr
...
[ROCm/rocshmem commit: 678564ba3c ]
2024-09-10 07:10:22 -07:00
Brandon Potter
688826937f
Minor updates to Nofifier sync method
...
[ROCm/rocshmem commit: 45c29e7734 ]
2024-09-10 07:10:21 -07:00