Graf commitů

108 Commity

Autor SHA1 Zpráva Datum
avinashkethineedi 4e71fa5d22 Remove active-set-based reduction test from the functional tests suite
[ROCm/rocshmem commit: e9484bbb86]
2024-10-28 21:22:46 +00:00
Yiltan da9466f7cc Merge pull request #44 from ROCm/fix-printing
Clean up functional tests output

[ROCm/rocshmem commit: 9885f984f6]
2024-10-28 15:45:28 -04:00
Yiltan bd3a07d28b Merge pull request #43 from ROCm/LWPRHMEM-75-API-differences-bug-fix
Lwprhmem 75 api differences bug fix

[ROCm/rocshmem commit: 794b888d69]
2024-10-28 15:45:15 -04:00
Yiltan Temucin 0ed689439b Cleaned up how we print the output
[ROCm/rocshmem commit: 9576ff6440]
2024-10-28 13:37:33 -05:00
Yiltan Temucin bc315d1c8b API bug fix in IB conduit
[ROCm/rocshmem commit: 98afb41263]
2024-10-24 11:52:03 -05:00
Yiltan Temucin 2da1f4e925 API change bug fix
[ROCm/rocshmem commit: e210020e9b]
2024-10-24 11:52:03 -05:00
Edgar Gabriel 95c01c67af add ascii art for ring allredude
[ROCm/rocshmem commit: 11df5427a6]
2024-10-24 15:08:32 +00:00
Edgar Gabriel 2836240906 fix odd-case allreduce scenarios
if the number of elements to be used in the allreduce operation is not
exact multiple of the work-array buffer size and number of pe's, we need
to adjust the algorithm to:
 - initially perform a ring_allreduce on n_segments * chunk_size (which
   is the integer division of the number of elements and the work-buffer
   size, i.e. will not cover the entire buffer)
 - perform another ring_allreduce where chunk_size is reduced to match
   the remaining elements
 - if the remaining elements from the previous step cannot evenly be
   divded by the number of pe's, we need to perform a direct_allreduce on
   the outstanding number of elements.


[ROCm/rocshmem commit: a4b4281f50]
2024-10-24 15:08:32 +00:00
Edgar Gabriel 7b760bb023 fix barrier synchronization on gfx90a
[ROCm/rocshmem commit: 87db7f7d38]
2024-10-24 15:08:28 +00:00
Edgar Gabriel 777401ae29 add some example code
first examples include a getmem testcase and an allreduce (to_all)
example.


[ROCm/rocshmem commit: a0ac7b2d60]
2024-10-24 15:07:17 +00:00
Edgar Gabriel c9b5f03548 ipc: add ring_allreduce algorithms
add the ring allreduce algorithm to the ipc conduit in order to be able
to execute slightly largers reductions.


[ROCm/rocshmem commit: 1fbb89bc73]
2024-10-24 15:07:17 +00:00
Edgar Gabriel 5f0f2f6e85 ipc/to_all: add direct allreduce algorithm
add a simple version of an allreduce algorithm as a starting point.


[ROCm/rocshmem commit: ba21cb7b85]
2024-10-24 15:07:14 +00:00
Brandon Potter a032b41c20 Merge pull request #34 from BKP/ipc_parameterized_simple_tests_10-01-24
IPC Parameterized Simple Tests

[ROCm/rocshmem commit: 416dffa129]
2024-10-24 08:23:26 -05:00
Avinash Kethineedi 6979677ec5 Merge pull request #41 from avinashkethineedi/collective_routine_buffers
Fine grained memory buffers for work/sync arrays

[ROCm/rocshmem commit: 8a16968cf2]
2024-10-23 23:33:48 -05:00
avinashkethineedi 82d296db73 Fix quiet and fence of default context
* Update tinfo of default context


[ROCm/rocshmem commit: d5ea5868e3]
2024-10-22 16:18:05 +00:00
avinashkethineedi fbcba80cd3 Add fine grained memory buffers for work/sync arrays
* Add interanl put_mem/get_mem{_wave, _wg} functions to read/write to work/sync arrays
* Add condition check to ensure all MPI processes are on the same compute node for IPC conduit


[ROCm/rocshmem commit: 6685d0ab60]
2024-10-21 15:28:39 +00:00
Yiltan 08ab0b4a41 Merge pull request #39 from Yiltan/LWPRHMEM-75-API-differences
LWPRHMEM-75 API Differences

[ROCm/rocshmem commit: b922bdcf4c]
2024-10-18 15:27:34 -04:00
avinashkethineedi 10eb11c1d5 Use C++ iota function to reset buffers and use its values for verification
* Update functional test script to include new tests


[ROCm/rocshmem commit: 18a1bdd0ac]
2024-10-15 20:23:25 +00:00
Avinash Kethineedi 1a5536bfaa Merge branch 'ROCm:develop' into functional_tests/puts_gets
[ROCm/rocshmem commit: e981f61693]
2024-10-14 10:27:54 -05:00
Yiltan Hassan Temucin 31fe937259 updated atomic_fetch() parameters
[ROCm/rocshmem commit: 8b3854b252]
2024-10-11 13:34:28 -07:00
Yiltan Hassan Temucin 3d0fca0387 updated *_wait* APIs to use int rather than roc_shmem_cmps
[ROCm/rocshmem commit: 722a5f0731]
2024-10-11 13:34:28 -07:00
Yiltan Hassan Temucin 496f06dd2b *_wait* routines changed parameter from ptr to ivars to match OpenSHMEM
[ROCm/rocshmem commit: bcf3fdff10]
2024-10-11 13:34:28 -07:00
Brandon Potter 7f19a42778 Merge branch 'ROCm:develop' into ipc_parameterized_simple_tests_10-01-24
[ROCm/rocshmem commit: ce0ca36d37]
2024-10-11 12:49:56 -05:00
Brandon Potter 5b47cf482d Merge pull request #29 from ROCm/improve-ib-latency
Vectorize WQE segments writes

[ROCm/rocshmem commit: e419a8b963]
2024-10-11 11:55:48 -05:00
Yiltan Hassan Temucin 17323323f8 fixed notifier bug
[ROCm/rocshmem commit: 509277c034]
2024-10-10 06:45:43 -07:00
Yiltan Hassan Temucin 8334214b98 added notifier->sync() when we are not using cooperative groups
updated scope bug


[ROCm/rocshmem commit: b1134e8633]
2024-10-09 13:11:28 -07:00
Yiltan Hassan Temucin caa6d356c0 Added Cooperative Groups configure option and header
[ROCm/rocshmem commit: 63667a3167]
2024-10-09 13:11:12 -07:00
Yiltan Hassan Temucin 45976b23ae Fix initialization order bug
[ROCm/rocshmem commit: 1baa071edf]
2024-10-09 13:11:12 -07:00
Yiltan Hassan Temucin ef571f5863 fixed barrier issue on MI250X
[ROCm/rocshmem commit: e2f6a65284]
2024-10-08 13:18:04 -07:00
Yiltan Hassan Temucin 3cba9ccd42 added .gitignore, we do not want to include the build directory in our commits
[ROCm/rocshmem commit: 120453c75c]
2024-10-08 13:18:04 -07:00
avinashkethineedi 7eec77ea17 Add script to run unit tests
[ROCm/rocshmem commit: c1bcf336b4]
2024-10-08 18:12:07 +00:00
avinashkethineedi 37b1de86cd Add team information to the context
* Update roc_shmem_ctx_fence API to use team-relative PE numbering
* Update backend to populate team_opaque member of ROC_SHMEM_CTX_DEFAULT (used to store information about the team wrt TEAM_WORLD)


[ROCm/rocshmem commit: 92fb1abaf2]
2024-10-04 17:56:15 +00:00
avinashkethineedi 69784a7423 Add fence and quiet functionality
* Perform atomic stores to enforce memory ordering


[ROCm/rocshmem commit: 979aed105a]
2024-10-03 06:28:12 +00:00
Brandon Potter 8e44e5d458 Merge pull request #31 from BKP/ipc_bringup_fine_unit_09-26-24
Add IPC Simple Buffer Fine-grained Unit Tests

[ROCm/rocshmem commit: 787cf0ff3f]
2024-10-01 15:12:30 -05:00
avinashkethineedi 285ac5cab6 Add MPI_THREAD_MULTIPLE check
[ROCm/rocshmem commit: 2f0739d823]
2024-10-01 20:05:15 +00:00
Brandon Potter cd44115728 Poll the signal from one thread instead of all
[ROCm/rocshmem commit: 24b928a007]
2024-10-01 15:01:37 -05:00
Brandon Potter 44803b3ba1 Use gtest parameterized test macros for IPC simple
The IPC simple test fixtures had replicated code in many places.
This changeset removes most of the duplication in the relevant files.


[ROCm/rocshmem commit: 526811957b]
2024-10-01 14:57:21 -05:00
avinashkethineedi 0641a4a29e make MPI_Init and MPI_Finalize independent of the test fixtures
[ROCm/rocshmem commit: 0f7dc70894]
2024-10-01 18:33:36 +00:00
Brandon Potter c097da70c4 Poll the signal from one thread instead of all
[ROCm/rocshmem commit: 0659f8d93c]
2024-09-27 15:17:57 -05:00
Brandon Potter 25d7d7fccd Change notifier max thread block value to account for MI300 CPX
[ROCm/rocshmem commit: db221b022a]
2024-09-27 11:17:53 -05:00
Brandon Potter 24a527dcda Reset config options to original values
[ROCm/rocshmem commit: 56b2ed699b]
2024-09-27 11:17:11 -05:00
Brandon Potter 325ce3cba7 Bugfixes for the ipc unit tests
[ROCm/rocshmem commit: f85c46ec0a]
2024-09-26 13:40:05 -05:00
Edgar Gabriel bed676f89d fix assembly switch/case instruction
move the case statement out of the architecture specific section.


[ROCm/rocshmem commit: c133ea18a5]
2024-09-20 20:25:40 +00:00
Muhammad Awad fe3ecde6f6 Vectorize WQe segments writes
Signed-off-by: Muhammad Awad <MuhammadAbdelghaffar.Awad@amd.com>


[ROCm/rocshmem commit: 3162d49b56]
2024-09-17 20:34:18 -05:00
Brandon Potter 56c1626df1 Update fine-grained simple tests
[ROCm/rocshmem commit: 46fdb1851c]
2024-09-10 09:35:41 -07:00
Brandon Potter e64264d233 Add missing header file
[ROCm/rocshmem commit: 86a2f34539]
2024-09-10 09:35:02 -07:00
Brandon Potter bf79c21ea8 Conservatively use SEQ_CST atomics in IPC conduit
[ROCm/rocshmem commit: 7411c45591]
2024-09-10 09:34:45 -07:00
Brandon Potter 10d351b6a1 Intermediate commit for rebase
[ROCm/rocshmem commit: 2806e1be79]
2024-09-10 07:10:22 -07:00
Brandon Potter 74c4a248cc Add an extra assertion check for nullptr
[ROCm/rocshmem commit: 678564ba3c]
2024-09-10 07:10:22 -07:00
Brandon Potter 688826937f Minor updates to Nofifier sync method
[ROCm/rocshmem commit: 45c29e7734]
2024-09-10 07:10:21 -07:00