Commit Graph

58 Commits

Author SHA1 Message Date
Jobbins 474112d03c Code Coverage (#82)
code coverage:  generate code coverage reports

* Add instrumentation flags to rocshmem target when adding -DBUILD_CODE_COVERAGE cmake flag
* Add helper script to build all subprojects and generate code coverage reports
* Update README with code coverage instructions
2025-05-16 09:09:17 -06:00
Aurelien Bouteiller f0501550f7 cleanup leftovers from SOS testers removal (#97)
Followup to pr#85
2025-05-02 11:59:52 -04:00
Aurelien Bouteiller b835de6cd5 Substitute pow2bin allocator with a dlmalloc based allocator (#71)
* Add dlmalloc_strat allocator strategy
 - Use mspace variant to ease encapsulation
 - Make pow2bins and dlmalloc cmake selectable
* Add unit tester for dlmalloc, rework single_heap, pow2bins unit testers
accordingly
 - add dlmalloc get_used/get_avail, and have all strats allocators also have a get_used
 - Rework memallocator unit tests: bin size is per strat, alignment is verified in singleheap
* bugfix: dlmalloc exposed that the pingpong test would write past end of
allocation with -w 32
* iostream leakage/mixed usage of cerr and fprintf(stderr

---------

Signed-off-by: Aurelien Bouteiller <aurelien.bouteiller@amd.com>
2025-05-01 11:55:23 -04:00
Aurelien Bouteiller 67bc5b9e5a Show and log what the functional test driver is running (#70)
Show and log what the functional test driver is running
* Log errors in the log file
* list all failed tests at the end
* pretty colors :x
* Print stderr when the test has failed

---------

Signed-off-by: Aurelien Bouteiller <aurelien.bouteiller@amd.com>
2025-04-23 10:21:35 -04:00
Avinash Kethineedi f6ef19f5a9 Add SPDX license identifiers and update copyright headers (#85)
* Update copyright information and add SPDX license identifier

* Update AUTHORS

* Remove `sos_tests`
2025-04-15 15:37:53 -05:00
Yiltan 5ee0c3407e Added sphinx dependencies (#84) 2025-04-15 11:28:16 -04:00
Edgar Gabriel 5e49567b6c add new flag to build instructions (#78)
This flag is required to link a pytorch use-case correctly.
It doesn't seem to impact the rocSHMEM code.
2025-04-10 08:39:54 -05:00
Yiltan 25e7109b64 Enable RO CI (#65) 2025-04-08 16:12:22 -04:00
Avinash Kethineedi dc61bca066 Update Barrier and Sync APIs (#73)
* Add thread, wavefront, and workgroup-level `barrier` APIs in IPC and RO conduits; remove collectives on default context
 - Implemented `barrier` APIs for thread, wavefront, and workgroup scopes
 - Added support into both IPC and RO conduits
 - Added functional tests to cover all `barrier` APIs
 - Removed collective operations on default context

* Add thread, wavefront, and workgroup-level `sync` APIs in IPC and RO conduits.
  - Implemented `sync` APIs for thread, wavefront, and workgroup scopes
  - Added support into both IPC and RO conduits
  - Added functional tests to cover all `sync` APIs

* update naming convention for context-based `barrier` APIs
2025-04-08 11:25:31 -05:00
Avinash Kethineedi c652f58cef Update Barrier_All and Sync_All APIs (#72)
* Fix deadlock in `rocshmem_ctx_wg_barrier_all` API in IPC conduit by adding per-context pSync buffers and context IDs
  - Added separate pSync buffers for each device context
  - Resolved deadlock when invoking barrier API (`rocshmem_ctx_wg_barrier_all`) concurrently from multiple contexts

* Update barrier_all functional tests for multi-context support

* Add thread, wavefront, and workgroup-level barrier_all APIs in IPC and RO conduits
  - Implemented barrier_all APIs at thread, wavefront, and workgroup granularity
  - Added support in both IPC and RO conduits
  - Updated functional tests to cover all `barrier_all` APIs

* Add thread, wavefront, and workgroup-level sync_all APIs in IPC and RO conduits
  - Implemented sync_all APIs for thread, wavefront, and workgroup scopes
  - Added support into both IPC and RO conduits
  - Added functional tests to cover all `sync_all` APIs
2025-04-02 11:58:55 -05:00
Edgar Gabriel 4e48c9748e update README documentation for RO (#63)
* README: update documentation for RO support

update the README and the install_dependencies script to match the
requirements of the RO conduit.

* add CODEOWNERS file
2025-03-25 07:50:15 -05:00
Avinash Kethineedi c16b0d6952 Fix/RO Backend Hang Issue (#53)
* Update HIP version check for compatibility with versions >= 5.5

* Update memory allocator for context BlockHandle
   - Replaced `HIPAllocator` with `HIPDefaultFinegrainedAllocator` for context `BlockHandle`.

* Update run commands for `rocshmem_g` and `rocshmem_p` functional tests
2025-03-24 22:54:07 -05:00
Edgar Gabriel bcbc42e78f add rocshmem_barrier() (#61)
* add team-barrier implementation

add a team-barrier API and implementation in the IPC and RO conduit.
Clean up some of the logic in the RO Conduit to distinguish between
sync, sync_all, barrier, and barrier_all.

* add team_barrier_tests to functional tests
2025-03-24 11:23:03 -05:00
Yiltan 658bf2a3b5 Removed GPU_IB (#59) 2025-03-24 09:04:52 -04:00
Yiltan 3428957de9 Sync Reverse Offload Scripts (#52)
* Sync Reverse Offload scripts
- Disable IPC unit tests when IPC is not available in the rocSHMEM configuration

* Added missing ptr in ipc_policy
2025-03-19 14:31:07 -04:00
Yiltan 7d9e82fb34 Bug fix for PR43 (#54) 2025-03-19 09:39:07 -04:00
Avinash Kethineedi aa3121a967 Update RMA functional tests (#50)
* Update primitive tests for multi-workgroup support

* Update workgroup primitive tests for multi-workgroup support

* Update workfront primitive tests for multi-workgroup support

* Update team based primitive tests for multi-workgroup support

* Update RMA functional tests to capture timing after quiet call
   - Modified RMA functional tests to record the time after a `quiet` call in thread, wavefront, and workgroup RMA calls.

* Improve error handling and memory management
   - Replaced `cout` with `cerr` for improved error reporting.
   - Ensured all allocated memory is freed when `rocshmem_malloc` fails.

* Update start time in primitive tests and latency calculations
   - Modified primitive tests to capture the earliest start time.
   - Updated latency calculations in functional tests.

* Remove `GetSwarmTester`

* Update start time in team primitive tests

* Invoke quiet call from a single thread within a block on a rocshmem context
2025-03-18 14:39:57 -05:00
Avinash Kethineedi df4ad2c04d Refactor RO backend data structures (#49)
- Remove hdp and ipc pointers from BlockHandle, align RO stats with RO contexts

- Add run commands for `rocshmem_g` and `rocshmem_p` API tests in driver.sh

- Allocate rocshmem API return buffers based on number of device contexts.

- Associate status flag address with blocking calls and remove threadId dependency
   - Associated the status flag address with each blocking call request to notify the GPU thread.
   - Removed dependency on threadId for determining the appropriate status flag index.

- Move status flag buffer allocation to backend.

- Initialize allocated memeory to zero
2025-03-14 10:49:44 -05:00
Yiltan 96424a59a8 Added option to build only tests and link to an external rocshmem library (#43)
* Rearrange CMakefile

* Enable linking to external rocshmem library

* Minor fix for the functional test driver

* ROCSHMEM_HOME detection fixed
2025-03-13 15:49:50 -04:00
Yiltan 785e31aa48 Sync develop with amd-mainline (#46)
* Update install_dependencies.sh

* Updated to ROCm repos

* Merge pull request #37 from ROCm/depBuild

locked specific version on ompi and ucx

* locked specific version on ompi and ucx

* [IPC] Fix ROCSHMEM_SIGNAL_ADD

* Generate CMake Package Configuration Files

---------

Co-authored-by: akolliasAMD <99202231+akolliasAMD@users.noreply.github.com>
Co-authored-by: akolliasAMD <akollias@amd.com>
2025-02-18 12:30:34 -05:00
Yiltan Hassan Temucin b83ff2fa84 Use the precalculated num_warps variable 2025-02-06 13:21:25 -06:00
Yiltan Hassan Temucin 8d74c7b73e Validate signal after put signal operations 2025-02-06 08:17:22 -06:00
Yiltan Hassan Temucin 3a8b0d4647 Updated RO builds script and functional test driver for multi-node support 2025-01-23 16:46:19 -06:00
Yiltan fa90f4b0ac Minor fixes for packaging 2025-01-20 18:15:07 +00:00
Yiltan 0fb673e186 Update scripts/install_dependencies.sh
Co-authored-by: Avinash Kethineedi <avinash.kethineedi@amd.com>
2025-01-16 13:38:08 -05:00
Yiltan Temucin 5de0371bec Added script to install dependencies 2025-01-16 10:06:39 -06:00
avinashkethineedi 23172c9150 Updated driver.sh and tester.hpp with sequential numbering for test identification
* Enabled Ping Pong tests
* Removed test commands for multi-workgroup collective tests
2024-12-26 21:28:21 +00:00
Yiltan Temucin 98c164d72e Added timeout to unit tests 2024-12-06 15:50:22 -06:00
Yiltan Temucin d08ea96ea3 Update build scripts
- Only build for the machine we are on
- Saves CI time
2024-12-06 15:49:55 -06:00
avinashkethineedi d8ce066adc Merge branch PR #55 into naming_scheme 2024-12-04 21:46:38 +00:00
Yiltan 46dcfbbb9e Merge pull request #57 from Yiltan/CI-fix-degeneratetiledfine
[CI Bug Fix] Updated gfilter flags for DegenerateTiledFine tests
2024-12-03 14:56:00 -05:00
Yiltan Temucin 37fe71343f Updated gfilter flags for new unit tests 2024-11-26 13:51:29 -06:00
Brandon Potter fd8dbc7fb6 Use new naming scheme 2024-11-25 14:25:29 -06:00
Yiltan Temucin f710a301fe Added functional tests 2024-11-22 15:36:17 -06:00
Yiltan Temucin 4ad24b5aab Propergate errors from build scripts so CI doesn't silently fail 2024-11-15 11:17:33 -06:00
Yiltan Temucin 3f857718fd Fixed bug in functional and unit tests driver.sh
- The driver previously did not propagate errors correctly
- Adjusted gtest filters

driver edit
2024-11-15 10:50:31 -06:00
Yiltan Hassan Temucin 997eb69b5a modified team based to_all -> reduce 2024-11-06 09:46:43 -06:00
Yiltan Hassan Temucin fe767d9abf remove cooperative groups 2024-10-30 20:10:21 +00:00
avinashkethineedi 18a1bdd0ac Use C++ iota function to reset buffers and use its values for verification
* Update functional test script to include new tests
2024-10-15 20:23:25 +00:00
Avinash Kethineedi e981f61693 Merge branch 'ROCm:develop' into functional_tests/puts_gets 2024-10-14 10:27:54 -05:00
Yiltan Hassan Temucin 63667a3167 Added Cooperative Groups configure option and header 2024-10-09 13:11:12 -07:00
avinashkethineedi c1bcf336b4 Add script to run unit tests 2024-10-08 18:12:07 +00:00
Brandon Potter 56b2ed699b Reset config options to original values 2024-09-27 11:17:11 -05:00
Brandon Potter 2806e1be79 Intermediate commit for rebase 2024-09-10 07:10:22 -07:00
Brandon Potter c4b7e0d91b Partial notifier 2024-09-10 07:10:21 -07:00
Brandon Potter e9fb01ab6b Merge pull request #27 from ROCm/ipc_bringup
Ipc bringup
2024-09-10 09:06:51 -05:00
avinashkethineedi b6d31ac7ef Add tilled version of puts and gets at wavefront level to the functional test suite
* Implemented tiled version of put*_wave and get*_wave functions
* Maintain single kernel that supports both tiled and untiled versions
* Disable IPC in the default RO build script
2024-09-07 16:06:36 -07:00
avinashkethineedi d226922733 Add tilled version of puts and gets at the workgroup level to the functional test suite 2024-09-07 15:58:14 -07:00
avinashkethineedi ff954237dd add functional tests for puts and gets at wavefront level
* These functional tests are simple puts and gets, where every wave will get/put the same amount of data
* Enabled workgroup level puts and gets tests
2024-09-05 14:52:48 -07:00
avinashkethineedi 9c9ef4ffd3 Comment out ping pong test
* ping pong test fails sporadically
* issues with roc_shmem_wait_until
2024-08-28 12:40:51 -07:00