Commit Graph

29 Commits

Author SHA1 Message Date
Bertan Dogancay f35777e9b0 improve compilation time and create timetrace plot (#773)
* improve compilation time and create time-trace plot

* set default value for nproc
2023-06-14 09:17:51 -06:00
gilbertlee-amd 777d8747a5 Refactoring CMakeFiles (#755) 2023-05-25 16:08:54 -06:00
akolliasAMD 58db1cb96d updated install script to enable all of npkit (#754) 2023-05-24 14:44:01 -06:00
akolliasAMD 9fe5a349f1 added npkit_enable on CI tests (#698) 2023-04-05 08:05:23 -06:00
PedramAlizadeh 45872d170f Changed the name of UnitTests to rccl-UnitTests (wrapper executable included). 2022-12-13 21:45:57 +00:00
Nirmal Unnikrishnan 676a4737c1 File reorganization as per the new defined standard
The header files will in /opt/rocm-xxx/include/rccl
Libraries and cmake will be in /opt/rocm-xxx/lib folder.
Added wrappers for header files using rocm-cmake functions for backward compatibility.
2022-03-08 17:32:02 +00:00
gilbertlee-amd 29ad0f5fbe Unit test refactor (#500)
Refactoring and consolidating single-process / multi-process unit testing
2022-02-25 08:59:07 -07:00
Stanley Tsang 7b8b54955b Set ROCM_PATH CMake variable in install script (#470)
* Fixing cmake_install_prefix search to include /opt/rocm-xxxx

* Removing all hard references to /opt/rocm with ROCM_PATH

* Setting ROCM_PATH CMake variable in install script
2021-11-18 14:44:19 -07:00
Stanley Tsang 7e55b211c5 Build AllReduce only mode (#443)
* Initial commit of all_reduce_only support

* Working AllReduce only build

* Removing printfs and restoring release build

* Restore P2P index

* Updates to build_allreduce_only mode.

* cleaning up macro ifdefs
2021-10-26 17:36:46 -06:00
Eiden Yoshida eea7b24058 Add address sanitizer build option (#389) 2021-06-10 09:14:54 -06:00
Stanley Tsang 820a53287f Fixing install script so that invoking -r alone does not trigger rebuild (#382) 2021-06-04 09:46:04 -06:00
Stanley Tsang 5d3b1fa4d0 Temporarily disabling multiprocess unit tests 2021-02-25 23:59:44 +00:00
Wenkai Du 3a1aebd742 Merge remote-tracking branch 'rccl/develop' into 2.8.3 2021-02-15 13:17:38 -05:00
pramenku e9f7908592 Update install.sh (#317)
* Update install.sh

Install.sh having hard code like /opt/rocm/bin/hipcc for rocm_path and default_path=/opt/rocm
This will work only when we have standalone rocm installed. If anyone has installed, side-by-side, they will face below error.

Can we keep like ROCM_PATH=$ROCM_PATH  instead of “default_path” as variable name and 
ROCM_BIN_PATH=$ROCM_PATH/bin ,rocm_path can be replaced with ROCM_BIN_PATH.

This way, we will have option to export ROCM_PATH as env variable as per need and use the script. 
I have also tried locally, it’s working.  ROCM_PATH is common variable name, we are having.

If you are ok, I can also submit the PR for the same.


Error when side-by-side install is done for driver.
# ./install.sh -dtr 2>&1 | tee /dockerx/6519_rccl-test.log
CMake Error at /usr/share/cmake/Modules/CMakeDetermineCXXCompiler.cmake:48 (message):
Could not find compiler set in environment variable CXX:
/opt/rocm/bin/hipcc.
Call Stack (most recent call first):
CMakeLists.txt:12 (project)

CMake Error: CMAKE_CXX_COMPILER not set, after EnableLanguage
-- Configuring incomplete, errors occurred!
See also "/root/driver/rccl/build/release/CMakeFiles/CMakeOutput.log".

* Update install.sh

Removed ROCM_PATH=$ROCM_PATH

* Update install.sh

Set default value if external value is not supplied.
2021-02-12 08:44:30 -08:00
Stanley Tsang d00b7d17bd Update MP UT to support arbitrary # of GPUs; multiple bugfixes (#16)
* Fixing temp file creation/deletion for Clique kernel mode.

* Refactoring of MP unit tests; include bugfixes and general support for any number of GPUs

* GroupCall MP UT properly quits when too many devices specified

* MP UT will programmatically set NCCL_COMM_ID if not specified; updated install script
2021-02-05 16:49:25 -08:00
Stanley Tsang d3fa257682 Adding multiprocess unit tests (#312)
Adding multiprocess unit tests for collectives.  

To run, NCCL_COMM_ID=$HOSTNAME:12345 build/release/test/UnitTestsMultiProcess
2021-01-15 16:34:36 -07:00
Stanley Tsang 8c90aefb6d Adding the ability to force install dependencies (namely gtest); gtest library installation fix for centos (#265)
* Adding the ability to force install dependencies (namely gtest); gtest library installation fix for centos

* Removing potentially unneccessary dependencies from install script
2020-09-10 17:27:22 -06:00
Stanley Tsang c5d4d9eb76 Adding static library building option. (#244)
* Adding static library building option.

* Disabling running tests for static build

* Removing static packaging in CI

Co-authored-by: Saad Rahim <saad.rahim@amd.com>
2020-08-06 11:19:43 -06:00
Stanley Tsang 684f3e6af4 Adding better naming to unit tests for filtering; adding short and full unit test suites (#235) 2020-07-21 12:19:47 -06:00
gilbertlee-amd f87ba17737 Removing UnitTest as install, removing unused env var (#231) 2020-07-10 09:30:28 -06:00
Stanley Tsang 8d21adb5e3 Documentation updates for NCCL 2.7.0 (#219)
* Making hip-clang the default compiler; documentation update

* Adding back --hip-clang to install.sh as a silent option for CI

* Documentation updates for NCCL 2.7

* Restoring deleted line in install script
2020-06-16 16:48:11 -06:00
saadrahim 87db65f22d Fixing CI as install.sh script should not install dependencies without user request (#217) 2020-06-05 11:04:03 -06:00
Stanley Tsang dc403e0ca2 Making hip-clang the default compiler; documentation update (#216)
* Making hip-clang the default compiler; documentation update

* Adding back --hip-clang to install.sh as a silent option for CI
2020-06-04 11:58:27 -06:00
Stanley Tsang 20fa04d9b6 Updating copyright notices for 2020. 2020-01-29 15:28:08 -08:00
Akila Premachandra f48ae5c98d Added hip-clang options to install script, and openmp/pthread options to CMakeLists.txt 2019-08-23 22:02:42 +00:00
Stanley Tsang 329a62a01f Fixing install script to actually install library when requested (#88)
* Fixing install script to actually install library when requested.  Cleaning up unused code.

Removing unused arguments from install script.

Fixing weird whitespacing

* Fixing install script to install to correct location /opt/rocm, now creates symlink in /opt/rocm/lib

* Updates and corrections to README and install script
2019-06-25 17:25:21 -06:00
saadrahim 4c4351673b Jenkinsfile (#65)
* Changing Jenkinsfile to support runs without docker
* Updating install file for build options
* Fixing command execution
* Fixing Jenkinsfile
* fixing test execution
* Removing junit search
2019-05-22 15:32:32 -06:00
Wenkai Du e517dbed5c By default will not build test program 2019-05-20 18:37:58 +00:00
Gilbert Lee 55a4b22ad7 Updating RCCL based on NCCL 2.3.7
- Contains modifications to support AMD hardware
- Adds unit tests
2019-05-16 16:16:18 +00:00