* added rccl version using rccl-tests
* Added function to get rccl version from rccl-tests
* removed whitespace
* Added rccl version
* Updated readme and fixed formatting
* removed debug prints
* Initial Script ready for review
* Added RCCL-tests and RCCL versions
* Added output folder and README
* Base format built
* Added ROCm version
* Added function to center titles and Vram information
* Added HIP version
* Cleaned formatting
* UCX version and MPI version
* Added NUMA balancing
* Added rocminfo
* Removed notes
* Changed regex for broadcom Nic
* Removed note by the ACS info
* Added Hostname to summary and details
* Print summary to terminal
* Added argparse
* Added flags and readme
* Added GPU ID
* fixed spelling
* renamed script again
* Added file descriptor and locked mem checks
* Added file descriptor and locked mem checks
* Removed extra spaces from summary table
* printing output file location
* Removed sudo in code and ACS flag
* Add another rome model and override
* Fix bug
* Fix typo
* Add ring
* Update ring
* Fix model matching
* Clean up
* Clean up
* Reverse rings for NCCL_RINGS input
* Only reverse NCCL_RINGS for ring graph
* Fix mapping issue when using NCCL_RINGS
* Add NCCL_RINGS_REMAP to handle inconsistant net names
* adding rocprof parser script
* adding the support for multiple json files
* adding pytorch profiler script
* remove filtering from pytorch log
* adding the addressing the comments and add the feature to parse all kernels
* completing the report for torch profiler
---------
Co-authored-by: Marzieh Berenjkoub <mberenjk@amd.com>
* Add 1H16P GPU model
* Implement NIC identification and remapping
* Revert "Sort IB devices based on device name (#413)"
This reverts commit 2d0ed8dff6.
* Fix permute and check order
* Correction on IB speed reporting
* Revert "Allow user to link layer with RCCL_IB_HCA_SKIP_LINK_LAYER (#361)"
This reverts commit caf5c9992a.
* Add another Rome model
* Add gfx908 4P3L models and support
* Revert "Use cached value for detecting GDR support only once"
This reverts commit 67c8e72ce3.
* Skip using ibverb for GPU direct RDMA detection
* Fine tune one Rome model