İşleme Grafiği

48 İşleme

Yazar SHA1 Mesaj Tarih
corey-derochie-amd 503a472a25 Replaced ROCmSoftwarePlatform and RadeonOpenCompute links with ROCm links. (#1125) 2024-03-25 16:29:13 -06:00
gilbertlee-amd 10dbd2a452 Fixing formatting for copywrite (#638) 2022-10-19 13:43:21 -06:00
gilbertlee-amd ebb8b5bf63 Updating files for missing licenses (#637) 2022-10-14 13:49:16 -06:00
gilbertlee-amd bd7d589446 Removing TransferBench from tools (#632)
Point to new TransferBench repo
2022-09-30 11:53:32 -06:00
gilbertlee-amd 685bcea127 [TransferBench] Syncing with TransferBench v1.02 (#541) 2022-04-27 20:43:24 -06:00
gilbertlee-amd def6832287 Transfer bench single stream mode (#531)
- Adding single stream mode
- Removing some unused env vars
- Adding output to CSV mode for p2p benchmark, topology listing modes
2022-04-08 15:20:55 -06:00
gilbertlee-amd 2d558c9abc Adding explicit request for coarse-grained host memory due to changes in HipHostMalloc (#517) 2022-03-25 13:05:07 -06:00
gilbertlee-amd f3c2cafd9d [TransferBench] Fix for cases with subsets of configured numa nodes (#495) 2022-02-07 12:16:19 -07:00
gilbertlee-amd 84d5fce7dd TransferBench: Adding ability to reindex GPUs based on PCIe address (#494) 2022-02-02 08:51:41 -07:00
gilbertlee-amd 2530a2f084 [TransferBench] Updating for 2.11.4. Decoupling from RCCL kernel (#485) 2022-01-05 16:33:25 -07:00
Wenkai Du 434ecb0e1f Merge remote-tracking branch 'origin/develop' into 2.11.4 2022-01-03 09:54:16 -08:00
gilbertlee-amd 1157c2edfe [TransferBench] Adding more preset benchmarks to filter read mode, cpu vs gpu pairs (#477) 2021-11-24 18:05:37 -07:00
Wenkai Du 3a919c1f49 Merge remote-tracking branch 'nccl/master' into develop 2021-11-11 14:22:12 -08:00
gilbertlee-amd 1c7ef1b790 [TransferBench] Adding #CUs / RRLW mode to p2p benchmark (#464) 2021-11-08 14:36:04 -07:00
gilbertlee-amd 18246fc191 [TransferBench] Changing default per block multiple to 256B, adding BLOCK_BYTES env var (#446) 2021-10-25 11:23:29 -06:00
gilbertlee-amd 550d732d6c TransferBench p2p benchmark mode (#444)
* [TransferBench] Adding a p2p benchmark mode
* [TransferBench] Switching to using single sync mode by default (USE_SINGLE_SYNC=1)
2021-10-21 15:28:16 -06:00
gilbertlee-amd f6b7ac693e [TransferBench] Adding comment echoing to help distinguish tests (#438) 2021-10-13 14:56:57 -06:00
gilbertlee-amd 269f07fbc3 [TransferBench] Adding shared memory per threadblock env var. Defaulting to 1 threadblock per CU (#436) 2021-10-12 09:32:54 -06:00
gilbertlee-amd aa917c3fc8 [TransferBench] Adding ability to specify suffix for numBytes (#435) 2021-10-08 16:36:19 -06:00
gilbertlee-amd e506d14d18 [TransferBench] Fixing advanced config, adding new all-1-hop sample test (#433)
* [TransferBench] Fixing advanced config, adding new all-1-hop sample test
2021-10-07 15:57:21 -06:00
Gilbert Lee 68ec3f84e6 [TransferBench] Update to 2.10.3 2021-08-02 05:53:20 -05:00
gilbertlee-amd 51d64894ff [TransferBench] ConfigFile parsing fixes, adding additional info (#422)
* [TransferBench] Adding GPU to NUMA distance detection, parsing fixes, config file generation fix

* [TransferBench] Fixing up NUMA node detection by filtering pools
2021-09-07 15:28:16 -06:00
gilbertlee-amd 1ed272e5f0 [TransferBench] Removing dependency on hip_fp16 header, fixing swapped output CSV header (#416) 2021-08-04 10:53:41 -06:00
gilbertlee-amd 2b0b608270 [TransferBench] Fixing a typo in TransferBench usage example (#401) 2021-06-22 17:08:57 -06:00
gilbertlee-amd 720374a767 [TransferBench] Switching from little-endian fill pattern to big-endian (#399) 2021-06-21 14:28:51 -06:00
gilbertlee-amd ff413be933 [TransferBench] Adding ability to specify source data pattern (#394)
* [TransferBench] Adding ability to specify source data pattern
2021-06-15 08:41:57 -06:00
Gilbert Lee f372c53d52 [TransferBench] Fixing some merge issues 2021-02-05 16:46:20 +00:00
Wenkai Du ab1e7a0318 Merge remote-tracking branch 'origin/develop' into 2.8.3 2021-02-04 20:02:34 -05:00
Gilbert Lee 9ce203dd0a [TransferBench] Updating for 2.8.3 2021-02-04 18:58:25 +00:00
gilbertlee-amd 62e0447e9a [TransferBench] Restore some previous fixes - memory leak, PCIe address (#314) 2021-02-01 09:48:09 -07:00
gilbertlee-amd 41c35dad48 [TransferBench] Fixing bug with fine-grained memory allocation (#311)
* Fixing bug with fine-grained memory
2020-12-15 17:37:31 -07:00
gilbertlee-amd ae0c4092c7 [TransferBench] Adding ability to perform CPU-executed copies, various upgrades (#309)
* Adding CPU based execution, fixing typos, adding Fine-grained mem
* Exposing sampling factor when generating range of data sizes
* Refactoring how Links are launched, now once per thread
* Documentation updates
2020-12-11 10:21:14 -07:00
gilbertlee-amd b80ae551b1 [TransferBench] Support multiple of 4 byte sizes, changing default GPU timing mechanism (#307)
* Changing default timing mechanism, adjusting CPU bandwidth calc, adding flag to use combined timing
* Adding support for smaller transfers (byte size must be multiple of 4 instead of 128)
2020-12-04 14:57:13 -07:00
gilbertlee-amd bfab1d3592 Adding output to CSV, removing OpenMP, decreasing default numBytes to 64MB, adding aggregate stats (#290) 2020-10-27 09:00:33 -06:00
gilbertlee-amd 61e1a71d14 [TransferBench] Displaying PCIe Bus ID (#288)
* Adding PCIe BusID per GPU in topology display
2020-10-21 16:13:36 -06:00
gilbertlee-amd 769418c5c7 TransferBench Typo. Pinned host memory uses C not P (#286) 2020-10-21 12:05:38 -06:00
gilbertlee-amd ee262819a7 New TransferBench features (#273)
* Upgrading TransferBench to support pinned CPU memory, expanding functionality, cleaning up env vars
2020-09-25 12:20:48 -06:00
gilbertlee-amd ec9af40fcd Upgrading various TransferBench features (#257) 2020-08-19 09:47:19 -06:00
gilbertlee-amd c985478133 Fixes to make TransferBench compile for hipclang (#254) 2020-08-13 12:25:28 -06:00
Gilbert Lee 339bf9ff19 Adding option to re-use streams instead of re-creating per topology 2020-04-23 15:53:40 +00:00
Aaron Enye Shi a95090d981 Fix HIP-Clang build with HSA headers
HIP-Clang does not include these HSA headers, and they need to be explicitly added in RCCL.
2020-04-03 17:58:23 -04:00
Stanley Tsang 20fa04d9b6 Updating copyright notices for 2020. 2020-01-29 15:28:08 -08:00
Gilbert Lee e5074ce94d Changing single sync mode to time all iterations instead of just last 2019-12-20 17:08:39 -08:00
gilbertlee-amd 2f4269d06d Adding new sleep after sync capability for data fabric profiling (#162)
Fixing missing header include for ROCM 3.0 changes
2019-12-12 15:20:54 -07:00
gilbertlee-amd fd94f4fa25 Adding interactive mode for profiling purposes (#150) 2019-11-05 17:10:16 -07:00
gilbertlee-amd 2f9edd2432 Single Sync Timing mode (#144)
* Adding single sync timing mode to emulate timing reported by rccl-prim-test / rccl-tests
* Adding duration / overhead info
2019-11-01 10:18:25 -06:00
Gilbert Lee 648c1ee7cc Adding ability to switch between fine/coarse grain destination GPU memory
Adding ability to switch between memset/memcpy
2019-10-29 12:00:32 -06:00
gilbertlee-amd b8cf48fc16 Adding TransferBench tool (#113)
* Adding standalone TransferBench tool
2019-08-07 17:21:41 -06:00