From 778bb2e78a14c44abbd5887a9828e9316fd100fa Mon Sep 17 00:00:00 2001 From: Rahul Garg Date: Mon, 22 Aug 2016 11:37:37 +0530 Subject: [PATCH] Added initial draft for performance optimizations, started with unpinned memory transfers Change-Id: Icbce2aec347d015bc66cc0c08f6193057bf36b4c --- hipamd/docs/markdown/hip_performance.md | 42 +++++++++++++++++++++++++ 1 file changed, 42 insertions(+) create mode 100644 hipamd/docs/markdown/hip_performance.md diff --git a/hipamd/docs/markdown/hip_performance.md b/hipamd/docs/markdown/hip_performance.md new file mode 100644 index 0000000000..98197b3db7 --- /dev/null +++ b/hipamd/docs/markdown/hip_performance.md @@ -0,0 +1,42 @@ +# HIP Performance Optimizations + +Please note that this document lists possible ways for experimenting with HIP stack to gain performance. Performance may vary from platform to platform. + +### Unpinned Memory Transfer Optimizations + +#### On Small BAR Setup + +There are two possible ways to transfer data from Host to Device (H2D) and Device to Host(D2H) + * Using Staging Buffers + * Using PinInPlace + +#### On Large BAR Setup + +There are two possible ways to transfer data from Host to Device (H2D) + * Using Staging Buffers + * Using PinInPlace + * Direct Memcpy + + And there are two possible ways to transfer data from Device to Host (D2H) + * Using Staging Buffers + * Using PinInPlace + +Some GPUs may not be able to directly access host memory, and in these cases we need to +stage the copy through an optimized pinned staging buffer, to implement H2D and D2H copies.The copy is broken into buffer-sized chunks to limit the size of the buffer and also to provide better performance by overlapping the CPU copies with the DMA copies. + +PinInPlace is another algorithm which pins the host memory "in-place", and copies it with the DMA +engine. + +By default staging buffers are used for unpinned memory transfers, however other ways can be used by enabling few environment variables (so no need to build the code again!!!) + +Following environment variables can be used: + +- HIP_PININPLACE - This environment variable forces the use of PinInPlace logic for all unpinned memory copies + +- HIP_OPTIMAL_MEM_TRANSFER- This environment variable enables a hybrid memory copy logic based on thresholds. These thresholds can be managed with following environment variables: + - HIP_H2D_MEM_TRANSFER_THRESHOLD_STAGING_OR_PININPLACE - Threshold in bytes for H2D copy. For sizes smaller than threshold staging buffers logic would be used else PinInPlace logic. + - HIP_H2D_MEM_TRANSFER_THRESHOLD_DIRECT_OR_STAGING - Threshold in bytes for H2D copy. For sizes smaller than threshold direct copy logic would be used else staging buffers logic. + - HIP_D2H_MEM_TRANSFER_THRESHOLD - Threshold in bytes for D2H copy. For sizes smaller than threshold staging buffer logic would be used else PinInPlace logic. + + +