diff --git a/projects/hip/CONTRIBUTING.md b/projects/hip/CONTRIBUTING.md index 060e70e519..6adf73bdf5 100644 --- a/projects/hip/CONTRIBUTING.md +++ b/projects/hip/CONTRIBUTING.md @@ -94,7 +94,7 @@ Differences or limitations of HIP APIs as compared to CUDA APIs should be clearl - Code Indentation: - Tabs should be expanded to spaces. - Use 4 spaces indendation. -- Capitaliziation and Naming +- Capitalization and Naming - Prefer camelCase for HIP interfaces and internal symbols. Note HCC uses _ for separator. This guideline is not yet consistently followed in HIP code - eventual compliance is aspirational. - Member variables should begin with a leading "_". This allows them to be easily distinguished from other variables or functions. @@ -110,6 +110,7 @@ Differences or limitations of HIP APIs as compared to CUDA APIs should be clearl doFooElse(); } ''' + - namespace should be on same line as { and separated by a space. - Single-line if statement should still use {/} pair (even though C++ does not require). - Miscellaneous - All references in function parameter lists should be const. diff --git a/projects/hip/docs/markdown/hip_kernel_language.md b/projects/hip/docs/markdown/hip_kernel_language.md index ba52b88433..c9df5ce1b1 100644 --- a/projects/hip/docs/markdown/hip_kernel_language.md +++ b/projects/hip/docs/markdown/hip_kernel_language.md @@ -669,20 +669,23 @@ The following C++ features are not supported: - Try/catch ## Kernel Compilation -HIP now supports compiling C++/HIP kernels to binary. Eventhough HIP does not support fatbinary (yet), the user can specify the target for which the binary can be generated. The file format for binary is `.co` which means Code Object. The following command builds the binary using `hipcc`. +hipcc now supports compiling C++/HIP kernels to binary code objects. +The user can specify the target for which the binary can be generated. HIP/HCC does not yet support fat binaries so only a single target may be specified. +The file format for binary is `.co` which means Code Object. The following command builds the code object using `hipcc`. -`hipcc --genisa --target-isa=[TARGET GPU] [INPUT FILE] -o [OUTPUT FILE]` +`hipcc --genco --target-isa=[TARGET GPU] [INPUT FILE] -o [OUTPUT FILE]` ```[TARGET GPU] = fiji/hawaii [INPUT FILE] = Name of the file containing kernels [OUTPUT FILE] = Name of the generated code object file``` -Note that the kernel file should have `int main(){}` at the end it so that the binary is generated. This happens because HCC generates binaries at linking time rather than compilation +Note that the kernel file should have `int main(){}` at the end it so that the binary is generated. This happens because HCC generates binaries at linking time rather than compilation. -You need 3 things to run kernel in binary. -1. Kernel Binary -2. Name of kernel binary -3. Name of the kernel - -We already got first two of them. In order to get name of the kernel, try `objdump -x [OUTPUT FILE]`. OUTPUT FILE is file generated by hipcc during kernel compilation. The output from objdump has symbol to the kernel whose name is mangled with `grid_launch_parm`, `__functor`, `__cxxamp_trampoline`. An example of how it looks is `ZN12_GLOBAL__N_137_Z3Cpy16grid_launch_parmPfS0__functor19__cxxamp_trampolineEiiiiiiPKfPf` where `Cpy` is the name of the kernel written in C++. +To load a kernel into HIP, we need both the code object and the name of the kernel stored within the code object. +In order to get name of the kernel, use: +``` +$ objdump -x [CODE_OBJECT_FILE]`. +``` +CODE_OBJECT_FILE is file generated by hipcc during kernel compilation. The output from objdump has symbol to the kernel whose name is mangled with `grid_launch_parm`, `__functor`, `__cxxamp_trampoline`. An example of how it looks is `ZN12_GLOBAL__N_137_Z3Cpy16grid_launch_parmPfS0__functor19__cxxamp_trampolineEiiiiiiPKfPf` where `Cpy` is the name of the kernel written in C++. The hipLoadKernelModule API needs to specify this mangled name on the HIP/hcc path. + diff --git a/projects/hip/docs/markdown/hip_performance.md b/projects/hip/docs/markdown/hip_performance.md index 98197b3db7..bd550c9255 100644 --- a/projects/hip/docs/markdown/hip_performance.md +++ b/projects/hip/docs/markdown/hip_performance.md @@ -12,7 +12,7 @@ There are two possible ways to transfer data from Host to Device (H2D) and Devic #### On Large BAR Setup -There are two possible ways to transfer data from Host to Device (H2D) +There are three possible ways to transfer data from Host to Device (H2D) * Using Staging Buffers * Using PinInPlace * Direct Memcpy @@ -24,12 +24,9 @@ There are two possible ways to transfer data from Host to Device (H2D) Some GPUs may not be able to directly access host memory, and in these cases we need to stage the copy through an optimized pinned staging buffer, to implement H2D and D2H copies.The copy is broken into buffer-sized chunks to limit the size of the buffer and also to provide better performance by overlapping the CPU copies with the DMA copies. -PinInPlace is another algorithm which pins the host memory "in-place", and copies it with the DMA -engine. +PinInPlace is another algorithm which pins the host memory "in-place", and copies it with the DMA engine. -By default staging buffers are used for unpinned memory transfers, however other ways can be used by enabling few environment variables (so no need to build the code again!!!) - -Following environment variables can be used: +By default staging buffers are used for unpinned memory transfers. Environment variables allow control over the unpinned copy algorithm and parameters: - HIP_PININPLACE - This environment variable forces the use of PinInPlace logic for all unpinned memory copies diff --git a/projects/hip/include/hcc_detail/hip_runtime_api.h b/projects/hip/include/hcc_detail/hip_runtime_api.h index 029682a341..213928151a 100644 --- a/projects/hip/include/hcc_detail/hip_runtime_api.h +++ b/projects/hip/include/hcc_detail/hip_runtime_api.h @@ -726,7 +726,7 @@ hipError_t hipHostAlloc(void** ptr, size_t size, unsigned int flags) __attribute hipError_t hipHostGetDevicePointer(void** devPtr, void* hstPtr, unsigned int flags) ; /** - * @brief Get flags associated with host pointer + * @brief Return flags associated with host pointer * * @param[out] flagsPtr Memory location to store flags * @param[in] hostPtr Host Pointer allocated through hipHostMalloc @@ -1186,13 +1186,12 @@ hipError_t hipCtxGetSharedMemConfig ( hipSharedMemConfig * pConfig ); hipError_t hipCtxSynchronize ( void ); /** - * @brief Get flags used for creating current/default context. + * @brief Return flags used for creating default context. * * @param [out] flags * * @returns #hipSuccess. */ - hipError_t hipCtxGetFlags ( unsigned int* flags ); /**