@@ -1,7 +1,32 @@
|
||||
# FAQ
|
||||
|
||||
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
|
||||
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
|
||||
**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)*
|
||||
|
||||
### What APIs does HIP support?
|
||||
- [FAQ](#faq)
|
||||
- [Table of Contents](#table-of-contents)
|
||||
- [What APIs does HIP support?](#what-apis-does-hip-support)
|
||||
- [What is not supported?](#what-is-not-supported)
|
||||
- [Run-time features:](#run-time-features)
|
||||
- [How does HIP compare with OpenCL?](#how-does-hip-compare-with-opencl)
|
||||
- [What hardware does HIP support?](#what-hardware-does-hip-support)
|
||||
- [Does Hipify automatically convert all source code?](#does-hipify-automatically-convert-all-source-code)
|
||||
- [What is NVCC?](#what-is-nvcc)
|
||||
- [What is HCC?](#what-is-hcc)
|
||||
- [Why use HIP rather than supporting CUDA directly?](#why-use-hip-rather-than-supporting-cuda-directly)
|
||||
- [Can I develop HIP code on an Nvidia CUDA platform?](#can-i-develop-hip-code-on-an-nvidia-cuda-platform)
|
||||
- [Can I develop HIP code on an AMD HCC platform?](#can-i-develop-hip-code-on-an-amd-hcc-platform)
|
||||
- [Can a HIP binary run on both AMD and Nvidia platforms?](#can-a-hip-binary-run-on-both-amd-and-nvidia-platforms)
|
||||
- [Hmmm](#hmmm)
|
||||
- [Link2 Is it ready?](#link2-is-it-ready)
|
||||
- [What's the difference between HIP and hc?](#whats-the-difference-between-hip-and-hc)
|
||||
|
||||
<!-- END doctoc generated TOC please keep comment here to allow auto update -->
|
||||
=================
|
||||
|
||||
|
||||
### What APIs does HIP support ?
|
||||
HIP provides the following:
|
||||
- Devices (hipSetDevice(), hipGetDeviceProperties(), etc)
|
||||
- Memory management (hipMalloc(), hipMemcpy(), hipFree())
|
||||
@@ -15,7 +40,7 @@ HIP provides the following:
|
||||
The HIP documentation describes each API and its limitations, if any, compared with the equivalent CUDA API.
|
||||
|
||||
### What is not supported?
|
||||
#### Run-time features:
|
||||
#### Run-time features
|
||||
- Textures
|
||||
- Dynamic parallelism
|
||||
- Managed memory
|
||||
@@ -23,7 +48,7 @@ The HIP documentation describes each API and its limitations, if any, compared w
|
||||
- CUDA array, mipmappedArray and pitched memory
|
||||
- CUDA Driver API
|
||||
|
||||
#### Kernel language features:
|
||||
#### Kernel language features
|
||||
- Device-side dynamic memory allocations (malloc, free, new, delete)
|
||||
- Virtual functions, indirect functions and try/catch
|
||||
- `__prof_trigger`
|
||||
@@ -58,9 +83,10 @@ NVCC is Nvidia's compiler driver for compiling "CUDA C++" code into PTX or devic
|
||||
### What is HCC?
|
||||
HCC is AMD's compiler driver which compiles "heterogenous C++" code into HSAIL or GCN device code for AMD GPUs. HCC is an open-source compiler based on recent versions of CLANG/LLVM.
|
||||
|
||||
### Why use HIP rather than supporting CUDA run time directly?
|
||||
### Why use HIP rather than supporting CUDA directly?
|
||||
While HIP is a strong subset of the CUDA, it is a subset. The HIP layer allows that subset to be clearly defined and documented.
|
||||
Developers who code to the HIP API can be assured there code will remain portable across Nvidia and AMD platforms.
|
||||
Developers who code to the HIP API can be assured there code will remain portable across Nvidia and AMD platforms.
|
||||
In addition, HIP defines portable mechanisms to query architectural features, and supports a larger 64-bit wavesize which expands the return type for cross-lane functions like ballot and shuffle from 32-bit ints to 64-bit ints.
|
||||
|
||||
### Can I develop HIP code on an Nvidia CUDA platform?
|
||||
Yes! HIP's CUDA path only exposes the APIs and functionality that work on both NVCC and HCC back-ends.
|
||||
@@ -75,6 +101,7 @@ Yes! HIP's HCC path only exposes the APIs and functions that work on both NVCC a
|
||||
### Can a HIP binary run on both AMD and Nvidia platforms?
|
||||
HIP is a source-portable language that can be compiled to run on either the HCC or NVCC platform. HIP tools don't create a "fat binary" that can run on either platform, however.
|
||||
|
||||
|
||||
### What's the difference between HIP and hc?
|
||||
HIP is a portable C++ language that supports a strong subset of the CUDA run-time APIs and device-kernel language. It's designed to simplify CUDA conversion to portable C++. HIP provides a C-compatible run-time API, C-compatible kernel-launch mechanism, C++ kernel language and pointer-based memory management.
|
||||
|
||||
|
||||
@@ -1,53 +1,54 @@
|
||||
# HIP Kernel Language
|
||||
<!-- START doctoc generated TOC please keep comment here to allow auto update -->
|
||||
<!-- DON'T EDIT THIS SECTION, INSTEAD RE-RUN doctoc TO UPDATE -->
|
||||
**Table of Contents** *generated with [DocToc](https://github.com/thlorenz/doctoc)*
|
||||
|
||||
- [Introduction](#introduction)
|
||||
- [Function-Type Qualifiers](#function-type-qualifiers)
|
||||
- [`__device__`](#__device__)
|
||||
- [`__global__`](#__global__)
|
||||
- [`__host__`](#__host__)
|
||||
- [Calling `__global__` Functions](#calling-__global__-functions)
|
||||
- [Kernel-Launch Example](#kernel-launch-example)
|
||||
- [Variable-Type Qualifiers](#variable-type-qualifiers)
|
||||
- [`__constant__`](#__constant__)
|
||||
- [`__shared__`](#__shared__)
|
||||
- [`__managed__`](#__managed__)
|
||||
- [`__restrict__`](#__restrict__)
|
||||
- [Built-In Variables](#built-in-variables)
|
||||
- [Coordinate Built-Ins](#coordinate-built-ins)
|
||||
- [warpSize](#warpsize)
|
||||
- [Vector Types](#vector-types)
|
||||
- [Short Vector Types](#short-vector-types)
|
||||
- [dim3](#dim3)
|
||||
- [Memory-Fence Instructions](#memory-fence-instructions)
|
||||
- [Synchronization Functions](#synchronization-functions)
|
||||
- [Math Functions](#math-functions)
|
||||
- [Single Precision Mathematical Functions](#single-precision-mathematical-functions)
|
||||
- [Double Precision Mathematical Functions](#double-precision-mathematical-functions)
|
||||
- [Integer Intrinsics](#integer-intrinsics)
|
||||
- [Texture Functions](#texture-functions)
|
||||
- [Surface Functions](#surface-functions)
|
||||
- [Timer Functions](#timer-functions)
|
||||
- [Atomic Functions](#atomic-functions)
|
||||
- [Caveats and Features Under-Development:](#caveats-and-features-under-development)
|
||||
- [Warp Cross-Lane Functions](#warp-cross-lane-functions)
|
||||
- [Warp Vote and Ballot Functions](#warp-vote-and-ballot-functions)
|
||||
- [Warp Shuffle Functions](#warp-shuffle-functions)
|
||||
- [Profiler Counter Function](#profiler-counter-function)
|
||||
- [Assert](#assert)
|
||||
- [Printf](#printf)
|
||||
- [Device-Side Dynamic Global Memory Allocation](#device-side-dynamic-global-memory-allocation)
|
||||
- [`__launch_bounds__`](#__launch_bounds__)
|
||||
- [Register Keyword](#register-keyword)
|
||||
- [Pragma Unroll](#pragma-unroll)
|
||||
- [In-Line Assembly](#in-line-assembly)
|
||||
- [C++ Support](#c-support)
|
||||
|
||||
<!-- END doctoc generated TOC please keep comment here to allow auto update -->
|
||||
|
||||
|
||||
###Table of Contents
|
||||
=================
|
||||
|
||||
* [HIP Kernel Language](#hip-kernel-language" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Table of Contents](#table-of-contents" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Function-Type Qualifiers](#function-type-qualifiers" aria-hidden="true"><span aria-hidden="true)
|
||||
* [__device__ ](#__device__" aria-hidden="true"><span aria-hidden="true)
|
||||
* [__global__ ](#__global__" aria-hidden="true"><span aria-hidden="true)
|
||||
* [__host__ ](#__host__" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Calling __global__ Functions](#calling-__global__-functions" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Kernel-Launch Example](#kernel-launch-example" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Variable-Type Qualifiers](#variable-type-qualifiers" aria-hidden="true"><span aria-hidden="true)
|
||||
* [__constant__ ](#__constant__" aria-hidden="true"><span aria-hidden="true)
|
||||
* [__shared__ ](#__shared__" aria-hidden="true"><span aria-hidden="true)
|
||||
* [__managed__ ](#__managed__" aria-hidden="true"><span aria-hidden="true)
|
||||
* [__restrict__ ](#__restrict__" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Built-In Variables](#built-in-variables" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Coordinate Built-Ins](#coordinate-built-ins" aria-hidden="true"><span aria-hidden="true)
|
||||
* [warpSize](#warpsize" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Vector Types](#vector-types" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Short Vector Types](#short-vector-types" aria-hidden="true"><span aria-hidden="true)
|
||||
* [dim3](#dim3" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Memory-Fence Instructions](#memory-fence-instructions" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Synchronization Functions](#synchronization-functions" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Math Functions](#math-functions" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Single Precision Mathematical Functions](#single-precision-mathematical-functions" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Double Precision Mathematical Functions](#double-precision-mathematical-functions" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Integer Intrinsics](#integer-intrinsics" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Texture Functions](#texture-functions" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Surface Functions](#surface-functions" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Timer Functions](#timer-functions" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Atomic Functions](#atomic-functions" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Caveats and Features Under-Development:](#caveats-and-features-under-development" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Warp Cross-Lane Functions](#warp-cross-lane-functions" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Warp Vote and Ballot Functions](#warp-vote-and-ballot-functions" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Warp Shuffle Functions](#warp-shuffle-functions" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Profiler Counter Function](#profiler-counter-function" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Assert](#assert" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Printf](#printf" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Device-Side Dynamic Global Memory Allocation](#device-side-dynamic-global-memory-allocation" aria-hidden="true"><span aria-hidden="true)
|
||||
* [__launch_bounds__ ](#__launch_bounds__" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Register Keyword](#register-keyword" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Pragma Unroll](#pragma-unroll" aria-hidden="true"><span aria-hidden="true)
|
||||
* [In-Line Assembly](#in-line-assembly" aria-hidden="true"><span aria-hidden="true)
|
||||
* [C Support](#c-support" aria-hidden="true"><span aria-hidden="true)
|
||||
|
||||
|
||||
## Introduction
|
||||
|
||||
HIP provides a C++ syntax that is suitable for compiling most code that commonly appears in compute kernels, including classes, namespaces, operator overloading, templates and more. Additionally, it defines other language features designed specifically to target accelerators, such as the following:
|
||||
- A kernel-launch syntax that uses standard C++, resembles a function call and is portable to all HIP targets
|
||||
|
||||
@@ -6,9 +6,9 @@ and provides practical suggestions on how to port CUDA code and work through com
|
||||
###Table of Contents
|
||||
=================
|
||||
|
||||
* [HIP Porting Guide](#hip-porting-guide" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Table of Contents](#table-of-contents" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Porting a New Cuda Project](#porting-a-new-cuda-project" aria-hidden="true"><span aria-hidden="true)
|
||||
* [HIP Porting Guide](#hip-porting-guide)
|
||||
* [Table of Contents](#table-of-contents)
|
||||
* [Porting a New Cuda Project TO](#porting-a-new-cuda-project)
|
||||
* [General Tips](#general-tips" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Scanning existing CUDA code to scope the porting effort](#scanning-existing-cuda-code-to-scope-the-porting-effort" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Distinguishing Compiler Modes](#distinguishing-compiler-modes" aria-hidden="true"><span aria-hidden="true)
|
||||
@@ -42,7 +42,7 @@ and provides practical suggestions on how to port CUDA code and work through com
|
||||
* [What Does This Error Mean?](#what-does-this-error-mean" aria-hidden="true"><span aria-hidden="true)
|
||||
* [/usr/include/c /v1/memory:5172:15: error: call to implicitly deleted default constructor of 'std::__1::bad_weak_ptr' throw bad_weak_ptr();](#usrincludecv1memory517215-error-call-to-implicitly-deleted-default-constructor-of-std__1bad_weak_ptr-throw-bad_weak_ptr" aria-hidden="true"><span aria-hidden="true)
|
||||
* [grid_launch kernel dispatch - fallback](#grid_launch-kernel-dispatch---fallback" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Editor Highlighting](#editor-highlighting" aria-hidden="true"><span aria-hidden="true)
|
||||
* [Editor Highlighting](#editor-highlighting)
|
||||
|
||||
|
||||
## Porting a New Cuda Project
|
||||
@@ -95,6 +95,32 @@ info: TOTAL-converted 89 CUDA->HIP refs( dev:3 mem:32 kern:2 builtin:37 math:0 s
|
||||
kernels (1 total) : kmeansPoint(1)
|
||||
```
|
||||
|
||||
### Converting a project "in-place"
|
||||
|
||||
```shell
|
||||
> hipify --inplace
|
||||
```
|
||||
|
||||
For each input file FILE, this script will:
|
||||
- If "FILE.prehip file does not exist, copy the original code to a new file with extension ".prehip". Then Hipify the code file.
|
||||
- If "FILE.prehip" file exists, hipify FILE.prehip and save to FILE.
|
||||
|
||||
This is useful for testing improvements to the hipify toolset.
|
||||
|
||||
|
||||
The "hipconvertinplace.sh" script will perform inplace conversion for all code files in the specified directory.
|
||||
This can be quite handy when dealing with an existing CUDA code base since the script preserves the existing directory structure
|
||||
and filenames - so includes work. After converting in-place, you can review the code to add additional parameters to
|
||||
directory names.
|
||||
|
||||
|
||||
```shell
|
||||
> hipconverinplace.sh MY_SRC_DIR
|
||||
```
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
## Distinguishing Compiler Modes
|
||||
|
||||
|
||||
Referência em uma Nova Issue
Bloquear um usuário