rocm-systems

Comhaid

T

Stair

Maneesh Gupta db959e648b GPUOpen-ProfessionalCompute-Tools -> ROCm-Developer-Tools

Change-Id: I9f5b29dd1097385acecb0c672770d8adca2fdcf7

2017-07-05 11:44:44 +05:30

inline_asm.cpp

Add unroll and inline asm cookbook samples

2017-05-31 09:25:35 +05:30

Makefile

removed rm for /opt/rocm/hip/src in inline asm sample

2017-06-20 11:35:52 -05:00

Readme.md

GPUOpen-ProfessionalCompute-Tools -> ROCm-Developer-Tools

2017-07-05 11:44:44 +05:30

Readme.md

inline asm

This tutorial is about how to use inline GCN asm in kernel. In this tutorial, we'll explain how to by using the simple Matrix Transpose.

Introduction:

If you want to take advantage of the extra performance benefits of writing in assembly as well as take advantage of special GPU hardware features that were only available through assemby, then this tutorial is for you. In this tutorial we'll be explaining how to start writing inline asm in kernel.

For more insight Please read the following blogs by Ben Sander The Art of AMDGCN Assembly: How to Bend the Machine to Your Will AMD GCN Assembly: Cross-Lane Operations

For more information: AMD GCN3 ISA Architecture Manual User Guide for AMDGPU Back-end

Requirement:

For hardware requirement and software installation Installation

prerequiste knowledge:

Programmers familiar with CUDA, OpenCL will be able to quickly learn and start coding with the HIP API. In case you are not, don't worry. You choose to start with the best one. We'll be explaining everything assuming you are completely new to gpgpu programming.

Simple Matrix Transpose

We will be using the Simple Matrix Transpose application from the our very first tutorial.

asm() Assembler statement

We insert the GCN isa into the kernel using asm() Assembler statement. In the same sourcecode, we used for MatrixTranspose. We'll add the following:

asm volatile ("v_mov_b32_e32 %0, %1" : "=v" (out[x*width + y]) : "v" (in[y*width + x]));

How to build and run:

Use the make command and execute it using ./exe Use hipcc to build the application, which is using hcc on AMD and nvcc on nvidia.

More Info:

HIP FAQ
HIP Kernel Language
HIP Runtime API (Doxygen)
HIP Porting Guide
HIP Terminology (including Rosetta Stone of GPU computing terms across CUDA/HIP/HC/AMP/OpenL)
clang-hipify
Developer/CONTRIBUTING Info
Release Notes