1. Split hip_ir.ll to hip_hc.ll and hip_hc_gfx803.ll
a. hip_hc.ll contains arch generic ir implementations
b. hip_hc_gfx803.ll contains gfx803 (fiji, polaris) specific ir
2. HIPCC can now parse --amdgpu-target=*.
a. Usage: hipcc --amdgpu-target=gfx803 --amdgpu-target=gfx701
b. TODO: Convert to --amdgpu-target=gfx803,gfx701
3. With LLC in HCC able to generate native f16 isa, removed inline half asm math ops
4. Fixed threadfence and threadfence_block to use functions in rocdl
Change-Id: Ic9a9e3e04139b0d75d2c2a263c030ca77adc1019
1. Removed HIP_EXPERIMENTAL env variable so that device code will be accessed from LLVM IR
2. Removed soft support from headers and moved to hip_fp16.cpp
3. Added LLVM IR + inline asm to hip_ir.ll
4. Added test for fp16
5. Added barriers for hcc 3.5 and hcc 4.0 for half support
a. Which means, hcc 4.0 can parse __fp16 but hcc 3.5 cant
b. HCC 4.0 code is implemented now, hcc 3.5 will be added later
Change-Id: Ic37859b2688ebb02e168bab643d1882bf4727952
The differences from the similar scripts for hipify.pl:
1. CSV file with extended statistics is produced.
2. scripts' arguments are changed a bit:
DIRNAME [hipify options] [--] [clang options]
where -- is a delimiter; all the arguments are optional, except DIRNAME.
Usage example:
./hipexamine2.sh ./tmp -o-stats ./tmp/stats.csv -- -I/usr/local/cuda-7.5/include -I/usr/local/hipify-clang/hipblas/include 2>&1 | tee log
hipcc accepts new parameter -use-staticlib and -use-sharedlib to
control linking behavior. Default is still static library.
Change-Id: I28fb9a939f8177c75abefd8b77d8118a6666d1f4
1. Use -DHIP_FAST_MATH to make precise math functions compiled to fast math
2. Added double fast math functions for sqrt
3. Changed hipcc to parse -use_fast_math (not working)
4. Added passed tag to hipFloatMath test
Change-Id: I72884b2436b4efe61e9a9297346c1358fee38a2d
Users who desire otherwise can set HIP_ATP_MARKER=0.
Also remove old unused hipcc_explicit_lib option.
Change-Id: I2bf07ba880329e7a3b1365dd33a3b2be6794370f
Missing includes are set explicitly.
Workaround is switched on by default, to disable it set HCC_SYS_INCLUDES_WA=0.
WA will be removed after fixing [SWDEV-105366].
1. Added feature for __threadfence and __threadfence_block
2. Added feature for using LLVM IR files directly while compilation
3. Added test for threadfence and threadfence_block
Change-Id: Ib7e5d89b4cca1a135952b317e5809cd05b56a3c9
- Expand message when HIP version mismatch detected.
- Doc touchup.
- change sorting of hipBusBandwidth so byte results shown at top.
-
Change-Id: Ifb4e44a5fdfb65d59c4994b11e5f13385705f7e0