Graf commitů

1316 Commity

Autor SHA1 Zpráva Datum
Ben Sander ca1cef4e06 Doc update - describe debug techniques
Also tweak sample to remove unneeded HIP_KERNEL_NAME.
Comment update
2017-01-19 12:40:45 -06:00
Ben Sander 2ffc9f4e22 Fix debug display for Module launch kernels 2017-01-19 12:40:45 -06:00
Rahul Garg aa3f278475 Fixed hipcommander default execution for HCSWAP-106
Change-Id: I9fbd10dfaeeb4928b2ec23ceed131b5200a658f9
2017-01-19 15:04:32 +05:30
Aditya Atluri 706a032a29 moved half device function declarations to top of the file
1. Moved half device functions around so that script can catch the signatures
2. Generated docs for half precision apis

Change-Id: Iee27658e3a639fdb02af135e71841dc6427f15e2
2017-01-18 15:06:18 -06:00
Aditya Atluri c9bc71dc86 more clarification about using device_md_gen.py
Change-Id: I3e207b65683f34d62be3454444ffb32f8814c0aa
2017-01-18 14:49:41 -06:00
Aditya Atluri 3d4dcee35d Added script for generating math api docs
1. Commented out unsupported device math functions
2. Moved function signatures to the top of implementation snippets
3. Added script to generate markdown documentation for device math apis
4. Added the generated file from the script which should be present everytime

Change-Id: Ic579dd8b8fdffa6e1b4d4f5f3fd8a803f4dcaac7
2017-01-18 14:40:50 -06:00
Aditya Atluri 41a46effef fixed compilation issues
1. Fixed compilation issues for tests
2. Added missing intrinsics + math functions
3. Disabled some device functions as they are causing linking error with HCC

Change-Id: I79d52c4c7a539cc8ef40580247ad97ffcb975f09
2017-01-18 11:53:47 -06:00
Aditya Atluri d23b6b8694 Moved device code to mimic cuda header behavior
1. All fp32, fp64 math device/host functions should be in math_functions.h/.cpp
2. All fp32, fp64 fast math intrinsics for device/host functions should be in device_functions.h/.cpp
3. All the device code implementations should be in device_util.h/.cpp
4. Hence, made changes appropriately by moving code and creating new header files
5. Added math_functions.cpp/.h
6. Changed #ifndef signature to make sure no conflicts between headers with same names in hip/hip_runtime.h and hip/hcc_detail/hip_runtime.h
7. Changed tests to fit the code changes, making them to include appropriate headers
8. Added math_functions.cpp to CMakeLists.txt
9. Some of the tests are still broken, mostly host math functions will fix them in next commit
10. TODO: FIX compilation issues for host math functions

Change-Id: I7a17637d7e294a7d224ffba932c1a08668febd26
2017-01-17 14:57:51 -06:00
Aditya Atluri 3f9a9d9318 enabled integer intrinsics tests
Change-Id: I5d28d556f228240eda2fc0098121ed3b29b041e7
2017-01-17 09:59:08 -06:00
Aditya Atluri 9ca135ac2e added last few integer intrinsic support
1. Added usad, umulhi, urhadd
2. Corrected implementation of __hadd, __hradd
3. TODO: __sad(). It gets tricky as ISA sees them as unsigned

Change-Id: Ibd2c2133b462f9393f3990355706386c79256bba
2017-01-17 09:27:51 -06:00
Aditya Atluri f0ea51c786 fixed broken tests and device code for integer intrinsics
1. Fixed build issues with new Integer intrinsics
2. Changed tests to work exactly as CUDA code
3. Still some integer intrinsics need to be supported

Change-Id: Ie6f4171259cf4da517436895d4f6f01e01f59b11
2017-01-17 09:00:09 -06:00
Aditya Atluri feba9fe213 v1: Working on Integer Intrinsics
1. Half way through
2. May not work
3. No test written

Change-Id: I705b743a78b142ff068e2521870e73fca7ad2b1c
2017-01-16 14:55:29 -06:00
Aditya Atluri e95456eee8 moved most of the fp16 code inside hip_fp16.cpp
1. As we use holder data structure, we move all the cmp, math, cvt apis to cpp file
2. All the tests passed
3. Add more extensive testing for half

Change-Id: I92c6399dace602a0a24432728e3f2a07124e6fb1
2017-01-16 12:32:35 -06:00
Aditya Atluri d496576b55 Added type conversion intrinsics
1. Added all type conversion intrinsics
2. NO TESTS have been added. (Will add in next commit)
3. Sanatized code in hip_runtime.h
4. Added passed() to hipTestHalf to make it pass on HIT

Change-Id: I0987963c802fc7ff4d7e07d7b88d86da35da53c9
2017-01-16 12:10:05 -06:00
Aditya Atluri 5c5f5c1ad1 added half2 log, log10, exp, exp10 math functions
1. Enabled tests for log, log10, exp, exp10 half2
2. h2rint is still disabled.

Change-Id: I01f6002f6992259919893c524c526db5ee09473a
2017-01-13 13:26:10 -06:00
Aditya Atluri eff68c989a added half2 math operations
1. They use SDWA + LLVM IR
2. Added these functions to test
3. Need to do exp, exp10, log, log10, rint

Change-Id: I06176acc6cb8bb054495310531777406a41b54e4
2017-01-13 12:27:11 -06:00
Aditya Atluri fe38e9652b added math functions for half
1. Added math functions for half precision
2. HRCP is not available due to device code linking errors, will be enabled once it is fixed
3. Added math functions to half test file

Change-Id: Ie317ce70ef518a4fc3f27142143d01e0327f5df3
2017-01-13 12:05:29 -06:00
Aditya Atluri 646f566bbf added half2 cmp and conv, data movement device functions
1. Added half2 comparision functions
2. Added conversion and data movement half apis

Change-Id: Ia33c0e957d9deb1f2b7a8fde8e22168f4d41b88b
2017-01-13 10:56:07 -06:00
Evgeny Mankov 5200073b4c [HIPIFY] Formatting, no functional changes. 2017-01-13 14:59:15 +03:00
Robert 32a35eda75 fix spelling errors
Conflicts:
	README.md
	docs/markdown/hip_faq.md

Change-Id: I8ca025e01276939ed3d7be24200ecaa8cf5e1e2c
2017-01-13 14:42:37 +05:30
Aditya Atluri 89998d436f added comparision device functions for fp16
1. Added comparision device functions
2. Added test to check correct isa getting generated

Change-Id: I16732f5a1438bdce145f7bfcecd28198e3cc4b79
2017-01-12 14:52:14 -06:00
Aditya Atluri eeef055469 added packed math fp16 native device functions
1. Added SDWA implementation inside IR file
2. Added device functions to header + used them in test

Change-Id: Ib4e059a58eee201cc82438689e3e9bc5f9d26653
2017-01-12 14:10:51 -06:00
Aditya Atluri c286bf6f8a Started adding native half math library support
1. Removed HIP_EXPERIMENTAL env variable so that device code will be accessed from LLVM IR
2. Removed soft support from headers and moved to hip_fp16.cpp
3. Added LLVM IR + inline asm to hip_ir.ll
4. Added test for fp16
5. Added barriers for hcc 3.5 and hcc 4.0 for half support
a. Which means, hcc 4.0 can parse __fp16 but hcc 3.5 cant
b. HCC 4.0 code is implemented now, hcc 3.5 will be added later

Change-Id: Ic37859b2688ebb02e168bab643d1882bf4727952
2017-01-12 11:30:20 -06:00
Aditya Atluri f85d7b7d97 changed data type used for complex
Change-Id: I0a3bb281af3d5ac1290207821c7c45aea40f513f
2017-01-11 18:23:37 -06:00
Aditya Atluri 7dbf63dde2 changed copyright year from 2016 to 2017 in include directory
Change-Id: Ib5935a84fb51a04b3446df31cc2287101f791b83
2017-01-11 18:09:33 -06:00
Aditya Atluri e9ff23e5f9 changed copyright year from 2016 to 2017 in src directory
Change-Id: Idb97db509b2b4b1656b2df7a14a62ade38c9d574
2017-01-11 18:05:41 -06:00
Aditya Atluri 1d8700096c added test for vector data types
Change-Id: I0b6624886e474601cb1ef003c5f10adf399a21c9
2017-01-11 18:02:30 -06:00
Aditya Atluri 430e1364f5 fixed compilation issues with operator overloading device data types
Change-Id: I6a60282f0c04a3c0d382cdf2d67ad8d9156880ad
2017-01-11 17:53:32 -06:00
Aditya Atluri 4e57822d95 Added proper device data types
Change-Id: I42029635ff68c3c13a764a3eda6447e6c77878c6
2017-01-11 15:06:25 -06:00
Evgeny Mankov fd0c56a767 [HIPIFY] cudaDataType_t and libraryPropertyType_t support (CUDA 8.0.44 only)
All are marked as HIP_UNSUPPORTED.
IMPORTANT:
1. libraryPropertyType_t has no cuda prefix. => TO_DO: new matcher is needed.
2. all libraries (cublas, cufft, cusolver, cusparse, nvgraph) have started to use these types (since 8.0).
2017-01-10 20:24:27 +03:00
Evgeny Mankov 81fd34f236 [HIPIFY] cudaDeviceAttr (RT API) support up to CUDA 8.0.44
Attributes, which are not yet supported by HIP, are marked as HIP_UNSUPPORTED.
2017-01-10 19:29:33 +03:00
Evgeny Mankov c0c04f34be [HIPIFY] CUdevice_attribute support up to CUDA 8.0.44
Attributes, which are not yet supported by HIP, are marked as HIP_UNSUPPORTED.
2017-01-10 17:54:22 +03:00
Ben Sander ff77106399 Fix delete[] 2017-01-09 21:03:11 -06:00
Ben Sander b29fbf736d Add HIP_MAX_QUEUES feature.
Includes some tricky manipulation of the locks for contexts and streams.
issue is that stealing a stream requires we lock the context to
walk the streams to find a victim.  To avoid deadlock, we can't
have a stream locked when we lock the context.  This implementation
releases the stream lock, then acquires the context and selects the
victim.
A more stable implemenation might be to copy the stream list
from a context so that a lock is not required to walk all streams.
Smart shared_ptr could be used to prevent the streams from being
deallocated during the walk.
2017-01-09 21:02:56 -06:00
Ben Sander c9f5fe34e6 First pass at virtualized queue support.
Also updated stream debug messages to consistently use trace_helper.
2017-01-09 21:02:53 -06:00
Ben Sander a6034b88e2 Add more notes on debugging HIP apps. 2017-01-09 21:02:50 -06:00
Ben Sander 49d1477b9d tolerate spaces in hip args 2017-01-09 20:57:13 -06:00
Rahul Garg 090eadd0bd Added state for hipDevice.
Change-Id: Idbc3c04cd054a01b634856a1e0a23ff172e991aa
2017-01-09 23:54:01 +05:30
scchan d71ac5c91a [cmake] add library dependencies to hip_hcc libraries 2017-01-05 18:26:54 -05:00
Maneesh Gupta 9b02f0d7e5 hipcc: Link to shared HIP runtime by default
Change-Id: I5030e3245e4afb6863b401656ca5d1ad9ae84310
2017-01-04 12:39:09 +05:30
Evgeny Mankov bbb75fdd8e [HIPIFY] Elapsed time is added to statistics. 2016-12-28 20:44:05 +03:00
Evgeny Mankov ddf51d382c [HIPIFY] Added the rest of cuBlas API.
CUBLAS API 7.5 now is supported by hipify;
API calls, which are not yet supported by hcblas/hipblas, are listed as HIP_UNSUPPORTED.
2016-12-28 18:08:10 +03:00
Evgeny Mankov 9070d55814 [HIPIFY] Formatting, no functional changes. 2016-12-27 19:48:59 +03:00
Evgeny Mankov e740d368d9 [HIPIFY] [Fix] An argument of a function used as macro argument is not hipified.
https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/35
2016-12-27 18:54:02 +03:00
Evgeny Mankov 3cd1adcb24 [HIPIFY] Pointer to typedef declaration is not hipified
Fix for https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/60
2016-12-26 19:03:50 +03:00
Evgeny Mankov f6aa3ddb0e [HIPIFY] Add hipconvertinplace2.sh and hipexamine2.sh scripts for hipify-clang.
The differences from the similar scripts for hipify.pl:
1. CSV file with extended statistics is produced.
2. scripts' arguments are changed a bit:
DIRNAME [hipify options] [--] [clang options]

where -- is a delimiter; all the arguments are optional, except DIRNAME.

Usage example:
./hipexamine2.sh ./tmp -o-stats ./tmp/stats.csv -- -I/usr/local/cuda-7.5/include -I/usr/local/hipify-clang/hipblas/include 2>&1 | tee log
2016-12-23 22:06:20 +03:00
Evgeny Mankov e8c5906062 [HIPIFY] Fix line endings. 2016-12-23 18:01:26 +03:00
Evgeny Mankov 52b3fb9f79 [HIPIFY] Stats: Calculation of changed code amount, based on actually replaced bytes.
+ REPLACED bytes, TOTAL bytes & CODE CHANGED are added to statistics.
+ -o-stats option for specifying the file with statistic.
2016-12-23 17:40:06 +03:00
Maneesh Gupta 16b705d912 hipcc: link to hip runtime using absolute path
Change-Id: I714b3e9da0bc1d49665b079d9c4cec1c1a2efa80
2016-12-23 11:49:00 +05:30
Ben Sander fd5b0c68b1 Support size_t in memset kernel.
Add disable for HSA_AMD_AGENT_INFO_MAX_WAVES_PER_CU
Remove one copy of completion_future in memset.
2016-12-22 12:25:09 -06:00