Граф коммитов

6428 Коммитов

Автор SHA1 Сообщение Дата
Aditya Atluri 5ef8ef3bd7 added packed math fp16 native device functions
1. Added SDWA implementation inside IR file
2. Added device functions to header + used them in test

Change-Id: Ib4e059a58eee201cc82438689e3e9bc5f9d26653
2017-01-12 14:10:51 -06:00
Aditya Atluri d180fdaae0 Started adding native half math library support
1. Removed HIP_EXPERIMENTAL env variable so that device code will be accessed from LLVM IR
2. Removed soft support from headers and moved to hip_fp16.cpp
3. Added LLVM IR + inline asm to hip_ir.ll
4. Added test for fp16
5. Added barriers for hcc 3.5 and hcc 4.0 for half support
a. Which means, hcc 4.0 can parse __fp16 but hcc 3.5 cant
b. HCC 4.0 code is implemented now, hcc 3.5 will be added later

Change-Id: Ic37859b2688ebb02e168bab643d1882bf4727952
2017-01-12 11:30:20 -06:00
Aditya Atluri e2318cda74 changed data type used for complex
Change-Id: I0a3bb281af3d5ac1290207821c7c45aea40f513f
2017-01-11 18:23:37 -06:00
Aditya Atluri 98c4221dc2 changed copyright year from 2016 to 2017 in include directory
Change-Id: Ib5935a84fb51a04b3446df31cc2287101f791b83
2017-01-11 18:09:33 -06:00
Aditya Atluri 73fcce26f9 changed copyright year from 2016 to 2017 in src directory
Change-Id: Idb97db509b2b4b1656b2df7a14a62ade38c9d574
2017-01-11 18:05:41 -06:00
Aditya Atluri 57294ce461 added test for vector data types
Change-Id: I0b6624886e474601cb1ef003c5f10adf399a21c9
2017-01-11 18:02:30 -06:00
Aditya Atluri e30887dc69 fixed compilation issues with operator overloading device data types
Change-Id: I6a60282f0c04a3c0d382cdf2d67ad8d9156880ad
2017-01-11 17:53:32 -06:00
Aditya Atluri 39910029a6 Added proper device data types
Change-Id: I42029635ff68c3c13a764a3eda6447e6c77878c6
2017-01-11 15:06:25 -06:00
Evgeny Mankov 9a0780001b [HIPIFY] cudaDataType_t and libraryPropertyType_t support (CUDA 8.0.44 only)
All are marked as HIP_UNSUPPORTED.
IMPORTANT:
1. libraryPropertyType_t has no cuda prefix. => TO_DO: new matcher is needed.
2. all libraries (cublas, cufft, cusolver, cusparse, nvgraph) have started to use these types (since 8.0).
2017-01-10 20:24:27 +03:00
Evgeny Mankov 3a99536ed5 [HIPIFY] cudaDeviceAttr (RT API) support up to CUDA 8.0.44
Attributes, which are not yet supported by HIP, are marked as HIP_UNSUPPORTED.
2017-01-10 19:29:33 +03:00
Evgeny Mankov 7ed2b163de [HIPIFY] CUdevice_attribute support up to CUDA 8.0.44
Attributes, which are not yet supported by HIP, are marked as HIP_UNSUPPORTED.
2017-01-10 17:54:22 +03:00
Ben Sander a15d236de3 Fix delete[] 2017-01-09 21:03:11 -06:00
Ben Sander a3e0012567 Add HIP_MAX_QUEUES feature.
Includes some tricky manipulation of the locks for contexts and streams.
issue is that stealing a stream requires we lock the context to
walk the streams to find a victim.  To avoid deadlock, we can't
have a stream locked when we lock the context.  This implementation
releases the stream lock, then acquires the context and selects the
victim.
A more stable implemenation might be to copy the stream list
from a context so that a lock is not required to walk all streams.
Smart shared_ptr could be used to prevent the streams from being
deallocated during the walk.
2017-01-09 21:02:56 -06:00
Ben Sander 93fbc9cf7b First pass at virtualized queue support.
Also updated stream debug messages to consistently use trace_helper.
2017-01-09 21:02:53 -06:00
Ben Sander fd209f37d9 Add more notes on debugging HIP apps. 2017-01-09 21:02:50 -06:00
Ben Sander 3a42a7642a tolerate spaces in hip args 2017-01-09 20:57:13 -06:00
Rahul Garg 5fb09879c7 Added state for hipDevice.
Change-Id: Idbc3c04cd054a01b634856a1e0a23ff172e991aa
2017-01-09 23:54:01 +05:30
Maneesh Gupta 9199f952f9 Merge branch 'amd-develop' into amd-master
Change-Id: I81429e5f3f55a71498da6cece9d08a8b1c170057
2017-01-06 12:55:36 +05:30
scchan 4fd48084a6 [cmake] add library dependencies to hip_hcc libraries 2017-01-05 18:26:54 -05:00
Maneesh Gupta a42da10c44 hipcc: Link to shared HIP runtime by default
Change-Id: I5030e3245e4afb6863b401656ca5d1ad9ae84310
2017-01-04 12:39:09 +05:30
Evgeny Mankov 14e9cf7e62 [HIPIFY] Elapsed time is added to statistics. 2016-12-28 20:44:05 +03:00
Evgeny Mankov 6ceb85a03a [HIPIFY] Added the rest of cuBlas API.
CUBLAS API 7.5 now is supported by hipify;
API calls, which are not yet supported by hcblas/hipblas, are listed as HIP_UNSUPPORTED.
2016-12-28 18:08:10 +03:00
Evgeny Mankov d7d3fcc77d [HIPIFY] Formatting, no functional changes. 2016-12-27 19:48:59 +03:00
Evgeny Mankov 5ec0488ce8 [HIPIFY] [Fix] An argument of a function used as macro argument is not hipified.
https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/35
2016-12-27 18:54:02 +03:00
Evgeny Mankov 24703944de [HIPIFY] Pointer to typedef declaration is not hipified
Fix for https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/60
2016-12-26 19:03:50 +03:00
Evgeny Mankov bcbbc32fa6 [HIPIFY] Add hipconvertinplace2.sh and hipexamine2.sh scripts for hipify-clang.
The differences from the similar scripts for hipify.pl:
1. CSV file with extended statistics is produced.
2. scripts' arguments are changed a bit:
DIRNAME [hipify options] [--] [clang options]

where -- is a delimiter; all the arguments are optional, except DIRNAME.

Usage example:
./hipexamine2.sh ./tmp -o-stats ./tmp/stats.csv -- -I/usr/local/cuda-7.5/include -I/usr/local/hipify-clang/hipblas/include 2>&1 | tee log
2016-12-23 22:06:20 +03:00
Evgeny Mankov ab00e2a627 [HIPIFY] Fix line endings. 2016-12-23 18:01:26 +03:00
Evgeny Mankov 6882057fd2 [HIPIFY] Stats: Calculation of changed code amount, based on actually replaced bytes.
+ REPLACED bytes, TOTAL bytes & CODE CHANGED are added to statistics.
+ -o-stats option for specifying the file with statistic.
2016-12-23 17:40:06 +03:00
Maneesh Gupta f6e9f6f0bf hipcc: link to hip runtime using absolute path
Change-Id: I714b3e9da0bc1d49665b079d9c4cec1c1a2efa80
2016-12-23 11:49:00 +05:30
Ben Sander c325c988b1 Support size_t in memset kernel.
Add disable for HSA_AMD_AGENT_INFO_MAX_WAVES_PER_CU
Remove one copy of completion_future in memset.
2016-12-22 12:25:09 -06:00
Maneesh Gupta 9b6d1588ba hip_hcc package changes
- updated hip_hcc package creation dependencies
- support build hip_hcc package for HCC-1.0

Change-Id: Idf23e415eff8cb352a8906191c79bd822c7618e7
2016-12-22 15:30:38 +05:30
Ben Sander 37d8cafb12 Increment API sequence number.
Change name to tls_tidInfo
2016-12-21 15:30:36 -06:00
Evgeny Mankov 4bb8bf8dab [HIPIFY] Statistics in CSV file.
+ Stats by CUDA ref name.
+ Conversion %.

TODO: Calculation of changed code amount, based on actually replaced bytes.
2016-12-21 23:08:01 +03:00
Rahul Garg 4704547bab Removed redundant GetPCIBusID int version function
Change-Id: I37f2ff87d09fcfb1e3b104c44c51f606fcb83c01
2016-12-20 23:25:16 +05:30
Evgeny Mankov 3dd32e969d [HIPIFY] Reflect unsupported CUDA API refs in statistics
https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/53

+ Unsupported refs (by HIP) are now might be listed along with the supported ones.
+ Warnings are added for the unhandled (by HIPIFY) refs, for instance:
  "warning: the following reference is not handled: 'cublasContext' [param decl ptr]."
+ Reflect unsupported CUDA API refs in statistics.
+ Occupancy API [HIP_UNSUPPORTED].
+ A few CUBLAS refs are listed as HIP_UNSUPPORTED.

TODO: Statistics in CSV file.
2016-12-19 14:38:19 +03:00
Maneesh Gupta 8f31ad6ce0 Merge branch 'amd-develop' into amd-master
Change-Id: I77fa88b460be549bfcf9e18d3212e732ffc045f5
2016-12-19 16:20:38 +05:30
Rahul Garg fbf7ed63a8 Fix for HCSWAP-67
Change-Id: I0b2ce5ab933237947fb41d89769db3da16e5be6a

Conflicts:
	src/hip_hcc.cpp
2016-12-19 16:19:51 +05:30
Maneesh Gupta f052f43b3b Updated doxygen documentation
Change-Id: If04d1155173fba8d3e050f3259da8b3edc60e076
2016-12-19 04:04:06 +00:00
Ben Sander 90c69e14bb Add name for function 2016-12-17 08:54:09 -06:00
Ben Sander 8bf4bd2f7d Remove HSA dependency from hipFunction_t
Place _groupSegmentSize and _privateSegmentSize inside Function,
remove hsa_executable_symbol_t.
2016-12-17 07:22:56 -06:00
Ben Sander 6ed7e1c1c1 Remove USE_DISPATCH_HSA_KERNEL=0 path. 2016-12-17 07:22:56 -06:00
Ben Sander 4d29885be3 Refactor Module and Function APIs.
- hipFunction_t is now returned by value.  This eliminates dynamic
      allocation / memory management complexity in the module.  Removed
the kernel
      name so the structure is just 16 bytes now.

    - Moved the hsa_executable_load_module and hsa_executable_freeze
      calls to the hipModuleLoad and hipModuleLoadData calls.

    - Apply sharedMemBytes in hipModuleLaunchKernel to group segment
      size (not private).
2016-12-17 07:22:33 -06:00
Rahul Garg 263a9614ff Mapped hipDevice_t to int
Change-Id: I6cfa56c42b7cd04aa0e0bce510c0d72d34ea211a
2016-12-17 16:53:03 +05:30
Aditya Atluri 2665ad2762 disabled half native support as inline asm is not working
Change-Id: I3073d8ae39eed321987f0f2f0e689eec4cdbb48c
2016-12-16 09:24:59 -06:00
Ben Sander 43635f51dc Print limits on CUDA devices 2016-12-16 08:55:11 -06:00
Ben Sander bd19bb4074 Fix typo 2016-12-15 14:42:52 -06:00
Ben Sander 8ed38bae69 fix copyright 2016-12-15 14:42:52 -06:00
Ben Sander 4080fe209d remove TODO file 2016-12-15 14:42:52 -06:00
Evgeny Mankov 2383d9bc1a [HIPIFY] nested macro is not hipified, when it isAnyIdentifier
Fix for https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/55
2016-12-15 21:00:34 +03:00
Maneesh Gupta 617737e0df Merge branch 'amd-develop' into amd-master
Change-Id: I52830df409da1f021c32ea569d4ae671aeb57b03
2016-12-15 16:25:33 +05:30