Граф коммитов

1330 Коммитов

Автор SHA1 Сообщение Дата
Ben Sander 57e1efebab Add debug tips to docs
[ROCm/clr commit: 4de3df746c]
2017-01-23 22:34:41 -06:00
Ben Sander 948c5e013c Add debug tips to docs
[ROCm/clr commit: fe24996326]
2017-01-23 22:34:41 -06:00
Ben Sander 1ff12d95a6 Log error with ihipLogError. Cleans up CXL trace display.
[ROCm/clr commit: d19c4767b7]
2017-01-23 22:34:41 -06:00
Ben Sander 6f2a8bf97b Add HIP_IGNORE_HCC_VERSION.
Ignores strict checking of HCC and HIP version.
Can be useful when developing new HCC code.


[ROCm/clr commit: df74158d1c]
2017-01-23 22:34:41 -06:00
Aditya Atluri 35631ea2a2 added ir code sad u8
Change-Id: Ie0d454b3bb9a6c9a028c091ad3aa969719b02cc9


[ROCm/clr commit: 9952117d64]
2017-01-20 17:21:51 -06:00
Aditya Atluri 11cf5fc117 added driver_types.h and texture_types.h header files to hip
Change-Id: Ic3b2403f07d6767dadf83d6c278fd14e87f6acdb


[ROCm/clr commit: 97315e8748]
2017-01-20 17:09:52 -06:00
Aditya Atluri 0e061ea69a fixed hipArray issues
1. Fixed build issues produced from previous commit
2. Create new header files to manage data structures better

Change-Id: I704d82c196c1858ed7617d76e40612eb507d2aa0


[ROCm/clr commit: 5b2d4c0e60]
2017-01-20 16:54:48 -06:00
Aditya Atluri 5d51e1ddbd changes device functions documentation according to the supported apis
Change-Id: I47ac6bbde11d54d8265e0d27ec8cd5da4d03eb8e


[ROCm/clr commit: 5f10a69ef7]
2017-01-20 14:19:09 -06:00
Aditya Atluri 100ef6d9b2 added nvcc backend for hipArrays
1. Added hip_texture.h to hip_runtime_api.h as cuda does declare array runtime apis inside cuda_runtime_api.h
2. Added nvcc backend for hipArray runtime apis
3. Didn't test on nvidia platform (should work)

Change-Id: I1a14aef41840e4f55e5535132e3443a918b55967


[ROCm/clr commit: a7fa600176]
2017-01-20 14:11:45 -06:00
Aditya Atluri f6d09573aa added more test coverage for vector data types
Change-Id: I9f57a8b597bd2ee4b265eadfd0859531497a6ada


[ROCm/clr commit: fd2e6ac2f0]
2017-01-20 13:52:02 -06:00
Aditya Atluri b6f4fedaaf fixed compilation issues for vector types and math functions
1. Added math_functions.h to hip_runtime.h
2. Changed operator overloading classifier static to static inline
3. Added vector types test for gpu
4. Seperated __host__ and __device__ for math functions in headers

Change-Id: I499862fad5d7b10da686da9011d7ecefe523f8e2


[ROCm/clr commit: 02190736e3]
2017-01-20 09:49:11 -06:00
Ben Sander 81488d5d00 Add HIP_SYNC_HOST_ALLOC, HipReadEnv
[ROCm/clr commit: db3f4889ca]
2017-01-19 23:55:24 -06:00
Ben Sander 7a992b9fc3 Change ihipDeviceSetState,ihipDevice* so it doesn't log error
Cleans up debug trace.


[ROCm/clr commit: 6de88d4293]
2017-01-19 23:55:24 -06:00
Aditya Atluri c50f5cbd2c added operator overloading for complex data types
Change-Id: Id96d5d000651914169f04497af6ff78ad96d846a


[ROCm/clr commit: fe5f45caaf]
2017-01-19 15:15:25 -06:00
Ben Sander 48bd62db9a Doc update - describe debug techniques
Also tweak sample to remove unneeded HIP_KERNEL_NAME.
Comment update


[ROCm/clr commit: ca1cef4e06]
2017-01-19 12:40:45 -06:00
Ben Sander 3bc2e3ba02 Fix debug display for Module launch kernels
[ROCm/clr commit: 2ffc9f4e22]
2017-01-19 12:40:45 -06:00
Rahul Garg 707c31913d Fixed hipcommander default execution for HCSWAP-106
Change-Id: I9fbd10dfaeeb4928b2ec23ceed131b5200a658f9


[ROCm/clr commit: aa3f278475]
2017-01-19 15:04:32 +05:30
Aditya Atluri d84be1d089 moved half device function declarations to top of the file
1. Moved half device functions around so that script can catch the signatures
2. Generated docs for half precision apis

Change-Id: Iee27658e3a639fdb02af135e71841dc6427f15e2


[ROCm/clr commit: 706a032a29]
2017-01-18 15:06:18 -06:00
Aditya Atluri e264a9740e more clarification about using device_md_gen.py
Change-Id: I3e207b65683f34d62be3454444ffb32f8814c0aa


[ROCm/clr commit: c9bc71dc86]
2017-01-18 14:49:41 -06:00
Aditya Atluri 5ea40f27b3 Added script for generating math api docs
1. Commented out unsupported device math functions
2. Moved function signatures to the top of implementation snippets
3. Added script to generate markdown documentation for device math apis
4. Added the generated file from the script which should be present everytime

Change-Id: Ic579dd8b8fdffa6e1b4d4f5f3fd8a803f4dcaac7


[ROCm/clr commit: 3d4dcee35d]
2017-01-18 14:40:50 -06:00
Aditya Atluri 69903887a4 fixed compilation issues
1. Fixed compilation issues for tests
2. Added missing intrinsics + math functions
3. Disabled some device functions as they are causing linking error with HCC

Change-Id: I79d52c4c7a539cc8ef40580247ad97ffcb975f09


[ROCm/clr commit: 41a46effef]
2017-01-18 11:53:47 -06:00
Aditya Atluri 42c627fbe8 Moved device code to mimic cuda header behavior
1. All fp32, fp64 math device/host functions should be in math_functions.h/.cpp
2. All fp32, fp64 fast math intrinsics for device/host functions should be in device_functions.h/.cpp
3. All the device code implementations should be in device_util.h/.cpp
4. Hence, made changes appropriately by moving code and creating new header files
5. Added math_functions.cpp/.h
6. Changed #ifndef signature to make sure no conflicts between headers with same names in hip/hip_runtime.h and hip/hcc_detail/hip_runtime.h
7. Changed tests to fit the code changes, making them to include appropriate headers
8. Added math_functions.cpp to CMakeLists.txt
9. Some of the tests are still broken, mostly host math functions will fix them in next commit
10. TODO: FIX compilation issues for host math functions

Change-Id: I7a17637d7e294a7d224ffba932c1a08668febd26


[ROCm/clr commit: d23b6b8694]
2017-01-17 14:57:51 -06:00
Aditya Atluri 968f2e9489 enabled integer intrinsics tests
Change-Id: I5d28d556f228240eda2fc0098121ed3b29b041e7


[ROCm/clr commit: 3f9a9d9318]
2017-01-17 09:59:08 -06:00
Aditya Atluri 70a03445f0 added last few integer intrinsic support
1. Added usad, umulhi, urhadd
2. Corrected implementation of __hadd, __hradd
3. TODO: __sad(). It gets tricky as ISA sees them as unsigned

Change-Id: Ibd2c2133b462f9393f3990355706386c79256bba


[ROCm/clr commit: 9ca135ac2e]
2017-01-17 09:27:51 -06:00
Aditya Atluri b864271e2c fixed broken tests and device code for integer intrinsics
1. Fixed build issues with new Integer intrinsics
2. Changed tests to work exactly as CUDA code
3. Still some integer intrinsics need to be supported

Change-Id: Ie6f4171259cf4da517436895d4f6f01e01f59b11


[ROCm/clr commit: f0ea51c786]
2017-01-17 09:00:09 -06:00
Aditya Atluri d259da2e42 v1: Working on Integer Intrinsics
1. Half way through
2. May not work
3. No test written

Change-Id: I705b743a78b142ff068e2521870e73fca7ad2b1c


[ROCm/clr commit: feba9fe213]
2017-01-16 14:55:29 -06:00
Aditya Atluri 65552aa2c5 moved most of the fp16 code inside hip_fp16.cpp
1. As we use holder data structure, we move all the cmp, math, cvt apis to cpp file
2. All the tests passed
3. Add more extensive testing for half

Change-Id: I92c6399dace602a0a24432728e3f2a07124e6fb1


[ROCm/clr commit: e95456eee8]
2017-01-16 12:32:35 -06:00
Aditya Atluri d9845446ef Added type conversion intrinsics
1. Added all type conversion intrinsics
2. NO TESTS have been added. (Will add in next commit)
3. Sanatized code in hip_runtime.h
4. Added passed() to hipTestHalf to make it pass on HIT

Change-Id: I0987963c802fc7ff4d7e07d7b88d86da35da53c9


[ROCm/clr commit: d496576b55]
2017-01-16 12:10:05 -06:00
Aditya Atluri d71fc0e60a added half2 log, log10, exp, exp10 math functions
1. Enabled tests for log, log10, exp, exp10 half2
2. h2rint is still disabled.

Change-Id: I01f6002f6992259919893c524c526db5ee09473a


[ROCm/clr commit: 5c5f5c1ad1]
2017-01-13 13:26:10 -06:00
Aditya Atluri 4756cf7c16 added half2 math operations
1. They use SDWA + LLVM IR
2. Added these functions to test
3. Need to do exp, exp10, log, log10, rint

Change-Id: I06176acc6cb8bb054495310531777406a41b54e4


[ROCm/clr commit: eff68c989a]
2017-01-13 12:27:11 -06:00
Aditya Atluri b2973b97e2 added math functions for half
1. Added math functions for half precision
2. HRCP is not available due to device code linking errors, will be enabled once it is fixed
3. Added math functions to half test file

Change-Id: Ie317ce70ef518a4fc3f27142143d01e0327f5df3


[ROCm/clr commit: fe38e9652b]
2017-01-13 12:05:29 -06:00
Aditya Atluri d3fe56550e added half2 cmp and conv, data movement device functions
1. Added half2 comparision functions
2. Added conversion and data movement half apis

Change-Id: Ia33c0e957d9deb1f2b7a8fde8e22168f4d41b88b


[ROCm/clr commit: 646f566bbf]
2017-01-13 10:56:07 -06:00
Evgeny Mankov 740888f088 [HIPIFY] Formatting, no functional changes.
[ROCm/clr commit: 5200073b4c]
2017-01-13 14:59:15 +03:00
Robert 4f1b94ab25 fix spelling errors
Conflicts:
	README.md
	docs/markdown/hip_faq.md

Change-Id: I8ca025e01276939ed3d7be24200ecaa8cf5e1e2c


[ROCm/clr commit: 32a35eda75]
2017-01-13 14:42:37 +05:30
Aditya Atluri a52591d117 added comparision device functions for fp16
1. Added comparision device functions
2. Added test to check correct isa getting generated

Change-Id: I16732f5a1438bdce145f7bfcecd28198e3cc4b79


[ROCm/clr commit: 89998d436f]
2017-01-12 14:52:14 -06:00
Aditya Atluri d1f7c4e048 added packed math fp16 native device functions
1. Added SDWA implementation inside IR file
2. Added device functions to header + used them in test

Change-Id: Ib4e059a58eee201cc82438689e3e9bc5f9d26653


[ROCm/clr commit: eeef055469]
2017-01-12 14:10:51 -06:00
Aditya Atluri 5317ffa9ec Started adding native half math library support
1. Removed HIP_EXPERIMENTAL env variable so that device code will be accessed from LLVM IR
2. Removed soft support from headers and moved to hip_fp16.cpp
3. Added LLVM IR + inline asm to hip_ir.ll
4. Added test for fp16
5. Added barriers for hcc 3.5 and hcc 4.0 for half support
a. Which means, hcc 4.0 can parse __fp16 but hcc 3.5 cant
b. HCC 4.0 code is implemented now, hcc 3.5 will be added later

Change-Id: Ic37859b2688ebb02e168bab643d1882bf4727952


[ROCm/clr commit: c286bf6f8a]
2017-01-12 11:30:20 -06:00
Aditya Atluri 93a0f1497a changed data type used for complex
Change-Id: I0a3bb281af3d5ac1290207821c7c45aea40f513f


[ROCm/clr commit: f85d7b7d97]
2017-01-11 18:23:37 -06:00
Aditya Atluri ba96b2f6c8 changed copyright year from 2016 to 2017 in include directory
Change-Id: Ib5935a84fb51a04b3446df31cc2287101f791b83


[ROCm/clr commit: 7dbf63dde2]
2017-01-11 18:09:33 -06:00
Aditya Atluri a86633f210 changed copyright year from 2016 to 2017 in src directory
Change-Id: Idb97db509b2b4b1656b2df7a14a62ade38c9d574


[ROCm/clr commit: e9ff23e5f9]
2017-01-11 18:05:41 -06:00
Aditya Atluri e4189cab53 added test for vector data types
Change-Id: I0b6624886e474601cb1ef003c5f10adf399a21c9


[ROCm/clr commit: 1d8700096c]
2017-01-11 18:02:30 -06:00
Aditya Atluri 702036c468 fixed compilation issues with operator overloading device data types
Change-Id: I6a60282f0c04a3c0d382cdf2d67ad8d9156880ad


[ROCm/clr commit: 430e1364f5]
2017-01-11 17:53:32 -06:00
Aditya Atluri fe2d13c861 Added proper device data types
Change-Id: I42029635ff68c3c13a764a3eda6447e6c77878c6


[ROCm/clr commit: 4e57822d95]
2017-01-11 15:06:25 -06:00
Evgeny Mankov a72a2e84f2 [HIPIFY] cudaDataType_t and libraryPropertyType_t support (CUDA 8.0.44 only)
All are marked as HIP_UNSUPPORTED.
IMPORTANT:
1. libraryPropertyType_t has no cuda prefix. => TO_DO: new matcher is needed.
2. all libraries (cublas, cufft, cusolver, cusparse, nvgraph) have started to use these types (since 8.0).


[ROCm/clr commit: fd0c56a767]
2017-01-10 20:24:27 +03:00
Evgeny Mankov 87d5b745ac [HIPIFY] cudaDeviceAttr (RT API) support up to CUDA 8.0.44
Attributes, which are not yet supported by HIP, are marked as HIP_UNSUPPORTED.


[ROCm/clr commit: 81fd34f236]
2017-01-10 19:29:33 +03:00
Evgeny Mankov a194e1af1f [HIPIFY] CUdevice_attribute support up to CUDA 8.0.44
Attributes, which are not yet supported by HIP, are marked as HIP_UNSUPPORTED.


[ROCm/clr commit: c0c04f34be]
2017-01-10 17:54:22 +03:00
Ben Sander 9af6545379 Fix delete[]
[ROCm/clr commit: ff77106399]
2017-01-09 21:03:11 -06:00
Ben Sander 13bf4c39cc Add HIP_MAX_QUEUES feature.
Includes some tricky manipulation of the locks for contexts and streams.
issue is that stealing a stream requires we lock the context to
walk the streams to find a victim.  To avoid deadlock, we can't
have a stream locked when we lock the context.  This implementation
releases the stream lock, then acquires the context and selects the
victim.
A more stable implemenation might be to copy the stream list
from a context so that a lock is not required to walk all streams.
Smart shared_ptr could be used to prevent the streams from being
deallocated during the walk.


[ROCm/clr commit: b29fbf736d]
2017-01-09 21:02:56 -06:00
Ben Sander 82cf0397c5 First pass at virtualized queue support.
Also updated stream debug messages to consistently use trace_helper.


[ROCm/clr commit: c9f5fe34e6]
2017-01-09 21:02:53 -06:00
Ben Sander a3d325d206 Add more notes on debugging HIP apps.
[ROCm/clr commit: a6034b88e2]
2017-01-09 21:02:50 -06:00