Commit Graph

6428 Commits

Author SHA1 Message Date
Aditya Atluri b37422d30b fixed arch compiler flag
Change-Id: I19f7a0ea513d6b8059f0c10cd0e7a5ead510e246
2017-01-27 08:51:48 -06:00
Aditya Atluri f7ff199daa fixed threadfence ir
Change-Id: Ia3afb54bdb50864e678d849608d72a3c321edba1
2017-01-27 08:42:26 -06:00
Aditya Atluri 6ba848a40f fixed kernel only compilation for module api
Change-Id: I567992fa9b87125318edba41fd82d2f7bc5504a1
2017-01-27 08:41:42 -06:00
Rahul Garg 3e21d55c98 hipMallocArray fixes for NV path
Change-Id: I1ca43e6bc0cd405998888005c20dfb1ea57003d5
2017-01-27 14:32:08 +05:30
pensun 2e1a66103a more fix on hipmallocarray on NV path
Change-Id: I890a36cab10c101f4a112bc4567f765b318d486c
2017-01-26 12:30:52 -06:00
pensun 8404215658 fix missing semicolon on NV path
Change-Id: I3cfecb7bd534578a1f5a07ca9397092dcf01db07
2017-01-26 11:28:15 -06:00
Ben Sander 0409bf639c Add HIP_FAIL_SOC.
Fail sub-optimal-copies rather than perform them slowly.
SOC occur on async copy of unpinned memory, or P2P copy between GPUs
that are not peers.
2017-01-25 21:53:17 -06:00
Ben Sander 1635b8f43f Read HCC_OPT_FLUSH and optimize dispatch accordingly.
If HCC is in this mode, we can use less aggressive flushes in some
cases.
2017-01-25 21:50:52 -06:00
Aditya Atluri d75d0596bf added device functions header to hip_runtime.h
Change-Id: I635931e1fbe4b7f0f64f3a126c0f1badcd6d234b
2017-01-25 08:14:30 -06:00
pensun 95677edabb Add more hip_bug.md entry, regarding hang after hipLaunchKernel
Change-Id: I5800cb627179ec0e913cd36d332fb8c2994ab71e
2017-01-24 22:43:25 -06:00
pensun f3da91de4e Initial commit on hip_bugs markdown doc
Change-Id: I5a6915337b8664cfed9eaee9443c6e4406348574
2017-01-24 22:30:36 -06:00
Ben Sander 9dff0782a4 Add debug tip to docs 2017-01-23 22:34:41 -06:00
Ben Sander 813c189b33 Show dynamic shared mem usage not static. 2017-01-23 22:34:41 -06:00
Ben Sander 0dabdeb01f Move core env var processing to env.cpp 2017-01-23 22:34:41 -06:00
Ben Sander 96eac67929 Add debug tips to docs 2017-01-23 22:34:41 -06:00
Ben Sander 85d41dcd05 Add debug tips to docs 2017-01-23 22:34:41 -06:00
Ben Sander 4586091dfe Log error with ihipLogError. Cleans up CXL trace display. 2017-01-23 22:34:41 -06:00
Ben Sander 138cf36547 Add HIP_IGNORE_HCC_VERSION.
Ignores strict checking of HCC and HIP version.
Can be useful when developing new HCC code.
2017-01-23 22:34:41 -06:00
Aditya Atluri 4e3afa6514 added ir code sad u8
Change-Id: Ie0d454b3bb9a6c9a028c091ad3aa969719b02cc9
2017-01-20 17:21:51 -06:00
Aditya Atluri 8ddec0426b added driver_types.h and texture_types.h header files to hip
Change-Id: Ic3b2403f07d6767dadf83d6c278fd14e87f6acdb
2017-01-20 17:09:52 -06:00
Aditya Atluri 22acd654cf fixed hipArray issues
1. Fixed build issues produced from previous commit
2. Create new header files to manage data structures better

Change-Id: I704d82c196c1858ed7617d76e40612eb507d2aa0
2017-01-20 16:54:48 -06:00
Aditya Atluri 7765469987 changes device functions documentation according to the supported apis
Change-Id: I47ac6bbde11d54d8265e0d27ec8cd5da4d03eb8e
2017-01-20 14:19:09 -06:00
Aditya Atluri b1eca6c855 added nvcc backend for hipArrays
1. Added hip_texture.h to hip_runtime_api.h as cuda does declare array runtime apis inside cuda_runtime_api.h
2. Added nvcc backend for hipArray runtime apis
3. Didn't test on nvidia platform (should work)

Change-Id: I1a14aef41840e4f55e5535132e3443a918b55967
2017-01-20 14:11:45 -06:00
Aditya Atluri b134a1a640 added more test coverage for vector data types
Change-Id: I9f57a8b597bd2ee4b265eadfd0859531497a6ada
2017-01-20 13:52:02 -06:00
Aditya Atluri f537d96633 fixed compilation issues for vector types and math functions
1. Added math_functions.h to hip_runtime.h
2. Changed operator overloading classifier static to static inline
3. Added vector types test for gpu
4. Seperated __host__ and __device__ for math functions in headers

Change-Id: I499862fad5d7b10da686da9011d7ecefe523f8e2
2017-01-20 09:49:11 -06:00
Ben Sander 927ac3d81c Add HIP_SYNC_HOST_ALLOC, HipReadEnv 2017-01-19 23:55:24 -06:00
Ben Sander 8209320ef0 Change ihipDeviceSetState,ihipDevice* so it doesn't log error
Cleans up debug trace.
2017-01-19 23:55:24 -06:00
Aditya Atluri 6ca2b289a2 added operator overloading for complex data types
Change-Id: Id96d5d000651914169f04497af6ff78ad96d846a
2017-01-19 15:15:25 -06:00
Ben Sander 1f5d16afe7 Doc update - describe debug techniques
Also tweak sample to remove unneeded HIP_KERNEL_NAME.
Comment update
2017-01-19 12:40:45 -06:00
Ben Sander 1c73e44ebe Fix debug display for Module launch kernels 2017-01-19 12:40:45 -06:00
Rahul Garg cc0d2a6753 Fixed hipcommander default execution for HCSWAP-106
Change-Id: I9fbd10dfaeeb4928b2ec23ceed131b5200a658f9
2017-01-19 15:04:32 +05:30
Aditya Atluri 56d73aaee7 moved half device function declarations to top of the file
1. Moved half device functions around so that script can catch the signatures
2. Generated docs for half precision apis

Change-Id: Iee27658e3a639fdb02af135e71841dc6427f15e2
2017-01-18 15:06:18 -06:00
Aditya Atluri db99ac798b more clarification about using device_md_gen.py
Change-Id: I3e207b65683f34d62be3454444ffb32f8814c0aa
2017-01-18 14:49:41 -06:00
Aditya Atluri 91ae5d6bd7 Added script for generating math api docs
1. Commented out unsupported device math functions
2. Moved function signatures to the top of implementation snippets
3. Added script to generate markdown documentation for device math apis
4. Added the generated file from the script which should be present everytime

Change-Id: Ic579dd8b8fdffa6e1b4d4f5f3fd8a803f4dcaac7
2017-01-18 14:40:50 -06:00
Aditya Atluri ea382e15f8 fixed compilation issues
1. Fixed compilation issues for tests
2. Added missing intrinsics + math functions
3. Disabled some device functions as they are causing linking error with HCC

Change-Id: I79d52c4c7a539cc8ef40580247ad97ffcb975f09
2017-01-18 11:53:47 -06:00
Aditya Atluri b723169ee9 Moved device code to mimic cuda header behavior
1. All fp32, fp64 math device/host functions should be in math_functions.h/.cpp
2. All fp32, fp64 fast math intrinsics for device/host functions should be in device_functions.h/.cpp
3. All the device code implementations should be in device_util.h/.cpp
4. Hence, made changes appropriately by moving code and creating new header files
5. Added math_functions.cpp/.h
6. Changed #ifndef signature to make sure no conflicts between headers with same names in hip/hip_runtime.h and hip/hcc_detail/hip_runtime.h
7. Changed tests to fit the code changes, making them to include appropriate headers
8. Added math_functions.cpp to CMakeLists.txt
9. Some of the tests are still broken, mostly host math functions will fix them in next commit
10. TODO: FIX compilation issues for host math functions

Change-Id: I7a17637d7e294a7d224ffba932c1a08668febd26
2017-01-17 14:57:51 -06:00
Aditya Atluri 13ce9ece77 enabled integer intrinsics tests
Change-Id: I5d28d556f228240eda2fc0098121ed3b29b041e7
2017-01-17 09:59:08 -06:00
Aditya Atluri 02c7f3a70f added last few integer intrinsic support
1. Added usad, umulhi, urhadd
2. Corrected implementation of __hadd, __hradd
3. TODO: __sad(). It gets tricky as ISA sees them as unsigned

Change-Id: Ibd2c2133b462f9393f3990355706386c79256bba
2017-01-17 09:27:51 -06:00
Aditya Atluri c0fd0921cb fixed broken tests and device code for integer intrinsics
1. Fixed build issues with new Integer intrinsics
2. Changed tests to work exactly as CUDA code
3. Still some integer intrinsics need to be supported

Change-Id: Ie6f4171259cf4da517436895d4f6f01e01f59b11
2017-01-17 09:00:09 -06:00
Aditya Atluri b09ad764a1 v1: Working on Integer Intrinsics
1. Half way through
2. May not work
3. No test written

Change-Id: I705b743a78b142ff068e2521870e73fca7ad2b1c
2017-01-16 14:55:29 -06:00
Aditya Atluri 18631efbc0 moved most of the fp16 code inside hip_fp16.cpp
1. As we use holder data structure, we move all the cmp, math, cvt apis to cpp file
2. All the tests passed
3. Add more extensive testing for half

Change-Id: I92c6399dace602a0a24432728e3f2a07124e6fb1
2017-01-16 12:32:35 -06:00
Aditya Atluri 6f2cfddc67 Added type conversion intrinsics
1. Added all type conversion intrinsics
2. NO TESTS have been added. (Will add in next commit)
3. Sanatized code in hip_runtime.h
4. Added passed() to hipTestHalf to make it pass on HIT

Change-Id: I0987963c802fc7ff4d7e07d7b88d86da35da53c9
2017-01-16 12:10:05 -06:00
Aditya Atluri bf45105c7c added half2 log, log10, exp, exp10 math functions
1. Enabled tests for log, log10, exp, exp10 half2
2. h2rint is still disabled.

Change-Id: I01f6002f6992259919893c524c526db5ee09473a
2017-01-13 13:26:10 -06:00
Aditya Atluri 0e576295b4 added half2 math operations
1. They use SDWA + LLVM IR
2. Added these functions to test
3. Need to do exp, exp10, log, log10, rint

Change-Id: I06176acc6cb8bb054495310531777406a41b54e4
2017-01-13 12:27:11 -06:00
Aditya Atluri 8c978c210c added math functions for half
1. Added math functions for half precision
2. HRCP is not available due to device code linking errors, will be enabled once it is fixed
3. Added math functions to half test file

Change-Id: Ie317ce70ef518a4fc3f27142143d01e0327f5df3
2017-01-13 12:05:29 -06:00
Aditya Atluri 3f52f76194 added half2 cmp and conv, data movement device functions
1. Added half2 comparision functions
2. Added conversion and data movement half apis

Change-Id: Ia33c0e957d9deb1f2b7a8fde8e22168f4d41b88b
2017-01-13 10:56:07 -06:00
Evgeny Mankov b7992fa252 [HIPIFY] Formatting, no functional changes. 2017-01-13 14:59:15 +03:00
Robert 65ad9d80d7 fix spelling errors
Conflicts:
	README.md
	docs/markdown/hip_faq.md

Change-Id: I8ca025e01276939ed3d7be24200ecaa8cf5e1e2c
2017-01-13 14:42:37 +05:30
Maneesh Gupta 53016c3e86 Merge branch 'amd-develop' into amd-master
Change-Id: I1ec6b0bacf555d11adc5f67a55086dc6bd648efb
2017-01-13 09:12:39 +05:30
Aditya Atluri 2dcd7600dc added comparision device functions for fp16
1. Added comparision device functions
2. Added test to check correct isa getting generated

Change-Id: I16732f5a1438bdce145f7bfcecd28198e3cc4b79
2017-01-12 14:52:14 -06:00