rocm-systems

Автор	SHA1	Сообщение	Дата
Ben Sander	57e1efebab	Add debug tips to docs [ROCm/clr commit: `4de3df746c`]	2017-01-23 22:34:41 -06:00
Ben Sander	948c5e013c	Add debug tips to docs [ROCm/clr commit: `fe24996326`]	2017-01-23 22:34:41 -06:00
Ben Sander	1ff12d95a6	Log error with ihipLogError. Cleans up CXL trace display. [ROCm/clr commit: `d19c4767b7`]	2017-01-23 22:34:41 -06:00
Ben Sander	6f2a8bf97b	Add HIP_IGNORE_HCC_VERSION. Ignores strict checking of HCC and HIP version. Can be useful when developing new HCC code. [ROCm/clr commit: `df74158d1c`]	2017-01-23 22:34:41 -06:00
Aditya Atluri	35631ea2a2	added ir code sad u8 Change-Id: Ie0d454b3bb9a6c9a028c091ad3aa969719b02cc9 [ROCm/clr commit: `9952117d64`]	2017-01-20 17:21:51 -06:00
Aditya Atluri	11cf5fc117	added driver_types.h and texture_types.h header files to hip Change-Id: Ic3b2403f07d6767dadf83d6c278fd14e87f6acdb [ROCm/clr commit: `97315e8748`]	2017-01-20 17:09:52 -06:00
Aditya Atluri	0e061ea69a	fixed hipArray issues 1. Fixed build issues produced from previous commit 2. Create new header files to manage data structures better Change-Id: I704d82c196c1858ed7617d76e40612eb507d2aa0 [ROCm/clr commit: `5b2d4c0e60`]	2017-01-20 16:54:48 -06:00
Aditya Atluri	5d51e1ddbd	changes device functions documentation according to the supported apis Change-Id: I47ac6bbde11d54d8265e0d27ec8cd5da4d03eb8e [ROCm/clr commit: `5f10a69ef7`]	2017-01-20 14:19:09 -06:00
Aditya Atluri	100ef6d9b2	added nvcc backend for hipArrays 1. Added hip_texture.h to hip_runtime_api.h as cuda does declare array runtime apis inside cuda_runtime_api.h 2. Added nvcc backend for hipArray runtime apis 3. Didn't test on nvidia platform (should work) Change-Id: I1a14aef41840e4f55e5535132e3443a918b55967 [ROCm/clr commit: `a7fa600176`]	2017-01-20 14:11:45 -06:00
Aditya Atluri	f6d09573aa	added more test coverage for vector data types Change-Id: I9f57a8b597bd2ee4b265eadfd0859531497a6ada [ROCm/clr commit: `fd2e6ac2f0`]	2017-01-20 13:52:02 -06:00
Aditya Atluri	b6f4fedaaf	fixed compilation issues for vector types and math functions 1. Added math_functions.h to hip_runtime.h 2. Changed operator overloading classifier static to static inline 3. Added vector types test for gpu 4. Seperated __host__ and __device__ for math functions in headers Change-Id: I499862fad5d7b10da686da9011d7ecefe523f8e2 [ROCm/clr commit: `02190736e3`]	2017-01-20 09:49:11 -06:00
Ben Sander	81488d5d00	Add HIP_SYNC_HOST_ALLOC, HipReadEnv [ROCm/clr commit: `db3f4889ca`]	2017-01-19 23:55:24 -06:00
Ben Sander	7a992b9fc3	Change ihipDeviceSetState,ihipDevice* so it doesn't log error Cleans up debug trace. [ROCm/clr commit: `6de88d4293`]	2017-01-19 23:55:24 -06:00
Aditya Atluri	c50f5cbd2c	added operator overloading for complex data types Change-Id: Id96d5d000651914169f04497af6ff78ad96d846a [ROCm/clr commit: `fe5f45caaf`]	2017-01-19 15:15:25 -06:00
Ben Sander	48bd62db9a	Doc update - describe debug techniques Also tweak sample to remove unneeded HIP_KERNEL_NAME. Comment update [ROCm/clr commit: `ca1cef4e06`]	2017-01-19 12:40:45 -06:00
Ben Sander	3bc2e3ba02	Fix debug display for Module launch kernels [ROCm/clr commit: `2ffc9f4e22`]	2017-01-19 12:40:45 -06:00
Rahul Garg	707c31913d	Fixed hipcommander default execution for HCSWAP-106 Change-Id: I9fbd10dfaeeb4928b2ec23ceed131b5200a658f9 [ROCm/clr commit: `aa3f278475`]	2017-01-19 15:04:32 +05:30
Aditya Atluri	d84be1d089	moved half device function declarations to top of the file 1. Moved half device functions around so that script can catch the signatures 2. Generated docs for half precision apis Change-Id: Iee27658e3a639fdb02af135e71841dc6427f15e2 [ROCm/clr commit: `706a032a29`]	2017-01-18 15:06:18 -06:00
Aditya Atluri	e264a9740e	more clarification about using device_md_gen.py Change-Id: I3e207b65683f34d62be3454444ffb32f8814c0aa [ROCm/clr commit: `c9bc71dc86`]	2017-01-18 14:49:41 -06:00
Aditya Atluri	5ea40f27b3	Added script for generating math api docs 1. Commented out unsupported device math functions 2. Moved function signatures to the top of implementation snippets 3. Added script to generate markdown documentation for device math apis 4. Added the generated file from the script which should be present everytime Change-Id: Ic579dd8b8fdffa6e1b4d4f5f3fd8a803f4dcaac7 [ROCm/clr commit: `3d4dcee35d`]	2017-01-18 14:40:50 -06:00
Aditya Atluri	69903887a4	fixed compilation issues 1. Fixed compilation issues for tests 2. Added missing intrinsics + math functions 3. Disabled some device functions as they are causing linking error with HCC Change-Id: I79d52c4c7a539cc8ef40580247ad97ffcb975f09 [ROCm/clr commit: `41a46effef`]	2017-01-18 11:53:47 -06:00
Aditya Atluri	42c627fbe8	Moved device code to mimic cuda header behavior 1. All fp32, fp64 math device/host functions should be in math_functions.h/.cpp 2. All fp32, fp64 fast math intrinsics for device/host functions should be in device_functions.h/.cpp 3. All the device code implementations should be in device_util.h/.cpp 4. Hence, made changes appropriately by moving code and creating new header files 5. Added math_functions.cpp/.h 6. Changed #ifndef signature to make sure no conflicts between headers with same names in hip/hip_runtime.h and hip/hcc_detail/hip_runtime.h 7. Changed tests to fit the code changes, making them to include appropriate headers 8. Added math_functions.cpp to CMakeLists.txt 9. Some of the tests are still broken, mostly host math functions will fix them in next commit 10. TODO: FIX compilation issues for host math functions Change-Id: I7a17637d7e294a7d224ffba932c1a08668febd26 [ROCm/clr commit: `d23b6b8694`]	2017-01-17 14:57:51 -06:00
Aditya Atluri	968f2e9489	enabled integer intrinsics tests Change-Id: I5d28d556f228240eda2fc0098121ed3b29b041e7 [ROCm/clr commit: `3f9a9d9318`]	2017-01-17 09:59:08 -06:00
Aditya Atluri	70a03445f0	added last few integer intrinsic support 1. Added usad, umulhi, urhadd 2. Corrected implementation of __hadd, __hradd 3. TODO: __sad(). It gets tricky as ISA sees them as unsigned Change-Id: Ibd2c2133b462f9393f3990355706386c79256bba [ROCm/clr commit: `9ca135ac2e`]	2017-01-17 09:27:51 -06:00
Aditya Atluri	b864271e2c	fixed broken tests and device code for integer intrinsics 1. Fixed build issues with new Integer intrinsics 2. Changed tests to work exactly as CUDA code 3. Still some integer intrinsics need to be supported Change-Id: Ie6f4171259cf4da517436895d4f6f01e01f59b11 [ROCm/clr commit: `f0ea51c786`]	2017-01-17 09:00:09 -06:00
Aditya Atluri	d259da2e42	v1: Working on Integer Intrinsics 1. Half way through 2. May not work 3. No test written Change-Id: I705b743a78b142ff068e2521870e73fca7ad2b1c [ROCm/clr commit: `feba9fe213`]	2017-01-16 14:55:29 -06:00
Aditya Atluri	65552aa2c5	moved most of the fp16 code inside hip_fp16.cpp 1. As we use holder data structure, we move all the cmp, math, cvt apis to cpp file 2. All the tests passed 3. Add more extensive testing for half Change-Id: I92c6399dace602a0a24432728e3f2a07124e6fb1 [ROCm/clr commit: `e95456eee8`]	2017-01-16 12:32:35 -06:00
Aditya Atluri	d9845446ef	Added type conversion intrinsics 1. Added all type conversion intrinsics 2. NO TESTS have been added. (Will add in next commit) 3. Sanatized code in hip_runtime.h 4. Added passed() to hipTestHalf to make it pass on HIT Change-Id: I0987963c802fc7ff4d7e07d7b88d86da35da53c9 [ROCm/clr commit: `d496576b55`]	2017-01-16 12:10:05 -06:00
Aditya Atluri	d71fc0e60a	added half2 log, log10, exp, exp10 math functions 1. Enabled tests for log, log10, exp, exp10 half2 2. h2rint is still disabled. Change-Id: I01f6002f6992259919893c524c526db5ee09473a [ROCm/clr commit: `5c5f5c1ad1`]	2017-01-13 13:26:10 -06:00
Aditya Atluri	4756cf7c16	added half2 math operations 1. They use SDWA + LLVM IR 2. Added these functions to test 3. Need to do exp, exp10, log, log10, rint Change-Id: I06176acc6cb8bb054495310531777406a41b54e4 [ROCm/clr commit: `eff68c989a`]	2017-01-13 12:27:11 -06:00
Aditya Atluri	b2973b97e2	added math functions for half 1. Added math functions for half precision 2. HRCP is not available due to device code linking errors, will be enabled once it is fixed 3. Added math functions to half test file Change-Id: Ie317ce70ef518a4fc3f27142143d01e0327f5df3 [ROCm/clr commit: `fe38e9652b`]	2017-01-13 12:05:29 -06:00
Aditya Atluri	d3fe56550e	added half2 cmp and conv, data movement device functions 1. Added half2 comparision functions 2. Added conversion and data movement half apis Change-Id: Ia33c0e957d9deb1f2b7a8fde8e22168f4d41b88b [ROCm/clr commit: `646f566bbf`]	2017-01-13 10:56:07 -06:00
Evgeny Mankov	740888f088	[HIPIFY] Formatting, no functional changes. [ROCm/clr commit: `5200073b4c`]	2017-01-13 14:59:15 +03:00
Robert	4f1b94ab25	fix spelling errors Conflicts: README.md docs/markdown/hip_faq.md Change-Id: I8ca025e01276939ed3d7be24200ecaa8cf5e1e2c [ROCm/clr commit: `32a35eda75`]	2017-01-13 14:42:37 +05:30
Aditya Atluri	a52591d117	added comparision device functions for fp16 1. Added comparision device functions 2. Added test to check correct isa getting generated Change-Id: I16732f5a1438bdce145f7bfcecd28198e3cc4b79 [ROCm/clr commit: `89998d436f`]	2017-01-12 14:52:14 -06:00
Aditya Atluri	d1f7c4e048	added packed math fp16 native device functions 1. Added SDWA implementation inside IR file 2. Added device functions to header + used them in test Change-Id: Ib4e059a58eee201cc82438689e3e9bc5f9d26653 [ROCm/clr commit: `eeef055469`]	2017-01-12 14:10:51 -06:00
Aditya Atluri	5317ffa9ec	Started adding native half math library support 1. Removed HIP_EXPERIMENTAL env variable so that device code will be accessed from LLVM IR 2. Removed soft support from headers and moved to hip_fp16.cpp 3. Added LLVM IR + inline asm to hip_ir.ll 4. Added test for fp16 5. Added barriers for hcc 3.5 and hcc 4.0 for half support a. Which means, hcc 4.0 can parse __fp16 but hcc 3.5 cant b. HCC 4.0 code is implemented now, hcc 3.5 will be added later Change-Id: Ic37859b2688ebb02e168bab643d1882bf4727952 [ROCm/clr commit: `c286bf6f8a`]	2017-01-12 11:30:20 -06:00
Aditya Atluri	93a0f1497a	changed data type used for complex Change-Id: I0a3bb281af3d5ac1290207821c7c45aea40f513f [ROCm/clr commit: `f85d7b7d97`]	2017-01-11 18:23:37 -06:00
Aditya Atluri	ba96b2f6c8	changed copyright year from 2016 to 2017 in include directory Change-Id: Ib5935a84fb51a04b3446df31cc2287101f791b83 [ROCm/clr commit: `7dbf63dde2`]	2017-01-11 18:09:33 -06:00
Aditya Atluri	a86633f210	changed copyright year from 2016 to 2017 in src directory Change-Id: Idb97db509b2b4b1656b2df7a14a62ade38c9d574 [ROCm/clr commit: `e9ff23e5f9`]	2017-01-11 18:05:41 -06:00
Aditya Atluri	e4189cab53	added test for vector data types Change-Id: I0b6624886e474601cb1ef003c5f10adf399a21c9 [ROCm/clr commit: `1d8700096c`]	2017-01-11 18:02:30 -06:00
Aditya Atluri	702036c468	fixed compilation issues with operator overloading device data types Change-Id: I6a60282f0c04a3c0d382cdf2d67ad8d9156880ad [ROCm/clr commit: `430e1364f5`]	2017-01-11 17:53:32 -06:00
Aditya Atluri	fe2d13c861	Added proper device data types Change-Id: I42029635ff68c3c13a764a3eda6447e6c77878c6 [ROCm/clr commit: `4e57822d95`]	2017-01-11 15:06:25 -06:00
Evgeny Mankov	a72a2e84f2	[HIPIFY] cudaDataType_t and libraryPropertyType_t support (CUDA 8.0.44 only) All are marked as HIP_UNSUPPORTED. IMPORTANT: 1. libraryPropertyType_t has no cuda prefix. => TO_DO: new matcher is needed. 2. all libraries (cublas, cufft, cusolver, cusparse, nvgraph) have started to use these types (since 8.0). [ROCm/clr commit: `fd0c56a767`]	2017-01-10 20:24:27 +03:00
Evgeny Mankov	87d5b745ac	[HIPIFY] cudaDeviceAttr (RT API) support up to CUDA 8.0.44 Attributes, which are not yet supported by HIP, are marked as HIP_UNSUPPORTED. [ROCm/clr commit: `81fd34f236`]	2017-01-10 19:29:33 +03:00
Evgeny Mankov	a194e1af1f	[HIPIFY] CUdevice_attribute support up to CUDA 8.0.44 Attributes, which are not yet supported by HIP, are marked as HIP_UNSUPPORTED. [ROCm/clr commit: `c0c04f34be`]	2017-01-10 17:54:22 +03:00
Ben Sander	9af6545379	Fix delete[] [ROCm/clr commit: `ff77106399`]	2017-01-09 21:03:11 -06:00
Ben Sander	13bf4c39cc	Add HIP_MAX_QUEUES feature. Includes some tricky manipulation of the locks for contexts and streams. issue is that stealing a stream requires we lock the context to walk the streams to find a victim. To avoid deadlock, we can't have a stream locked when we lock the context. This implementation releases the stream lock, then acquires the context and selects the victim. A more stable implemenation might be to copy the stream list from a context so that a lock is not required to walk all streams. Smart shared_ptr could be used to prevent the streams from being deallocated during the walk. [ROCm/clr commit: `b29fbf736d`]	2017-01-09 21:02:56 -06:00
Ben Sander	82cf0397c5	First pass at virtualized queue support. Also updated stream debug messages to consistently use trace_helper. [ROCm/clr commit: `c9f5fe34e6`]	2017-01-09 21:02:53 -06:00
Ben Sander	a3d325d206	Add more notes on debugging HIP apps. [ROCm/clr commit: `a6034b88e2`]	2017-01-09 21:02:50 -06:00

1 2 3 4 5 ...

1330 Коммитов