rocm-systems

Penulis	SHA1	Pesan	Tanggal
Rahul Garg	bddd6b73c0	Context management related changes in HIP. - -Contexts across threads are listed under device -Device reset cleans up all contexts and re-initializes _primaryCtx Change-Id: Ie1cfbb26d43a8dc6869be3e6ebaf7344ce374643	2017-02-27 15:24:17 +05:30
Aditya Atluri	2e245ae58c	Added initial support for hipMemcpyFromSymbol. But not working! Change-Id: I48d8c7de4ec9f85c6c942be995fb488a3931f5d7	2017-02-23 11:29:06 -06:00
Aditya Atluri	639fd4dd5e	added runtime api hipMemcpyFromSymbolAsync Change-Id: Ibaf925faf0ba464dd0ed6c5ea74c224c2ce38889	2017-02-22 19:16:35 -06:00
Aditya Atluri	d52c5867f2	Enable symbol tests Change-Id: I6bd036bf00c8051c8ff728ee60562c4ebd222160	2017-02-22 13:42:03 -06:00
Aditya Atluri	d03fe5a40d	v3: added free for ihipModuleSymbol_t structures inside tracker Change-Id: Ib8041a05312c08cbdf2d4fee5e7cbae17df6efff	2017-02-10 13:42:10 -06:00
Aditya Atluri	378eb3fa55	v2: Fixed hipModule memory management 1. Changed test to assert for same hipFunction values 2. Added better memory management for hipModule Change-Id: I10d7aef13c215a2211e262f3c79017f26a17d9a7	2017-02-10 13:32:13 -06:00
Aditya Atluri	6fd3daed30	fixed hipFunction memory management Change-Id: I7ebb323419bcd220ebd6466a8eb38e7bfdb1520a	2017-02-09 17:22:55 -06:00
Aditya Atluri	01b66dd998	Fixed Hawaii link issues 1. Split hip_ir.ll to hip_hc.ll and hip_hc_gfx803.ll a. hip_hc.ll contains arch generic ir implementations b. hip_hc_gfx803.ll contains gfx803 (fiji, polaris) specific ir 2. HIPCC can now parse --amdgpu-target=*. a. Usage: hipcc --amdgpu-target=gfx803 --amdgpu-target=gfx701 b. TODO: Convert to --amdgpu-target=gfx803,gfx701 3. With LLC in HCC able to generate native f16 isa, removed inline half asm math ops 4. Fixed threadfence and threadfence_block to use functions in rocdl Change-Id: Ic9a9e3e04139b0d75d2c2a263c030ca77adc1019	2017-02-08 12:04:05 -06:00
Aditya Atluri	5e3d63c0a3	changed __global__ attribute 1. Moved around tests and added them to HIT Change-Id: I5d75280c42a5af852670ebabc7305ee56721ec7b	2017-02-03 10:53:36 -06:00
Aditya Atluri	2790e9a448	fixed symbol memcpy issue Change-Id: I89d7401be51d194bcbf771020ba66e3d3b6a18f8	2017-02-01 17:54:59 -06:00
Aditya Atluri	f7ff199daa	fixed threadfence ir Change-Id: Ia3afb54bdb50864e678d849608d72a3c321edba1	2017-01-27 08:42:26 -06:00
Ben Sander	0409bf639c	Add HIP_FAIL_SOC. Fail sub-optimal-copies rather than perform them slowly. SOC occur on async copy of unpinned memory, or P2P copy between GPUs that are not peers.	2017-01-25 21:53:17 -06:00
Ben Sander	1635b8f43f	Read HCC_OPT_FLUSH and optimize dispatch accordingly. If HCC is in this mode, we can use less aggressive flushes in some cases.	2017-01-25 21:50:52 -06:00
Ben Sander	813c189b33	Show dynamic shared mem usage not static.	2017-01-23 22:34:41 -06:00
Ben Sander	0dabdeb01f	Move core env var processing to env.cpp	2017-01-23 22:34:41 -06:00
Ben Sander	96eac67929	Add debug tips to docs	2017-01-23 22:34:41 -06:00
Ben Sander	4586091dfe	Log error with ihipLogError. Cleans up CXL trace display.	2017-01-23 22:34:41 -06:00
Aditya Atluri	4e3afa6514	added ir code sad u8 Change-Id: Ie0d454b3bb9a6c9a028c091ad3aa969719b02cc9	2017-01-20 17:21:51 -06:00
Aditya Atluri	f537d96633	fixed compilation issues for vector types and math functions 1. Added math_functions.h to hip_runtime.h 2. Changed operator overloading classifier static to static inline 3. Added vector types test for gpu 4. Seperated __host__ and __device__ for math functions in headers Change-Id: I499862fad5d7b10da686da9011d7ecefe523f8e2	2017-01-20 09:49:11 -06:00
Ben Sander	927ac3d81c	Add HIP_SYNC_HOST_ALLOC, HipReadEnv	2017-01-19 23:55:24 -06:00
Ben Sander	8209320ef0	Change ihipDeviceSetState,ihipDevice* so it doesn't log error Cleans up debug trace.	2017-01-19 23:55:24 -06:00
Ben Sander	1f5d16afe7	Doc update - describe debug techniques Also tweak sample to remove unneeded HIP_KERNEL_NAME. Comment update	2017-01-19 12:40:45 -06:00
Ben Sander	1c73e44ebe	Fix debug display for Module launch kernels	2017-01-19 12:40:45 -06:00
Aditya Atluri	ea382e15f8	fixed compilation issues 1. Fixed compilation issues for tests 2. Added missing intrinsics + math functions 3. Disabled some device functions as they are causing linking error with HCC Change-Id: I79d52c4c7a539cc8ef40580247ad97ffcb975f09	2017-01-18 11:53:47 -06:00
Aditya Atluri	b723169ee9	Moved device code to mimic cuda header behavior 1. All fp32, fp64 math device/host functions should be in math_functions.h/.cpp 2. All fp32, fp64 fast math intrinsics for device/host functions should be in device_functions.h/.cpp 3. All the device code implementations should be in device_util.h/.cpp 4. Hence, made changes appropriately by moving code and creating new header files 5. Added math_functions.cpp/.h 6. Changed #ifndef signature to make sure no conflicts between headers with same names in hip/hip_runtime.h and hip/hcc_detail/hip_runtime.h 7. Changed tests to fit the code changes, making them to include appropriate headers 8. Added math_functions.cpp to CMakeLists.txt 9. Some of the tests are still broken, mostly host math functions will fix them in next commit 10. TODO: FIX compilation issues for host math functions Change-Id: I7a17637d7e294a7d224ffba932c1a08668febd26	2017-01-17 14:57:51 -06:00
Aditya Atluri	13ce9ece77	enabled integer intrinsics tests Change-Id: I5d28d556f228240eda2fc0098121ed3b29b041e7	2017-01-17 09:59:08 -06:00
Aditya Atluri	b09ad764a1	v1: Working on Integer Intrinsics 1. Half way through 2. May not work 3. No test written Change-Id: I705b743a78b142ff068e2521870e73fca7ad2b1c	2017-01-16 14:55:29 -06:00
Aditya Atluri	18631efbc0	moved most of the fp16 code inside hip_fp16.cpp 1. As we use holder data structure, we move all the cmp, math, cvt apis to cpp file 2. All the tests passed 3. Add more extensive testing for half Change-Id: I92c6399dace602a0a24432728e3f2a07124e6fb1	2017-01-16 12:32:35 -06:00
Aditya Atluri	6f2cfddc67	Added type conversion intrinsics 1. Added all type conversion intrinsics 2. NO TESTS have been added. (Will add in next commit) 3. Sanatized code in hip_runtime.h 4. Added passed() to hipTestHalf to make it pass on HIT Change-Id: I0987963c802fc7ff4d7e07d7b88d86da35da53c9	2017-01-16 12:10:05 -06:00
Aditya Atluri	0e576295b4	added half2 math operations 1. They use SDWA + LLVM IR 2. Added these functions to test 3. Need to do exp, exp10, log, log10, rint Change-Id: I06176acc6cb8bb054495310531777406a41b54e4	2017-01-13 12:27:11 -06:00
Aditya Atluri	5ef8ef3bd7	added packed math fp16 native device functions 1. Added SDWA implementation inside IR file 2. Added device functions to header + used them in test Change-Id: Ib4e059a58eee201cc82438689e3e9bc5f9d26653	2017-01-12 14:10:51 -06:00
Aditya Atluri	d180fdaae0	Started adding native half math library support 1. Removed HIP_EXPERIMENTAL env variable so that device code will be accessed from LLVM IR 2. Removed soft support from headers and moved to hip_fp16.cpp 3. Added LLVM IR + inline asm to hip_ir.ll 4. Added test for fp16 5. Added barriers for hcc 3.5 and hcc 4.0 for half support a. Which means, hcc 4.0 can parse __fp16 but hcc 3.5 cant b. HCC 4.0 code is implemented now, hcc 3.5 will be added later Change-Id: Ic37859b2688ebb02e168bab643d1882bf4727952	2017-01-12 11:30:20 -06:00
Aditya Atluri	73fcce26f9	changed copyright year from 2016 to 2017 in src directory Change-Id: Idb97db509b2b4b1656b2df7a14a62ade38c9d574	2017-01-11 18:05:41 -06:00
Aditya Atluri	39910029a6	Added proper device data types Change-Id: I42029635ff68c3c13a764a3eda6447e6c77878c6	2017-01-11 15:06:25 -06:00
Ben Sander	a3e0012567	Add HIP_MAX_QUEUES feature. Includes some tricky manipulation of the locks for contexts and streams. issue is that stealing a stream requires we lock the context to walk the streams to find a victim. To avoid deadlock, we can't have a stream locked when we lock the context. This implementation releases the stream lock, then acquires the context and selects the victim. A more stable implemenation might be to copy the stream list from a context so that a lock is not required to walk all streams. Smart shared_ptr could be used to prevent the streams from being deallocated during the walk.	2017-01-09 21:02:56 -06:00
Ben Sander	93fbc9cf7b	First pass at virtualized queue support. Also updated stream debug messages to consistently use trace_helper.	2017-01-09 21:02:53 -06:00
Ben Sander	3a42a7642a	tolerate spaces in hip args	2017-01-09 20:57:13 -06:00
Rahul Garg	5fb09879c7	Added state for hipDevice. Change-Id: Idbc3c04cd054a01b634856a1e0a23ff172e991aa	2017-01-09 23:54:01 +05:30
Ben Sander	c325c988b1	Support size_t in memset kernel. Add disable for HSA_AMD_AGENT_INFO_MAX_WAVES_PER_CU Remove one copy of completion_future in memset.	2016-12-22 12:25:09 -06:00
Ben Sander	37d8cafb12	Increment API sequence number. Change name to tls_tidInfo	2016-12-21 15:30:36 -06:00
Rahul Garg	fbf7ed63a8	Fix for HCSWAP-67 Change-Id: I0b2ce5ab933237947fb41d89769db3da16e5be6a Conflicts: src/hip_hcc.cpp	2016-12-19 16:19:51 +05:30
Ben Sander	90c69e14bb	Add name for function	2016-12-17 08:54:09 -06:00
Ben Sander	8bf4bd2f7d	Remove HSA dependency from hipFunction_t Place _groupSegmentSize and _privateSegmentSize inside Function, remove hsa_executable_symbol_t.	2016-12-17 07:22:56 -06:00
Ben Sander	6ed7e1c1c1	Remove USE_DISPATCH_HSA_KERNEL=0 path.	2016-12-17 07:22:56 -06:00
Ben Sander	4d29885be3	Refactor Module and Function APIs. - hipFunction_t is now returned by value. This eliminates dynamic allocation / memory management complexity in the module. Removed the kernel name so the structure is just 16 bytes now. - Moved the hsa_executable_load_module and hsa_executable_freeze calls to the hipModuleLoad and hipModuleLoadData calls. - Apply sharedMemBytes in hipModuleLaunchKernel to group segment size (not private).	2016-12-17 07:22:33 -06:00
Rahul Garg	263a9614ff	Mapped hipDevice_t to int Change-Id: I6cfa56c42b7cd04aa0e0bce510c0d72d34ea211a	2016-12-17 16:53:03 +05:30
Aditya Atluri	2665ad2762	disabled half native support as inline asm is not working Change-Id: I3073d8ae39eed321987f0f2f0e689eec4cdbb48c	2016-12-16 09:24:59 -06:00
Ben Sander	8ed38bae69	fix copyright	2016-12-15 14:42:52 -06:00
Aditya Atluri	68c57c38ff	fixed compilation issues Change-Id: I96692538736e2e4f2da9dba9c8c29a164aec4c0d	2016-12-14 16:50:16 -06:00
Aditya Atluri	d2daf6ad75	added half2 support Change-Id: I0f3b9b7037fed97e80ec99f5369c75a63f001aae	2016-12-14 14:18:48 -06:00

1 2 3 4 5 ...

474 Melakukan