Grafico dei commit

2959 Commit

Autore SHA1 Messaggio Data
Amber Lin b6f65f9849 Add CPU cache information
Fill up cache properties of CPU node by reading data from /proc/cpuinfo
and /sys/devices/system/cpu/cpuX/cache/indexY



Change-Id: I0a96760575e504e38962554f192c3fe66bea3c15
2015-11-09 07:16:24 -05:00
Ramesh Errabolu (xN/A) TX 2f0425d354 Update Binarysearch and BlackScholes Hsa Sample to support FULL and BASE Profiles
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1206169]
2015-10-30 17:53:25 -05:00
Kent Russell cb3a664065 Add option to create release build for Thunk
By adding REL=1 to the make command line (e.g. make REL=1 deb), we can
create a release build of the Thunk. This will not affect existing
functionality, and will only have an effect if REL=1 is specified on the
command line, or in the build_thunk.sh script.

Change-Id: Iedc3b6094e70a4ebd726499eda56013cc254b83d
2015-10-30 14:05:40 -04:00
Kent Russell cabbcbabff Cleanup RPM build of thunk
Change-Id: Ib437a3ec7be9f5aa7d3ef9e53c13e3c5e7b7382e
2015-10-30 08:42:16 -04:00
Felix Kuehling bd93eecc64 Use correct aperture for _fmm_unmap_from_gpu_scratch
Passing in the wrong aperture resulted in failure to unmap scratch.


Change-Id: Icd7423abfb1bcc773b33becffcbefc233f4ff340
2015-10-29 18:26:15 -04:00
Philip Cox 0c234c7ef3 Add SDMA IOCTL type to Create Queue function.
Change-Id: I7e31507b761ca388b2cac93f994f6106de962f17
2015-10-29 10:25:41 -04:00
Kent Russell 6ceed7def3 libhsakmt - Add make option to package thunk as RPM
Add an option to libhsakmt to allow the thunk to be packaged as an RPM.
The default will remain being built as-is, but this can now be packaged
as an RPM by using "make src rpm" . build_thunk.sh will be modified to
reflect this new option.

Change-Id: I38e03d10cfb5035bdf0a87635a784c47a709a5b6
2015-10-29 07:49:13 -04:00
Harish Kasiviswanathan f885e551aa Remove erroneous and redundant memory banks reported
hsaKmtGetNodeMemoryProperties -
	- Return only HSA_HEAPTYPE_SYSTEM memory for CPU only node.
	- For dGPU remove redundant HSA_HEAPTYPE_FRAME_BUFFER_PRIVATE
	  entry.

Change-Id: I0349be39b8409a0fd64a038b8b2956191356d937
2015-10-23 18:43:46 -04:00
Harish Kasiviswanathan 69662da3dc Correct parameter name for topology_is_dgpu()
The function expects device_id and not gpu_id.

Change-Id: I79794fd4e58e6e6adb26659da30f3e4d8e108434
2015-10-23 18:43:45 -04:00
Harish Kasiviswanathan cb53548c89 Unify fmm_get_aperture_xxx functions
Unify fmm_get_aperture_base and fmm_get_aperture_limit into one
function. Make the return value to HSAKMT_STATUS.

Change-Id: I0b3f563ffb268947ab891f4935f61788d0af0e01
2015-10-23 18:43:34 -04:00
Felix Kuehling 5131ab4e64 Implement flat scratch support for dGPU
hsaKmtAllocMemory only allocates aligned address space and sets up
the scratch_physical aperture to match the allocated address space.

Actual allocation of backing memory happens in hsaKmtMapMemoryToGPU.

Change-Id: Ie709815ab9bedb3d682e096b4005fdfb5e94d3a7
2015-10-22 20:40:22 -04:00
Felix Kuehling 149261ba09 Allow address space allocations with specific alignment
Change-Id: I4bf7f7ac53c3921dd330b9dc7a40582611f88b69
2015-10-22 20:27:49 -04:00
Ben Goz 55b1a5dc43 Casting local memory size to uint64_t
Change-Id: I5c2010056b84ac01bb65361210d2a693e437050a
Signed-off-by: Ben Goz <ben.goz@amd.com>
2015-10-22 09:05:34 -04:00
Ben Goz e61500c46e Adding support for new AQL Queue Memory allocation
Change-Id: If84fc4b961627dbdd0b77b1c509a3c9a4c709b9f
Signed-off-by: Ben Goz <ben.goz@amd.com>
2015-10-22 13:13:54 +03:00
Felix Kuehling 590c8e522c Fix node 0 system memory allocation for dGPU
This is a hack to allow the Runtime to allocate system memory with
PreferredNode=0 on a dGPU system. We allocated it from Node 1
instead so that the node 1 GPU can map the memory. A proper fix
will be implemented together with multi-GPU support.

Change-Id: Ieb52599e5275781c04ee34405ea850bf782c523a
2015-10-21 20:00:01 -04:00
Felix Kuehling 39bde26c9b Reserve more SVM process address space
Try to reserve as much SVM address space as GPUVM can address.
Implement a fallback scheme to smaller sizes if larger allocations
fail or are not addressable by the GPU, down to an (arbitrary)
minimum of 4GB.

Change-Id: I770177834cc9e6ddd6ef4f20d789eab63c8055cb
2015-10-19 17:44:23 -04:00
Andres Rodriguez 0df346aaf9 make: add 'deb' target for creating deb packages
When 'make deb' is run create a libhsakmt.deb archive that installs
libhsakmt into the appropriate folder on the target where the dymanic
linker can find it.

Change-Id: I32de7198975f7831e509a67371e78456982b5c42
2015-10-16 19:13:51 -04:00
Harish Kasiviswanathan 5cc56a2647 Fix init process apertures
Kernel ioctl AMDKFD_IOC_GET_PROCESS_APERTURES returns process apertures
only for GPU nodes. The current implementation assumed that this list of
GPU nodes returned by the ioctl has one to one correpondence to sysfs
topology nodes. This fails when non-GPU nodes exist in topology as in
case of Intel + gfx802

Fix this by using gpu_id (./sys/.../kfd/topology/nodes/1/gpu_id) to map
information obtained from kernel ioctl call.

Change-Id: I4ab8ae5354f12cf0b6609fc4b24182b82eb3677f
2015-10-15 15:38:14 -04:00
Harish Kasiviswanathan b6c6f79143 Fix hard-coded usage of Node 0
Use appropriate NodeId instead

Change-Id: I46af93b76978fea7bedb34457fcc0864ed4fe2d4
2015-10-14 17:27:38 -04:00
Felix Kuehling 6a5ca4bc5a Fix various dgpu memory management issues
Fix TONGA_PAGE_SIZE value and move it to libhsakmt.h for usiing it
consistently in all places that require the same alignment for the
same reason. Create a generic alignment helper macro to replace some
incorrect hand-coded size alignments.

Move virtual address and size alignments down into aperture management
functions. Alignment is a per-aperture property that is set during
fmm_init_process_apertures. Doing the alignment there ensures that
all allocations in the same aperture are aligned the same way. Finding
objects by size and address can take the alignment into account.

Also align the size of physical allocations to back aligned virtual
address allocations. CPU mappings do not need to be aligned.

Map anonymous pages over released memory mappings to allow the
backing pages to be released, while keeping the address space
reserved.

Add alignment parameter to free_exec_aligned_memory_gpu to match the
interface of allocate_exec_aligned_memory_cpu. It doesn't make sense
to allow an alignment parameter in one but assume a specific
alignment in the other.

Change-Id: I74226ca6938f4948f643e5aee1d474720cd89e78
2015-10-13 19:14:56 -04:00
Felix Kuehling 0fc0a5b526 Add support for gfx803
Create new device_info and add device ID. Add helper macros to
identify chip families (VI, discrete). For now gfx803 behaves like
gfx802. But if necessary we can have gfx802 or gfx803-specific
code paths or workarounds in the future.

Change-Id: I61b4ffef7dd7796bb34cb01fbff0089bd49507bb
2015-10-09 17:40:54 -04:00
Harish Kasiviswanathan 758824db17 Fix assert failure for CPU only node
hsa_gfxip_table lists only (supported) GPUs. So assert fail only when a
non-supported GPU is detected.

Change-Id: I6207dc7cd55860c8b3348b6a4ca6102131975722
2015-10-08 11:52:59 -04:00
Harish Kasiviswanathan f2a46101d3 Refactor hsa_gfxip_table lookup
Also fix some formatting

Change-Id: Ia04d7a9cd3972cc4d283c576161de639027aac6d
2015-10-08 11:52:59 -04:00
Felix Kuehling b94ae66c62 Update HsaMemFlags.ui32.CoarseGrain comment
As advised by Paul Blinzer

Change-Id: Icabf4acd94866ddbbe53faf48a71e1113f0c76b6
2015-10-05 16:48:50 -04:00
Felix Kuehling 8e836f8183 Setup APE1 on dGPU for coherent access
The default is non-coherent access for better performance on dGPU.
Disabled hsaKmtSetMemoryPolicy function on dGPU to prevent app from
overriding the APE1 settings at runtime.
Fixed dGPU VM aperture limit to be inclusive.

Change-Id: I378ff74a654f533572775c0c97c19779a56bc6d9
2015-10-02 17:20:33 -04:00
Felix Kuehling 7505893cc7 Add all gfx802 device IDs to supported_devices
Without this, queue creation segfaults on unknown devices.

Change-Id: Ieea0bc4783e7313b3dcdabf03ab1269e3670b217
2015-10-02 15:33:37 -04:00
Felix Kuehling f3aaba0621 Fix returning of base and limit on dgpu_mem_init reinitialization
Change-Id: I1d1500ee57c3b85fc39c224d233a62097f981719
2015-09-30 18:07:04 -04:00
Felix Kuehling f2f45cc0e4 Add CoarseGrain memory flag
Change-Id: If8ac0339ae8c809c6e6a4f56592a4061d110ea94
2015-09-30 18:07:04 -04:00
shaoyunl 2d63ee7b8f Initiali support for CWSR on thunk
1. Add IOCTL defines to set trap handler
2. Add control stack size information on create queue argument.
3. Increase the total save&restore area size for carrizo to include the control stack size.

Signed-off-by: Shaoyun Liu <Shaoyun.liu@amd.com>

Change-Id: Iccf15e073b7db2519e96e7f7b46a89d57ab9a4df
2015-09-25 15:12:25 -04:00
Harish Kasiviswanathan 1897acd78e Merge "Sync up HSA_ENGINE_ID type with Windows/Perforce" into amd-staging 2015-09-24 11:03:23 -04:00
Amber Lin 082f8314c4 Sync up HSA_ENGINE_ID type with Windows/Perforce
HSA_ENGINE_ID in Perforce added ui32 to the typedef while in Git it doesn't.
This causes conflicts to RT applications. Decision being made is to change Git
to match Perforce.

Change-Id: I7e9c6437b023bb23ec9578737f8534e9453589b9
2015-09-24 00:10:52 -04:00
Harish Kasiviswanathan 1438f15fd0 Fix VM range for dGPU local memory
Currently, Kernel imposes a limit on VM. Thunk should be aware of it.
This fix is required till Kernel VM limit is sorted.

For now both "Host Access" memory and "Local Memory" share the same VM range.

Change-Id: I5a9220face20df9ede2b78bd6201a01dd2ea70e0
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
2015-09-23 16:18:50 -04:00
Harish Kasiviswanathan 4b768872c0 Fix mem size variable type
Memory size is 64-bit. So use HSAuint64 instead of uint32.

Change-Id: Iaa607dec9c1a1c5ac46ea442fd482210ea550b45
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
2015-09-23 15:33:54 -04:00
Amber Lin f7fffdc2be Enable GFXIP version info for dGPU
Add GFXIP version 8.0.2(major.minor.stepping) for gfx802 and 8.0.3 for gfx803.


Change-Id: Icc7cac6b2e8a78d9cff4105aeb2bfcd2c7759027
2015-09-22 15:04:43 -04:00
Ben Goz 6170080cf6 Adding support for local memory on dGPU
Change-Id: I1a926b11730ba295605eeb37c9b1fc438bed8a64
Signed-off-by: Ben Goz <ben.goz@amd.com>
2015-09-21 14:13:15 -04:00
Ben Goz 692e004047 Adding new memory allocation IOCTL
Change-Id: I0eb1924811a2e1e436296ebe632d8f112a61637d
Signed-off-by: Ben Goz <ben.goz@amd.com>
2015-09-21 13:58:32 -04:00
Harish Kasiviswanathan 3e9773ff2c Revert "Topology, memory allocation, cleanup issue for gGPU"
This reverts commit ee08f537a7.

Change-Id: I92a4ed91bf566259916d1a96207e1fe9a6099c31
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
2015-09-21 10:47:30 -04:00
Harish Kasiviswanathan ee08f537a7 Topology, memory allocation, cleanup issue for gGPU
Patch submitted by Besar Wicaksono

1. Bug on detecting local memory size interpreted as 32 bit value
instead of 64. The bug causes thunk to go into an infinite loop trying
to reserve virtual address range for dgpu system memory.
2. SIMD count in the node property is 0. Runtime use this attribute to
find a gpu device.
	Regarding other attributes of intel+tonga topology, Harish started a
	discussion on August iirc, could you please share an update ?
	This would help me progress with more tests such as scratch memory,
	which require the scratch aperture information in order to construct a
	buffer srd in gpuvm space.
3. Bug on releasing memory via fmm_release, where no actual release is
being done. The vm_object can't be found because the memory size does
not match due to the allocation padded the size with 32KB.
4. Pointer arithmetic on vm_area allocation/release. The value of
vm_area_t::end seems to be interpreted inconsistently whether it is
(start + size  -1) or (start + size).
	One example of potential issue I see is the logic could report
	larger size of the hole in the vm area list.
5. Resource cleanup on multiple library load/unload within a single
process.
	- Any memory allocation on subsequent library load will result
	an error "va above limit". To my understanding this is due to
	the reserved memory for the system memory not being released on unload.
	- The static variable events_page needs to be invalidated
	appropriately on library unload so the next load could
	reinitialize it.
6. Could you please update if AQL queue is ready to test with the stg
kfd/kmt ?
7. The system memory allocation with size larger than 32KB seems to be
padded by an extra 32KB. I was wondering if we could remove this
overhead.

Change-Id: I039988d36637525089c7569dc3b77e58750e2121
2015-09-15 13:15:04 -04:00
David Ogbeide 8a01cd1212 libhsakmt: specify build output via variable
Makefile currently sends build output a default location.
Allow choice of build output location if so desired
using a variable.



Signed-off-by: David Ogbeide <davidboyowa.ogbeide@amd.com>
2015-09-01 14:30:53 -04:00
Ding, Wei (xN/A) TX a32c2b9854 ECR #333755 - HSA samples changes for dGPU. All passed on gfx802.
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1186398]
2015-08-31 14:41:50 -05:00
Ben Goz fb8378a18b Support gfx802 dGPU
Signed-off-by: Ben Goz <ben.goz@amd.com>
2015-08-30 14:13:53 +03:00
shaoyunl 2dff5cabfa Minor fix in libhsathunk for KFDMemory test
Signed-off-by: shaoyun liu(shaoyun.liu@amd.com)
Reviewed-by: Ben Goz(Ben.Goz@amd.com)
2015-08-05 17:32:00 -04:00
Ben Goz bb4a5cddd9 Revert "Enable creating SDMA queue."
This reverts commit 112f7e751a.
2015-08-05 13:33:42 +03:00
Amber Lin a3925a3a19 Enable version info via thunk interface
- Replace HSAuint32 with HSA_ENGINE_ID for EngineId type so it explicitely
  presents version information for ucode and GfxIP
- Created a GfxIP lookup table to pass the version information. This lookup
  searches for matching device ID.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Acked-by: John Bridgman <John.Bridgman@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
2015-07-31 14:56:33 -04:00
Flora Cui fc4e07daa3 Add interface to set CU mask
Signed-off-by: Flora Cui <flora.cui@amd.com>
Acked-by: Ben Goz <ben.goz@amd.com>
2015-07-23 15:44:01 +08:00
Moses Reuben 29c083f695 adding support for scratch memory
Signed-off-by: Moses Reuben <moses.reuben@amd.com>
2015-07-21 16:43:23 +03:00
Andrey Kasaurov (xN/A) SP 3bbf3c6a8b Fix HSA Finalizer
including Cherry pick of CL#1166690 from SC Stg and update for RT samples. It contains:

Change 1166280 on 2015/06/30 by bolek@bolek-common2

EPR #092474 - Fix missing integrations

	Change 1164156 on 2015-06-23 by nhaustov
	ECR #010005 - HSA Finalizer: Add missing tests.

	Change 1164232 on 2015-06-23 by nhaustov
	ECR #333756 - HSA Finalizer: Implement reading of notes.

Change 1166268 on 2015/06/30 by bolek@bolek-laser

EPR #010001 - Promotion of the Shader Compiler (SC)

	Release SC Library version 0001.IL01-02.0339 Date: June 30, 2015
	Changelist (stg/sc): 1165197

	Change 1163976 on 2015-06-22 by sashao
	EPR #373149 - OpenGL ES 3.0 Development

	Change 1164122 on 2015-06-23 by nhaustov
	ECR #033756 - HSA Finalizer: Fix compilation warnings.

	Change 1164150 on 2015-06-23 by efinger
	EPR #092474 - bugzilla 10829 - optimize out useless V_PERM_B32 feeding packed math op, in early expansion, rather than late expansion, and do it regardless of whether the packed op will be split or not.

	Change 1164187 on 2015-06-23 by efinger
	EPR #092474 - Add and use GetUAVInfo() and GetNumUAVs()

	Change 1164194 on 2015-06-23 by rgottlie
	EPR #092474 - Fix Linux Build Issue for SC_OPEN_SOURCE

	Change 1164204 on 2015-06-23 by mbedy
	EPR #092474 - Update Open Source build - now working from SC stg.

	Change 1164216 on 2015-06-23 by rouellet
	EPR #092474 - Add directive to do what -il_interpreter does.

	Change 1164232 on 2015-06-23 by nhaustov
	ECR #333756 - HSA Finalizer: Implement reading of notes.

	Change 1164239 on 2015-06-23 by nhaustov
	ECR #333756 - HSA Finalizer: Fix OpenCL build problem.

	Change 1164275 on 2015-06-23 by nhaustov
	ECR #333756 - HSA Finalizer: Fix Linux build errors.

	Change 1164365 on 2015-06-23 by efinger
	EPR #092474 - Cleanup UAV Atomic handling

	Change 1164393 on 2015-06-23 by kzhuravl
	EPR #333756 - Finalizer/Loader fixes

	Change 1164654 on 2015-06-24 by dpreobra
	ECR #333753 - HSA HLC: SPB_ASM: TestGen improvements

	Change 1164727 on 2015-06-24 by bolek
	EPR #092474 - Enable function level linking, COMDAT folding and unused function removal optimizations in Dev release builds. This saves about 2.6MB in code size on 64-bit Dev.

	Change 1164760 on 2015-06-24 by rgottlie
	EPR #422210 - Fix problem with TransformScratch heuristics

	Change 1164761 on 2015-06-24 by rgottlie
	EPR #422181 - Fix handling of sub-dword load instructions in propagation of immediates from store to load in RefineMemory

	Change 1164764 on 2015-06-24 by bolek
	EPR #092474 - Add missing const

	Change 1164769 on 2015-06-24 by efinger
	EPR #092474 - Cleanup GDS atomics

	Change 1164776 on 2015-06-24 by efinger
	EPR #092474 - Fix linux build

	Change 1164799 on 2015-06-24 by mbedy
	EPR #092474 - Improve alignment for 2 DWORD instructions by more closely

	Change 1164803 on 2015-06-24 by efinger
	EPR #092474 - Open Source Cleanup

	Change 1164809 on 2015-06-24 by bfavela
	EPR #092474 - Escape an infinite loop in shader during the build of a DAG when a block is visited twice

	Change 1164814 on 2015-06-24 by bolek
	EPR #092474 - Add Dev command line option to disable individual peephole patterns (blame Chris for this one).

	Change 1164827 on 2015-06-24 by bfavela
	EPR #092474 - Adding small change to CL 1164809 as suggested by creeve to remove superfluous if()

	Change 1164842 on 2015-06-24 by gujin
	EPR #092474 - Prevent moving exit-loop checking to the end of loop if there is a branch in the loop that is optimized with a target replacement bypassing the loop end. This is to fix an OpenGL hull shader conformance test fail (bug 10859).

	Change 1164876 on 2015-06-24 by rgottlie
	EPR #092474 - Only allow memory merging if no memory scope or order is specified

	Change 1164883 on 2015-06-24 by kdintino
	EPR #092474 - Add HSAIL files to the AMD -> LLVM copyright replacement loop.

	Change 1165060 on 2015-06-25 by efinger
	EPR #092474 - Open Source Cleanup - Copyright

	Change 1165077 on 2015-06-25 by efinger
	EPR #092474 - Cleanup LDS atomics - part 1 (groundwork)

	Change 1165080 on 2015-06-25 by bfavela
	EPR #092474 - Extension to SUPPRESS_PI_REDUCE_F32 for TAN (TAN_F16 is already handled by expansion)

	Change 1165189 on 2015-06-25 by efinger
	EPR #092474 - Cleanup LDS atomics - part 2

	Change 1165196 on 2015-06-25 by bolek
	EPR #092474 - Add syntax to the peephole pattern language to specify SCInst flag values or wildcards.

	Change 1165197 on 2015-06-25 by bolek
	EPR #092474 - Allow the MulAddToMadF peephole pattern to modify instructcions marked as invariant (result should still be the same)

Change 1165438 on 2015/06/26 by bolek@bolek-common2

EPR #010001 - Promotion of the Shader Compiler (SC)

	Release SC Library version 0001.IL01-02.0338 Date: June 26, 2015
	Changelist (stg/sc): 1163954

	Change 1161629 on 2015-06-15 by efinger
	EPR #092474 - Move CFG:IL2IRProcessDeclare() to global scope

	Change 1161633 on 2015-06-15 by rouellet
	EPR #092474 - Bugzilla 10852 call ConverInstFields when translating COND_MOVE.

	Change 1161643 on 2015-06-15 by rgottlie
	EPR #092474 - Handle manually inserted wait state for SALU writing M0 followd by VINTERP

	Change 1161718 on 2015-06-15 by lifpan
	EPR #092474 - The "point size" in copy shader of GS

	Change 1161721 on 2015-06-15 by xlji
	EPR #092474 - Split DIV_F16 and DIV_PRECISE_F16 

	Change 1161850 on 2015-06-16 by kzhuravl
	EPR #333756 - Change a few function names, general cleanup (no functional change)

	Change 1161934 on 2015-06-16 by efinger
	EPR #092474 - Fix linux compile warnings

	Change 1161946 on 2015-06-16 by nhaustov
	ECR #333756 - HSA Finalizer: Fix Linux build warnings.

	Change 1161981 on 2015-06-16 by efinger
	EPR #092474 - Open Source Cleanup

	Change 1161991 on 2015-06-16 by efinger
	EPR #092474 - Move CFG::IL2IRProcessSpecial() to global scope

	Change 1161997 on 2015-06-16 by rgottlie
	EPR #092474 - Fix compile warnings under Linux

	Change 1162001 on 2015-06-16 by efinger
	EPR #092474 - Fix linux build

	Change 1162045 on 2015-06-16 by mherdeg
	EPR #092474 - Comment out unused functions to fix linux compiler warnings.

	Change 1162048 on 2015-06-16 by akasauro
	EPR #092474 - SC: Some AMD OCL SDK tests (including BinomialOption) assert in SCInst.cpp. [on behalf of Atrem Tamazov]

	Change 1162061 on 2015-06-16 by efinger
	EPR #092474 - Rename NewIRInst to MakeIRInst and drop last (unused) arg.

	Change 1162066 on 2015-06-16 by creeve
	EPR #092474 - Linux build fixes for open source.

	Change 1162067 on 2015-06-16 by creeve
	EPR #092474 - Improve hash table grow and sanitize.

	Change 1162072 on 2015-06-16 by creeve
	EPR #092474 - Peephole |x| * |x| => x*x

	Change 1162089 on 2015-06-16 by chfang
	EPR #092474 - Fix linux compiler warnings in SCStructureAnalyzer.cpp.

	Change 1162145 on 2015-06-16 by efinger
	EPR #092474 - Improve interface to MakeInstOp[123]

	Change 1162427 on 2015-06-17 by efinger
	EPR #092474 - bugzilla 10862 - Back out changelist 1161549

	Change 1162434 on 2015-06-17 by rgottlie
	EPR #092474 - Only dump individual functions in each pass of Refine Memory

	Change 1162436 on 2015-06-17 by kzhuravl
	EPR #333756 - Integrate runtime independent loader from stg hsa + update project files

	Change 1162442 on 2015-06-17 by efinger
	EPR #092474 - Add and use CreateRegTemp()

	Change 1162505 on 2015-06-17 by skolton
	ECR #333756 - HSA Finalizer: Doorbell signals support 

	Change 1162527 on 2015-06-17 by kzhuravl
	EPR #333756 - Always set dx10_clamp to true for hsa

	Change 1162531 on 2015-06-17 by efinger
	EPR #092474 - Fix linux compile warnings

	Change 1162568 on 2015-06-17 by mbedy
	EPR #092474 - Specify a newer DX9 SDK for SCDevUtil that correctly links with WDK n10136.

	Change 1162623 on 2015-06-17 by mherdeg
	EPR #092474 - Remove duplicate #include "SCHSAInterface.h". It confuses Intellisense in Visual Studio.

	Change 1162905 on 2015-06-18 by rgottlie
	EPR #092474 - Fix Linux Build Warnings

	Change 1162930 on 2015-06-18 by nhaustov
	ECR #333756 - HSA Finalizer: Cleanup amdhsafin command-line tool.

	Change 1162938 on 2015-06-18 by nhaustov
	ECR #333756 - HSA Finalizer: Fix build problem.

	Change 1162944 on 2015-06-18 by rgottlie
	EPR #092474 - Clean up bug descriptions as per Phil's suggestion

	Change 1162951 on 2015-06-18 by skolton
	ECR #333756 - HSA Finalizer:  Bug fix for 1DB query image

	Change 1163009 on 2015-06-18 by nhaustov
	ECR #333756 - HSA Finalizer: build amdhsafin with WITH_LIBBRIGDWARF when needed.

	Change 1163263 on 2015-06-19 by nhaustov
	ECR #092474 - Fix patgen VS build by quoting %TMPDIR%.

	Change 1163265 on 2015-06-19 by skolton
	ECR #333756 - HSA Finalizer: Fix for doorbell signal store.

	Change 1163310 on 2015-06-19 by nhaustov
	ECR #333756 - HSA Finalizer: Introduce separate amdhsacode library.

	Change 1163316 on 2015-06-19 by nhaustov
	ECR #333756 - HSA Finalizer: Fix OpenCL build problem.

	Change 1163320 on 2015-06-19 by nhaustov
	ECR #333756 - HSA Finalizer: Fix another OpenCL build problem.

	Change 1163331 on 2015-06-19 by mjared
	EPR #092474 - Replace asin/acos 5th order minimax polynomial with a 6th order double locked (at 0 and 1) minimax polynomial

	Change 1163353 on 2015-06-19 by efinger
	EPR #092474 - Use normal temps (not expansion temps) for expansion template T regs.

	Change 1163473 on 2015-06-19 by mjared
	EPR #092474 - Improve accuracy of ATAN instruction by replacing rational approximation with a 17th order double locked minimax polynomial. Also increase degree of ASIN/ACOS double locked minimax polynomial to 7.

	Change 1163475 on 2015-06-19 by creeve
	EPR #092474 - Avoid putting partial write on export instruction. This feature existed before but only occurred if the output was point sprite. This change removed that restriction. Also fixed the implementation of //EsMode and //LsMode shader directi

	Change 1163481 on 2015-06-19 by mjared
	EPR #092747 - Misc. python scripts for working with transcendental functions. Includes fast implementation of remez minimax algorithm for absolute error and slower optimization-based remez for weighted/custom error reduction.

	Change 1163528 on 2015-06-19 by creeve
	EPR #092474 - Fix build issue.

	Change 1163603 on 2015-06-21 by bolek
	EPR #092474 - patgen makefile cleanup

	Change 1163614 on 2015-06-21 by kzhuravl
	EPR #333756 - Integrate runtime independent loader changes from stg hsa

	Change 1163699 on 2015-06-22 by rouellet
	EPR #092474 - bugzilla 10854 Get cb0[1] initialized with group dimensions for compute shaders on r800 and newer.  Make IL and HW interpreter details and variable names more closely match. Flush denorms when doing cube mapped samples (the cb0[1] init 

	Change 1163713 on 2015-06-22 by efinger
	EPR #092474 - Convert all usage of expansion temps to regular temps

	Change 1163718 on 2015-06-22 by nhaustov
	ECR #333756 - HSA Finalizer: Add loader (-loader option) to amdhsafin and update tests.

	Change 1163732 on 2015-06-22 by nhaustov
	ECR #333756 - HSA Finalizer: Implement images in amdhsafin loader and update tests.

	Change 1163774 on 2015-06-22 by mbedy
	EPR #092474 - Strip _DEV macros from open source. Fix issue in ifdef stripping.

	Change 1163786 on 2015-06-22 by mbedy
	EPR #092474 - Revert unintentionally submitted change.

	Change 1163803 on 2015-06-22 by bolek
	EPR #092474 - Peephole compile-time performance improvements

	Change 1163832 on 2015-06-22 by efinger
	EPR #092474 - bugzilla 10849 - fix copy propagation bug with SDWA

	Change 1163916 on 2015-06-22 by efinger
	EPR #092474 - Nuke support for expansion temps

	Change 1163954 on 2015-06-22 by creeve
	EPR #092474 - More code sanitization.

Change 1164740 on 2015/06/24 by vpykhtin@vpykhtin-SC

ECR #333753 - Cherrypicking CL1164641 from stg/sc (that is cherrypick of CL1164640 form stg/opencl)

Testing: TC PSDB

[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1167011]
2015-07-02 10:27:31 -05:00
Nikolay Haustov [TEXT] 5a8c84e012 ECR #010005 - Update HSA samples and test to use libHSAIL high-level tool interface.
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1155310]
2015-05-28 05:38:18 -05:00
Oded Gabbay 2e76017278 increase event limit to provide 4K events
Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2015-05-18 11:01:42 +03:00
Zhuravlyov, Konstantin (x21446) MA 20bed7ce7f ECR #333756 - Add support for relocations/offline global support in finalizer/loader
Testing: precheckin (http://ocltc:8111/viewModification.html?modId=51121&personal=true&init=1&tab=vcsModificationBuilds)

[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1147298]
2015-05-05 08:29:40 -05:00