Граф коммитов

2959 Коммитов

Автор SHA1 Сообщение Дата
Shi, Aaron (en ye) (xN/A) TO ad21f0606e HSA Finalizer: Promote SC PRM -> Finalizer (HSA tree) up to CL 1258514
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1259784]
2016-04-19 15:31:52 -05:00
Jay Cornwall (xN/A) UK 1d4a257225 Fix SDMA fill for >=4MB regions
max_single_fill_size_ overflowed the packet field size. Reduce by one dword.

[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1259263]
2016-04-18 16:05:13 -05:00
Andres Rodriguez 44572965f6 package: rename to hsathk-rocm-dev
Since we include headers and not just a library anymore, we should be
considered a -dev package and not a lib package.

Change-Id: I220465ea4ffc8d66d8d76e6716e6c6c50cdacea1
2016-04-13 19:39:54 -04:00
Besar Wicaksono (xN/A) TX [TEXT] 5a584fa1ab Fix query HSA_AMD_AGENT_MEMORY_POOL_INFO_LINK_INFO
Querying HSA_AMD_AGENT_MEMORY_POOL_INFO_LINK_INFO between a gpu agent
and its own local memory pool returns a wrong information.
Fix: return link with 0 hop count.

[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1257544]
2016-04-13 12:39:25 -05:00
Hari Thangirala 0545761aa9 ROCR Build ID support
Fix dirty-tree status. Thanks to Fan for fixing the issue.

[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1256716]
2016-04-11 18:48:29 -05:00
Besar Wicaksono (xN/A) TX [TEXT] ea67bb8374 Sdma wraparound optimization.
Remove mutex and just make the thread spin again if the queue is wrapping.
Remove the wait for the queue to finish wrapping, and just check if there is enough space to recycle when reserving queue space.

[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1256713]
2016-04-11 18:31:45 -05:00
Andres Rodriguez 9f355b78a0 Adopt new ROCm packaging guidelines
All files should go into /opt/rocm/$component

For developer convenience, a single include directory is created through
symlinks, from the component include directory to /opt/rocm/include.

Similarly, a unified linked directory is present in /opt/rocm/lib

The component lib directory should not include linker names (library
names without version numbers).

This commit also fixes 'make rpm' running correctly without the need for
sourcing build/envsetup.sh

Change-Id: I95a680f6d3e3bd1ae688d0694934a0577dbd007c
2016-04-11 18:30:54 -04:00
James Edwards (xN/A) TX 871412adff Remove ENV variables from CMakeLists.txt files.
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1256687]
2016-04-11 17:18:01 -05:00
Hari Thangirala a148fd0b68 ROCR Build ID support
Build system/Package maintainer:
-    BUILDID is specified at cmake.
-    USAGE: cmake -DBUILDID=<ID> ../src

For developer builds the who typically don?t provide BUILDID, cmake will:
-    Determine the last git commit when this tree was syncd 
-    Deteremine the build date 
-    Check if tree is clean when built 

The idea of this embedded string is that later when you get a ROCR build, you can get some idea on the build origination by using: strings libhsa-runtime.so.1 | grep ?ROCR BUILD ID?

For eg:
-    If it?s a Jenkins build 25, it returns: ?ROCR  BUILD ID: 25?
-    If it?s a developer build sync'd @ 06f5f2a with modifications, it returns: ?ROCR BUILD ID: 06f5f2a-2016-04-11-0"

[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1256588]
2016-04-11 15:03:06 -05:00
Felix Kuehling 82b3fad320 Fix 4GB and larger system memory allocations
Intermediate size was stored in a 32-bit variable. This resulted in
4GB allocations to fail in KFD due to 0 size. Larger allocations
would allocate the wrong amount of memory.

Change-Id: If19dedf64952f1d2edd813793241e12c0362d220
2016-04-11 11:17:06 -04:00
Zhuravlyov, Konstantin (x21446) MA 503fd728dd Fail gracefully if memory allocation did not succeed
Testing: precheckin (http://ocltc.amd.com:8111/viewModification.html?modId=69427&personal=true&tab=vcsModificationBuilds)

[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1256179]
2016-04-09 16:40:24 -05:00
Besar Wicaksono (xN/A) TX [TEXT] 2ebde5d2a7 Fix unit test build error due to CL#1256098
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1256119]
2016-04-08 16:51:45 -05:00
Besar Wicaksono (xN/A) TX [TEXT] 7760839934 Fix build error from CL#1256102 due to whitespace issue.
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1256108]
2016-04-08 16:40:05 -05:00
Besar Wicaksono (xN/A) TX [TEXT] a03c5148a7 Add AMD extension version
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1256102]
2016-04-08 16:31:00 -05:00
Besar Wicaksono (xN/A) TX [TEXT] 4ccc695b95 Add global memory clock and width info on the agent attribute list and deprecate the ones in the memory region attribute list.
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1256098]
2016-04-08 16:29:10 -05:00
Andres Rodriguez 31861c838e package: change install directory to /opt/rocm
Align with the rest of the driver stack on the new installation path
/opt/rocm/*

This mechanism for generating packages should be changed for something
nicer and more standards compliant in the future.

Change-Id: Ic31409b0d0b8f6ee4b25296d2580982a76aab564
2016-04-08 11:41:49 -04:00
Nikolay Haustov [TEXT] a795909bca Cherry-pick CL 1250286 from SC stg.
HSA Finalizer: Add dumping of code object, ISA and executable to loader.

This is controlled by loader options -dump-all, -dump-isa, -dump-code, -dump-exec

The options can now also be set with env variable LOADER_OPTIONS_APPEND.

Added tests to finalizer_offline

Testing: smoke, dumping on hardware

Reviewed by: Konstantin Zhuravlyov

[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1255351]
2016-04-07 06:01:20 -05:00
Besar Wicaksono (xN/A) TX [TEXT] 823c254d61 Cleanup TODO format
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1255182]
2016-04-06 16:50:50 -05:00
Ramesh Errabolu (xN/A) TX b93946790d Update Private Segment Size parameter of the dispatch packet
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1254638]
2016-04-05 14:03:33 -05:00
Besar Wicaksono (xN/A) TX [TEXT] c95f96a9e4 Add environment flag to enable sdma workaround that will wait for the sdma queue to be idle before updating the write pointer. Add class to manage environment flags.
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1254004]
2016-04-01 17:13:45 -05:00
James Edwards (xN/A) TX e3670a2bef Branch Brig.h file into opensrc hsa-runtime directory.
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1251455]
2016-03-25 15:26:18 -05:00
Nikolay Haustov [TEXT] 46842a57e5 HSA Finalizer: Merge changes in libamdhsacode and loader from sc_prm into hsa/compiler/finalizer and hsa/runtime.
Testing: pre-checkin

[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1251389]
2016-03-25 08:36:20 -05:00
Zhuravlyov, Konstantin (x21446) MA f6565a2f70 Clean up extensions and provide public extension/API to query host address given device address:
- Partially remove 'amd_load_map' extension because it is not used and will not be used
- Remove 'hsa_amd_query_kernel_host_address' API
- Add 'hsa_ext_amd_loaded_code_object' extension
- Add 'hsa_ext_amd_loaded_code_object_query_host_address' API
	- Most likely to be used by debugger, profiler, and hcc (printf)
- Update affected sources
	- 'hsa_system_extension_supported'
	- 'hsa_system_get_extension_table'
	- SoftCP path
- Integrate CLs 1250699, 1251204, 1251214 from stg sc

ReviewBoardURL: http://ocltc.amd.com/reviews/r/10091/
Testing: smoke (ok), teamcity (ok), samples on fiji (AQL and SoftCP) (ok)

[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1251223]
2016-03-24 19:00:30 -05:00
Besar Wicaksono (xN/A) TX [TEXT] 9fa0531950 Always wait queue wrap around to finish and dont return not enough resource.
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1251141]
2016-03-24 15:52:45 -05:00
Sean Keely (xN/A) TX 1c7142c129 Minor fix to hsa_amd_image_descriptor_t.
Change uint32_t data[0]; to uint32_t data[1];

[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1251050]
2016-03-24 13:24:22 -05:00
David Ogbeide 682776d89a libhsakmt: get CPU model name from proc/cpuinfo
HSA thunk is currently only aware of GPU node
model info, CPU names are NULL.



Signed-off-by: David Ogbeide <davidboyowa.ogbeide@amd.com>
Change-Id: I3c2adbb8566a5048b44c39fff4fd8228912468ff
2016-03-23 11:11:18 -04:00
James Edwards (xN/A) TX 7d2bc9d113 Separate open source core runtime code from DK makefiles.
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1250152]
2016-03-22 18:10:13 -05:00
James Edwards (xN/A) TX 7d1e6c3a57 Remove opensrc test files.
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1249961]
2016-03-22 13:39:51 -05:00
James Edwards (xN/A) TX c9ffe0004e Check open source core runtime code into perforce. This includes license and README files.
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1249136]
2016-03-20 15:39:40 -05:00
Felix Kuehling 06d391c6c9 Add environment variable to disable GPU caching
This option may help debug synchronization or coherency issues
involving the GPU caches. It works only on dGPUs, by changing the
cache policy of the GPUVM default aperture to "cohrent", which is
implemented as non-cached on current dGPU hardware.

Change-Id: I544ac9cc5c0cf1fa5c4e30f67aa42b3b5e44ae67
2016-03-17 18:51:47 -04:00
Harish Kasiviswanathan f1fbacca15 Add QPI or HT io_links
Create QPI or HT links among all NUMA nodes. For now, assume all the
NUMA nodes are interconnected with same Weight (=1).

Change-Id: Id48ba95b9d75515a186f7dc5006b19bd92743ae3
2016-03-15 21:10:53 -04:00
Harish Kasiviswanathan ee1dd5d9c2 Get processor vendor from /proc/cpuinfo
Change-Id: I9039385d268ef1693fab121cbf1caf442129a12e
2016-03-15 15:37:52 -04:00
Besar Wicaksono (xN/A) TX [TEXT] 73d43224e9 Add IOLink support
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1247220]
2016-03-14 18:42:31 -05:00
shaoyunl 79077811f5 Add Imprecise flag for memory access fault
KFD may not be  able to provide the precise VM fault address and status.
This flag will indicate whether the event data has the fault details

Change-Id: I15ffd5c25f555003c6450cc0700efb769418f76b
2016-03-14 15:17:17 -04:00
Felix Kuehling 0ed29f5191 Report SVM heap in topology
The Runtime requested this information so they can tell easily
whether a pointer is part of HSA shared address space or not.


Change-Id: If2041ed34031636677d692bc2dc6625634027ed4
2016-03-14 11:52:36 -04:00
Harish Kasiviswanathan 1c1bc32477 Sync IOLINK defines to thunk spec
Current thunk spec v1.07 dated Feb 1, 2016

Change-Id: Ie1821f7f1903ac48b76cb68d452a6073d3a3c8d9
2016-03-11 18:59:57 -05:00
Harish Kasiviswanathan 8ff2bcd48d Fix indirect io_links
Connect only (Peer-to-Peer) GPUs that belong to same NUMA node. Without
this additional check non direct GPUs would also get connected.

Change-Id: I9a5ed19b8f06cd0527854cbbdb51ede99eade28b
2016-03-11 18:54:32 -05:00
Felix Kuehling cac0c08496 Fix lstopo
Lstopo doesn't have system memory mappings at low addresses. Make
sure we leave enough GPUVM address space for kernel allocations
(currently only CWSR) before the start of the user-managed SVM
aperture.

Change-Id: Ic197f7bd5a3cfb150a0da2bfdbc848664e7869be
2016-03-11 11:01:12 -05:00
Harish Kasiviswanathan 7042292c60 Add indirect io_links
Connect (Peer-to-Peer) GPUs that belong to same NUMA node.
Connect all [GPU] <--> [Non Parent NUMA] node

Change-Id: Ib4b08a6545d28b7dce4c9b1a90378bfc51bed07e
2016-03-10 15:11:17 -05:00
Harish Kasiviswanathan 1e729510d2 Allocate memory for indirect io_links
To simplify, allocate maximum needed memory for node_t->link array.
No need for realloc when indirect links are added. Trade off - for some
nodes more memory than required will be allocated.

This means the loop to compute the number of direct (reverse) io_links
for a CPU node is not necessary.

Change-Id: I2b2559142cbec3b262d0b4ea5fdebfd8f36c28fc
2016-03-10 15:10:48 -05:00
Felix Kuehling 61ec3df2f9 Add support for hsaKmtRegisterGraphicsHandleToNodes
Change-Id: I6fd7154dea78188480d5cb89ac237bad572356c4
2016-03-10 11:16:02 -05:00
Ben Goz b1393f8224 Support MapMemoryToGPUNodes on APU
Change-Id: Ie77a2eb23cd9fe6671ff9e0630977220218e55dd
Signed-off-by: Ben Goz <ben.goz@amd.com>
2016-03-09 21:31:52 -05:00
Felix Kuehling cb0315d31d Update kfd_ioctl.h from kernel
Change-Id: I9852ef2e33e1f3b24343747e3c1c09b0050ffdc1
2016-03-09 10:55:12 -05:00
Felix Kuehling b837c3e7b0 Clean up GPUVM aperture management
Non-canonical GPUVM aperture doesn't exist on dGPUs. Remove comments
and code that say otherwise.

Fix alignment of GPUVM aperture for gfx801. Requires the same workaround
as gfx802. It's not used for anything on gfx801 yet, but will be soon.

Change-Id: I88607fe7b340081cc0715b85f28fdbf5f1bb0ad7
2016-03-09 10:55:12 -05:00
Yair Shachar c42ec0b82c name unnamed struct within HsaMemMapFlagd union
For aligning with RT definitions

Change-Id: I4dca0c5818fdcea6c596a48c7516835fc595a289
Signed-off-by: Yair Shachar <Yair.Shachar@amd.com>
2016-03-07 18:43:03 +02:00
Harish Kasiviswanathan 1d1c30db7c Add reverse direct io_links
The Kernel only creates one way direct link -
	GPU(PCI_BUS) --> [Parent NUMA Node]

Create the reverse direct io_link here -
	[Parent NUMA Node] -->  GPU(PCI_BUS)

Change-Id: I829a1b1b7f34bda42871ede3472d60915e88418c
2016-03-04 15:54:03 -05:00
Harish Kasiviswanathan a80d2f2303 Add free_nodes() helper function
Change-Id: I18ae0ac91b05275d7ad9d93175bae06870080844
2016-03-03 18:33:59 -05:00
Andres Rodriguez 7c376247b5 README: spelling and date fixes
Change-Id: I51fa196971b91ea71fd8b0abe169fe23502ebb96
2016-03-02 18:42:01 -05:00
Andres Rodriguez 35e8fc6b15 readme: add an initial README.md file
This is a simple README.md since most of the details should be in the
ROCK project.

Change-Id: I3175e2a5ade0f9ecb913076a4842b528f14947f0
2016-03-02 18:42:01 -05:00
Ben Goz e2fb4bc312 Align hsaKmtMapMemoryToGPUNodes according thunk spec
Change-Id: I507ba5c6029ca5e7088c25930d46f5221679ace4
Signed-off-by: Ben Goz <ben.goz@amd.com>
2016-03-02 16:12:03 +02:00