Due to a misinterpretation of the HSA specification the microcode has,
until now, been responsible for ensuring a coherent view of the
amd_kernel_code_t object when acquire_fence_scope is set to agent or system.
To correct this the runtime must instead assume this responsibility.
Introduce GpuAgentInt::InvalidateCodeCaches to perform this operation
on-demand. Invoke this after code object allocation. Extend the Queue
implementations to support PM4 command submission, through which the
PM4 command ACQUIRE_MEM can be submitted to perform cache invalidation.
Submit through a runtime-managed queue shared with the blit implementation.
This change depends on microcode support and this is checked against the
running version. Older microcode builds will perform cache invalidation
themselves, so it is acceptable for this change to do nothing in that case.
Change-Id: I268dd2b83af3decdd9ad07430a81df8a2ecb6bd2
This option was disabled by default to address issues writing to stderr
in Windows applications. The lack of an error message for memory access
faults is confusing to users, however.
Enable the error message by default on Linux only.
Change-Id: I1f44ba42362f8874abdc7c8e63ddd54a855b5394
The runtime needs a queue on which to submit cache management commands.
Device-to-device blit copy already creates a queue unconditionally.
We can share this queue for both purposes.
This change restructures the BlitKernel interface to accept, rather than
create, a queue. GpuAgent creates queues as needed for both cache
management and blit compute.
Fix queue full detection in AcquireWriteIndex (<= vs <).
Change-Id: I61d0c6b9d04f2dba74872f0676ad791435778ba4
This is the first part of transitioning to the LLVM-based assembler.
SP3 is deprecated and all references to the library are removed.
Pending LLVM support, relevant shaders have been precompiled.
Change-Id: I7d44cef5ded1836c4a74b77881af5bea8803d2c1
On multi-node systems only the first CPU node was recognized in the
signal consumer list, causing fallback to non-interrupt signals.
Change-Id: I9bd0706bafbe046be9d7f210d05fa4cf1fcd16fa
Before this change, runtime hard code the device name, in this commit,
we will query the name from KFD. Will use codecvt to do UTF-16 to
UTF-8 transfer after GCC supports it.
Change-Id: I7c4dc32ef857296296c810d083888c5ba1c808b6
Have amd::MemoryRegion::Lock not assert if the alternate_va
is null but use the host_ptr instead because in the case where
the src/dst memory pointer is allocated via KFD, the host_ptr
is a GPUVA already.
Change-Id: If44368cc2854d4c0c477ae56e4eeabc37e54c1a5
Reduces the number of blit queues from 3 to 2, when SDMA is unavailable,
improving the availability of queue slots for applications.
Change-Id: I8860d2b6c6d6527494b9fc35d164099e1313886a
for the kernel args.
Most image-related HSA conformance tests pass now
Many more ocltst/oclperf image ones pass too.
Change-Id: I3f28d4ee7369f0ebc7af5128d3ffe1390957db98
max_single_fill_size_ overflowed the packet field size. Reduce by one dword.
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1259263]
Querying HSA_AMD_AGENT_MEMORY_POOL_INFO_LINK_INFO between a gpu agent
and its own local memory pool returns a wrong information.
Fix: return link with 0 hop count.
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1257544]
Remove mutex and just make the thread spin again if the queue is wrapping.
Remove the wait for the queue to finish wrapping, and just check if there is enough space to recycle when reserving queue space.
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1256713]
Build system/Package maintainer:
- BUILDID is specified at cmake.
- USAGE: cmake -DBUILDID=<ID> ../src
For developer builds the who typically don?t provide BUILDID, cmake will:
- Determine the last git commit when this tree was syncd
- Deteremine the build date
- Check if tree is clean when built
The idea of this embedded string is that later when you get a ROCR build, you can get some idea on the build origination by using: strings libhsa-runtime.so.1 | grep ?ROCR BUILD ID?
For eg:
- If it?s a Jenkins build 25, it returns: ?ROCR BUILD ID: 25?
- If it?s a developer build sync'd @ 06f5f2a with modifications, it returns: ?ROCR BUILD ID: 06f5f2a-2016-04-11-0"
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1256588]