提交線圖

2930 次程式碼提交

作者 SHA1 備註 日期
Chen Gong fc87256172 libhsakmt: enhancing support to gfx1033
This patch make get_block_properties() function work on gfx1033 platform

Change-Id: Ie5be7dfb38575eec8b39b91f3ee5b3a31abe8bd1
Signed-off-by: Chen Gong <curry.gong@amd.com>


[ROCm/ROCR-Runtime commit: 4cf50fdeaa]
2020-12-22 14:15:23 +08:00
Yifan Zhang 72b5ce407a kfdtest: Take vram size into account when calculate buffer number.
Vram size is relatively smaller in APU, e.g. 512MB.
Current MMBench doesn't support small vram system.
Running MMBench may have below errors:

[ RUN      ] KFDMemoryTest.MMBench
[          ] Found VRAM of 512MB.
[          ] Test (avg. ns)        alloc   mapOne  umapOne   mapAll  umapAll     free
[          ] --------------------------------------------------------------------------
[          ]   4K-SysMem-noSDMA         4569    20098     1292    18835      926     2218
[          ]  64K-SysMem-noSDMA        12738    20469     1030    19201     1293     4560
[          ]   2M-SysMem-noSDMA       256384    21020     1022    20568     1196    36294
[          ]  32M-SysMem-noSDMA      4031812    83750     5406    61156     4312   535656
[          ]   1G-SysMem-noSDMA    129260000   427000    34000   390000    30000 18548000
[          ] --------------------------------------------------------------------------
[          ]   4K-VRAM-noSDMA         3594    19637      979    19624     1357     2829
[          ]  64K-VRAM-noSDMA         3540    21062     1407    19614     1654     3024
/home/foreman/build/hsakmt-roct-amdgpu-1.0.9/sources/libhsakmt/tests/kfdtest/src/KFDMemoryTest.cpp:1119: Failure
Value of: (hsaKmtAllocMemory(allocNode, bufSize, memFlags, &bufs[i]))
  Actual: 6
Expected: HSAKMT_STATUS_SUCCESS
Which is: 0
[  FAILED  ] KFDMemoryTest.MMBench (723 ms)

Fix this issue by changing buffer number calculation in MMBench.

Change-Id: I5cce95707a048248f1e825c807586818619eddaf
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>


[ROCm/ROCR-Runtime commit: 742f718722]
2020-12-17 07:41:24 -05:00
Chengming Gui 5d08071b4f kfdtest: remove unsupported modifier 'offset'
fix 
v2: fix VGPR conflict
v3: use s_addc_u32 to replace s_add_u32

Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Change-Id: I8fe6bf1f5bf99544038ad16128c2bebd559d3da9


[ROCm/ROCR-Runtime commit: 3ed8b96bf0]
2020-12-14 17:29:13 +08:00
Huang Rui d72d4943b4 libhsakmt: add gfx1033 support
This patch is to add Van Gogh support on thunk.

Change-Id: I75819329b865e4c38c097e83e3a0cb4e4f566fa2
Signed-off-by: Huang Rui <ray.huang@amd.com>


[ROCm/ROCR-Runtime commit: 9600760ff7]
2020-12-08 23:54:46 -05:00
Sean Keely 14dd324d2f Cleanup warnings when using clang.
Change-Id: I09f72831e29bccdb4170c54e203872412e2f0b59


[ROCm/ROCR-Runtime commit: bd63a2b690]
2020-12-04 22:18:14 -06:00
changzhu 2608314fae Add distinguish for iommuv2/dgpu_fallback when getting gpuName
The memory tests between iommuv2 and dgpu_fallback are different.So it
needs to ditinguish them.

Change-Id: Icc64e9ae0fc1638c3d148795a5f247d9e5e8e503
Signed-off-by: changzhu <Changfeng.Zhu@amd.com>


[ROCm/ROCR-Runtime commit: 39386c03bf]
2020-12-04 02:24:49 -05:00
Philip Cox c03f46e198 kfdtest: increase default timeout to 10,000
The default kfdtest timeout is not enough for certain platforms, and
tests are failing.

Change-Id: I2027eadcbeb12a2fbbc9c55f92f31869fa13dbcb
Signed-off-by: Philip Cox <Philip.Cox@amd.com>


[ROCm/ROCR-Runtime commit: 4bbfbe7789]
2020-11-27 15:06:41 -05:00
Gang Ba 3d11816985 kfdtest: check peer accessible with new function
check GPU peer accessible with p2p_links in system

Signed-off-by: Gang Ba <gaba@amd.com>
Change-Id: I026f16564303b687811d6648f0b7f84be6819979


[ROCm/ROCR-Runtime commit: 8e94dde685]
2020-11-26 10:34:06 -05:00
Tony b36aad204e Make supported targets consistent
Add missing target names and make all parts consistent with which
targets are supported.

- Add gfx805 as a supported target.

- Add all ELF targets to genric code.

- Make offline loader match supported targets.

Change-Id: Idab4d69edc71645aecaa83aa55e29c1aeee4c1d6


[ROCm/ROCR-Runtime commit: b443397bcc]
2020-11-24 03:14:31 +00:00
Kent Russell e39dab44e3 Look in /opt/rocm* for SMI for setting clocks to high
Now that symlinks aren't necessarily guaranteed, use "find" to try to
find the rocm-smi, and clarify the error message if it is not found

Also tie in a fix for parsing the output now that the output has changed

Change-Id: I2081442a71731c186c3ad00585a2ba6e8a8e5a28


[ROCm/ROCR-Runtime commit: 2651ce37d8]
2020-11-23 14:05:10 -05:00
Sean Keely 62138712cf Add asserts and minimum values for kernarg alignment and utility functions.
Kernel argument size and alignment queries are not supported on
code object v3.

Change-Id: I1bdd34e2e62132f912ac39d80355efd3456df87c


[ROCm/ROCR-Runtime commit: 6182abf5e9]
2020-11-21 21:39:49 -06:00
Tony e1734526fc Update code object V3 kernarg queries
Code object V2 had the ability to support the following queries:

- HSA_CODE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_SIZE
- HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_SIZE
- HSA_CODE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_ALIGNMENT
- HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_ALIGNMENT

However code object V3 onwards cannot support these as the kernel
descriptor changed. These queries need to be deprecated.

Until then return more reasonable values:

- For kernarg alignment return 16 which is the minimum alignment
  required by the HSA standard.

- For kernarg size return the field from the kernel descriptor which
  is a hint. If it is 0 then the compiler is not specifying the kernarg
  size, or the kernel has no kernarg.

Change-Id: I19ce6cd0f3658a2bf62277492f39100ea5ab4256


[ROCm/ROCR-Runtime commit: ef755e4c82]
2020-11-20 21:39:18 -05:00
Sean Keely ce4de85616 Cache scratch allocations.
Avoids calling to KFD to map/unmap scratch allocations for
every large scratch using dispatch.

Change-Id: I9fab5705251ec82b03e4f2f2ca6da7cdccabefb9


[ROCm/ROCR-Runtime commit: 27e044ae4d]
2020-11-20 15:07:01 -05:00
Sean Keely 20f9fbd7f2 Limit clock synchronization to 16Hz.
Improves HIP event performance in directed benchmarks where
clock sync latency is significant.

Change-Id: I78b724a14a8f5b6a9a2b9f4d85afe9d8b81808a6


[ROCm/ROCR-Runtime commit: 32d0fcafa9]
2020-11-20 15:06:13 -05:00
Sean Keely eacc927741 Style update for SDMA enable flag.
Updated to match xnack flag's style.

Change-Id: I6115c0b53660d789e698de1606a9388ae1789866


[ROCm/ROCR-Runtime commit: b51f68b535]
2020-11-20 15:06:02 -05:00
Cordell Bloor 63953d98e1 Fix CMake configure error due to CMP0012
The modern meaning of the construct if( NOT ON ) was added in CMake 2.8,
but when the cmake_minimum_required not set in user code and no policy
level is set in the CMake config, then CMake 2.8 features cannot be
used. In old CMake (the default), ON is interpreted as a variable, and
because it is not defined, it is considered false. The same is true of
OFF.

This change sets a variable as ON, so that old CMake interpretation is
correct, and the if works as expected regardless of policy version.

Change-Id: I67d7ed4ceaf8248eeb5a1c7f54009d72313f3f5d


[ROCm/ROCR-Runtime commit: 4a35f560f6]
2020-11-20 15:04:41 -05:00
Cole Nelson 5dd453a265 opensrc/hsa-runtime/CMakeLists.txt: conformant package names
Names test good:
hsa-rocr-dev_1.2.0.30900-crdnnv.415_amd64.deb
hsa-rocr-dev-1.2.0.30900-crdnnv.415.el7.x86_64.rpm
hsa-rocr-dev-1.2.0.30900-crdnnv.sles151.415.x86_64.rpm

http://confluence.amd.com/display/GPUCPT/Package+File+Naming

Note: rpm requires 'devel' instead of 'dev', to be a subsequent
patchset.

Change-Id: Id6a422f3c335448b52c70c77ed39c9041114b80f
Signed-off-by: Cole Nelson <cole.nelson@amd.com>


[ROCm/ROCR-Runtime commit: 90f2dd5b1b]
2020-11-18 14:56:24 -05:00
Aakash Sudhanwa 0e2e1aca74 Revert "CMakeLists: Fix RPM dependency declaration"
This reverts commit 4aae34276b.

Reason for revert: <INSERT REASONING HERE>

Change-Id: Ieaf2da0067c3e89577569c5d478c70b97ca5f5ca


[ROCm/ROCR-Runtime commit: ac6e96e7a3]
2020-11-17 20:10:45 -05:00
Gang Ba 98bb73932d libhsakmt: Create P2P links
1. Create P2P links
2. Determine FRAMEBUFFER_PUBLIC/PRIVATE only based
   host-accessibility, not peer-accesssibility

Signed-off-by: Gang Ba <gaba@amd.com>
Change-Id: I15fccdc60386b453e2a47849a16df15157324b21


[ROCm/ROCR-Runtime commit: bedecc5957]
2020-11-17 15:43:12 -05:00
Kent Russell 4aae34276b CMakeLists: Fix RPM dependency declaration
RPM needs _REQUIRES at the end, not _DEPENDS, and also requires a space
before the version of the required package.

Change-Id: I9dd70bd92fc2407b7e8b31e4d46df43c52438a65
Signed-off-by: Kent Russell <kent.russell@amd.com>


[ROCm/ROCR-Runtime commit: 089fdeb1fe]
2020-11-17 12:36:05 -05:00
Gang Ba 0ddd455ead libhsakmt: add Streaming Performance Monitors APIs
Signed-off-by: Gang Ba <gaba@amd.com>
Change-Id: Iab9a98fa2079b7cae7158c524479dfc3fa672407


[ROCm/ROCR-Runtime commit: e8c0426c54]
2020-11-16 16:36:21 -05:00
Pruthvi Madugundu f254139e48 Fix for uninstallation problem of hsa-rocr-dev
- /opt/rocm-xx/hsa/include directory wasnt deleted after
  debian package uninstallation
- , 

Signed-off-by: Pruthvi Madugundu <pruthvi.madugundu@amd.com>
Change-Id: I213439d73f6533ff3a55e2b0071061d970cf56d4


[ROCm/ROCR-Runtime commit: 87955f8551]
2020-11-11 12:32:11 -08:00
Ashutosh Mishra 5026ed2f8a Standardizing Package names
: Enables standards compliant package naming for debian and rpm

Signed-off-by: Ashutosh Mishra <ashutosh.mishra@amd.com>
Change-Id: I177af15ec7a3f909d05135be30a0acc7b0b20745


[ROCm/ROCR-Runtime commit: aefc173da4]
2020-11-11 00:37:20 -05:00
Konstantin Zhuravlyov ba667661c5 Implement Target ID Proposal
Changes from Konstantin Zhuravlyov, Tony Tye

Change-Id: I532801193afa9d5b8ac2a877b5497eab661f0597


[ROCm/ROCR-Runtime commit: 3a08d0964e]
2020-11-10 13:42:35 -05:00
Tianci.Yin 349fe6a583 kfdtest: add DID for gfx1010 blockchain SKU
Change-Id: Icd52c5db4dd975086fcfb13deb6727919c1f5809
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>


[ROCm/ROCR-Runtime commit: 1231570441]
2020-11-04 01:14:32 -05:00
Chengming Gui 55fe1ba5a3 kfdtest: update shader code for gfx10.3 kfdtest
s_store_* instruction set was retired from gfx10.3

Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Change-Id: Ibe41a3fe7e053fb345b1af6ad4abc22a0885bc81


[ROCm/ROCR-Runtime commit: f283fe2854]
2020-11-03 22:25:39 -05:00
Felix Kuehling d8ded1fa6e Revert "libhsakmt: optimize system memory allocation"
This reverts commit 35b07e1e28.

Reason for revert: This commit caused a regression rocrtst memory
subtest: Maximum Single Allocation in Memory Pools failed.


Change-Id: I15330625603f893200a08cd8b5b097f9bf95361f


[ROCm/ROCR-Runtime commit: e515fd818b]
2020-11-03 12:19:56 -05:00
Jeremy Newton a803c56889 kfdtest: Use ldflags for drm
This fixes a build issue with kfdtest and the amdgpu pro driver build.

This was requested as kfdtest is needed for regular testing due to the
inclusion of the ROCr/KFD stack in the amdgpu pro driver (OSGSUP-199)

Change-Id: I224d2e9ee3f02065596890b4d8226484f4fac04f
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>


[ROCm/ROCR-Runtime commit: 8026ba250c]
2020-11-02 16:46:45 -05:00
Kent Russell 693a1d2b95 kfdtest: Fix bit compare logic by using | instead of ||
We want to compare bits, not check if a defined value is true

Change-Id: Ie51ede96d18eae01aff6677d852a056ee12bd9c6


[ROCm/ROCR-Runtime commit: e34dfa8ebd]
2020-11-02 06:03:59 -05:00
Kent Russell 9057877de1 kfdtest: Add missing default switch label in GetQueue function
There is no default case, and we were missing a few types defined from
hsakmttypes.h. This was found via clang

Change-Id: I26193cb111a9d8220b1eff21c7313fe060288f36


[ROCm/ROCR-Runtime commit: 761d9d84d2]
2020-11-02 06:03:59 -05:00
Kent Russell 85b0b04903 Remove extra strlen call
While the ternary is nice to read, strlen in general is an expensive
call, so call it once and check if the value is greater than our maximum
allowable string length and adjust accordingly

Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: Id744f2ba0eb81bb2b3c52eb69f38a615398a655d


[ROCm/ROCR-Runtime commit: 025036a662]
2020-10-30 06:44:48 -04:00
Felix Kuehling acdda4811b libhsakmt: handle GPU mapping errors
Don't update the vm_object if GPU mapping failed. Print an error message
to help diagnose underlying problems.

Change-Id: I801ab6fe6c155bd25e6c0358007c106a4a019480
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: c7e6f5a274]
2020-10-26 18:46:24 -04:00
Felix Kuehling 35b07e1e28 libhsakmt: optimize system memory allocation
Use MAP_POPULATE when allocating anonymous system memory for later
GPU mapping as a userptr. This can speed up large allocations by
more than factor 2. I suspect populating pages in this way is more
efficient than the CPU page fault code path triggered by
get_user_pages in the kernel.

Change-Id: I188bbc1462ccb650d48cbfb1080dbb8eb7ada8b5
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 8f26c0c40c]
2020-10-26 18:44:20 -04:00
Sean Keely 56fa259711 Diable sram ecc reporting.
Temporary workaround while language and compiler teams sort out
handling both modes.

Change-Id: I5d676cd546382dba05ec0b62bb885baa854614f6


[ROCm/ROCR-Runtime commit: a09ba8bcc8]
2020-10-20 17:06:30 -05:00
Laurent Morichetti d933e70089 libhsakmt: Fix the ctrl stack size calculation
On gfx9, the maximum number of wavefronts per queue is the minimum of
40 waves per compute units, or 512 waves per shader engine.  On gfx10,
there can only be 32 waves per compute units.

Signed-off-by: Laurent Morichetti <laurent.morichetti@amd.com>
Change-Id: I148d1a4fe6c07cdbfaa1f77939eb29311c81c008


[ROCm/ROCR-Runtime commit: 783e346777]
2020-10-16 12:35:11 -07:00
Arun Sunil 8c495a73de CMakeLists.txt: Fix issue with rocrtst
Fix for issue where rocrtst could not be built if out directory 
was outside the src (WORK_ROOT) directory due to hard-coded 
relative path for OPENCL_INC_DIR.

Change-Id: Icb93de2266d568e9c2437166e34c88ec526fb45c


[ROCm/ROCR-Runtime commit: 8d00f1aa59]
2020-10-13 18:14:26 -04:00
Evgeny 2c1f448744 aqlprofile: adding counters DISABLE get-info id
Change-Id: I90d0f6ae96b0d80c481648eecf907301fc13ab74


[ROCm/ROCR-Runtime commit: 0d1e5cbcb6]
2020-10-12 17:12:25 -05:00
Chengming Gui 120457680e kfdtest: Add gfx1032 support
Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Change-Id: I04cd412a5e243dfe7aa7596287341e1671c1521a


[ROCm/ROCR-Runtime commit: 4a9d55c414]
2020-10-12 11:30:39 +08:00
Chengming Gui c5429d56c1 libhsakmt: Prepare gfx1032 support
PCI IDs have yet to be added later.

Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Change-Id: I28f657201868423012e856df4310a493b7cd5752


[ROCm/ROCR-Runtime commit: b543f4f77c]
2020-10-12 11:30:39 +08:00
Sean Keely c3142d6b6d Initialize intercept queue packets properly.
Change-Id: I0ff1540940665409a9ade3a517dd576a8f334c7b


[ROCm/ROCR-Runtime commit: 9192dfe1b0]
2020-10-08 15:33:43 -05:00
Laurent Morichetti 18240aa6d6 Update the context save area size
Reserve some space in the context save area for the debugger's
use. There should be 32 bytes per wave for a given queue.

Change-Id: I65ddb6123d0f6afd3149844617ad19023009101d


[ROCm/ROCR-Runtime commit: 2ed2e46b9b]
2020-10-05 12:10:58 -07:00
Chris Freehill 6e6cc27c73 Add README for rocrtst
Change-Id: Icd43a243ccfc9caf5ade3cd0e7ffc00e251fc0a2


[ROCm/ROCR-Runtime commit: 2b41fb9fdc]
2020-09-28 20:27:53 -05:00
Jay Cornwall 7821213760 libhsakmt: Limit control stack size on gfx10
The queue control stack size cannot exceed 0x7000 on ASICs
gfx1010 through gfx1031. The lower limit is not achievable
with AQL so this should have no practical effect.

Fixes control stack size overflow on large ASICs.

Change-Id: Ib78cf6e4c5f096044bf8de24debe211689891caa
Signed-off-by: Jay Cornwall <jay.cornwall@amd.com>


[ROCm/ROCR-Runtime commit: 44f80d170d]
2020-09-28 18:14:57 -05:00
Sean Keely e78c1df5e3 Correct return type error in hsa_amd_signal_wait_any.
The error checking macro IS_OPEN returns an hsa_signal_t.
This conflicts with the return type of uint32_t.

Add an assert and rely on spurious return rule to return zero
when rocr is not initialized.

Change-Id: Ifc9bb75e22ecdd675273de59b31e5026a69c62e0


[ROCm/ROCR-Runtime commit: a3c4aaf95a]
2020-09-25 21:33:23 -04:00
Sean Keely d78de2d062 Add try/catch blocks to image APIs.
Change-Id: I724dcc8015ac556649278dd6cdf1ad4097aaa846


[ROCm/ROCR-Runtime commit: 248904ab26]
2020-09-22 19:49:36 -04:00
Sean Keely e5df994bff Correct image limits tables to SI limits.
Limits remain unchanged through gfx1030.

Change-Id: Ibdd39b7b97101ea0133af6cebdf295aeef81ac45


[ROCm/ROCR-Runtime commit: 33a57ddf72]
2020-09-22 19:49:08 -04:00
Harish Kasiviswanathan 0e9dbc61d2 libhsakmt: add device ID for gfx1030
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: I3024bf706f88c0e612391f6d8045020101007bdc


[ROCm/ROCR-Runtime commit: 3311785c7b]
2020-09-21 14:23:46 -04:00
Chengming Gui 90298c1436 kfdtest: fix MTYPE error when init sdma fence packet
Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Change-Id: I14a1ef204dbd5ab1e9f1840b9555f88b0df361c0


[ROCm/ROCR-Runtime commit: befc7edaea]
2020-09-21 03:57:12 -04:00
Jinzhou.Su 8fc695a7c1 kfdtest: fix gfx90c blacklist issue
This patch is the hot fix to update gfx90c blacklist 

Change-Id: I41f48154ad5ec3035fcb7891a224fc940dca991f


[ROCm/ROCR-Runtime commit: 7efa823ec8]
2020-09-16 09:42:01 +08:00
Jinzhou.Su 0b78c33201 kfdtest:Update gfx90c blacklist
1. Add KFDEvictTest.* for gfx90c based on CI test results

2. Remove SDMA blacklist based on SDMA issue fixed:

Change-Id: I86910fc98a5141f29959b35248a900f0c098a6e8


[ROCm/ROCR-Runtime commit: 36249ddc0e]
2020-09-14 22:49:53 -04:00