Графік комітів

2959 Коміти

Автор SHA1 Повідомлення Дата
Huang Rui 554ed5e76d Add gfx 10.3.3 into rocrtst list
Change-Id: I854e5092236175e47a2134d703f154885cae8c3e
Signed-off-by: Huang Rui <ray.huang@amd.com>
2021-01-22 04:22:15 -05:00
Huang Rui feeb2f62e2 Add gfx10.3.3 ISA support for Van Gogh
This patch is to let ROCr recognize new gfx10.3.3 ISA.

Change-Id: Ied23eee2752e14c19c8c0a6d7789fded9940e31e
Signed-off-by: Huang Rui <ray.huang@amd.com>
2021-01-22 04:22:15 -05:00
Laurent Morichetti 8aec53969f Don't terminate waves halted at s_endpgm
To support single stepping the instruction preceding an s_endpgm,
unwind the PC by 8 bytes and set ttmp11[9] to notify the debugger
that the wave is halted with a modified PC.

Bump the debug r_version for this new trap handler ABI.

Change-Id: I55e4e0d65576f92da14a336266c31c513baab547
2021-01-21 20:51:38 -08:00
Laurent Morichetti 8808ed3177 Correct gfx10.3+ trap handler.
Change-Id: I77d2b41c8882014a430d741ecd777718a1f61561
2021-01-21 09:24:20 -08:00
Gang Ba 7652932c38 libhsakmt: Correct number of io_links
Inside Docker, when limit GPU number to one, it may cause node
numIOLinks bigger than total node number.

Signed-off-by: Gang Ba <gaba@amd.com>
Change-Id: Ib84f2f05f8e0c70e48b9043b79aec02b5a214bbe
2021-01-19 19:46:25 -05:00
Tony Tye 26fe26e415 Correct isa lookup for targets that do not support a target feature
Change-Id: I130070a53162e5d9fcc6a64a4bdda7869179be82
2021-01-18 15:47:19 +00:00
changzhu 18d9cca879 Remove MMBench test from kfdtest blacklist for gfx90c and gfx902
The MMBench issue has fixed by patch:
kfdtest: Take vram size into account when calculate buffer number
So it can remove it from kfdtest blacklist now.

Change-Id: Ib918bca72adf28f4082248fae1e3287d395c32bf
Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
2021-01-18 14:53:45 +08:00
Chris Freehill 09bc75bf0d Correct some target ID strings for gfx908
Change-Id: I7833b561447b9928447cf49472cfe1ca1867e71d
2021-01-15 14:56:38 -06:00
Prike.Liang 7e184ebb3a libhsakmt: add more gfx90c family device support
This patch is to add Cezanne/Lucienne support on thunk.

Change-Id: Icd9b9913fa87bbfe6c71b36a2892d6ddb73e3ddd
Signed-off-by: Prike.Liang <Prike.Liang@amd.com>
2021-01-15 09:48:41 +08:00
Kent Russell bb7e7df02a Remove extra brace, use libsan vs libasan
Change-Id: I82e0d4fc8ea7dc292def7485bcf53c3849442c47
2021-01-14 07:51:23 -05:00
Sean Keely 7bc6aac5d2 Correct computation of scratch slot requirements.
Each SE must be assigned equal numbers of slots and slots
must be assigned in units of whole groups.

Change-Id: I8f3677237fa6f2e2d25e3e78210c5a7a0ad792f3
2021-01-13 15:09:00 -05:00
Sean Keely 9fe8ccc3ee Revert "Revert "Cache scratch allocations.""
This reverts commit 7e2ba23566.

Change-Id: I3f3c257270016559f8b2e70151664f0931db28d2
2021-01-13 15:08:53 -05:00
Tony Tye 6bbf6b1c9c Improve Isa class
- Use consistent naming in Isa class.
- Remove unused Isa methods.
- Simplify Isa methods.

Change-Id: I7c4045d08fbfe0d94b3181db8ebc5e5ed8c8cc82
2021-01-10 18:23:54 +00:00
Tony 853ccc762e Store target ID in isa registry
Store target ID string in isa registry and use for returning agent and
isa name.

Change-Id: I72a20d8ff963c73d86392158aff3853e4c9bfdbd
2021-01-10 18:23:54 +00:00
Tony 12eb2764cd Correct code object V2 support
- Remove gfx800, gfx804 and gfx901 as they do not exist.
- Map the V2 note record of "AMD:AMDGPU:8:0:0" to gfx802 as they are
  the same target just connected to a differnt motherboard.
- Correct typo for supporting gfx902:xnack+.
- Support agent names with a minor or stepping version greater than 9.

Change-Id: Ife933449f60ab4687e2aaab9baf4c9fc5b86339d
2021-01-10 18:23:54 +00:00
Sean Keely 7e2ba23566 Revert "Cache scratch allocations."
This reverts commit 27e044ae4d.

Change-Id: I698b33bacb2be3de6c8185fe89597a60a79521c5
2021-01-08 11:57:40 -06:00
Kent Russell c742764d01 Merge address sanitizer branch into amd-staging
Merge in topic branch to enable address sanitizer and CLANG compilation
support into amd-staging branch

Change-Id: I3fcd24c6fac83d0619bef4cbbc56fd95e9fb009d
Signed-off-by: Kent Russell <kent.russell@amd.com>
2021-01-06 11:50:54 -05:00
Kent Russell 1290d4d56c CMakeLists.txt: Use %{dist} in RPM naming
The %{dist} suffix is part of the package name due to
CPACK_RPM_PACKAGE_RELEASE_DIST, but the string provided to the
"REQUIRES" field lacks it. Add it in here so the devel package can
reference the thunk package correctly. Use a nice function suggested by
Cole since CPACK_RPM_PACKAGE_RELEASE_DIST has caused some infra issues
in the past

This works for packages build in both Ubuntu and CentOS
Also fix a mistake in the naming for DEBIAN packages, which should be a
no-op since both the DEBIAN and RPM PACKAGE_RELEASE variables are the
same right now

Change-Id: I70659d2e1b6ff9027b8564ca4366d81b0c164760
Signed-off-by: Kent Russell <kent.russell@amd.com>
2021-01-06 08:06:49 -05:00
Sean Keely d39ae13420 Add support for gfx1032.
Change-Id: I36f93a6b61e74cf17aac1a05d7c1d4ba6369fcc9
2021-01-05 17:28:19 -06:00
Kent Russell 3c8273c57b libhsakmt: Explicitly set shared/static sanitizer flags
Don't rely on default values for static/shared sanitizer flags, set them
explicitly based on whether BUILD_SHARED_LIBS is defined or not

Change-Id: Ifbfe95269d1cf184237643176a033a3ce98b62f9
2020-12-24 10:32:01 -08:00
Kent Russell 323bab0734 kfdtest: Quote all CXX flags
Otherwise it doesn't play nicely with -O2
Change-Id: I2e5b60c73ee1ec668b186088a4e2e3a03af65033
2020-12-24 10:32:01 -08:00
Kent Russell 92ad039915 kfdtest: Add sanitizer flags after C flags are set
Otherwise they get overwritten

Change-Id: I9042422d4515e7ac812ed34779906b0b2c44545c
2020-12-24 10:32:01 -08:00
Kent Russell f6f47aa43d Remove address-sanitizer debug messages
Change-Id: I08509aaed36459329f0a65264e42f287c27f4a18
2020-12-24 10:32:01 -08:00
Kent Russell 9cca1216e9 kfdtest: Support address sanitizer in KFDTest
Change-Id: Iee1182608ddc9896c82feb5004b3fe078d3d3223
2020-12-24 10:32:01 -08:00
Kent Russell 8e0a9aa417 Set -no-undefined properly if it's CC
Address-sanitizer doesn't like it at all. And it's called differently
under clang than gcc, so adjust accordingly

Change-Id: Iebe8cd68618d3f7a4c310419c64b4f73d7ecfda4
2020-12-24 10:32:01 -08:00
Kent Russell 3d9f60d7fe CMakeLists: Address-sanitizer fix and cleanup
Move all the logic into 1 spot, and make sure -fsanitize=address is also
passed to the library flags

Change-Id: I7b60629d32df6436b5c7ad37997fe14ea48f5d72
2020-12-24 10:32:01 -08:00
Gefei Jiang d3bc75d229 CMakeLists.txt: Address Sanitizer Support
1. add sanitize flag to link flags
  2. use ${ADDRESS_SANITIZER} as a condition to turn on/off
     instead of (DEFINED ADDRESS_SANITIZER).
     The latter will always turn the feature on regardless of
     the value as long as there is "-DADDRESS_SANITIZER in
     cmake command line,which will be an issue when merging to
     the mainline
Amended: put -fsanitize=address at the begining in link flags

Change-Id: I84df0e5b6d7fb8f02f18bf7961f25f15cac10443
Signed-off-by: Gefei Jiang <gefei.jiang@amd.com>
2020-12-24 10:32:01 -08:00
Gefei Jiang b92d28bd71 CMakeLists.txt: Address Sanitize Support
ROCMOPS-1249
	correct if statement and -f flag name

Change-Id: I92e9aa30b1c81f855ad269c0c686ec1e136a85fd
Signed-off-by: Gefei Jiang <gefei.jiang@amd.com>
2020-12-24 10:32:01 -08:00
Gefei Jiang f0e6e7ae17 CMakeList.txt -- Support Address Sanitize
ROCMOPS-1249
	append address sanitize flag

Change-Id: Ie5d1e5b8b93022b80e0ca74106a16d53d52e41af
Signed-off-by: Gefei Jiang <gefei.jiang@amd.com>
2020-12-24 10:32:01 -08:00
Chen Gong 4cf50fdeaa libhsakmt: enhancing support to gfx1033
This patch make get_block_properties() function work on gfx1033 platform

Change-Id: Ie5be7dfb38575eec8b39b91f3ee5b3a31abe8bd1
Signed-off-by: Chen Gong <curry.gong@amd.com>
2020-12-22 14:15:23 +08:00
Yifan Zhang 742f718722 kfdtest: Take vram size into account when calculate buffer number.
Vram size is relatively smaller in APU, e.g. 512MB.
Current MMBench doesn't support small vram system.
Running MMBench may have below errors:

[ RUN      ] KFDMemoryTest.MMBench
[          ] Found VRAM of 512MB.
[          ] Test (avg. ns)        alloc   mapOne  umapOne   mapAll  umapAll     free
[          ] --------------------------------------------------------------------------
[          ]   4K-SysMem-noSDMA         4569    20098     1292    18835      926     2218
[          ]  64K-SysMem-noSDMA        12738    20469     1030    19201     1293     4560
[          ]   2M-SysMem-noSDMA       256384    21020     1022    20568     1196    36294
[          ]  32M-SysMem-noSDMA      4031812    83750     5406    61156     4312   535656
[          ]   1G-SysMem-noSDMA    129260000   427000    34000   390000    30000 18548000
[          ] --------------------------------------------------------------------------
[          ]   4K-VRAM-noSDMA         3594    19637      979    19624     1357     2829
[          ]  64K-VRAM-noSDMA         3540    21062     1407    19614     1654     3024
/home/foreman/build/hsakmt-roct-amdgpu-1.0.9/sources/libhsakmt/tests/kfdtest/src/KFDMemoryTest.cpp:1119: Failure
Value of: (hsaKmtAllocMemory(allocNode, bufSize, memFlags, &bufs[i]))
  Actual: 6
Expected: HSAKMT_STATUS_SUCCESS
Which is: 0
[  FAILED  ] KFDMemoryTest.MMBench (723 ms)

Fix this issue by changing buffer number calculation in MMBench.

Change-Id: I5cce95707a048248f1e825c807586818619eddaf
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
2020-12-17 07:41:24 -05:00
Chengming Gui 3ed8b96bf0 kfdtest: remove unsupported modifier 'offset'
fix 
v2: fix VGPR conflict
v3: use s_addc_u32 to replace s_add_u32

Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Change-Id: I8fe6bf1f5bf99544038ad16128c2bebd559d3da9
2020-12-14 17:29:13 +08:00
Huang Rui 9600760ff7 libhsakmt: add gfx1033 support
This patch is to add Van Gogh support on thunk.

Change-Id: I75819329b865e4c38c097e83e3a0cb4e4f566fa2
Signed-off-by: Huang Rui <ray.huang@amd.com>
2020-12-08 23:54:46 -05:00
Sean Keely bd63a2b690 Cleanup warnings when using clang.
Change-Id: I09f72831e29bccdb4170c54e203872412e2f0b59
2020-12-04 22:18:14 -06:00
changzhu 39386c03bf Add distinguish for iommuv2/dgpu_fallback when getting gpuName
The memory tests between iommuv2 and dgpu_fallback are different.So it
needs to ditinguish them.

Change-Id: Icc64e9ae0fc1638c3d148795a5f247d9e5e8e503
Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
2020-12-04 02:24:49 -05:00
Philip Cox 4bbfbe7789 kfdtest: increase default timeout to 10,000
The default kfdtest timeout is not enough for certain platforms, and
tests are failing.

Change-Id: I2027eadcbeb12a2fbbc9c55f92f31869fa13dbcb
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
2020-11-27 15:06:41 -05:00
Gang Ba 8e94dde685 kfdtest: check peer accessible with new function
check GPU peer accessible with p2p_links in system

Signed-off-by: Gang Ba <gaba@amd.com>
Change-Id: I026f16564303b687811d6648f0b7f84be6819979
2020-11-26 10:34:06 -05:00
Tony b443397bcc Make supported targets consistent
Add missing target names and make all parts consistent with which
targets are supported.

- Add gfx805 as a supported target.

- Add all ELF targets to genric code.

- Make offline loader match supported targets.

Change-Id: Idab4d69edc71645aecaa83aa55e29c1aeee4c1d6
2020-11-24 03:14:31 +00:00
Kent Russell 2651ce37d8 Look in /opt/rocm* for SMI for setting clocks to high
Now that symlinks aren't necessarily guaranteed, use "find" to try to
find the rocm-smi, and clarify the error message if it is not found

Also tie in a fix for parsing the output now that the output has changed

Change-Id: I2081442a71731c186c3ad00585a2ba6e8a8e5a28
2020-11-23 14:05:10 -05:00
Sean Keely 6182abf5e9 Add asserts and minimum values for kernarg alignment and utility functions.
Kernel argument size and alignment queries are not supported on
code object v3.

Change-Id: I1bdd34e2e62132f912ac39d80355efd3456df87c
2020-11-21 21:39:49 -06:00
Tony ef755e4c82 Update code object V3 kernarg queries
Code object V2 had the ability to support the following queries:

- HSA_CODE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_SIZE
- HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_SIZE
- HSA_CODE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_ALIGNMENT
- HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_ALIGNMENT

However code object V3 onwards cannot support these as the kernel
descriptor changed. These queries need to be deprecated.

Until then return more reasonable values:

- For kernarg alignment return 16 which is the minimum alignment
  required by the HSA standard.

- For kernarg size return the field from the kernel descriptor which
  is a hint. If it is 0 then the compiler is not specifying the kernarg
  size, or the kernel has no kernarg.

Change-Id: I19ce6cd0f3658a2bf62277492f39100ea5ab4256
2020-11-20 21:39:18 -05:00
Sean Keely 27e044ae4d Cache scratch allocations.
Avoids calling to KFD to map/unmap scratch allocations for
every large scratch using dispatch.

Change-Id: I9fab5705251ec82b03e4f2f2ca6da7cdccabefb9
2020-11-20 15:07:01 -05:00
Sean Keely 32d0fcafa9 Limit clock synchronization to 16Hz.
Improves HIP event performance in directed benchmarks where
clock sync latency is significant.

Change-Id: I78b724a14a8f5b6a9a2b9f4d85afe9d8b81808a6
2020-11-20 15:06:13 -05:00
Sean Keely b51f68b535 Style update for SDMA enable flag.
Updated to match xnack flag's style.

Change-Id: I6115c0b53660d789e698de1606a9388ae1789866
2020-11-20 15:06:02 -05:00
Cordell Bloor 4a35f560f6 Fix CMake configure error due to CMP0012
The modern meaning of the construct if( NOT ON ) was added in CMake 2.8,
but when the cmake_minimum_required not set in user code and no policy
level is set in the CMake config, then CMake 2.8 features cannot be
used. In old CMake (the default), ON is interpreted as a variable, and
because it is not defined, it is considered false. The same is true of
OFF.

This change sets a variable as ON, so that old CMake interpretation is
correct, and the if works as expected regardless of policy version.

Change-Id: I67d7ed4ceaf8248eeb5a1c7f54009d72313f3f5d
2020-11-20 15:04:41 -05:00
Cole Nelson 90f2dd5b1b opensrc/hsa-runtime/CMakeLists.txt: conformant package names
Names test good:
hsa-rocr-dev_1.2.0.30900-crdnnv.415_amd64.deb
hsa-rocr-dev-1.2.0.30900-crdnnv.415.el7.x86_64.rpm
hsa-rocr-dev-1.2.0.30900-crdnnv.sles151.415.x86_64.rpm

http://confluence.amd.com/display/GPUCPT/Package+File+Naming

Note: rpm requires 'devel' instead of 'dev', to be a subsequent
patchset.

Change-Id: Id6a422f3c335448b52c70c77ed39c9041114b80f
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
2020-11-18 14:56:24 -05:00
Aakash Sudhanwa ac6e96e7a3 Revert "CMakeLists: Fix RPM dependency declaration"
This reverts commit 089fdeb1fe.

Reason for revert: <INSERT REASONING HERE>

Change-Id: Ieaf2da0067c3e89577569c5d478c70b97ca5f5ca
2020-11-17 20:10:45 -05:00
Gang Ba bedecc5957 libhsakmt: Create P2P links
1. Create P2P links
2. Determine FRAMEBUFFER_PUBLIC/PRIVATE only based
   host-accessibility, not peer-accesssibility

Signed-off-by: Gang Ba <gaba@amd.com>
Change-Id: I15fccdc60386b453e2a47849a16df15157324b21
2020-11-17 15:43:12 -05:00
Kent Russell 089fdeb1fe CMakeLists: Fix RPM dependency declaration
RPM needs _REQUIRES at the end, not _DEPENDS, and also requires a space
before the version of the required package.

Change-Id: I9dd70bd92fc2407b7e8b31e4d46df43c52438a65
Signed-off-by: Kent Russell <kent.russell@amd.com>
2020-11-17 12:36:05 -05:00
Gang Ba e8c0426c54 libhsakmt: add Streaming Performance Monitors APIs
Signed-off-by: Gang Ba <gaba@amd.com>
Change-Id: Iab9a98fa2079b7cae7158c524479dfc3fa672407
2020-11-16 16:36:21 -05:00