Add 'projects/rocprofiler-systems/' from commit '92e1d84c72c9321d79a1866e0090fae0215e6557'

git-subtree-dir: projects/rocprofiler-systems
git-subtree-mainline: ee9e74df21
git-subtree-split: 92e1d84c72
Этот коммит содержится в:
systems-assistant[bot]
2025-07-17 18:13:44 +00:00
родитель ee9e74df21 92e1d84c72
Коммит 6755fa3a36
699 изменённых файлов: 179131 добавлений и 0 удалений
+46
Просмотреть файл
@@ -0,0 +1,46 @@
resources:
repositories:
- repository: pipelines_repo
type: github
endpoint: ROCm
name: ROCm/ROCm
variables:
- group: common
- template: /.azuredevops/variables-global.yml@pipelines_repo
trigger:
batch: true
branches:
include:
- amd-staging
- amd-mainline
paths:
exclude:
- .github
- docs
- '.*.y*ml'
- '*.md'
- LICENSE
- VERSION
- .wordlist.txt
pr:
autoCancel: true
branches:
include:
- amd-staging
- amd-mainline
paths:
exclude:
- .github
- docs
- '.*.y*ml'
- '*.md'
- LICENSE
- VERSION
- .wordlist.txt
drafts: false
jobs:
- template: ${{ variables.CI_COMPONENT_PATH }}/rocprofiler-systems.yml@pipelines_repo
+148
Просмотреть файл
@@ -0,0 +1,148 @@
# clang-format v18
---
Language: Cpp
AccessModifierOffset: -4
AlignAfterOpenBracket: Align
AlignConsecutiveMacros: true
AlignConsecutiveAssignments: true
AlignConsecutiveBitFields: true
AlignConsecutiveDeclarations: true
AlignEscapedNewlines: Right
AlignOperands: Align
AlignTrailingComments: true
AllowAllArgumentsOnNextLine: true
AllowAllConstructorInitializersOnNextLine: true
AllowAllParametersOfDeclarationOnNextLine: false
AllowShortEnumsOnASingleLine: true
AllowShortBlocksOnASingleLine: Never
AllowShortCaseLabelsOnASingleLine: true
AllowShortFunctionsOnASingleLine: All
AllowShortLambdasOnASingleLine: All
AllowShortIfStatementsOnASingleLine: true
AllowShortLoopsOnASingleLine: false
AlwaysBreakAfterDefinitionReturnType: TopLevel
AlwaysBreakAfterReturnType: TopLevel
AlwaysBreakBeforeMultilineStrings: false
AlwaysBreakTemplateDeclarations: true
BinPackArguments: true
BinPackParameters: true
BraceWrapping:
AfterCaseLabel: true
AfterClass: true
AfterControlStatement: Always
AfterEnum: true
AfterFunction: true
AfterNamespace: true
AfterObjCDeclaration: false
AfterStruct: true
AfterUnion: true
AfterExternBlock: true
BeforeCatch: false
BeforeElse: true
BeforeLambdaBody: false
BeforeWhile: false
IndentBraces: false
SplitEmptyFunction: false
SplitEmptyRecord: false
SplitEmptyNamespace: false
BreakBeforeBinaryOperators: None
BreakBeforeBraces: Custom
BreakBeforeInheritanceComma: true
BreakInheritanceList: BeforeComma
BreakBeforeTernaryOperators: true
BreakConstructorInitializersBeforeComma: false
BreakConstructorInitializers: BeforeComma
BreakAfterJavaFieldAnnotations: false
BreakStringLiterals: true
ColumnLimit: 90
CommentPragmas: '^ IWYU pragma:'
CompactNamespaces: true
ConstructorInitializerAllOnOneLineOrOnePerLine: false
ConstructorInitializerIndentWidth: 0
ContinuationIndentWidth: 4
Cpp11BracedListStyle: false
DeriveLineEnding: true
DerivePointerAlignment: false
DisableFormat: false
ExperimentalAutoDetectBinPacking: false
FixNamespaceComments: true
ForEachMacros:
- foreach
- Q_FOREACH
- BOOST_FOREACH
IncludeBlocks: Preserve
IncludeCategories:
- Regex: '^"(llvm|llvm-c|clang|clang-c)/'
Priority: 2
SortPriority: 0
- Regex: '^(<|"(gtest|gmock|isl|json)/)'
Priority: 3
SortPriority: 0
- Regex: '.*'
Priority: 1
SortPriority: 0
IncludeIsMainRegex: '(Test)?$'
IncludeIsMainSourceRegex: ''
IndentCaseLabels: true
IndentCaseBlocks: false
IndentGotoLabels: true
IndentPPDirectives: AfterHash
IndentExternBlock: AfterExternBlock
IndentWidth: 4
IndentWrappedFunctionNames: false
InsertTrailingCommas: None
JavaScriptQuotes: Leave
JavaScriptWrapImports: true
KeepEmptyLinesAtTheStartOfBlocks: false
MacroBlockBegin: ''
MacroBlockEnd: ''
MaxEmptyLinesToKeep: 1
NamespaceIndentation: None
ObjCBinPackProtocolList: Auto
ObjCBlockIndentWidth: 2
ObjCBreakBeforeNestedBlockParam: true
ObjCSpaceAfterProperty: true
ObjCSpaceBeforeProtocolList: false
PenaltyBreakAssignment: 2
PenaltyBreakBeforeFirstCallParameter: 19
PenaltyBreakComment: 300
PenaltyBreakFirstLessLess: 120
PenaltyBreakString: 1000
PenaltyBreakTemplateDeclaration: 10
PenaltyExcessCharacter: 1000000
PenaltyReturnTypeOnItsOwnLine: 200
PointerAlignment: Left
ReflowComments: true
SortIncludes: true
SortUsingDeclarations: true
SpaceAfterCStyleCast: true
SpaceAfterLogicalNot: false
SpaceAfterTemplateKeyword: true
SpaceBeforeAssignmentOperators: true
SpaceBeforeCpp11BracedList: false
SpaceBeforeCtorInitializerColon: true
SpaceBeforeInheritanceColon: true
SpaceBeforeParens: Never
SpaceBeforeRangeBasedForLoopColon: true
SpaceInEmptyBlock: false
SpaceInEmptyParentheses: false
SpacesBeforeTrailingComments: 2
SpacesInAngles: false
SpacesInConditionalStatement: false
SpacesInContainerLiterals: true
SpacesInCStyleCastParentheses: false
SpacesInParentheses: false
SpacesInSquareBrackets: false
SpaceBeforeSquareBrackets: false
Standard: Latest
StatementMacros:
- Q_UNUSED
- QT_REQUIRE_VERSION
TabWidth: 4
UseCRLF: false
UseTab: Never
WhitespaceSensitiveMacros:
- STRINGIZE
- PP_STRINGIZE
- BOOST_PP_STRINGIZE
...
+50
Просмотреть файл
@@ -0,0 +1,50 @@
---
Checks: "-*,\
misc-*,\
-misc-incorrect-roundings,\
-misc-macro-parentheses,\
-misc-misplaced-widening-cast,\
-misc-static-assert,\
-misc-no-recursion,\
-misc-non-private-member-variables-in-classes,\
modernize-*,\
-modernize-deprecated-headers,\
-modernize-raw-string-literal,\
-modernize-return-braced-init-list,\
-modernize-use-transparent-functors,\
-modernize-use-trailing-return-type,\
-modernize-avoid-c-arrays,\
-modernize-redundant-void-arg,\
-modernize-use-using,\
-modernize-use-auto,\
-modernize-concat-nested-namespaces,\
-modernize-use-nodiscard,\
performance-*,\
readability-*,\
-readability-function-size,\
-readability-identifier-naming,\
-readability-identifier-length,\
-readability-implicit-bool-cast,\
-readability-inconsistent-declaration-parameter-name,\
-readability-named-parameter,\
-readability-magic-numbers,\
-readability-redundant-declaration,\
-readability-redundant-member-init,\
-readability-simplify-boolean-expr,\
-readability-uppercase-literal-suffix,\
-readability-braces-around-statements,\
-readability-avoid-const-params-in-decls,\
-readability-else-after-return,\
-readability-isolate-declaration,\
-readability-redundant-string-cstr,\
-readability-static-accessed-through-instance,\
-readability-const-return-type,\
-readability-redundant-access-specifiers,\
-readability-function-cognitive-complexity,\
"
CheckOptions:
- key: readability-braces-around-statements.ShortStatementLines
value: '2'
- key: readability-implicit-bool-conversion.AllowPointerConditions
value: '1'
...
+5
Просмотреть файл
@@ -0,0 +1,5 @@
# yaml-language-server: $schema=https://raw.githubusercontent.com/BlankSpruce/gersemi/0.19.3/gersemi/configuration.schema.json
warn_about_unknown_commands: false
indent: 4
line_length: 90
+8
Просмотреть файл
@@ -0,0 +1,8 @@
* @ROCm/rocprof-sys @jrmadsen
# Documentation files
docs/** @ROCm/rocm-documentation
*.md @ROCm/rocm-documentation
*.rst @ROCm/rocm-documentation
.readthedocs.yaml @ROCm/rocm-documentation
docs/sphinx/* @samjwu
+17
Просмотреть файл
@@ -0,0 +1,17 @@
# To get started with Dependabot version updates, you'll need to specify which
# package ecosystems to update and where the package manifests are located.
# Please see the documentation for all configuration options:
# https://docs.github.com/github/administering-a-repository/configuration-options-for-dependency-updates
version: 2
updates:
- package-ecosystem: "pip" # See documentation for possible values
directory: "/docs/sphinx" # Location of package manifests
open-pull-requests-limit: 10
schedule:
interval: "daily"
labels:
- "documentation"
- "dependencies"
reviewers:
- "samjwu"
+35
Просмотреть файл
@@ -0,0 +1,35 @@
# rocprofiler-systems Pull Request
## Related Issue
<!-- Please link to the external GitHub issue(s) that this PR addresses.
If providing a JIRA ticket, please don't include an internal link -->
- [ ] Closes #<issue number>
## What type of PR is this? (check all that apply)
- [ ] Bug Fix
- [ ] Cherry Pick
- [ ] Continuous Integration
- [ ] Documentation Update
- [ ] Feature
- [ ] Optimization
- [ ] Refactor
- [ ] Other (please specify)
## Technical Details
<!-- Please explain the changes. -->
## Have you added or updated tests to validate functionality?
- [ ] Yes
- [ ] No - does not apply to this PR
## Added / Updated documentation?
- [ ] Yes
- [ ] No - does not apply to this PR
## Have you updated CHANGELOG?
<!-- Needed for Release updates for a ROCm release. -->
- [ ] Yes
- [ ] No - does not apply to this PR
+189
Просмотреть файл
@@ -0,0 +1,189 @@
name: Continuous Integration Containers
run-name: ci-containers
# nightly build
on:
workflow_dispatch:
schedule:
- cron: 0 5 * * *
push:
branches: [amd-staging, amd-mainline]
paths:
- '.github/workflows/containers.yml'
- 'docker/**'
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
env:
GIT_DISCOVERY_ACROSS_FILESYSTEM: 1
jobs:
rocprofiler-systems-ci:
if: github.repository == 'ROCm/rocprofiler-systems'
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
include:
- distro: "ubuntu"
version: "20.04"
- distro: "ubuntu"
version: "22.04"
- distro: "ubuntu"
version: "24.04"
- distro: "opensuse"
version: "15.5"
- distro: "opensuse"
version: "15.6"
- distro: "rhel"
version: "8.10"
- distro: "rhel"
version: "9.3"
- distro: "rhel"
version: "9.4"
- distro: "rhel"
version: "9.5"
steps:
- uses: actions/checkout@v4
with:
submodules: recursive
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Login to DockerHub
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Build CI Container
timeout-minutes: 45
uses: nick-fields/retry@v3
with:
retry_wait_seconds: 60
timeout_minutes: 45
max_attempts: 3
command: |
pushd docker
./build-docker-ci.sh --distro ${{ matrix.distro }} --versions ${{ matrix.version }} --user ${{ secrets.DOCKERHUB_USERNAME }} --push --jobs 2 --elfutils-version 0.186 --boost-version 1.79.0
popd
rocprofiler-systems-release:
if: github.repository == 'ROCm/rocprofiler-systems'
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
include:
# ubuntu 20.04
- os-distro: "ubuntu"
os-version: "20.04"
rocm-version: "0.0"
- os-distro: "ubuntu"
os-version: "20.04"
rocm-version: "6.3"
# ubuntu 22.04
- os-distro: "ubuntu"
os-version: "22.04"
rocm-version: "0.0"
- os-distro: "ubuntu"
os-version: "22.04"
rocm-version: "6.3"
- os-distro: "ubuntu"
os-version: "22.04"
rocm-version: "6.4"
# ubuntu 24.04
- os-distro: "ubuntu"
os-version: "24.04"
rocm-version: "0.0"
- os-distro: "ubuntu"
os-version: "24.04"
rocm-version: "6.3"
- os-distro: "ubuntu"
os-version: "24.04"
rocm-version: "6.4"
# opensuse 15.5
- os-distro: "opensuse"
os-version: "15.5"
rocm-version: "0.0"
- os-distro: "opensuse"
os-version: "15.5"
rocm-version: "6.3"
# opensuse 15.6
- os-distro: "opensuse"
os-version: "15.6"
rocm-version: "0.0"
- os-distro: "opensuse"
os-version: "15.6"
rocm-version: "6.3"
- os-distro: "opensuse"
os-version: "15.6"
rocm-version: "6.4"
# RHEL 8.10
- os-distro: "rhel"
os-version: "8.10"
rocm-version: "0.0"
- os-distro: "rhel"
os-version: "8.10"
rocm-version: "6.3"
- os-distro: "rhel"
os-version: "8.10"
rocm-version: "6.4"
# RHEL 9.4
- os-distro: "rhel"
os-version: "9.4"
rocm-version: "0.0"
- os-distro: "rhel"
os-version: "9.4"
rocm-version: "6.3"
- os-distro: "rhel"
os-version: "9.4"
rocm-version: "6.4"
# RHEL 9.5
- os-distro: "rhel"
os-version: "9.5"
rocm-version: "0.0"
- os-distro: "rhel"
os-version: "9.5"
rocm-version: "6.3"
- os-distro: "rhel"
os-version: "9.5"
rocm-version: "6.4"
steps:
- uses: actions/checkout@v4
- name: Set up QEMU
uses: docker/setup-qemu-action@v2
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
- name: Login to DockerHub
uses: docker/login-action@v2
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Build Base Container
timeout-minutes: 45
uses: nick-fields/retry@v3
with:
retry_wait_seconds: 60
timeout_minutes: 45
max_attempts: 3
command: |
pushd docker
./build-docker.sh --distro ${{ matrix.os-distro }} --versions ${{ matrix.os-version }} --rocm-versions ${{ matrix.rocm-version }} --user ${{ secrets.DOCKERHUB_USERNAME }} --push
popd
+191
Просмотреть файл
@@ -0,0 +1,191 @@
name: Installer Packaging (CPack)
run-name: cpack
on:
workflow_dispatch:
push:
branches: [amd-staging, amd-mainline, release/**]
tags:
- "v[1-9].[0-9]+.[0-9]+*"
- "rocm-[1-9].[0-9]+.[0-9]+*"
paths-ignore:
- '*.md'
- 'docs/**'
- 'source/docs/**'
pull_request:
branches: [amd-staging, amd-mainline, release/**]
paths:
- '.github/workflows/cpack.yml'
- 'docker/**'
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
env:
GIT_DISCOVERY_ACROSS_FILESYSTEM: 1
jobs:
installers:
if: github.repository == 'ROCm/rocprofiler-systems'
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
include:
# ubuntu 20.04
- os-distro: "ubuntu"
os-version: "20.04"
rocm-version: "0.0"
- os-distro: "ubuntu"
os-version: "20.04"
rocm-version: "6.3"
# ubuntu 22.04
- os-distro: "ubuntu"
os-version: "22.04"
rocm-version: "0.0"
- os-distro: "ubuntu"
os-version: "22.04"
rocm-version: "6.3"
- os-distro: "ubuntu"
os-version: "22.04"
rocm-version: "6.4"
# ubuntu 24.04
- os-distro: "ubuntu"
os-version: "24.04"
rocm-version: "0.0"
- os-distro: "ubuntu"
os-version: "24.04"
rocm-version: "6.3"
- os-distro: "ubuntu"
os-version: "24.04"
rocm-version: "6.4"
# opensuse 15.5
- os-distro: "opensuse"
os-version: "15.5"
rocm-version: "0.0"
- os-distro: "opensuse"
os-version: "15.5"
rocm-version: "6.3"
# opensuse 15.6
- os-distro: "opensuse"
os-version: "15.6"
rocm-version: "0.0"
- os-distro: "opensuse"
os-version: "15.6"
rocm-version: "6.3"
- os-distro: "opensuse"
os-version: "15.6"
rocm-version: "6.4"
# RHEL 8.10
- os-distro: "rhel"
os-version: "8.10"
rocm-version: "0.0"
- os-distro: "rhel"
os-version: "8.10"
rocm-version: "6.3"
- os-distro: "rhel"
os-version: "8.10"
rocm-version: "6.4"
# RHEL 9.4
- os-distro: "rhel"
os-version: "9.4"
rocm-version: "0.0"
- os-distro: "rhel"
os-version: "9.4"
rocm-version: "6.3"
- os-distro: "rhel"
os-version: "9.4"
rocm-version: "6.4"
# RHEL 9.5
- os-distro: "rhel"
os-version: "9.5"
rocm-version: "0.0"
- os-distro: "rhel"
os-version: "9.5"
rocm-version: "6.3"
- os-distro: "rhel"
os-version: "9.5"
rocm-version: "6.4"
steps:
- name: Free Disk Space
uses: jlumbroso/free-disk-space@v1.2.0
with:
tool-cache: false
android: true
dotnet: true
haskell: true
large-packages: false
swap-storage: false
- uses: actions/checkout@v4
with:
submodules: recursive
- name: Configure ROCm Version
if: ${{ matrix.rocm-version == 0 }}
run: |
echo "CI_SCRIPT_ARGS=--core +python" >> $GITHUB_ENV
- name: Configure ROCm Version
if: ${{ matrix.rocm-version > 0 }}
run: |
echo "CI_SCRIPT_ARGS=--rocm +python" >> $GITHUB_ENV
- name: Configure Generators
run: |
echo "CI_GENERATOR_ARGS=--generators STGZ" >> $GITHUB_ENV
- name: Build Base Container
timeout-minutes: 30
run: |
pushd docker
./build-docker.sh --distro ${{ matrix.os-distro }} --versions ${{ matrix.os-version }} --rocm-versions ${{ matrix.rocm-version }}
popd
- name: Build Release
timeout-minutes: 150
run: |
pushd docker
./build-docker-release.sh --distro ${{ matrix.os-distro }} --versions ${{ matrix.os-version }} --rocm-versions ${{ matrix.rocm-version }} -- ${CI_SCRIPT_ARGS} ${CI_GENERATOR_ARGS}
popd
- name: List Files
timeout-minutes: 10
run: |
find build-release -type f | egrep '\.(sh|deb|rpm)$'
- name: STGZ Artifacts
timeout-minutes: 10
uses: actions/upload-artifact@v4
with:
name: rocprofiler-systems-stgz-${{ matrix.os-distro }}-${{ matrix.os-version }}-rocm-${{ matrix.rocm-version }}-installer
path: |
build-release/stgz/*.sh
# before testing remove any artifacts of the build
- name: Remove Build
timeout-minutes: 10
run: |
shopt -s nullglob
for i in $(find build-release -type f | egrep '/(stgz|deb|rpm)/.*\.(sh|deb|rpm)$'); do mv ${i} ./; done
sudo rm -rf build-release
sudo rm -rf /opt/rocprofiler-systems
- name: Test STGZ Install
timeout-minutes: 20
run: |
set -v
for i in rocprofiler-systems-*.sh
do
./docker/test-docker-release.sh --distro ${{ matrix.os-distro }} --versions ${{ matrix.os-version }} --rocm-versions ${{ matrix.rocm-version }} -- --stgz ${i}
done
- name: Upload STGZ Release Assets
uses: softprops/action-gh-release@v2
if: startsWith(github.ref, 'refs/tags/') && github.repository == 'ROCm/rocprofiler-systems'
with:
fail_on_unmatched_files: True
files: |
rocprofiler-systems-*.sh
+104
Просмотреть файл
@@ -0,0 +1,104 @@
name: Formatting
run-name: formatting
on:
push:
branches: [ amd-mainline, amd-staging, release/** ]
pull_request:
branches: [ amd-mainline, amd-staging, release/** ]
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
call-workflow-passing-data:
name: Documentation
uses: ROCm/rocm-docs-core/.github/workflows/linting.yml@develop
python:
runs-on: ubuntu-22.04
strategy:
matrix:
python-version: [3.8]
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install black
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
- name: black format
run: |
black --diff --check .
cmake:
runs-on: ubuntu-24.04
steps:
- uses: actions/checkout@v4
- name: Install dependencies
run: |
sudo apt-get update
sudo apt-get install -y python3-pip
python3 -m pip install gersemi
- name: gersemi
run: |
set +e
gersemi -i $(find . -type f ! -path '*/external/*' | grep -E 'CMakeLists.txt|\.cmake$')
if [ $(git diff | wc -l) -gt 0 ]; then
echo -e "\nError! CMake code not formatted. Run gersemi ...\n"
echo -e "\nFiles:\n"
git diff --name-only
echo -e "\nFull diff:\n"
git diff
exit 1
fi
source:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Install dependencies
run: |
DISTRIB_CODENAME=$(cat /etc/lsb-release | grep DISTRIB_CODENAME | awk -F '=' '{print $NF}')
sudo apt-get update
sudo apt-get install -y software-properties-common wget curl clang-format-18
- name: clang-format
run: |
set +e
FILES=$(find source examples tests -type f | egrep '\.(h|hpp|c|cpp)(|\.in)$')
FORMAT_OUT=$(clang-format-18 -output-replacements-xml ${FILES})
RET=$(echo ${FORMAT_OUT} | grep -c '<replacement ')
if [ "${RET}" -ne 0 ]; then
echo -e "\nError! Code not formatted. Detected ${RET} lines\n"
clang-format-18 -i ${FILES}
git diff
exit ${RET}
fi
includes:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v4
- name: check-includes
run: |
set +e
FILES=$(find source examples -type f | egrep '\.(hpp|cpp)(|\.in)$')
MATCHES=$(egrep 'include "timemory/|include <bits/' ${FILES})
if [ -n "${MATCHES}" ]; then
echo -e "\nError! Included timemory header with quotes or bits folder included\n"
echo -e "### MATCHES: ###"
echo -e "${MATCHES}"
echo -e "################"
exit 1
fi
+176
Просмотреть файл
@@ -0,0 +1,176 @@
name: OpenSUSE 15 (GCC, Python)
run-name: opensuse-15
on:
push:
branches: [ amd-mainline, amd-staging, release/** ]
paths-ignore:
- '*.md'
- 'docs/**'
- 'source/docs/**'
- 'source/python/gui/**'
- '.github/workflows/docs.yml'
- '.github/workflows/cpack.yml'
- '.github/workflows/containers.yml'
- '.github/workflows/formatting.yml'
- '.github/workflows/weekly-mainline-sync.yml'
- 'docker/**'
- .wordlist.txt
- CMakePresets.json
pull_request:
branches: [ amd-mainline, amd-staging, release/** ]
paths-ignore:
- '*.md'
- 'docs/**'
- 'source/docs/**'
- 'source/python/gui/**'
- '.github/workflows/docs.yml'
- '.github/workflows/cpack.yml'
- '.github/workflows/containers.yml'
- '.github/workflows/formatting.yml'
- '.github/workflows/weekly-mainline-sync.yml'
- 'docker/**'
- .wordlist.txt
- CMakePresets.json
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
env:
ROCPROFSYS_CI: ON
ROCPROFSYS_TMPDIR: "%env{PWD}%/testing-tmp"
jobs:
opensuse:
runs-on: ubuntu-latest
container:
image: dgaliffiamd/rocprofiler-systems:ci-base-opensuse-${{ matrix.os-release }}
strategy:
fail-fast: false
matrix:
compiler: ['g++']
os-release: [ '15.5', '15.6' ]
build-type: ['Release']
steps:
- uses: actions/checkout@v4
- name: Install Packages
timeout-minutes: 25
uses: nick-fields/retry@v3
with:
retry_wait_seconds: 30
timeout_minutes: 25
max_attempts: 5
command: |
if [ "${{ matrix.os-release }}" == "15.5" ]; then
wget https://commondatastorage.googleapis.com/perfetto-luci-artifacts/v47.0/linux-amd64/trace_processor_shell -P /opt/trace_processor/bin &&
chmod +x /opt/trace_processor/bin/trace_processor_shell
fi
python3 -m pip install --upgrade pip &&
python3 -m pip install --upgrade numpy perfetto dataclasses &&
python3 -m pip install 'cmake==3.21' &&
for i in 6 7 8 9 10 11; do /opt/conda/envs/py3.${i}/bin/python -m pip install --upgrade numpy perfetto dataclasses; done
- name: Configure Env
run:
echo "CC=$(echo '${{ matrix.compiler }}' | sed 's/+/c/g')" >> $GITHUB_ENV &&
echo "CXX=${{ matrix.compiler }}" >> $GITHUB_ENV &&
echo "/opt/rocprofiler-systems/bin:${HOME}/.local/bin" >> $GITHUB_PATH &&
echo "LD_LIBRARY_PATH=/opt/rocprofiler-systems/lib:${LD_LIBRARY_PATH}" >> $GITHUB_ENV
- name: Configure, Build, and Test
timeout-minutes: 115
shell: bash
run:
git config --global --add safe.directory ${PWD} &&
cmake --version &&
python3 ./scripts/run-ci.py -B build
--name ${{ github.repository_owner }}-${{ github.ref_name }}-opensuse-${{ matrix.os-release }}-${{ matrix.compiler }}-nompi-python
--build-jobs 2
--site GitHub
--
-DCMAKE_C_COMPILER=$(echo '${{ matrix.compiler }}' | sed 's/+/c/g')
-DCMAKE_CXX_COMPILER=${{ matrix.compiler }}
-DCMAKE_BUILD_TYPE=${{ matrix.build-type }}
-DCMAKE_INSTALL_PREFIX=/opt/rocprofiler-systems
-DROCPROFSYS_BUILD_TESTING=ON
-DROCPROFSYS_USE_MPI=OFF
-DROCPROFSYS_USE_ROCM=OFF
-DROCPROFSYS_USE_OMPT=OFF
-DROCPROFSYS_USE_PYTHON=ON
-DROCPROFSYS_BUILD_DYNINST=ON
-DROCPROFSYS_BUILD_BOOST=ON
-DROCPROFSYS_BUILD_TBB=ON
-DROCPROFSYS_BUILD_ELFUTILS=ON
-DROCPROFSYS_BUILD_LIBIBERTY=ON
-DROCPROFSYS_INSTALL_PERFETTO_TOOLS=OFF
-DROCPROFSYS_USE_MPI_HEADERS=ON
-DROCPROFSYS_PYTHON_PREFIX=/opt/conda/envs
-DROCPROFSYS_PYTHON_ENVS="py3.6;py3.7;py3.8;py3.9;py3.10;py3.11"
-DROCPROFSYS_CI_MPI_RUN_AS_ROOT=ON
-DROCPROFSYS_MAX_THREADS=64
-DROCPROFSYS_DISABLE_EXAMPLES="transpose;rccl;openmp-target;videodecode;jpegdecode"
-DROCPROFSYS_BUILD_NUMBER=${{ github.run_attempt }}
--
-LE "transpose|rccl|videodecode|jpegdecode|network|mpi"
- name: Install
timeout-minutes: 10
run:
cmake --build build --target install --parallel 2
- name: Test Install
timeout-minutes: 10
run: |
set -v
export ROCPROFSYS_DEBUG=ON
which rocprof-sys-avail
ldd $(which rocprof-sys-avail)
rocprof-sys-avail --help
rocprof-sys-avail -a
which rocprof-sys-instrument
ldd $(which rocprof-sys-instrument)
rocprof-sys-instrument --help
rocprof-sys-instrument -e -v 1 -o ls.inst --simulate -- ls
for i in $(find rocprofsys-ls.inst-output -type f); do echo -e "\n\n --> ${i} \n\n"; cat ${i}; done
rocprof-sys-instrument -e -v 1 -o ls.inst -- ls
rocprof-sys-run -- ./ls.inst
rocprof-sys-instrument -e -v 1 --simulate -- ls
for i in $(find rocprofsys-ls-output -type f); do echo -e "\n\n --> ${i} \n\n"; cat ${i}; done
rocprof-sys-instrument -e -v 1 -- ls
- name: Test User API
timeout-minutes: 10
run: |
set -v
./scripts/test-find-package.sh --install-dir /opt/rocprofiler-systems
- name: CTest Artifacts
if: failure()
continue-on-error: True
uses: actions/upload-artifact@v4
with:
name: ctest-${{ github.job }}-${{ strategy.job-index }}-log
path: |
build/*.log
- name: Data Artifacts
if: failure()
continue-on-error: True
uses: actions/upload-artifact@v4
with:
name: data-${{ github.job }}-${{ strategy.job-index }}-files
path: |
build/rocprofsys-tests-config/*.cfg
build/rocprofsys-tests-output/**/*.txt
build/rocprofsys-tests-output/**/*-instr*.json
- name: Kill Perfetto
if: success() || failure()
continue-on-error: True
run: |
set +e
RUNNING_PROCS=$(pgrep trace_processor_shell)
if [ -n "${RUNNING_PROCS}" ]; then kill -s 9 ${RUNNING_PROCS}; fi
+45
Просмотреть файл
@@ -0,0 +1,45 @@
name: Python
run-name: Python
on:
push:
branches: [ amd-mainline, amd-staging ]
paths:
- 'source/python/gui/*.py'
- 'source/python/gui/**/*.py'
pull_request:
branches: [ amd-mainline, amd-staging ]
paths:
- 'source/python/gui/*.py'
- 'source/python/gui/**/*.py'
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
linting:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.7", "3.8", "3.9", "3.10"]
steps:
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
working-directory: ${{ github.workspace }}/source/python/gui
run: |
python -m pip install --upgrade pip
pip install flake8
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
- name: Lint with flake8
working-directory: ${{ github.workspace }}/source/python/gui
run: |
# stop the build if there are Python syntax errors or undefined names
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
# flake8 options are defined in setup.cfg
flake8 . --count --statistics
+180
Просмотреть файл
@@ -0,0 +1,180 @@
name: RedHat Linux (GCC, Python, ROCm)
run-name: redhat
on:
push:
branches: [ amd-mainline, amd-staging, release/** ]
paths-ignore:
- '*.md'
- 'docs/**'
- 'source/docs/**'
- 'source/python/gui/**'
- '.github/workflows/docs.yml'
- '.github/workflows/cpack.yml'
- '.github/workflows/containers.yml'
- '.github/workflows/formatting.yml'
- '.github/workflows/weekly-mainline-sync.yml'
- 'docker/**'
- .wordlist.txt
- CMakePresets.json
pull_request:
branches: [ amd-mainline, amd-staging, release/** ]
paths-ignore:
- '*.md'
- 'docs/**'
- 'source/docs/**'
- 'source/python/gui/**'
- '.github/workflows/docs.yml'
- '.github/workflows/cpack.yml'
- '.github/workflows/containers.yml'
- '.github/workflows/formatting.yml'
- '.github/workflows/weekly-mainline-sync.yml'
- 'docker/**'
- .wordlist.txt
- CMakePresets.json
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
env:
ROCPROFSYS_CI: ON
ROCPROFSYS_TMPDIR: "%env{PWD}%/testing-tmp"
jobs:
rhel:
runs-on: ubuntu-latest
container:
image: dgaliffiamd/rocprofiler-systems:ci-base-rhel-${{ matrix.os-release }}
strategy:
fail-fast: false
matrix:
compiler: ['g++']
os-release: [ '8.10', '9.3', '9.4' ]
rocm-version: [ '0.0', '6.3', '6.4' ]
build-type: ['Release']
steps:
- uses: actions/checkout@v4
- name: Configure Env
shell: bash
run:
echo "CC=$(echo '${{ matrix.compiler }}' | sed 's/+/c/g')" >> $GITHUB_ENV &&
echo "CXX=${{ matrix.compiler }}" >> $GITHUB_ENV &&
echo "OS_VERSION_MAJOR=$(cat /etc/os-release | grep 'VERSION_ID' | sed 's/=/ /1' | awk '{print $NF}' | sed 's/"//g' | sed 's/\./ /g' | awk '{print $1}')" >> $GITHUB_ENV &&
env
- name: Install Packages
shell: bash
run: |
if [ $OS_VERSION_MAJOR -eq 8 ]; then
wget https://commondatastorage.googleapis.com/perfetto-luci-artifacts/v47.0/linux-amd64/trace_processor_shell -P /opt/trace_processor/bin &&
chmod +x /opt/trace_processor/bin/trace_processor_shell
fi
python3 -m pip install --upgrade pip &&
python3 -m pip install --upgrade numpy perfetto dataclasses &&
python3 -m pip install 'cmake==3.21' &&
for i in 6 7 8 9 10 11; do /opt/conda/envs/py3.${i}/bin/python -m pip install --upgrade numpy perfetto dataclasses; done
- name: Install ROCm Packages
if: ${{ matrix.rocm-version > 0 }}
timeout-minutes: 30
shell: bash
run: |
RPM_TAG=".el${OS_VERSION_MAJOR}"
ROCM_VERSION=${{ matrix.rocm-version }}
ROCM_MAJOR=$(echo ${ROCM_VERSION} | sed 's/\./ /g' | awk '{print $1}')
ROCM_MINOR=$(echo ${ROCM_VERSION} | sed 's/\./ /g' | awk '{print $2}')
ROCM_VERSN=$(( (${ROCM_MAJOR}*10000)+(${ROCM_MINOR}*100) ))
if [ "${OS_VERSION_MAJOR}" -eq 8 ]; then PERL_REPO=powertools; else PERL_REPO=crb; fi
dnf -y --enablerepo=${PERL_REPO} install perl-File-BaseDir
yum install -y https://repo.radeon.com/amdgpu-install/${{ matrix.rocm-version }}/rhel/${{ matrix.os-release }}/amdgpu-install-${ROCM_MAJOR}.${ROCM_MINOR}.${ROCM_VERSN}-1${RPM_TAG}.noarch.rpm
yum install -y rocm-dev rocdecode-devel
if [ "${OS_VERSION_MAJOR}" -gt 8 ]; then dnf install -y libavcodec-free-devel libavformat-free-devel; fi
- name: Configure, Build, and Test
timeout-minutes: 115
shell: bash
run:
git config --global --add safe.directory ${PWD} &&
cmake --version &&
TAG="${{ github.repository_owner }}-${{ github.ref_name }}-rhel-${{ matrix.os-release }}-${{ matrix.compiler }}-python-mpip" &&
USE_HIP=OFF &&
if [ ${{ matrix.rocm-version }} != "0.0" ]; then USE_HIP=ON; TAG="${TAG}-rocm-${{ matrix.rocm-version }}"; fi &&
python3 ./scripts/run-ci.py -B build
--name ${TAG}
--build-jobs 2
--site GitHub
--
-DCMAKE_C_COMPILER=$(echo '${{ matrix.compiler }}' | sed 's/+/c/g')
-DCMAKE_CXX_COMPILER=${{ matrix.compiler }}
-DCMAKE_BUILD_TYPE=${{ matrix.build-type }}
-DCMAKE_INSTALL_PREFIX=/opt/rocprofiler-systems
-DROCPROFSYS_BUILD_TESTING=ON
-DROCPROFSYS_USE_MPI=OFF
-DROCPROFSYS_USE_ROCM=${USE_HIP}
-DROCPROFSYS_USE_OMPT=OFF
-DROCPROFSYS_USE_PYTHON=ON
-DROCPROFSYS_BUILD_DYNINST=ON
-DROCPROFSYS_BUILD_BOOST=ON
-DROCPROFSYS_BUILD_TBB=ON
-DROCPROFSYS_BUILD_ELFUTILS=ON
-DROCPROFSYS_BUILD_LIBIBERTY=ON
-DROCPROFSYS_USE_MPI_HEADERS=ON
-DROCPROFSYS_CI_MPI_RUN_AS_ROOT=ON
-DROCPROFSYS_MAX_THREADS=64
-DROCPROFSYS_INSTALL_PERFETTO_TOOLS=OFF
-DROCPROFSYS_PYTHON_PREFIX=/opt/conda/envs
-DROCPROFSYS_PYTHON_ENVS="py3.6;py3.7;py3.8;py3.9;py3.10;py3.11"
-DROCPROFSYS_DISABLE_EXAMPLES="transpose;rccl;openmp-target"
-DROCPROFSYS_BUILD_NUMBER=${{ github.run_attempt }}
--
-LE "transpose|rccl|videodecode|jpegdecode|network"
- name: Install
timeout-minutes: 10
run:
cmake --build build --target install --parallel 2
- name: Test Install
timeout-minutes: 10
shell: bash
run: |
set -v
source /opt/rocprofiler-systems/share/rocprofiler-systems/setup-env.sh
./scripts/test-install.sh --test-rocprof-sys-{instrument,avail,sample,rewrite,runtime,python}=1
- name: Test User API
timeout-minutes: 10
run: |
set -v
./scripts/test-find-package.sh --install-dir /opt/rocprofiler-systems
- name: CTest Artifacts
if: failure()
continue-on-error: True
uses: actions/upload-artifact@v4
with:
name: ctest-${{ github.job }}-${{ strategy.job-index }}-log
path: |
build/*.log
- name: Data Artifacts
if: failure()
continue-on-error: True
uses: actions/upload-artifact@v4
with:
name: data-${{ github.job }}-${{ strategy.job-index }}-files
path: |
build/rocprofsys-tests-config/*.cfg
build/rocprofsys-tests-output/**/*.txt
build/rocprofsys-tests-output/**/*-instr*.json
- name: Kill Perfetto
if: success() || failure()
continue-on-error: True
run: |
set +e
RUNNING_PROCS=$(pgrep trace_processor_shell)
if [ -n "${RUNNING_PROCS}" ]; then kill -s 9 ${RUNNING_PROCS}; fi
+40
Просмотреть файл
@@ -0,0 +1,40 @@
name: Release
on:
workflow_dispatch:
push:
tags:
- "v[1-9].[0-9]+.[0-9]+*"
- "rocm-[1-9].[0-9]+.[0-9]+*"
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
env:
GIT_DISCOVERY_ACROSS_FILESYSTEM: 1
jobs:
release:
if: github.repository == 'ROCm/rocprofiler-systems'
runs-on: ubuntu-latest
permissions:
contents: write
packages: write
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Generate generic installer script
shell: bash
run: |
sudo apt-get update
sudo apt-get install -y cmake
cmake -D OUTPUT_DIR=${PWD} -P scripts/write-rocprof-sys-install.cmake
- name: Generate Release
uses: softprops/action-gh-release@v1
with:
draft: False
generate_release_notes: True
fail_on_unmatched_files: True
files: |
rocprofiler-systems-install.py
+487
Просмотреть файл
@@ -0,0 +1,487 @@
name: Ubuntu 20.04 (GCC, Python, ROCm, MPICH, OpenMPI)
run-name: ubuntu-focal
on:
push:
branches: [ amd-mainline, amd-staging, release/** ]
paths-ignore:
- '*.md'
- 'docs/**'
- 'source/docs/**'
- 'source/python/gui/**'
- '.github/workflows/docs.yml'
- '.github/workflows/cpack.yml'
- '.github/workflows/containers.yml'
- '.github/workflows/formatting.yml'
- '.github/workflows/weekly-mainline-sync.yml'
- 'docker/**'
- .wordlist.txt
- CMakePresets.json
pull_request:
branches: [ amd-mainline, amd-staging, release/** ]
paths-ignore:
- '*.md'
- 'docs/**'
- 'source/docs/**'
- 'source/python/gui/**'
- '.github/workflows/docs.yml'
- '.github/workflows/cpack.yml'
- '.github/workflows/containers.yml'
- '.github/workflows/formatting.yml'
- '.github/workflows/weekly-mainline-sync.yml'
- 'docker/**'
- .wordlist.txt
- CMakePresets.json
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
env:
ROCPROFSYS_CI: ON
ROCPROFSYS_TMPDIR: "%env{PWD}%/testing-tmp"
jobs:
ubuntu-focal-external:
runs-on: ubuntu-latest
container:
image: dgaliffiamd/rocprofiler-systems:ci-base-ubuntu-20.04
strategy:
fail-fast: false
matrix:
compiler: ['g++-7', 'g++-8']
lto: ['OFF']
strip: ['OFF']
python: ['OFF']
build-type: ['Release']
mpi-headers: ['OFF']
static-libgcc: ['OFF']
static-libstdcxx: ['OFF']
include:
- compiler: 'g++-9'
lto: 'OFF'
strip: 'ON'
python: 'OFF'
build-type: 'Release'
mpi-headers: 'ON'
static-libgcc: 'ON'
static-libstdcxx: 'ON'
- compiler: 'g++-10'
lto: 'OFF'
strip: 'ON'
python: 'ON'
build-type: 'Release'
mpi-headers: 'ON'
static-libgcc: 'ON'
static-libstdcxx: 'OFF'
- compiler: 'g++-11'
lto: 'ON'
strip: 'ON'
python: 'OFF'
build-type: 'Release'
mpi-headers: 'ON'
static-libgcc: 'ON'
static-libstdcxx: 'OFF'
steps:
- uses: actions/checkout@v4
- name: Install Packages
timeout-minutes: 25
uses: nick-fields/retry@v3
with:
retry_wait_seconds: 30
timeout_minutes: 25
max_attempts: 5
command: |
apt-get update &&
apt-get install -y software-properties-common &&
add-apt-repository -y ppa:ubuntu-toolchain-r/test &&
apt-get update &&
apt-get upgrade -y &&
apt-get install -y autoconf bison build-essential clang environment-modules gettext libiberty-dev libmpich-dev libtool m4 mpich python3-pip texinfo ${{ matrix.compiler }} &&
wget https://commondatastorage.googleapis.com/perfetto-luci-artifacts/v47.0/linux-amd64/trace_processor_shell -P /opt/trace_processor/bin &&
chmod +x /opt/trace_processor/bin/trace_processor_shell &&
python3 -m pip install --upgrade pip &&
python3 -m pip install --upgrade numpy perfetto dataclasses &&
python3 -m pip install 'cmake==3.21' &&
for i in 6 7 8 9 10 11; do /opt/conda/envs/py3.${i}/bin/python -m pip install --upgrade numpy perfetto dataclasses; done &&
apt-get -y --purge autoremove &&
apt-get -y clean &&
/opt/conda/bin/conda clean -y -a
- name: Test Environment Modules
timeout-minutes: 15
shell: bash
run: |
set -v
source /usr/share/modules/init/$(basename ${SHELL})
module avail
- name: Configure Env
run:
echo "CC=$(echo '${{ matrix.compiler }}' | sed 's/+/c/g')" >> $GITHUB_ENV &&
echo "CXX=${{ matrix.compiler }}" >> $GITHUB_ENV
- name: Configure, Build, and Test
timeout-minutes: 115
shell: bash
run:
git config --global --add safe.directory ${PWD} &&
TAG="" &&
append-tagname() { if [ "${1}" == "ON" ]; then TAG="${TAG}-${2}"; fi; } &&
append-tagname ${{ matrix.lto }} lto &&
append-tagname ${{ matrix.strip }} strip &&
append-tagname ${{ matrix.python }} python &&
append-tagname ${{ matrix.mpi-headers }} mpip &&
append-tagname ${{ matrix.static-libgcc }} libgcc &&
append-tagname ${{ matrix.static-libstdcxx }} libstdcxx &&
cmake --version &&
python3 ./scripts/run-ci.py -B build
--name ${{ github.repository_owner }}-${{ github.ref_name }}-ubuntu-focal-${{ matrix.compiler }}${TAG}
--build-jobs 2
--site GitHub
--
-DCMAKE_C_COMPILER=$(echo '${{ matrix.compiler }}' | sed 's/+/c/g')
-DCMAKE_CXX_COMPILER=${{ matrix.compiler }}
-DCMAKE_BUILD_TYPE=${{ matrix.build-type }}
-DCMAKE_INSTALL_PREFIX=/opt/rocprofiler-systems
-DROCPROFSYS_BUILD_TESTING=ON
-DROCPROFSYS_USE_MPI=OFF
-DROCPROFSYS_USE_ROCM=OFF
-DROCPROFSYS_USE_OMPT=OFF
-DROCPROFSYS_USE_PAPI=OFF
-DROCPROFSYS_BUILD_DYNINST=ON
-DROCPROFSYS_BUILD_BOOST=ON
-DROCPROFSYS_BUILD_TBB=ON
-DROCPROFSYS_BUILD_ELFUTILS=ON
-DROCPROFSYS_BUILD_LIBIBERTY=ON
-DROCPROFSYS_USE_PYTHON=${{ matrix.python }}
-DROCPROFSYS_USE_MPI_HEADERS=${{ matrix.mpi-headers }}
-DROCPROFSYS_STRIP_LIBRARIES=${{ matrix.strip }}
-DROCPROFSYS_BUILD_LTO=${{ matrix.lto }}
-DROCPROFSYS_BUILD_STATIC_LIBGCC=${{ matrix.static-libgcc }}
-DROCPROFSYS_BUILD_STATIC_LIBSTDCXX=${{ matrix.static-libstdcxx }}
-DROCPROFSYS_PYTHON_PREFIX=/opt/conda/envs
-DROCPROFSYS_PYTHON_ENVS="py3.6;py3.7;py3.8;py3.9;py3.10;py3.11"
-DROCPROFSYS_MAX_THREADS=64
-DROCPROFSYS_DISABLE_EXAMPLES="transpose;rccl;videodecode;jpegdecode;openmp-target"
-DROCPROFSYS_BUILD_NUMBER=${{ github.run_attempt }}
-DMPI_HEADERS_ALLOW_MPICH=OFF
--
-LE "transpose|rccl|videodecode|jpegdecode|network"
- name: Test Build-Tree Module
timeout-minutes: 45
shell: bash
run: |
cd build
source /usr/share/modules/init/$(basename ${SHELL})
module use ./share/modulefiles
module avail
module load rocprofiler-systems
echo $(which rocprof-sys-instrument)
ldd $(which rocprof-sys-instrument)
rocprof-sys-instrument --help
rocprof-sys-avail --help
rocprof-sys-sample --help
- name: Test Build-Tree Source Script
timeout-minutes: 45
shell: bash
run: |
cd build
source ./share/rocprofiler-systems/setup-env.sh
echo $(which rocprof-sys-instrument)
ldd $(which rocprof-sys-instrument)
rocprof-sys-instrument --help
rocprof-sys-avail --help
rocprof-sys-sample --help
- name: Install
timeout-minutes: 10
run:
cmake --build build --target install --parallel 2
- name: Test Install
timeout-minutes: 15
shell: bash
run: |
source /usr/share/modules/init/$(basename ${SHELL})
module use /opt/rocprofiler-systems/share/modulefiles
module avail
module load rocprofiler-systems
./scripts/test-install.sh --test-rocprof-sys-{instrument,avail,sample,rewrite,runtime}=1 --test-rocprof-sys-python=${{ matrix.python }}
- name: Test User API
timeout-minutes: 10
run: |
set -v
./scripts/test-find-package.sh --install-dir /opt/rocprofiler-systems
- name: CTest Artifacts
if: failure()
continue-on-error: True
uses: actions/upload-artifact@v4
with:
name: ctest-${{ github.job }}-${{ strategy.job-index }}-log
path: |
build/*.log
- name: Data Artifacts
if: failure()
continue-on-error: True
uses: actions/upload-artifact@v4
with:
name: data-${{ github.job }}-${{ strategy.job-index }}-files
path: |
build/rocprofsys-tests-config/*.cfg
build/rocprofsys-tests-output/**/*.txt
build/rocprofsys-tests-output/**/*-instr*.json
- name: Kill Perfetto
if: success() || failure()
continue-on-error: True
run: |
set +e
RUNNING_PROCS=$(pgrep trace_processor_shell)
if [ -n "${RUNNING_PROCS}" ]; then kill -s 9 ${RUNNING_PROCS}; fi
ubuntu-focal-external-rocm:
runs-on: ubuntu-latest
container:
image: dgaliffiamd/rocprofiler-systems:ci-base-ubuntu-20.04
strategy:
fail-fast: false
matrix:
compiler: ['g++']
rocm-version: ['6.3']
mpi-headers: ['OFF']
build-jobs: ['3']
ctest-exclude: ['-LE "transpose|videodecode|jpegdecode|network"']
env:
BUILD_TYPE: MinSizeRel
OMPI_ALLOW_RUN_AS_ROOT: 1
OMPI_ALLOW_RUN_AS_ROOT_CONFIRM: 1
steps:
- uses: actions/checkout@v4
- name: Install Packages
timeout-minutes: 25
uses: nick-fields/retry@v3
with:
retry_wait_seconds: 30
timeout_minutes: 25
max_attempts: 5
command: |
apt-get update &&
apt-get install -y software-properties-common wget gnupg2 &&
ROCM_VERSION=${{ matrix.rocm-version }} &&
ROCM_MAJOR=$(echo ${ROCM_VERSION} | sed 's/\./ /g' | awk '{print $1}') &&
ROCM_MINOR=$(echo ${ROCM_VERSION} | sed 's/\./ /g' | awk '{print $2}') &&
ROCM_VERSN=$(( (${ROCM_MAJOR}*10000)+(${ROCM_MINOR}*100) )) &&
echo "ROCM_MAJOR=${ROCM_MAJOR} ROCM_MINOR=${ROCM_MINOR} ROCM_VERSN=${ROCM_VERSN}" &&
wget -q https://repo.radeon.com/amdgpu-install/${{ matrix.rocm-version }}/ubuntu/focal/amdgpu-install_${ROCM_MAJOR}.${ROCM_MINOR}.${ROCM_VERSN}-1_all.deb &&
apt-get install -y ./amdgpu-install_${ROCM_MAJOR}.${ROCM_MINOR}.${ROCM_VERSN}-1_all.deb &&
apt-get update &&
apt-get install -y autoconf bison build-essential clang curl gettext libfabric-dev libnuma1 libomp-dev libopenmpi-dev libpapi-dev libtool libudev1 m4 openmpi-bin python3-pip rocm-dev texinfo &&
apt-get install -y rocdecode-dev libavformat-dev libavcodec-dev &&
wget https://commondatastorage.googleapis.com/perfetto-luci-artifacts/v47.0/linux-amd64/trace_processor_shell -P /opt/trace_processor/bin &&
chmod +x /opt/trace_processor/bin/trace_processor_shell &&
python3 -m pip install --upgrade pip &&
python3 -m pip install --upgrade numpy perfetto dataclasses &&
python3 -m pip install 'cmake==3.21' &&
for i in 6 7 8 9 10 11; do /opt/conda/envs/py3.${i}/bin/python -m pip install --upgrade numpy perfetto dataclasses; done &&
apt-get -y --purge autoremove &&
apt-get -y clean &&
/opt/conda/bin/conda clean -y -a
- name: Configure Env
run: |
echo "CC=$(echo '${{ matrix.compiler }}' | sed 's/+/c/g')" >> $GITHUB_ENV
echo "CXX=${{ matrix.compiler }}" >> $GITHUB_ENV
echo "CMAKE_PREFIX_PATH=/opt/dyninst:${CMAKE_PREFIX_PATH}" >> $GITHUB_ENV
echo "LD_LIBRARY_PATH=/opt/rocm/lib:/usr/local/lib:${LD_LIBRARY_PATH}" >> $GITHUB_ENV
cat << EOF > test-install.cfg
ROCPROFSYS_PROFILE = ON
ROCPROFSYS_TRACE = ON
ROCPROFSYS_USE_PID = OFF
ROCPROFSYS_USE_SAMPLING = OFF
ROCPROFSYS_USE_PROCESS_SAMPLING = OFF
ROCPROFSYS_COUT_OUTPUT = ON
ROCPROFSYS_TIME_OUTPUT = OFF
ROCPROFSYS_TIMEMORY_COMPONENTS = cpu_clock cpu_util current_peak_rss kernel_mode_time monotonic_clock monotonic_raw_clock network_stats num_io_in num_io_out num_major_page_faults num_minor_page_faults page_rss peak_rss priority_context_switch process_cpu_clock process_cpu_util read_bytes read_char system_clock thread_cpu_clock thread_cpu_util timestamp trip_count user_clock user_mode_time virtual_memory voluntary_context_switch wall_clock written_bytes written_char
ROCPROFSYS_OUTPUT_PATH = rocprofsys-tests-output
ROCPROFSYS_OUTPUT_PREFIX = %tag%/
ROCPROFSYS_DEBUG = OFF
ROCPROFSYS_VERBOSE = 3
ROCPROFSYS_DL_VERBOSE = 3
ROCPROFSYS_PERFETTO_BACKEND = system
EOF
realpath test-install.cfg
cat test-install.cfg
- name: Configure, Build, and Test
timeout-minutes: 115
shell: bash
run:
git config --global --add safe.directory ${PWD} &&
cmake --version &&
TAG="-rocm-${{ matrix.rocm-version }}" &&
TAG="$(echo ${TAG} | sed 's/debian/latest/g')" &&
python3 ./scripts/run-ci.py -B build
--name ${{ github.repository_owner }}-${{ github.ref_name }}-ubuntu-focal-rocm-${{ matrix.compiler }}${TAG}
--build-jobs 2
--site GitHub
--
-DCMAKE_C_COMPILER=$(echo '${{ matrix.compiler }}' | sed 's/+/c/g')
-DCMAKE_CXX_COMPILER=${{ matrix.compiler }}
-DCMAKE_BUILD_TYPE=${{ env.BUILD_TYPE }}
-DCMAKE_INSTALL_PREFIX=/opt/rocprofiler-systems
-DROCPROFSYS_BUILD_TESTING=ON
-DROCPROFSYS_BUILD_DEVELOPER=ON
-DROCPROFSYS_BUILD_EXTRA_OPTIMIZATIONS=OFF
-DROCPROFSYS_BUILD_LTO=OFF
-DROCPROFSYS_USE_MPI=OFF
-DROCPROFSYS_USE_ROCM=ON
-DROCPROFSYS_MAX_THREADS=64
-DROCPROFSYS_USE_PAPI=OFF
-DROCPROFSYS_USE_OMPT=OFF
-DROCPROFSYS_USE_PYTHON=ON
-DROCPROFSYS_USE_MPI_HEADERS=${{ matrix.mpi-headers }}
-DROCPROFSYS_BUILD_DYNINST=ON
-DROCPROFSYS_BUILD_BOOST=ON
-DROCPROFSYS_BUILD_TBB=ON
-DROCPROFSYS_BUILD_ELFUTILS=ON
-DROCPROFSYS_BUILD_LIBIBERTY=ON
-DROCPROFSYS_USE_SANITIZER=OFF
-DROCPROFSYS_PYTHON_PREFIX=/opt/conda/envs
-DROCPROFSYS_PYTHON_ENVS="py3.6;py3.7;py3.8;py3.9;py3.10;py3.11"
-DROCPROFSYS_CI_MPI_RUN_AS_ROOT=${{ matrix.mpi-headers }}
-DROCPROFSYS_CI_GPU=OFF
-DCMAKE_INSTALL_RPATH_USE_LINK_PATH=OFF
-DROCPROFSYS_BUILD_NUMBER=${{ github.run_attempt }}
--
${{ matrix.ctest-exclude }}
- name: Install
run:
cmake --build build --target install --parallel 2
- name: Test Install
timeout-minutes: 15
shell: bash
run: |
source /opt/rocprofiler-systems/share/rocprofiler-systems/setup-env.sh
./scripts/test-install.sh --test-rocprof-sys-{instrument,avail,sample,python,rewrite,runtime}=1
- name: Test User API
timeout-minutes: 10
run: |
set -v
./scripts/test-find-package.sh --install-dir /opt/rocprofiler-systems
- name: CTest Artifacts
if: failure()
continue-on-error: True
uses: actions/upload-artifact@v4
with:
name: ctest-${{ github.job }}-${{ strategy.job-index }}-log
path: |
build/*.log
- name: Data Artifacts
if: failure()
continue-on-error: True
uses: actions/upload-artifact@v4
with:
name: data-${{ github.job }}-${{ strategy.job-index }}-files
path: |
rocprofsys-tests-output/**/*.txt
build/rocprofsys-tests-config/*.cfg
build/rocprofsys-tests-output/**/*.txt
build/rocprofsys-tests-output/**/*-instr*.json
- name: Kill Perfetto
if: success() || failure()
continue-on-error: True
run: |
set +e
RUNNING_PROCS=$(pgrep trace_processor_shell)
if [ -n "${RUNNING_PROCS}" ]; then kill -s 9 ${RUNNING_PROCS}; fi
ubuntu-focal-codecov:
runs-on: ubuntu-latest
container:
image: dgaliffiamd/rocprofiler-systems:ci-base-ubuntu-20.04
options: --cap-add CAP_SYS_ADMIN
env:
ROCPROFSYS_VERBOSE: 2
ROCPROFSYS_CAUSAL_BACKEND: perf
steps:
- uses: actions/checkout@v4
- name: Install Packages
timeout-minutes: 25
uses: nick-fields/retry@v3
with:
retry_wait_seconds: 30
timeout_minutes: 25
max_attempts: 5
command: |
apt-get update &&
apt-get install -y autoconf bison build-essential clang environment-modules gcc g++ libmpich-dev libomp-dev libtool m4 mpich python3-pip texinfo &&
wget https://commondatastorage.googleapis.com/perfetto-luci-artifacts/v47.0/linux-amd64/trace_processor_shell -P /opt/trace_processor/bin &&
chmod +x /opt/trace_processor/bin/trace_processor_shell &&
python3 -m pip install --upgrade pip &&
python3 -m pip install --upgrade numpy perfetto dataclasses &&
python3 -m pip install 'cmake==3.21' &&
for i in 6 7 8 9 10 11; do /opt/conda/envs/py3.${i}/bin/python -m pip install --upgrade numpy perfetto dataclasses; done &&
apt-get -y --purge autoremove &&
apt-get -y clean &&
/opt/conda/bin/conda clean -y -a
- name: Configure Env
run:
echo "${HOME}/.local/bin" >> $GITHUB_PATH
- name: Configure, Build, and Test
timeout-minutes: 115
shell: bash
run:
git config --global --add safe.directory ${PWD} &&
cmake --version &&
python3 ./scripts/run-ci.py -B build
--name ${{ github.repository_owner }}-${{ github.ref_name }}-ubuntu-focal-codecov-mpi-python-ompt-papi
--build-jobs 2
--site GitHub
--coverage
--
-DCMAKE_INSTALL_PREFIX=/opt/rocprofiler-systems
-DROCPROFSYS_BUILD_CI=OFF
-DROCPROFSYS_BUILD_TESTING=ON
-DROCPROFSYS_BUILD_DYNINST=ON
-DROCPROFSYS_BUILD_BOOST=ON
-DROCPROFSYS_BUILD_TBB=ON
-DROCPROFSYS_BUILD_ELFUTILS=ON
-DROCPROFSYS_BUILD_LIBIBERTY=ON
-DROCPROFSYS_BUILD_DEBUG=OFF
-DROCPROFSYS_BUILD_HIDDEN_VISIBILITY=OFF
-DROCPROFSYS_USE_MPI=ON
-DROCPROFSYS_USE_PYTHON=ON
-DROCPROFSYS_USE_OMPT=ON
-DROCPROFSYS_USE_PAPI=ON
-DROCPROFSYS_USE_ROCM=OFF
-DROCPROFSYS_USE_RCCL=OFF
-DROCPROFSYS_MAX_THREADS=64
-DROCPROFSYS_DISABLE_EXAMPLES="transpose;rccl;videodecode;jpegdecode;openmp-target"
-DROCPROFSYS_BUILD_NUMBER=${{ github.run_attempt }}
--
-LE "transpose|rccl|videodecode|jpegdecode|network"
+381
Просмотреть файл
@@ -0,0 +1,381 @@
name: Ubuntu 22.04 (GCC, Python, ROCm)
run-name: ubuntu-jammy
on:
push:
branches: [ amd-mainline, amd-staging, release/** ]
paths-ignore:
- '*.md'
- 'docs/**'
- 'source/docs/**'
- 'source/python/gui/**'
- '.github/workflows/docs.yml'
- '.github/workflows/cpack.yml'
- '.github/workflows/containers.yml'
- '.github/workflows/formatting.yml'
- '.github/workflows/weekly-mainline-sync.yml'
- 'docker/**'
- .wordlist.txt
- CMakePresets.json
pull_request:
branches: [ amd-mainline, amd-staging, release/** ]
paths-ignore:
- '*.md'
- 'docs/**'
- 'source/docs/**'
- 'source/python/gui/**'
- '.github/workflows/docs.yml'
- '.github/workflows/cpack.yml'
- '.github/workflows/containers.yml'
- '.github/workflows/formatting.yml'
- '.github/workflows/weekly-mainline-sync.yml'
- 'docker/**'
- .wordlist.txt
- CMakePresets.json
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
env:
ROCPROFSYS_CI: ON
ROCPROFSYS_TMPDIR: "%env{PWD}%/testing-tmp"
jobs:
ubuntu-jammy-external:
runs-on: ubuntu-latest
container:
image: dgaliffiamd/rocprofiler-systems:ci-base-ubuntu-22.04
strategy:
fail-fast: false
matrix:
compiler: ['g++-11', 'g++-12']
rocm: ['OFF']
mpi: ['OFF']
ompt: ['OFF']
papi: ['OFF']
python: ['ON']
strip: ['OFF']
hidden: ['ON', 'OFF']
build-type: ['Release']
mpi-headers: ['ON', 'OFF']
build-dyninst: ['ON']
rocm-version: ['0.0']
env:
OMPI_ALLOW_RUN_AS_ROOT: 1
OMPI_ALLOW_RUN_AS_ROOT_CONFIRM: 1
ROCPROFSYS_CI: 'ON'
steps:
- uses: actions/checkout@v4
- name: Install Packages
timeout-minutes: 25
uses: nick-fields/retry@v3
with:
retry_wait_seconds: 30
timeout_minutes: 25
max_attempts: 5
command: |
apt-get update &&
apt-get install -y software-properties-common &&
apt-get upgrade -y &&
apt-get install -y autoconf bison build-essential clang environment-modules \
gettext libfabric-dev libiberty-dev libomp-dev libopenmpi-dev libtool m4 \
openmpi-bin python3-pip texinfo ${{ matrix.compiler }} &&
python3 -m pip install --upgrade pip &&
python3 -m pip install --upgrade numpy perfetto dataclasses &&
python3 -m pip install 'cmake==3.21' &&
for i in 6 7 8 9 10 11; do /opt/conda/envs/py3.${i}/bin/python -m pip install --upgrade numpy perfetto dataclasses; done
- name: Test Environment Modules
timeout-minutes: 15
shell: bash
run: |
set -v
source /usr/share/modules/init/$(basename ${SHELL})
module avail
- name: Configure Env
run: |
echo "CC=$(echo '${{ matrix.compiler }}' | sed 's/+/c/g')" >> $GITHUB_ENV
echo "CXX=${{ matrix.compiler }}" >> $GITHUB_ENV
- name: Configure, Build, and Test
timeout-minutes: 115
shell: bash
run:
git config --global --add safe.directory ${PWD} &&
cmake --version &&
TAG="" &&
append-tagname() { if [ "${1}" == "ON" ]; then TAG="${TAG}-${2}"; fi; } &&
append-tagname ${{ matrix.mpi }} mpi &&
append-tagname ${{ matrix.ompt }} ompt &&
append-tagname ${{ matrix.papi }} papi &&
append-tagname ${{ matrix.python }} python &&
append-tagname ${{ matrix.mpi-headers }} mpip &&
append-tagname ${{ matrix.build-dyninst }} internal-dyninst &&
append-tagname ${{ matrix.strip }} strip &&
append-tagname ${{ matrix.hidden }} hidden-viz &&
python3 ./scripts/run-ci.py -B build
--name ${{ github.repository_owner }}-${{ github.ref_name }}-ubuntu-jammy-${{ matrix.compiler }}${TAG}
--build-jobs 2
--site GitHub
--
-DCMAKE_C_COMPILER=$(echo '${{ matrix.compiler }}' | sed 's/+/c/g')
-DCMAKE_CXX_COMPILER=${{ matrix.compiler }}
-DCMAKE_BUILD_TYPE=${{ matrix.build-type }}
-DCMAKE_INSTALL_PREFIX=/opt/rocprofiler-systems-dev
-DROCPROFSYS_BUILD_TESTING=ON
-DROCPROFSYS_USE_MPI=${{ matrix.mpi }}
-DROCPROFSYS_USE_ROCM=${{ matrix.rocm }}
-DROCPROFSYS_USE_OMPT=${{ matrix.ompt }}
-DROCPROFSYS_USE_PAPI=${{ matrix.papi }}
-DROCPROFSYS_USE_PYTHON=${{ matrix.python }}
-DROCPROFSYS_USE_MPI_HEADERS=${{ matrix.mpi-headers }}
-DROCPROFSYS_BUILD_DYNINST=${{ matrix.build-dyninst }}
-DROCPROFSYS_BUILD_BOOST=${{ matrix.build-dyninst }}
-DROCPROFSYS_BUILD_TBB=${{ matrix.build-dyninst }}
-DROCPROFSYS_BUILD_ELFUTILS=${{ matrix.build-dyninst }}
-DROCPROFSYS_BUILD_LIBIBERTY=${{ matrix.build-dyninst }}
-DROCPROFSYS_BUILD_HIDDEN_VISIBILITY=${{ matrix.hidden }}
-DROCPROFSYS_PYTHON_PREFIX=/opt/conda/envs
-DROCPROFSYS_PYTHON_ENVS="py3.7;py3.8;py3.9;py3.10;py3.11"
-DROCPROFSYS_STRIP_LIBRARIES=${{ matrix.strip }}
-DROCPROFSYS_MAX_THREADS=64
-DROCPROFSYS_DISABLE_EXAMPLES="transpose;rccl;openmp-target"
-DROCPROFSYS_BUILD_NUMBER=${{ github.run_attempt }}
-DUSE_CLANG_OMP=OFF
--
-LE "transpose|rccl|videodecode|jpegdecode|network"
- name: Install
timeout-minutes: 10
run:
cmake --build build --target install --parallel 2
- name: CPack and Install
run: |
cd build
cpack -G STGZ
mkdir -p /opt/rocprofiler-systems
./rocprofiler-systems-*.sh --prefix=/opt/rocprofiler-systems --exclude-subdir --skip-license
- name: Test Install with Modulefile
timeout-minutes: 15
shell: bash
run: |
set -v
source /usr/share/modules/init/$(basename ${SHELL})
module use /opt/rocprofiler-systems/share/modulefiles
module avail
module load rocprofiler-systems
./scripts/test-install.sh --test-rocprof-sys-{instrument,avail,sample,python,rewrite,runtime}=1
- name: Test User API
timeout-minutes: 10
run: |
set -v
./scripts/test-find-package.sh --install-dir /opt/rocprofiler-systems
- name: CTest Artifacts
if: failure()
continue-on-error: True
uses: actions/upload-artifact@v4
with:
name: ctest-${{ github.job }}-${{ strategy.job-index }}-log
path: |
build/*.log
- name: Data Artifacts
if: failure()
continue-on-error: True
uses: actions/upload-artifact@v4
with:
name: data-${{ github.job }}-${{ strategy.job-index }}-files
path: |
build/rocprofsys-tests-config/*.cfg
build/rocprofsys-tests-output/**/*.txt
build/rocprofsys-tests-output/**/*-instr*.json
ubuntu-jammy-external-rocm:
runs-on: ubuntu-latest
container:
image: dgaliffiamd/rocprofiler-systems:ci-base-ubuntu-22.04
strategy:
fail-fast: false
matrix:
compiler: ['g++']
rocm: ['ON']
mpi: ['OFF']
ompt: ['OFF']
papi: ['OFF']
python: ['ON']
strip: ['OFF']
hidden: ['ON']
build-type: ['Release']
mpi-headers: ['OFF']
build-dyninst: ['ON']
rocm-version: ['6.3', '6.4']
env:
OMPI_ALLOW_RUN_AS_ROOT: 1
OMPI_ALLOW_RUN_AS_ROOT_CONFIRM: 1
ROCPROFSYS_CI: 'ON'
steps:
- uses: actions/checkout@v4
- name: Install Packages
timeout-minutes: 25
uses: nick-fields/retry@v3
with:
retry_wait_seconds: 30
timeout_minutes: 25
max_attempts: 5
command: |
apt-get update &&
apt-get install -y software-properties-common &&
apt-get upgrade -y &&
apt-get install -y autoconf bison build-essential clang environment-modules \
gettext libfabric-dev libiberty-dev libomp-dev libopenmpi-dev libtool m4 \
openmpi-bin python3-pip texinfo ${{ matrix.compiler }} &&
python3 -m pip install --upgrade pip &&
python3 -m pip install --upgrade numpy perfetto dataclasses &&
python3 -m pip install 'cmake==3.21' &&
for i in 6 7 8 9 10 11; do /opt/conda/envs/py3.${i}/bin/python -m pip install --upgrade numpy perfetto dataclasses; done
- name: Install ROCm Packages
timeout-minutes: 25
uses: nick-fields/retry@v3
with:
retry_wait_seconds: 30
timeout_minutes: 25
max_attempts: 5
shell: bash
command: |
ROCM_VERSION=${{ matrix.rocm-version }}
ROCM_MAJOR=$(echo ${ROCM_VERSION} | sed 's/\./ /g' | awk '{print $1}')
ROCM_MINOR=$(echo ${ROCM_VERSION} | sed 's/\./ /g' | awk '{print $2}')
ROCM_VERSN=$(( (${ROCM_MAJOR}*10000)+(${ROCM_MINOR}*100) ))
echo "ROCM_MAJOR=${ROCM_MAJOR} ROCM_MINOR=${ROCM_MINOR} ROCM_VERSN=${ROCM_VERSN}"
wget -q https://repo.radeon.com/amdgpu-install/${{ matrix.rocm-version }}/ubuntu/jammy/amdgpu-install_${ROCM_MAJOR}.${ROCM_MINOR}.${ROCM_VERSN}-1_all.deb
apt-get install -y ./amdgpu-install_${ROCM_MAJOR}.${ROCM_MINOR}.${ROCM_VERSN}-1_all.deb
apt-get update
apt-get install -y rocm-dev rocdecode-dev libavformat-dev libavcodec-dev
echo "/opt/rocm/bin" >> $GITHUB_PATH
echo "ROCM_PATH=/opt/rocm" >> $GITHUB_ENV
echo "LD_LIBRARY_PATH=/opt/rocm/lib:${LD_LIBRARY_PATH}" >> $GITHUB_ENV
/opt/rocm/bin/hipcc -O3 -c ./examples/transpose/transpose.cpp -o /tmp/transpose.o
- name: Test Environment Modules
timeout-minutes: 15
shell: bash
run: |
set -v
source /usr/share/modules/init/$(basename ${SHELL})
module avail
- name: Configure Env
run: |
echo "CC=$(echo '${{ matrix.compiler }}' | sed 's/+/c/g')" >> $GITHUB_ENV
echo "CXX=${{ matrix.compiler }}" >> $GITHUB_ENV
- name: Configure, Build, and Test
timeout-minutes: 115
shell: bash
run:
git config --global --add safe.directory ${PWD} &&
cmake --version &&
TAG="" &&
append-tagname() { if [ "${1}" == "ON" ]; then TAG="${TAG}-${2}"; fi; } &&
append-tagname ${{ matrix.rocm }} rocm-${{ matrix.rocm-version }} &&
append-tagname ${{ matrix.mpi }} mpi &&
append-tagname ${{ matrix.ompt }} ompt &&
append-tagname ${{ matrix.papi }} papi &&
append-tagname ${{ matrix.python }} python &&
append-tagname ${{ matrix.mpi-headers }} mpip &&
append-tagname ${{ matrix.build-dyninst }} internal-dyninst &&
append-tagname ${{ matrix.strip }} strip &&
append-tagname ${{ matrix.hidden }} hidden-viz &&
python3 ./scripts/run-ci.py -B build
--name ${{ github.repository_owner }}-${{ github.ref_name }}-ubuntu-jammy-${{ matrix.compiler }}${TAG}
--build-jobs 2
--site GitHub
--
-DCMAKE_C_COMPILER=$(echo '${{ matrix.compiler }}' | sed 's/+/c/g')
-DCMAKE_CXX_COMPILER=${{ matrix.compiler }}
-DCMAKE_BUILD_TYPE=${{ matrix.build-type }}
-DCMAKE_INSTALL_PREFIX=/opt/rocprofiler-systems-dev
-DROCPROFSYS_BUILD_TESTING=ON
-DROCPROFSYS_USE_MPI=${{ matrix.mpi }}
-DROCPROFSYS_USE_ROCM=${{ matrix.rocm }}
-DROCPROFSYS_USE_OMPT=${{ matrix.ompt }}
-DROCPROFSYS_USE_PAPI=${{ matrix.papi }}
-DROCPROFSYS_USE_PYTHON=${{ matrix.python }}
-DROCPROFSYS_USE_MPI_HEADERS=${{ matrix.mpi-headers }}
-DROCPROFSYS_BUILD_DYNINST=${{ matrix.build-dyninst }}
-DROCPROFSYS_BUILD_BOOST=${{ matrix.build-dyninst }}
-DROCPROFSYS_BUILD_TBB=${{ matrix.build-dyninst }}
-DROCPROFSYS_BUILD_ELFUTILS=${{ matrix.build-dyninst }}
-DROCPROFSYS_BUILD_LIBIBERTY=${{ matrix.build-dyninst }}
-DROCPROFSYS_BUILD_HIDDEN_VISIBILITY=${{ matrix.hidden }}
-DROCPROFSYS_PYTHON_PREFIX=/opt/conda/envs
-DROCPROFSYS_PYTHON_ENVS="py3.7;py3.8;py3.9;py3.10;py3.11"
-DROCPROFSYS_STRIP_LIBRARIES=${{ matrix.strip }}
-DROCPROFSYS_MAX_THREADS=64
-DROCPROFSYS_DISABLE_EXAMPLES="transpose;rccl;openmp-target"
-DROCPROFSYS_BUILD_NUMBER=${{ github.run_attempt }}
-DUSE_CLANG_OMP=OFF
--
-LE "transpose|rccl|videodecode|jpegdecode|network"
- name: Install
timeout-minutes: 10
run:
cmake --build build --target install --parallel 2
- name: CPack and Install
run: |
cd build
cpack -G STGZ
mkdir -p /opt/rocprofiler-systems
./rocprofiler-systems-*.sh --prefix=/opt/rocprofiler-systems --exclude-subdir --skip-license
- name: Test Install with Modulefile
timeout-minutes: 15
shell: bash
run: |
set -v
source /usr/share/modules/init/$(basename ${SHELL})
module use /opt/rocprofiler-systems/share/modulefiles
module avail
module load rocprofiler-systems
./scripts/test-install.sh --test-rocprof-sys-{instrument,avail,sample,python,rewrite,runtime}=1
- name: Test User API
timeout-minutes: 10
run: |
set -v
./scripts/test-find-package.sh --install-dir /opt/rocprofiler-systems
- name: CTest Artifacts
if: failure()
continue-on-error: True
uses: actions/upload-artifact@v4
with:
name: ctest-${{ github.job }}-${{ strategy.job-index }}-log
path: |
build/*.log
- name: Data Artifacts
if: failure()
continue-on-error: True
uses: actions/upload-artifact@v4
with:
name: data-${{ github.job }}-${{ strategy.job-index }}-files
path: |
build/rocprofsys-tests-config/*.cfg
build/rocprofsys-tests-output/**/*.txt
build/rocprofsys-tests-output/**/*-instr*.json
+121
Просмотреть файл
@@ -0,0 +1,121 @@
name: Ubuntu 24.04 (GCC, Python, ROCm)
run-name: ubuntu-noble
on:
push:
branches: [ amd-mainline, amd-staging, release/** ]
paths-ignore:
- '*.md'
- 'docs/**'
- 'source/docs/**'
- 'source/python/gui/**'
- '.github/workflows/docs.yml'
- '.github/workflows/cpack.yml'
- '.github/workflows/containers.yml'
- '.github/workflows/formatting.yml'
- '.github/workflows/weekly-mainline-sync.yml'
- 'docker/**'
- .wordlist.txt
- CMakePresets.json
pull_request:
branches: [ amd-mainline, amd-staging, release/** ]
paths-ignore:
- '*.md'
- 'docs/**'
- 'source/docs/**'
- 'source/python/gui/**'
- '.github/workflows/docs.yml'
- '.github/workflows/cpack.yml'
- '.github/workflows/containers.yml'
- '.github/workflows/formatting.yml'
- '.github/workflows/weekly-mainline-sync.yml'
- 'docker/**'
- .wordlist.txt
- CMakePresets.json
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
env:
ROCPROFSYS_CI: ON
ROCPROFSYS_TMPDIR: "%env{PWD}%/testing-tmp"
jobs:
ubuntu-noble:
runs-on: ubuntu-latest
container:
image: dgaliffiamd/rocprofiler-systems:ci-base-ubuntu-24.04
strategy:
fail-fast: false
matrix:
compiler: ['g++']
build-type: ['Release', 'Debug']
strip: ['OFF']
build-dyninst: ['OFF']
rocm-version: ['0.0','6.3','6.4']
env:
ROCPROFSYS_CI: 'ON'
steps:
- uses: actions/checkout@v4
- name: Install Packages
timeout-minutes: 25
uses: nick-fields/retry@v3
with:
retry_wait_seconds: 30
timeout_minutes: 25
max_attempts: 5
command: |
apt-get -y update && apt-get upgrade -y &&
apt-get install -y \
libiberty-dev clang libomp-dev libopenmpi-dev libfabric-dev \
openmpi-bin ${{ matrix.compiler }} &&
for i in 8 9 10 11 12; do /opt/conda/envs/py3.${i}/bin/python -m pip install numpy perfetto dataclasses; done
- name: Install ROCm Packages
if: ${{ matrix.rocm-version > 0 }}
timeout-minutes: 30
shell: bash
run: |
ROCM_VERSION=${{ matrix.rocm-version }}
ROCM_MAJOR=$(echo ${ROCM_VERSION} | sed 's/\./ /g' | awk '{print $1}')
ROCM_MINOR=$(echo ${ROCM_VERSION} | sed 's/\./ /g' | awk '{print $2}')
ROCM_VERSN=$(( (${ROCM_MAJOR}*10000)+(${ROCM_MINOR}*100) ))
echo "ROCM_MAJOR=${ROCM_MAJOR} ROCM_MINOR=${ROCM_MINOR} ROCM_VERSN=${ROCM_VERSN}"
wget -q https://repo.radeon.com/amdgpu-install/${{ matrix.rocm-version }}/ubuntu/noble/amdgpu-install_${ROCM_MAJOR}.${ROCM_MINOR}.${ROCM_VERSN}-1_all.deb
apt-get install -y ./amdgpu-install_${ROCM_MAJOR}.${ROCM_MINOR}.${ROCM_VERSN}-1_all.deb
apt-get update
apt-get install -y rocm-dev rocdecode-dev libavformat-dev libavcodec-dev
- name: Configure
timeout-minutes: 30
shell: bash
run: |
git config --global --add safe.directory ${PWD} &&
cmake --version
USE_ROCM=OFF
if [ ${{ matrix.rocm-version }} != "0.0" ]; then USE_ROCM=ON; fi
cmake -B build \
-DCMAKE_C_COMPILER=$(echo '${{ matrix.compiler }}' | sed 's/+/c/g') \
-DCMAKE_CXX_COMPILER=${{ matrix.compiler }} \
-DCMAKE_BUILD_TYPE=${{ matrix.build-type }} \
-DCMAKE_INSTALL_PREFIX=/opt/rocprofiler-systems \
-DROCPROFSYS_BUILD_TESTING=ON \
-DROCPROFSYS_DISABLE_EXAMPLES="transpose;rccl;openmp-target" \
-DROCPROFSYS_USE_ROCM=${USE_ROCM} \
-DRCOPROFSYS_USE_PYTHON=ON \
-DROCPROFSYS_BUILD_DYNINST=ON \
-DROCPROFSYS_BUILD_BOOST=ON \
-DROCPROFSYS_BUILD_TBB=ON \
-DROCPROFSYS_BUILD_ELFUTILS=ON \
-DROCPROFSYS_BUILD_LIBIBERTY=ON \
-DROCPROFSYS_STRIP_LIBRARIES=${{ matrix.strip }} \
-DROCPROFSYS_PYTHON_PREFIX=/opt/conda/envs \
-DROCPROFSYS_PYTHON_ENVS="py3.8;py3.9;py3.10;py3.11;py3.12"
- name: Build
timeout-minutes: 115
run: cmake --build build --parallel 2
+25
Просмотреть файл
@@ -0,0 +1,25 @@
name: Sync Mainline with Staging
on:
workflow_dispatch:
jobs:
promote-stg-to-main:
if: github.repository == 'ROCm/rocprofiler-systems'
runs-on: ubuntu-latest
name: Promote Staging to Mainline
steps:
- name: Checkout
uses: actions/checkout@v4
with:
ref: amd-mainline
fetch-depth: '0'
- name: Merge - Fast Forward Only
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
git checkout amd-mainline
git checkout -b promote-staging-$(date +%F)
git merge --ff-only origin/amd-staging
git push -u origin HEAD
gh pr create --base amd-mainline --title "Promote \`amd-staging\` to \`amd-mainline\`" --fill --label "automerge"
+50
Просмотреть файл
@@ -0,0 +1,50 @@
# Edit files
*~
# Prerequisites
*.d
# Compiled Object files
*.slo
*.lo
*.o
*.obj
# Precompiled Headers
*.gch
*.pch
# Compiled Dynamic libraries
*.so
*.dylib
*.dll
# Fortran module files
*.mod
*.smod
# Compiled Static libraries
*.lai
*.la
*.a
*.lib
# Executables
*.exe
*.out
*.app
# Python cache files
*.pyc
# Documentation artifacts
/_build
_toc.yml
/build*
/.vscode
/.cache
/.clangd
/compile_commands.json
/rocprof-sys-install.py
/scripts/rocprof-sys-install.py
+26
Просмотреть файл
@@ -0,0 +1,26 @@
[submodule "external/timemory"]
path = external/timemory
url = https://github.com/ROCm/timemory.git
branch = rocprofiler-systems
[submodule "external/perfetto"]
path = external/perfetto
url = https://github.com/google/perfetto.git
[submodule "external/elfio"]
path = external/elfio
url = https://github.com/jrmadsen/ELFIO.git
[submodule "external/dyninst"]
path = external/dyninst
url = https://github.com/ROCm/dyninst.git
branch = dyninst_13
[submodule "external/PTL"]
path = external/PTL
url = https://github.com/jrmadsen/PTL.git
[submodule "external/kokkos"]
path = examples/lulesh/external/kokkos
url = https://github.com/kokkos/kokkos.git
[submodule "external/papi"]
path = external/papi
url = https://github.com/icl-utk-edu/papi.git
[submodule "external/pybind11"]
path = external/pybind11
url = https://github.com/jrmadsen/pybind11.git
+68
Просмотреть файл
@@ -0,0 +1,68 @@
# MIT License
#
# Copyright (c) 2025 Advanced Micro Devices, Inc. All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
# Pre-configuration file for pre-commit hooks
# This is optional. To run pre-commit, see CONTRIBUTING.md
exclude: \.(svg)$|(^|/)\.gitignore$ # Exclude files with these extensions
default_stages: [pre-commit]
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
hooks:
- id: check-yaml # Check YAML files for syntax errors
- id: trailing-whitespace # Remove trailing whitespace
- id: end-of-file-fixer # Fix files to have a newline at the end
- repo: https://github.com/pre-commit/mirrors-clang-format
rev: v18.1.8 # Version 18 as specified in contributor guide
hooks:
- id: clang-format
files: \.(c|cpp|h.*)$
- repo: https://github.com/BlankSpruce/gersemi
rev: 0.19.3
hooks:
- id: gersemi
- repo: local
hooks:
- id: check-copyright
name: copyright-detector
require_serial: true # Slightly slower, but prevents hook running script many times
entry: ./scripts/check-copyright.sh
language: script
files: \.(c|h|txt|cpp|hpp|py)$
exclude: ^\.|^docs/|^examples/lulesh/|^examples/mpi/|^examples/openmp/|^external/|^cmake/
# - repo: local
# hooks:
# - id: check-copyright-date
# name: Check Copyright Date
# # Check copyright date in all files
# # Fails if something is out of date.
# # -u automatically updates copyright year
# entry: ./scripts/check-copyright.sh -u
# language: script
# files: \.(c|h|txt|cpp|hpp|py)$
# exclude: ^\.|^docs/|^examples/lulesh/|^examples/mpi/|^examples/openmp/|^external/|^cmake/
+18
Просмотреть файл
@@ -0,0 +1,18 @@
# Read the Docs configuration file
# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
version: 2
build:
os: ubuntu-22.04
tools:
python: "3.10"
python:
install:
- requirements: docs/sphinx/requirements.txt
sphinx:
configuration: docs/conf.py
formats: []
+53
Просмотреть файл
@@ -0,0 +1,53 @@
aarch
amd
bundler
Coz
CrayPAT
dl
durations
Dyninst
enp
Kokkos
KokkosP
librocprof
LibIberty
libunwind
Linkable
LIKWID
lvalue
metaprogramming
MPICH
mpirun
NVTX
OpenMPI
OpenSUSE
PAPI
perf
pid
polymorphism
POSIX
ppc
proc
proto
Pthreads
rocDecode
rocdecode
ROCprofiler
ROCPROFSYS
rocJPEG
rocjpeg
ROCTX
roctx
rpath
RPATH
RSS
rvalues
sdk
SELinux
STGZ
sudo
sys
TBB
Timemory
VCN
VTune
+110
Просмотреть файл
@@ -0,0 +1,110 @@
# Changelog for ROCm Systems Profiler
Full documentation for ROCm Systems Profiler is available at [https://rocm.docs.amd.com/projects/rocprofiler-systems/en/latest/](https://rocm.docs.amd.com/projects/rocprofiler-systems/en/latest/).
## ROCm Systems Profiler 1.1.0 for ROCm 7.0
### Added
- Profiling and metric collection capabilities for VCN engine activity, JPEG engine activity, and API tracing for rocDecode, rocJPEG, and VA-APIs.
- How-to document for VCN and JPEG activity sampling and tracing.
- Support for tracing Fortran applications.
- Support for tracing MPI API in Fortran.
### Changed
- Replaced ROCm SMI backend with AMD SMI backend for collecting GPU metrics.
- ROCprofiler-SDK is now used to trace RCCL API and collect communication counters.
- Updated the Dyninst submodule to v13.0.
- Set the default value of `ROCPROFSYS_SAMPLING_CPUS` to `none`.
### Resolved issues
- Fixed GPU metric collection settings with `ROCPROFSYS_AMD_SMI_METRICS`.
- Fixed a build issue with CMake 4.
- Fixed incorrect kernel names shown for kernel dispatch tracks in Perfetto.
- Fixed formatting of some output logs.
- Fixed an issue where ROC-TX ranges were displayed as two separate events instead of a single spanning event.
## ROCm Systems Profiler 1.0.2 for ROCm 6.4.2
### Optimized
- Improved readability of the OpenMP target offload traces by showing on a single Perfetto track.
### Resolved issues
- Fixed the file path to the script that merges Perfetto files from multi-process MPI runs. The script has also been renamed from `merge-multiprocess-output.sh` to `rocprof-sys-merge-output.sh`.
## ROCm Systems Profiler 1.0.1 for ROCm 6.4.1
### Added
- How-to document for [network performance profiling](https://rocm.docs.amd.com/projects/rocprofiler-systems/en/amd-staging/how-to/nic-profiling.html) for standard Network Interface Cards (NICs).
### Resolved issues
- Fixed a build issue with Dyninst on GCC 13.
## ROCm Systems Profiler 1.0.0 for ROCm 6.4.0
### Added
- Support for VA-API and rocDecode tracing.
- Aggregation of MPI data collected across distributed nodes and ranks. The data is concatenated into a single proto file.
### Changed
- Backend refactored to use ROCprofiler-SDK rather than ROCProfiler and ROCTracer.
### Resolved issues
- Fixed hardware counter summary files not being generated after profiling.
- Fixed an application crash when collecting performance counters with ROCProfiler.
- Fixed interruption in config file generation.
- Fixed segmentation fault while running `rocprof-sys-instrument`.
- Fixed an issue where running `rocprof-sys-causal` or using the `-I all` option with `rocprof-sys-sample` caused the system to become non-responsive.
- Fixed an issue where sampling multi-GPU Python workloads caused the system to stop responding.
## ROCm Systems Profiler 0.1.1 for ROCm 6.3.2
### Resolved issues
- Fixed an error when building from source on some SUSE and RHEL systems when using the `ROCPROFSYS_BUILD_DYNINST` option.
## ROCm Systems Profiler 0.1.0 for ROCm 6.3.1
### Added
- Improvements to support OMPT target offload.
### Resolved issues
- Fixed an issue with generated Perfetto files.
- Fixed an issue with merging multiple `.proto` files.
- Fixed an issue causing GPU resource data to be missing from traces of Instinct MI300A systems.
- Fixed a minor issue for users upgrading to ROCm 6.3 from 6.2 post-rename from `omnitrace`.
## ROCm Systems Profiler 0.1.0 for ROCm 6.3.0
### Changed
- Renamed Omnitrace to ROCm Systems Profiler.
## Omnitrace 1.11.2 for ROCm 6.2.1
### Known issues
- Perfetto can no longer open Omnitrace proto files. Loading the Perfetto trace output `.proto` file in `ui.perfetto.dev` can
result in a dialog with the message, "Oops, something went wrong! Please file a bug." The information in the dialog will
refer to an "Unknown field type." The workaround is to open the files with the previous version of the Perfetto UI found
at https://ui.perfetto.dev/v46.0-35b3d9845/#!/.
+488
Просмотреть файл
@@ -0,0 +1,488 @@
cmake_minimum_required(VERSION 3.18.4 FATAL_ERROR)
if(
CMAKE_SOURCE_DIR STREQUAL CMAKE_BINARY_DIR
AND CMAKE_CURRENT_SOURCE_DIR STREQUAL CMAKE_SOURCE_DIR
)
set(MSG "")
message(STATUS "Warning! Building from the source directory is not recommended")
message(STATUS "If unintented, please remove 'CMakeCache.txt' and 'CMakeFiles'")
message(STATUS "and build from a separate directory")
message(AUTHOR_WARNING "In-source build")
endif()
# find_package() uses upper-case <PACKAGENAME>_ROOT variables.
if(POLICY CMP0144)
cmake_policy(SET CMP0144 NEW)
endif()
if(NOT UNIX OR APPLE)
message(
AUTHOR_WARNING
"rocprofiler-systems only supports Linux. Configure and/or build is likely to fail"
)
endif()
file(READ "${CMAKE_CURRENT_SOURCE_DIR}/VERSION" FULL_VERSION_STRING LIMIT_COUNT 1)
string(REGEX REPLACE "(\n|\r)" "" FULL_VERSION_STRING "${FULL_VERSION_STRING}")
string(
REGEX REPLACE
"([0-9]+)\.([0-9]+)\.([0-9]+)(.*)"
"\\1.\\2.\\3"
ROCPROFSYS_VERSION
"${FULL_VERSION_STRING}"
)
project(
rocprofiler-systems
LANGUAGES C CXX
VERSION ${ROCPROFSYS_VERSION}
DESCRIPTION "CPU/GPU Application tracing with static/dynamic binary instrumentation"
HOMEPAGE_URL "https://github.com/ROCm/rocprofiler-systems"
)
set(PROJECT_NAME_UNDERSCORED "rocprofiler_systems")
set(BINARY_NAME_PREFIX "rocprof-sys")
find_package(Git)
if(Git_FOUND AND EXISTS "${PROJECT_SOURCE_DIR}/.git")
execute_process(
COMMAND ${GIT_EXECUTABLE} describe --tags
OUTPUT_VARIABLE ROCPROFSYS_GIT_DESCRIBE
OUTPUT_STRIP_TRAILING_WHITESPACE
RESULT_VARIABLE _GIT_DESCRIBE_RESULT
ERROR_QUIET
)
if(NOT _GIT_DESCRIBE_RESULT EQUAL 0)
execute_process(
COMMAND ${GIT_EXECUTABLE} describe
OUTPUT_VARIABLE ROCPROFSYS_GIT_DESCRIBE
OUTPUT_STRIP_TRAILING_WHITESPACE
RESULT_VARIABLE _GIT_DESCRIBE_RESULT
ERROR_QUIET
)
endif()
execute_process(
COMMAND ${GIT_EXECUTABLE} rev-parse HEAD
OUTPUT_VARIABLE ROCPROFSYS_GIT_REVISION
OUTPUT_STRIP_TRAILING_WHITESPACE
ERROR_QUIET
)
else()
set(ROCPROFSYS_GIT_DESCRIBE "v${ROCPROFSYS_VERSION}")
set(ROCPROFSYS_GIT_REVISION "")
endif()
message(
STATUS
"[${PROJECT_NAME}] version ${PROJECT_VERSION_MAJOR}.${PROJECT_VERSION_MINOR}.${PROJECT_VERSION_PATCH} (${FULL_VERSION_STRING})"
)
message(STATUS "[${PROJECT_NAME}] git revision: ${ROCPROFSYS_GIT_REVISION}")
message(STATUS "[${PROJECT_NAME}] git describe: ${ROCPROFSYS_GIT_DESCRIBE}")
set(CMAKE_MODULE_PATH
${PROJECT_SOURCE_DIR}/cmake
${PROJECT_SOURCE_DIR}/cmake/Modules
${PROJECT_SOURCE_DIR}/source/python/cmake
${CMAKE_MODULE_PATH}
)
set(BUILD_SHARED_LIBS ON CACHE BOOL "Build shared libraries")
set(BUILD_STATIC_LIBS OFF CACHE BOOL "Build static libraries")
set(CMAKE_POSITION_INDEPENDENT_CODE ON CACHE BOOL "Build position independent code")
if(CMAKE_VERSION VERSION_GREATER_EQUAL 3.24)
cmake_policy(SET CMP0135 NEW)
endif()
if("${CMAKE_BUILD_TYPE}" STREQUAL "")
set(CMAKE_BUILD_TYPE Release CACHE STRING "Build type" FORCE)
else()
set(VALID_BUILD_TYPES "Release" "RelWithDebInfo" "Debug" "MinSizeRel")
if(NOT "${CMAKE_BUILD_TYPE}" IN_LIST VALID_BUILD_TYPES)
string(REPLACE ";" ", " _VALID_BUILD_TYPES "${VALID_BUILD_TYPES}")
message(
FATAL_ERROR
"Invalid CMAKE_BUILD_TYPE :: ${CMAKE_BUILD_TYPE}. Valid build types are: ${_VALID_BUILD_TYPES}"
)
endif()
endif()
set(_STRIP_LIBRARIES_DEFAULT OFF)
if("${CMAKE_BUILD_TYPE}" STREQUAL "Release")
set(_STRIP_LIBRARIES_DEFAULT ON)
endif()
if(DEFINED CMAKE_INSTALL_LIBDIR AND NOT DEFINED CMAKE_DEFAULT_INSTALL_LIBDIR)
# always have a fresh install
unset(CMAKE_INSTALL_LIBDIR CACHE)
include(GNUInstallDirs) # install directories
# force this because dyninst always installs to lib
set(CMAKE_DEFAULT_INSTALL_LIBDIR
"${CMAKE_INSTALL_LIBDIR}"
CACHE STRING
"Object code libraries"
FORCE
)
endif()
if(NOT "$ENV{ROCPROFSYS_CI}" STREQUAL "")
set(CI_BUILD $ENV{ROCPROFSYS_CI})
else()
set(CI_BUILD OFF)
endif()
include(GNUInstallDirs) # install directories
include(MacroUtilities) # various functions and macros
if(CI_BUILD)
rocprofiler_systems_add_option(ROCPROFSYS_BUILD_CI "Enable internal asserts, etc." ON
ADVANCED NO_FEATURE
)
rocprofiler_systems_add_option(ROCPROFSYS_BUILD_TESTING
"Enable building the testing suite" ON ADVANCED
)
rocprofiler_systems_add_option(
ROCPROFSYS_BUILD_DEBUG "Enable building with extensive debug symbols" OFF
ADVANCED
)
rocprofiler_systems_add_option(
ROCPROFSYS_BUILD_HIDDEN_VISIBILITY
"Build with hidden visibility (disable for Debug builds)" OFF ADVANCED
)
rocprofiler_systems_add_option(ROCPROFSYS_STRIP_LIBRARIES "Strip the libraries" OFF
ADVANCED
)
else()
rocprofiler_systems_add_option(ROCPROFSYS_BUILD_CI "Enable internal asserts, etc."
OFF ADVANCED NO_FEATURE
)
rocprofiler_systems_add_option(ROCPROFSYS_BUILD_EXAMPLES
"Enable building the examples" OFF ADVANCED
)
rocprofiler_systems_add_option(ROCPROFSYS_BUILD_TESTING
"Enable building the testing suite" OFF ADVANCED
)
rocprofiler_systems_add_option(
ROCPROFSYS_BUILD_DEBUG "Enable building with extensive debug symbols" OFF
ADVANCED
)
rocprofiler_systems_add_option(
ROCPROFSYS_BUILD_HIDDEN_VISIBILITY
"Build with hidden visibility (disable for Debug builds)" ON ADVANCED
)
rocprofiler_systems_add_option(ROCPROFSYS_STRIP_LIBRARIES "Strip the libraries"
${_STRIP_LIBRARIES_DEFAULT} ADVANCED
)
endif()
include(Compilers) # compiler identification
include(BuildSettings) # compiler flags
set(CMAKE_INSTALL_LIBDIR "lib" CACHE STRING "Object code libraries (lib)" FORCE)
set(CMAKE_CXX_STANDARD 17 CACHE STRING "CXX language standard")
rocprofiler_systems_add_feature(CMAKE_BUILD_TYPE "Build optimization level")
rocprofiler_systems_add_feature(CMAKE_INSTALL_PREFIX "Installation prefix")
rocprofiler_systems_add_feature(CMAKE_CXX_COMPILER "C++ compiler")
rocprofiler_systems_add_feature(CMAKE_CXX_STANDARD "CXX language standard")
rocprofiler_systems_add_option(CMAKE_CXX_STANDARD_REQUIRED
"Require C++ language standard" ON
)
rocprofiler_systems_add_option(CMAKE_CXX_EXTENSIONS
"Compiler specific language extensions" OFF
)
rocprofiler_systems_add_option(CMAKE_INSTALL_RPATH_USE_LINK_PATH
"Enable rpath to linked libraries" ON
)
set(CMAKE_INSTALL_MESSAGE "LAZY" CACHE STRING "Installation message")
mark_as_advanced(CMAKE_INSTALL_MESSAGE)
rocprofiler_systems_add_option(ROCPROFSYS_USE_CLANG_TIDY "Enable clang-tidy" OFF)
rocprofiler_systems_add_option(ROCPROFSYS_USE_BFD
"Enable BFD support (map call-stack samples to LOC)" ON
)
rocprofiler_systems_add_option(ROCPROFSYS_USE_MPI "Enable MPI support" OFF)
rocprofiler_systems_add_option(ROCPROFSYS_USE_ROCM "Enable ROCm support" ON)
rocprofiler_systems_add_option(ROCPROFSYS_USE_PAPI "Enable HW counter support via PAPI"
ON
)
rocprofiler_systems_add_option(
ROCPROFSYS_USE_MPI_HEADERS
"Enable wrapping MPI functions w/o enabling MPI dependency" ON
)
rocprofiler_systems_add_option(ROCPROFSYS_USE_OMPT "Enable OpenMP tools support" ON)
rocprofiler_systems_add_option(ROCPROFSYS_USE_PYTHON "Enable Python support" OFF)
rocprofiler_systems_add_option(ROCPROFSYS_BUILD_DYNINST "Build dyninst from submodule"
OFF
)
rocprofiler_systems_add_option(ROCPROFSYS_BUILD_LIBUNWIND
"Build libunwind from submodule" ON
)
rocprofiler_systems_add_option(ROCPROFSYS_BUILD_CODECOV "Build for code coverage" OFF)
rocprofiler_systems_add_option(ROCPROFSYS_INSTALL_PERFETTO_TOOLS
"Install perfetto tools (i.e. traced, perfetto, etc.)" OFF
)
if(ROCPROFSYS_USE_PAPI)
rocprofiler_systems_add_option(ROCPROFSYS_BUILD_PAPI "Build PAPI from submodule" ON)
endif()
if(ROCPROFSYS_USE_PYTHON)
rocprofiler_systems_add_option(ROCPROFSYS_BUILD_PYTHON
"Build python bindings with internal pybind11" ON
)
elseif("$ENV{ROCPROFSYS_CI}")
# quiet warnings in dashboard
if(ROCPROFSYS_PYTHON_ENVS OR ROCPROFSYS_PYTHON_PREFIX)
rocprofiler_systems_message(
STATUS
"Ignoring values of ROCPROFSYS_PYTHON_ENVS and/or ROCPROFSYS_PYTHON_PREFIX"
)
endif()
endif()
if(ROCPROFSYS_BUILD_TESTING)
set(ROCPROFSYS_BUILD_EXAMPLES ON CACHE BOOL "Enable building the examples" FORCE)
endif()
include(ProcessorCount)
ProcessorCount(ROCPROFSYS_PROCESSOR_COUNT)
if(ROCPROFSYS_PROCESSOR_COUNT LESS 8)
set(ROCPROFSYS_THREAD_COUNT 128)
else()
math(EXPR ROCPROFSYS_THREAD_COUNT "16 * ${ROCPROFSYS_PROCESSOR_COUNT}")
compute_pow2_ceil(ROCPROFSYS_THREAD_COUNT "16 * ${ROCPROFSYS_PROCESSOR_COUNT}")
# set the default to 2048 if it could not be calculated
if(ROCPROFSYS_THREAD_COUNT LESS 2)
set(ROCPROFSYS_THREAD_COUNT 2048)
endif()
endif()
set(ROCPROFSYS_MAX_THREADS
"${ROCPROFSYS_THREAD_COUNT}"
CACHE STRING
"Maximum number of threads in the host application. Likely only needs to be increased if host app does not use thread-pool but creates many threads"
)
rocprofiler_systems_add_feature(
ROCPROFSYS_MAX_THREADS
"Maximum number of total threads supported in the host application (default: max of 128 or 16 * nproc)"
)
compute_pow2_ceil(_MAX_THREADS "${ROCPROFSYS_MAX_THREADS}")
if(_MAX_THREADS GREATER 0 AND NOT ROCPROFSYS_MAX_THREADS EQUAL _MAX_THREADS)
rocprofiler_systems_message(
FATAL_ERROR
"Error! ROCPROFSYS_MAX_THREADS must be a power of 2. Recommendation: ${_MAX_THREADS}"
)
elseif(NOT ROCPROFSYS_MAX_THREADS EQUAL _MAX_THREADS)
rocprofiler_systems_message(
AUTHOR_WARNING
"ROCPROFSYS_MAX_THREADS (=${ROCPROFSYS_MAX_THREADS}) must be a power of 2. We were unable to verify it so we are emitting this warning instead. Estimate resulted in: ${_MAX_THREADS}"
)
endif()
set(ROCPROFSYS_MAX_UNWIND_DEPTH
"64"
CACHE STRING
"Maximum call-stack depth to search during call-stack unwinding. Decreasing this value will result in sampling consuming less memory"
)
rocprofiler_systems_add_feature(
ROCPROFSYS_MAX_UNWIND_DEPTH
"Maximum call-stack depth to search during call-stack unwinding. Decreasing this value will result in sampling consuming less memory"
)
# default visibility settings
set(CMAKE_C_VISIBILITY_PRESET
"default"
CACHE STRING
"Visibility preset for non-inline C functions"
)
set(CMAKE_CXX_VISIBILITY_PRESET
"default"
CACHE STRING
"Visibility preset for non-inline C++ functions/objects"
)
set(CMAKE_VISIBILITY_INLINES_HIDDEN
OFF
CACHE BOOL
"Visibility preset for inline functions"
)
set(CMAKE_EXPORT_COMPILE_COMMANDS ON)
include(Formatting) # format target
include(Packages) # finds third-party libraries
rocprofiler_systems_activate_clang_tidy()
# custom visibility settings
if(ROCPROFSYS_BUILD_HIDDEN_VISIBILITY)
set(CMAKE_C_VISIBILITY_PRESET "internal")
set(CMAKE_CXX_VISIBILITY_PRESET "internal")
set(CMAKE_VISIBILITY_INLINES_HIDDEN ON)
endif()
if(ROCPROFSYS_BUILD_TESTING OR "$ENV{ROCPROFSYS_CI}" MATCHES "[1-9]+|ON|on|y|yes")
enable_testing()
include(CTest)
endif()
# ------------------------------------------------------------------------------#
#
# library and executables
#
# ------------------------------------------------------------------------------#
set(CMAKE_INSTALL_DEFAULT_COMPONENT_NAME core)
if(ROCPROFSYS_BUILD_CODECOV)
rocprofiler_systems_save_variables(CODECOV_FLAGS VARIABLES CMAKE_C_FLAGS
CMAKE_CXX_FLAGS
)
foreach(_BUILD_TYPE DEBUG MINSIZEREL RELWITHDEBINFO RELEASE)
rocprofiler_systems_save_variables(
CODECOV_FLAGS VARIABLES CMAKE_C_FLAGS_${_BUILD_TYPE}
CMAKE_CXX_FLAGS_${_BUILD_TYPE}
)
endforeach()
foreach(_BUILD_TYPE DEBUG MINSIZEREL RELWITHDEBINFO RELEASE)
set(CMAKE_C_FLAGS_${_BUILD_TYPE}
"-Og -g3 -fno-omit-frame-pointer -fprofile-abs-path -fprofile-arcs -ftest-coverage"
)
set(CMAKE_CXX_FLAGS_${_BUILD_TYPE}
"-Og -g3 -fno-omit-frame-pointer -fprofile-abs-path -fprofile-arcs -ftest-coverage"
)
endforeach()
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} --coverage")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} --coverage")
endif()
add_subdirectory(source)
if(ROCPROFSYS_BUILD_CODECOV)
rocprofiler_systems_restore_variables(CODECOV_FLAGS VARIABLES CMAKE_C_FLAGS
CMAKE_CXX_FLAGS
)
foreach(_BUILD_TYPE DEBUG MINSIZEREL RELWITHDEBINFO RELEASE)
rocprofiler_systems_restore_variables(
CODECOV_FLAGS VARIABLES CMAKE_C_FLAGS_${_BUILD_TYPE}
CMAKE_CXX_FLAGS_${_BUILD_TYPE}
)
endforeach()
endif()
# ------------------------------------------------------------------------------#
#
# miscellaneous installs
#
# ------------------------------------------------------------------------------#
configure_file(
${PROJECT_SOURCE_DIR}/LICENSE
${PROJECT_BINARY_DIR}/${CMAKE_INSTALL_DATAROOTDIR}/doc/${PROJECT_NAME}/LICENSE
COPYONLY
)
configure_file(
${PROJECT_SOURCE_DIR}/perfetto.cfg
${PROJECT_BINARY_DIR}/${CMAKE_INSTALL_DATAROOTDIR}/${PROJECT_NAME}/perfetto.cfg
COPYONLY
)
configure_file(
${PROJECT_SOURCE_DIR}/cmake/Templates/setup-env.sh.in
${PROJECT_BINARY_DIR}/${CMAKE_INSTALL_DATAROOTDIR}/${PROJECT_NAME}/setup-env.sh
@ONLY
)
configure_file(
${PROJECT_SOURCE_DIR}/cmake/Templates/modulefile.in
${PROJECT_BINARY_DIR}/${CMAKE_INSTALL_DATAROOTDIR}/modulefiles/${PROJECT_NAME}/${ROCPROFSYS_VERSION}
@ONLY
)
configure_file(
${PROJECT_SOURCE_DIR}/scripts/merge-multiprocess-output.sh
${PROJECT_BINARY_DIR}/${CMAKE_INSTALL_LIBEXECDIR}/${PROJECT_NAME}/rocprof-sys-merge-output.sh
COPYONLY
)
install(
FILES
${PROJECT_BINARY_DIR}/${CMAKE_INSTALL_DATAROOTDIR}/${PROJECT_NAME}/setup-env.sh
${PROJECT_BINARY_DIR}/${CMAKE_INSTALL_DATAROOTDIR}/${PROJECT_NAME}/perfetto.cfg
DESTINATION ${CMAKE_INSTALL_DATAROOTDIR}/${PROJECT_NAME}
COMPONENT setup
)
install(
FILES
${PROJECT_BINARY_DIR}/${CMAKE_INSTALL_DATAROOTDIR}/modulefiles/${PROJECT_NAME}/${ROCPROFSYS_VERSION}
DESTINATION ${CMAKE_INSTALL_DATAROOTDIR}/modulefiles/${PROJECT_NAME}
COMPONENT setup
)
install(
FILES ${PROJECT_BINARY_DIR}/${CMAKE_INSTALL_DATAROOTDIR}/doc/${PROJECT_NAME}/LICENSE
DESTINATION ${CMAKE_INSTALL_DATAROOTDIR}/doc/${PROJECT_NAME}
COMPONENT setup
)
install(
PROGRAMS
${PROJECT_BINARY_DIR}/${CMAKE_INSTALL_LIBEXECDIR}/${PROJECT_NAME}/rocprof-sys-merge-output.sh
DESTINATION ${CMAKE_INSTALL_LIBEXECDIR}/${PROJECT_NAME}
COMPONENT setup
)
# ------------------------------------------------------------------------------#
#
# install
#
# ------------------------------------------------------------------------------#
set(CMAKE_INSTALL_DEFAULT_COMPONENT_NAME core)
include(ConfigInstall)
# ------------------------------------------------------------------------------#
#
# examples
#
# ------------------------------------------------------------------------------#
if(ROCPROFSYS_BUILD_EXAMPLES)
set(CMAKE_INSTALL_DEFAULT_COMPONENT_NAME examples)
add_subdirectory(examples)
endif()
# ------------------------------------------------------------------------------#
#
# tests
#
# ------------------------------------------------------------------------------#
if(ROCPROFSYS_BUILD_TESTING)
set(CMAKE_INSTALL_DEFAULT_COMPONENT_NAME testing)
add_subdirectory(tests)
endif()
# ------------------------------------------------------------------------------#
#
# packaging
#
# ------------------------------------------------------------------------------#
set(CMAKE_INSTALL_DEFAULT_COMPONENT_NAME core)
include(ConfigCPack)
# ------------------------------------------------------------------------------#
#
# config info
#
# ------------------------------------------------------------------------------#
rocprofiler_systems_print_features()
+94
Просмотреть файл
@@ -0,0 +1,94 @@
{
"version": 3,
"configurePresets": [
{
"name": "ci",
"displayName": "official CI build",
"description": "Official CI build parameters",
"generator": "Ninja",
"binaryDir": "${sourceDir}/build/ci",
"cacheVariables": {
"CMAKE_BUILD_TYPE": "Release",
"CMAKE_INSTALL_PREFIX": "/opt/rocprofiler-systems",
"CMAKE_C_COMPILER": "gcc",
"CMAKE_CXX_COMPILER": "g++",
"ROCPROFSYS_USE_ROCM": "ON",
"ROCPROFSYS_USE_PYTHON": "ON",
"ROCPROFSYS_BUILD_DYNINST": "ON",
"ROCPROFSYS_BUILD_TBB": "ON",
"ROCPROFSYS_BUILD_BOOST": "ON",
"ROCPROFSYS_BUILD_ELFUTILS": "ON",
"ROCPROFSYS_BUILD_LIBIBERTY": "ON",
"ROCPROFSYS_BUILD_TESTING": "ON",
"ROCPROFSYS_STRIP_LIBRARIES": "OFF",
"ROCPROFSYS_MAX_THREADS": "64",
"ROCPROFSYS_BUILD_CI": "ON"
}
},
{
"name": "debug",
"displayName": "official debug build",
"description": "Debug build parameters with tests",
"binaryDir": "${sourceDir}/build/debug",
"generator": "Ninja",
"cacheVariables": {
"CMAKE_BUILD_TYPE": "Debug",
"CMAKE_INSTALL_PREFIX": "/opt/rocprofiler-systems",
"CMAKE_C_COMPILER": "gcc",
"CMAKE_CXX_COMPILER": "g++",
"ROCPROFSYS_USE_ROCM": "ON",
"ROCPROFSYS_USE_PYTHON": "ON",
"ROCPROFSYS_BUILD_DYNINST": "ON",
"ROCPROFSYS_BUILD_TBB": "ON",
"ROCPROFSYS_BUILD_BOOST": "ON",
"ROCPROFSYS_BUILD_ELFUTILS": "ON",
"ROCPROFSYS_BUILD_LIBIBERTY": "ON",
"ROCPROFSYS_BUILD_TESTING": "ON",
"ROCPROFSYS_STRIP_LIBRARIES": "OFF",
"ROCPROFSYS_BUILD_DEBUG": "ON"
}
},
{
"name": "debug-optimized",
"displayName": "release build with debug info",
"description": "Release build with debug info with tests",
"generator": "Ninja",
"binaryDir": "${sourceDir}/build/debug-optimized",
"cacheVariables": {
"CMAKE_BUILD_TYPE": "RelWithDebInfo",
"CMAKE_INSTALL_PREFIX": "/opt/rocprofiler-systems",
"CMAKE_C_COMPILER": "gcc",
"CMAKE_CXX_COMPILER": "g++",
"ROCPROFSYS_USE_ROCM": "ON",
"ROCPROFSYS_USE_PYTHON": "ON",
"ROCPROFSYS_BUILD_DYNINST": "ON",
"ROCPROFSYS_BUILD_TBB": "ON",
"ROCPROFSYS_BUILD_BOOST": "ON",
"ROCPROFSYS_BUILD_ELFUTILS": "ON",
"ROCPROFSYS_BUILD_LIBIBERTY": "ON",
"ROCPROFSYS_BUILD_TESTING": "ON",
"ROCPROFSYS_STRIP_LIBRARIES": "OFF"
}
},
{
"name": "release",
"displayName": "official release build",
"description": "Official release build",
"generator": "Ninja",
"binaryDir": "${sourceDir}/build/release",
"cacheVariables": {
"CMAKE_BUILD_TYPE": "Release",
"CMAKE_INSTALL_PREFIX": "/opt/rocprofiler-systems",
"CMAKE_C_COMPILER": "gcc",
"CMAKE_CXX_COMPILER": "g++",
"ROCPROFSYS_USE_ROCM": "ON",
"ROCPROFSYS_USE_PYTHON": "ON",
"ROCPROFSYS_BUILD_DYNINST": "ON",
"ROCPROFSYS_BUILD_TBB": "ON",
"ROCPROFSYS_BUILD_BOOST": "ON",
"ROCPROFSYS_BUILD_ELFUTILS": "ON",
"ROCPROFSYS_BUILD_LIBIBERTY": "ON"
}
}
]
}
+107
Просмотреть файл
@@ -0,0 +1,107 @@
<head>
<meta charset="UTF-8">
<meta name="description" content="Contributing to rocprofiler-systems">
<meta name="keywords" content="ROCm, contributing, rocprofiler-systems">
</head>
# Contributing to rocprofiler-systems #
ROCm Systems Profiler (rocprofiler-systems), formerly Omnitrace, is a comprehensive profiling and tracing tool for parallel applications written in C, C++, Fortran, HIP, OpenCL, and Python which execute on the CPU or CPU+GPU.
We welcome contributions to rocprofiler-systems. Please follow these details to help ensure your contributions will be successfully accepted.
## Table of Contents ##
1. [Issue Discussion](#issue-discussion)
2. [Acceptance Criteria](#acceptance-criteria)
3. [Pull Request Guidelines](#pull-request-guidelines)
4. [Coding Style](#coding-style)
5. [Code License](#code-license)
6. [References](#references)
## Issue Discussion ##
Please use the GitHub Issues tab to notify us of issues.
* Use your best judgement for issue creation. Search [existing issues](https://github.com/ROCm/rocprofiler-systems/issues) to make sure your issue isn't already listed
* If your issue is already listed, upvote the issue and comment or post to provide additional details, such as how you reproduced this issue.
* If you're not sure if your issue is the same, err on the side of caution and file your issue. You can add a comment to include the issue number (and link) for the similar issue. If we evaluate your issue as being the same as the existing issue, we'll close the duplicate.
* If your issue doesn't exist, use the issue template to file a new issue.
* When filing an issue, be sure to provide as much information as possible, including script output so we can collect information about your configuration. This helps reduce the time required to reproduce your issue.
* Check your issue regularly, as we may require additional information to successfully reproduce the issue.
* You may also open an issue to ask questions to the maintainers about whether a proposed change meets the acceptance criteria, or to discuss an idea pertaining to the library.
## Acceptance Criteria ##
* Contributions should align with the project's goals and maintainability.
* Code should be well-documented and include tests where applicable.
* Ensure that your changes do not break existing functionality.
* Each commit is to be digitally signed. For more details see: [About commit signature verification - GitHub Docs](https://docs.github.com/en/authentication/managing-commit-signature-verification/about-commit-signature-verification).
### Exceptions ###
* If you believe your contribution does not fit the guidelines but is still valuable, please discuss it with the maintainers before submitting.
## Pull Request Guidelines ##
By creating a pull request, you agree to the statements made in the [code license](#code-license) section. Your pull request should target the default branch. Our current default branch is the **amd-staging** branch, which serves as our integration branch.
### Process ###
* Fork the repository and create your branch from `amd-staging`.
* If you've added code that should be tested, add tests.
* Ensure the test suite passes.
* Make sure your code conforms to the format. Use clang-format-18 and/or gersemi.
* Use clear and descriptive commit messages.
* Submit your PR and work with the reviewer or maintainer to get your PR approved
* Once approved, the PR is brought onto internal CI systems and may be merged into the component during our release cycle, as coordinated by the maintainer.
### Setting Up the Development Environment ###
* It is recommended to [fork](https://github.com/ROCm/rocprofiler-systems/fork) the repository.
* Clone your forked repository: `git clone https://github.com/ROCm/<yourgithub-id>/rocprofiler-systems.git`
* Navigate to the project directory: `cd rocprofiler-systems`
* Set the original repository URL as the remote upstream using `git remote add upstream https://github.com/ROCm/rocprofiler-systems` (or `git remote set-url upstream https://github.com/ROCm/rocprofiler-systems`)
* Verify if origin and upstream points correctly with `git remote -v`.
* Start a new branch for your work: `git checkout -b topic-<yourFeatureName>`
* Build the project as outlined in [ROCm documentation](https://github.com/ROCm/rocprofiler-systems/blob/a03770c0606c23fda5e2c83782f2d188eb8522f5/docs/install/install.rst#building-and-installing-rocm-systems-profiler).
### Running Tests ###
* To run the test suite, use the following command: `make test`
* Ensure all tests pass before submitting a pull request.
* If the project was built with option `-D ROCPROFSYS_BUILD_TESTING=ON`, then the tests are built with it. Individual tests groups can be run using command: `ctest -R <test-name> -V --output-on-failure`. Command `ctest --print-labels` will list all the test names which can be passed to -R as test-name.
## Coding Style ##
* Adhere to the coding style used in the project. This includes naming conventions, indentation, and commenting practices.
* Follow the existing directory structure and organization of the codebase.
* Group related files together and maintain a logical hierarchy.
* Use `clang-format-18` and `gersemi` formatters to ensure consistency.
### Using pre-commit hooks ###
Our project supports optional [*pre-commit hooks*](https://pre-commit.com/#introduction) which developers can leverage to verify formatting before publishing their code. Once enabled, any commits you propose to the repository will be automatically checked for formatting. Initial setup is as follows:
```shell
pip install pre-commit # or: apt-get install pre-commit
cd rocprofiler-systems
pre-commit install
```
**Note:** pre-commit version **3.0.0 or higher** is required.
Now, when you commit code to the repository you should see something like this:
![A screen capture showing terminal output from a pre-commit hook](docs/data/pre-commit-hook.png)
Please see the [pre-commit documentation](https://pre-commit.com/#quick-start) for additional information.
## Code License ##
All code contributed to this project will be licensed under the license identified in the [License](LICENSE). Your contribution will be accepted under the same license.
## References ##
1. [ROCm Systems Profiler Documentation](https://rocm.docs.amd.com/projects/rocprofiler-systems/en/latest/index.html)
2. [ROCm Systems Profiler README](README.md)
+21
Просмотреть файл
@@ -0,0 +1,21 @@
MIT License
Copyright (c) 2025 Advanced Micro Devices, Inc. All Rights Reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Исполняемый файл
+419
Просмотреть файл
@@ -0,0 +1,419 @@
# ROCm Systems Profiler: Application profiling, tracing, and analysis
[![Ubuntu 20.04 with GCC, ROCm, and MPI](https://github.com/ROCm/rocprofiler-systems/actions/workflows/ubuntu-focal.yml/badge.svg)](https://github.com/ROCm/rocprofiler-systems/actions/workflows/ubuntu-focal.yml)
[![Ubuntu 22.04 (GCC, Python, ROCm)](https://github.com/ROCm/rocprofiler-systems/actions/workflows/ubuntu-jammy.yml/badge.svg)](https://github.com/ROCm/rocprofiler-systems/actions/workflows/ubuntu-jammy.yml)
[![OpenSUSE 15.x with GCC](https://github.com/ROCm/rocprofiler-systems/actions/workflows/opensuse.yml/badge.svg)](https://github.com/ROCm/rocprofiler-systems/actions/workflows/opensuse.yml)
[![RedHat Linux (GCC, Python, ROCm)](https://github.com/ROCm/rocprofiler-systems/actions/workflows/redhat.yml/badge.svg)](https://github.com/ROCm/rocprofiler-systems/actions/workflows/redhat.yml)
[![Installer Packaging (CPack)](https://github.com/ROCm/rocprofiler-systems/actions/workflows/cpack.yml/badge.svg)](https://github.com/ROCm/rocprofiler-systems/actions/workflows/cpack.yml)
[![Documentation](https://github.com/ROCm/rocprofiler-systems/actions/workflows/docs.yml/badge.svg)](https://github.com/ROCm/rocprofiler-systems/actions/workflows/docs.yml)
> [!NOTE]
> If you are using a version of ROCm prior to ROCm 6.3.1 and are experiencing problems viewing your trace in the latest version of [Perfetto](http://ui.perfetto.dev), then try using [Perfetto UI v46.0](https://ui.perfetto.dev/v46.0-35b3d9845/#!/).
## Overview
ROCm Systems Profiler (rocprofiler-systems), formerly Omnitrace, is a comprehensive profiling and tracing tool for parallel applications written in C, C++, Fortran, HIP, OpenCL, and Python which execute on the CPU or CPU+GPU.
It is capable of gathering the performance information of functions through any combination of binary instrumentation, call-stack sampling, user-defined regions, and Python interpreter hooks.
ROCm Systems Profiler supports interactive visualization of comprehensive traces in the web browser in addition to high-level summary profiles with mean/min/max/stddev statistics.
In addition to runtimes, ROCm Systems Profiler supports the collection of system-level metrics such as the CPU frequency, GPU temperature, and GPU utilization, process-level metrics
such as the memory usage, page-faults, and context-switches, and thread-level metrics such as memory usage, CPU time, and numerous hardware counters.
> [!NOTE]
> Full documentation is available at [ROCm Systems Profiler documentation](https://rocm.docs.amd.com/projects/rocprofiler-systems/en/latest/index.html) in an organized, easy-to-read, searchable format.
The documentation source files reside in the [`/docs`](/docs) folder of this repository. For information on contributing to the documentation, see
[Contribute to ROCm documentation](https://rocm.docs.amd.com/en/latest/contribute/contributing.html)
### Data collection modes
- Dynamic instrumentation
- Runtime instrumentation
- Instrument executable and shared libraries at runtime
- Binary rewriting
- Generate a new executable and/or library with instrumentation built-in
- Statistical sampling
- Periodic software interrupts per-thread
- Process-level sampling
- Background thread records process-, system- and device-level metrics while the application executes
- Causal profiling
- Quantifies the potential impact of optimizations in parallel codes
### Data analysis
- High-level summary profiles with mean/min/max/stddev statistics
- Low overhead, memory efficient
- Ideal for running at scale
- Comprehensive traces
- Every individual event/measurement
- Application speedup predictions resulting from potential optimizations in functions and lines of code (causal profiling)
### Parallelism API support
- HIP
- HSA
- Pthreads
- MPI
- Kokkos-Tools (KokkosP)
- OpenMP-Tools (OMPT)
### GPU metrics
- GPU hardware counters
- HIP API tracing
- HIP kernel tracing
- HSA API tracing
- HSA operation tracing
- rocDecode API tracing
- rocJPEG API tracing
- System-level sampling (via AMD-SMI)
- Memory usage
- Power usage
- Temperature
- Utilization
- VCN Utilization
- JPEG Utilization
> [!NOTE]
> The availability of VCN and JPEG engine utilization depends on device support for different ASICs. If unsupported, all values for VCN_ACTIVITY and JPEG_ACTIVITY will be reported as N/A in the output of `amd-smi metric --usage`.
### CPU metrics
- CPU hardware counters sampling and profiles
- CPU frequency sampling
- Various timing metrics
- Wall time
- CPU time (process and/or thread)
- CPU utilization (process and/or thread)
- User CPU time
- Kernel CPU time
- Various memory metrics
- High-water mark (sampling and profiles)
- Memory page allocation
- Virtual memory usage
- Network statistics
- I/O metrics
- ... many more
## Quick start
### Installation
- Visit [Releases](https://github.com/ROCm/rocprofiler-systems/releases) page
- Select appropriate installer (recommendation: `.sh` scripts do not require super-user priviledges unlike the DEB/RPM installers)
- If targeting a ROCm application, find the installer script with the matching ROCm version
- If you are unsure about your Linux distro, check `/etc/os-release` or use the `rocprofiler-systems-install.py` script
If the above recommendation is not desired, download the `rocprofiler-systems-install.py` and specify `--prefix <install-directory>` when
executing it. This script will attempt to auto-detect a compatible OS distribution and version.
If ROCm support is desired, specify `--rocm X.Y` where `X` is the ROCm major version and `Y`
is the ROCm minor version, e.g. `--rocm 6.2`.
```console
wget https://github.com/ROCm/rocprofiler-systems/releases/latest/download/rocprofiler-systems-install.py
python3 ./rocprofiler-systems-install.py --prefix /opt/rocprofiler-systems --rocm 6.2
```
See the [ROCm Systems Profiler installation guide](https://rocm.docs.amd.com/projects/rocprofiler-systems/en/latest/install/install.html) for detailed information.
### Setup
> [!NOTE]
> Replace `/opt/rocprofiler-systems` below with installation prefix as necessary.
- **Option 1**: Source `setup-env.sh` script
```bash
source /opt/rocprofiler-systems/share/rocprofiler-systems/setup-env.sh
```
- **Option 2**: Load modulefile
```bash
module use /opt/rocprofiler-systems/share/modulefiles
module load rocprofiler-systems
```
- **Option 3**: Manual
```bash
export PATH=/opt/rocprofiler-systems/bin:${PATH}
export LD_LIBRARY_PATH=/opt/rocprofiler-systems/lib:${LD_LIBRARY_PATH}
```
### Testing environment
The `build-docker` script can be used to create a testing environment. To see the available options, use the following commands:
```shell
cd docker
./build-docker.sh --help
```
> [!NOTE]
> The `-m` argument can be used to show supported OS + ROCm combinations.
**Example:** To set up an Ubuntu 24.04 + ROCm 6.4 + Python 3.12 environment for building and testing, run the following commands:
```shell
cd docker
./build-docker.sh --distro ubuntu --versions 24.04 \
--rocm-versions 6.4 --python-versions 12 --retry 1
docker run -v "$(cd .. && pwd)":/home/development \
-it -w /home/development \
--device /dev/kfd --device /dev/dri \
$(whoami)/rocprofiler-systems:release-base-ubuntu-24.04-rocm-6.4
```
Inside the container, clean, build, and install the project with testing enabled using the following commands:
```shell
rm -rf rocprof-sys-build
cmake -B rocprof-sys-build -S . \
-D CMAKE_INSTALL_PREFIX=/opt/rocprofiler-systems \
-D ROCPROFSYS_USE_PYTHON=ON -D ROCPROFSYS_BUILD_DYNINST=ON \
-D ROCPROFSYS_BUILD_TBB=ON -D ROCPROFSYS_BUILD_BOOST=ON \
-D ROCPROFSYS_BUILD_ELFUTILS=ON -D ROCPROFSYS_BUILD_LIBIBERTY=ON \
-D ROCPROFSYS_BUILD_TESTING=ON
cmake --build rocprof-sys-build --target all --parallel 8
cmake --build rocprof-sys-build --target install
source /opt/rocprofiler-systems/share/rocprofiler-systems/setup-env.sh
```
> [!NOTE]
> If you see "dubious ownership" Git errors when working in the container, run:
>
> ```shell
> git config --global --add safe.directory /home/development
> ```
>
> and
>
> ```shell
> git config --global --add safe.directory /home/development/external/timemory
> ```
Then, use the following command to start automated testing:
```shell
ctest --test-dir rocprof-sys-build --output-on-failure
```
To enable MPI testing inside the container, set the following environment variables:
```shell
export OMPI_ALLOW_RUN_AS_ROOT=1
export OMPI_ALLOW_RUN_AS_ROOT_CONFIRM=1
```
For manual testing, you can find the executables in `rocprof-sys-build/bin`.
### ROCm Systems Profiler settings
Generate a rocprofiler-systems configuration file using `rocprof-sys-avail -G rocprof-sys.cfg`. Optionally, use `rocprof-sys-avail -G rocprof-sys.cfg --all` for
a verbose configuration file with descriptions, categories, etc. Modify the configuration file as desired, e.g. enable
[perfetto](https://perfetto.dev/), [timemory](https://github.com/ROCm/timemory), sampling, and process-level sampling by default
and tweak some sampling default values:
```console
# ...
ROCPROFSYS_TRACE = true
ROCPROFSYS_PROFILE = true
ROCPROFSYS_USE_SAMPLING = true
ROCPROFSYS_USE_PROCESS_SAMPLING = true
# ...
ROCPROFSYS_SAMPLING_FREQ = 50
ROCPROFSYS_SAMPLING_CPUS = all
ROCPROFSYS_SAMPLING_GPUS = $env:HIP_VISIBLE_DEVICES
```
Once the configuration file is adjusted to your preferences, either export the path to this file via `ROCPROFSYS_CONFIG_FILE=/path/to/rocprof-sys.cfg`
or place this file in `${HOME}/.rocprof-sys.cfg` to ensure these values are always read as the default. If you wish to change any of these settings,
you can override them via environment variables or by specifying an alternative `ROCPROFSYS_CONFIG_FILE`.
### Call-Stack sampling
The `rocprof-sys-sample` executable is used to execute call-stack sampling on a target application without binary instrumentation.
Use a double-hypen (`--`) to separate the command-line arguments for `rocprof-sys-sample` from the target application and it's arguments.
```shell
rocprof-sys-sample --help
rocprof-sys-sample <rocprof-sys-options> -- <exe> <exe-options>
rocprof-sys-sample -f 1000 -- ls -la
```
### Binary instrumentation
The `rocprof-sys-instrument` executable is used to instrument an existing binary. Call-stack sampling can be enabled alongside
the execution an instrumented binary, to help "fill in the gaps" between the instrumentation via setting the `ROCPROFSYS_USE_SAMPLING`
configuration variable to `ON`.
Similar to `rocprof-sys-sample`, use a double-hypen (`--`) to separate the command-line arguments for `rocprof-sys-instrument` from the target application and it's arguments.
```shell
rocprof-sys-instrument --help
rocprof-sys-instrument <rocprof-sys-options> -- <exe-or-library> <exe-options>
```
#### Binary rewrite
Rewrite the text section of an executable or library with instrumentation:
```shell
rocprof-sys-instrument -o app.inst -- /path/to/app
```
In binary rewrite mode, if you also want instrumentation in the linked libraries, you must also rewrite those libraries.
Example of rewriting the functions starting with `"hip"` with instrumentation in the amdhip64 library:
```shell
mkdir -p ./lib
rocprof-sys-instrument -R '^hip' -o ./lib/libamdhip64.so.4 -- /opt/rocm/lib/libamdhip64.so.4
export LD_LIBRARY_PATH=${PWD}/lib:${LD_LIBRARY_PATH}
```
> [!NOTE]
> Verify via `ldd` that your executable will load the instrumented library. If you built your executable with an RPATH to the original library's directory, then prefixing `LD_LIBRARY_PATH` will have no effect.
Once you have rewritten your executable and/or libraries with instrumentation, you can just run the (instrumented) executable
or exectuable which loads the instrumented libraries normally, e.g.:
```shell
rocprof-sys-run -- ./app.inst
```
If you want to re-define certain settings to new default in a binary rewrite, use the `--env` option. This `rocprof-sys` option
will set the environment variable to the given value but will not override it. E.g. the default value of `ROCPROFSYS_PERFETTO_BUFFER_SIZE_KB`
is 1024000 KB (1 GiB):
```shell
# buffer size defaults to 1024000
rocprof-sys-instrument -o app.inst -- /path/to/app
rocprof-sys-run -- ./app.inst
```
Passing `--env ROCPROFSYS_PERFETTO_BUFFER_SIZE_KB=5120000` will change the default value in `app.inst` to 5120000 KiB (5 GiB):
```shell
# defaults to 5 GiB buffer size
rocprof-sys-instrument -o app.inst --env ROCPROFSYS_PERFETTO_BUFFER_SIZE_KB=5120000 -- /path/to/app
rocprof-sys-run -- ./app.inst
```
```shell
# override default 5 GiB buffer size to 200 MB via command-line
rocprof-sys-run --trace-buffer-size=200000 -- ./app.inst
# override default 5 GiB buffer size to 200 MB via environment
export ROCPROFSYS_PERFETTO_BUFFER_SIZE_KB=200000
rocprof-sys-run -- ./app.inst
```
#### Runtime instrumentation
Runtime instrumentation will not only instrument the text section of the executable but also the text sections of the
linked libraries. Thus, it may be useful to exclude those libraries via the `-ME` (module exclude) regex option
or exclude specific functions with the `-E` regex option.
```shell
rocprof-sys-instrument -- /path/to/app
rocprof-sys-instrument -ME '^(libhsa-runtime64|libz\\.so)' -- /path/to/app
rocprof-sys-instrument -E 'rocr::atomic|rocr::core|rocr::HSA' -- /path/to/app
```
### Python profiling and tracing
Use the `rocprof-sys-python` script to profile/trace Python interpreter function calls.
Use a double-hypen (`--`) to separate the command-line arguments for `rocprof-sys-python` from the target script and it's arguments.
```shell
rocprof-sys-python --help
rocprof-sys-python <rocprof-sys-options> -- <python-script> <script-args>
rocprof-sys-python -- ./script.py
```
> [!NOTE]
> The first argument after the double-hyphen must be a Python script, e.g. `rocprof-sys-python -- ./script.py`.
If you need to specify a specific python interpreter version, use `rocprof-sys-python-X.Y` where `X.Y` is the Python
major and minor version:
```shell
rpcprof-sys-python-3.8 -- ./script.py
```
If you need to specify the full path to a Python interpreter, set the `PYTHON_EXECUTABLE` environment variable:
```shell
PYTHON_EXECUTABLE=/opt/conda/bin/python rocprof-sys-python -- ./script.py
```
If you want to restrict the data collection to specific function(s) and its callees, pass the `-b` / `--builtin` option after decorating the
function(s) with `@profile`. Use the `@noprofile` decorator for excluding/ignoring function(s) and its callees:
```python
def foo():
pass
@noprofile
def bar():
foo()
@profile
def spam():
foo()
bar()
```
Each time `spam` is called during profiling, the profiling results will include 1 entry for `spam` and 1 entry
for `foo` via the direct call within `spam`. There will be no entries for `bar` or the `foo` invocation within it.
### Trace visualization
- Visit [ui.perfetto.dev](https://ui.perfetto.dev) in the web-browser
- Select "Open trace file" from panel on the left
- Locate the rocprofiler-systems perfetto output (extension: `.proto`)
![rocprof-sys-perfetto](docs/data/rocprof-sys-perfetto.png)
![rocprof-sys-rocm](docs/data/rocprof-sys-rocm.png)
![rocprof-sys-rocm-flow](docs/data/rocprof-sys-rocm-flow.png)
![rocprof-sys-user-api](docs/data/rocprof-sys-user-api.png)
## Using Perfetto tracing with system backend
Perfetto tracing with the system backend supports multiple processes writing to the same
output file. Thus, it is a useful technique if rocprofiler-systems is built with partial MPI support
because all the perfetto output will be coalesced into a single file. The
installation docs for perfetto can be found [here](https://perfetto.dev/docs/contributing/build-instructions).
If you are building rocprofiler-systems from source, you can configure CMake with `ROCPROFSYS_INSTALL_PERFETTO_TOOLS=ON`
and the `perfetto` and `traced` applications will be installed as part of the build process. However,
it should be noted that to prevent this option from accidentally overwriting an existing perfetto install,
all the perfetto executables installed by ROCm Systems Profiler are prefixed with `rocprof-sys-perfetto-`, except
for the `perfetto` executable, which is just renamed `rocprof-sys-perfetto`.
Enable `traced` and `perfetto` in the background:
```shell
pkill traced
traced --background
perfetto --out ./rocprof-sys-perfetto.proto --txt -c ${ROCPROFSYS_ROOT}/share/perfetto.cfg --background
```
> [!NOTE]
> If the perfetto tools were installed by rocprofiler-systems, replace `traced` with `rocprof-sys-perfetto-traced` and `perfetto` with `rocprof-sys-perfetto`.
Configure rocprofiler-systems to use the perfetto system backend via the `--perfetto-backend` option of `rocprof-sys-run`:
```shell
# enable sampling on the uninstrumented binary
rocprof-sys-run --sample --trace --perfetto-backend=system -- ./myapp
# trace the instrument the binary
rocprof-sys-instrument -o ./myapp.inst -- ./myapp
rocprof-sys-run --trace --perfetto-backend=system -- ./myapp.inst
```
or via the `--env` option of `rocprof-sys-instrument` + runtime instrumentation:
```shell
rocprof-sys-instrument --env ROCPROFSYS_PERFETTO_BACKEND=system -- ./myapp
```
+1
Просмотреть файл
@@ -0,0 +1 @@
1.1.0
+451
Просмотреть файл
@@ -0,0 +1,451 @@
# include guard
include_guard(DIRECTORY)
# ########################################################################################
#
# Handles the build settings
#
# ########################################################################################
include(GNUInstallDirs)
include(Compilers)
include(FindPackageHandleStandardArgs)
include(MacroUtilities)
rocprofiler_systems_add_option(
ROCPROFSYS_BUILD_DEVELOPER "Extra build flags for development like -Werror"
${ROCPROFSYS_BUILD_CI}
)
rocprofiler_systems_add_option(ROCPROFSYS_BUILD_RELEASE
"Build with minimal debug line info" OFF
)
rocprofiler_systems_add_option(ROCPROFSYS_BUILD_EXTRA_OPTIMIZATIONS
"Extra optimization flags" OFF
)
rocprofiler_systems_add_option(ROCPROFSYS_BUILD_LTO "Build with link-time optimization"
OFF
)
rocprofiler_systems_add_option(ROCPROFSYS_USE_COMPILE_TIMING
"Build with timing metrics for compilation" OFF
)
rocprofiler_systems_add_option(ROCPROFSYS_USE_SANITIZER
"Build with -fsanitze=\${ROCPROFSYS_SANITIZER_TYPE}" OFF
)
rocprofiler_systems_add_option(ROCPROFSYS_BUILD_STATIC_LIBGCC
"Build with -static-libgcc if possible" OFF
)
rocprofiler_systems_add_option(ROCPROFSYS_BUILD_STATIC_LIBSTDCXX
"Build with -static-libstdc++ if possible" OFF
)
rocprofiler_systems_add_option(ROCPROFSYS_BUILD_STACK_PROTECTOR
"Build with -fstack-protector" ON
)
rocprofiler_systems_add_cache_option(
ROCPROFSYS_BUILD_LINKER
"If set to a non-empty value, pass -fuse-ld=\${ROCPROFSYS_BUILD_LINKER}" STRING "bfd"
)
rocprofiler_systems_add_cache_option(ROCPROFSYS_BUILD_NUMBER "Internal CI use" STRING "0"
ADVANCED NO_FEATURE
)
rocprofiler_systems_add_interface_library(rocprofiler-systems-static-libgcc
"Link to static version of libgcc"
)
rocprofiler_systems_add_interface_library(rocprofiler-systems-static-libstdcxx
"Link to static version of libstdc++"
)
rocprofiler_systems_add_interface_library(rocprofiler-systems-static-libgcc-optional
"Link to static version of libgcc"
)
rocprofiler_systems_add_interface_library(rocprofiler-systems-static-libstdcxx-optional
"Link to static version of libstdc++"
)
target_compile_definitions(
rocprofiler-systems-compile-options
INTERFACE $<$<CONFIG:DEBUG>:DEBUG>
)
set(ROCPROFSYS_SANITIZER_TYPE "leak" CACHE STRING "Sanitizer type")
if(ROCPROFSYS_USE_SANITIZER)
rocprofiler_systems_add_feature(
ROCPROFSYS_SANITIZER_TYPE
"Sanitizer type, e.g. leak, thread, address, memory, etc."
)
endif()
if(ROCPROFSYS_BUILD_CI)
rocprofiler_systems_target_compile_definitions(${LIBNAME}-compile-options
INTERFACE ROCPROFSYS_CI
)
endif()
# ----------------------------------------------------------------------------------------#
# dynamic linking and runtime libraries
#
if(CMAKE_DL_LIBS AND NOT "${CMAKE_DL_LIBS}" STREQUAL "dl")
# if cmake provides dl library, use that
set(dl_LIBRARY ${CMAKE_DL_LIBS} CACHE FILEPATH "dynamic linking system library")
endif()
foreach(_TYPE dl rt dw)
if(NOT ${_TYPE}_LIBRARY)
find_library(${_TYPE}_LIBRARY NAMES ${_TYPE})
endif()
endforeach()
find_package_handle_standard_args(dl-library REQUIRED_VARS dl_LIBRARY)
find_package_handle_standard_args(rt-library REQUIRED_VARS rt_LIBRARY)
# find_package_handle_standard_args(dw-library REQUIRED_VARS dw_LIBRARY)
if(dl_LIBRARY)
target_link_libraries(rocprofiler-systems-compile-options INTERFACE ${dl_LIBRARY})
endif()
# ----------------------------------------------------------------------------------------#
# set the compiler flags
#
add_flag_if_avail(
"-W" "-Wall" "-Wno-unknown-pragmas" "-Wno-unused-function" "-Wno-ignored-attributes"
"-Wno-attributes" "-Wno-missing-field-initializers" "-Wno-interference-size"
)
if(ROCPROFSYS_BUILD_DEBUG)
add_flag_if_avail("-g3" "-fno-omit-frame-pointer")
endif()
if(WIN32)
# suggested by MSVC for spectre mitigation in rapidjson implementation
add_cxx_flag_if_avail("/Qspectre")
endif()
if(CMAKE_CXX_COMPILER_IS_CLANG)
add_cxx_flag_if_avail("-Wno-mismatched-tags")
endif()
# ----------------------------------------------------------------------------------------#
# extra flags for debug information in debug or optimized binaries
#
rocprofiler_systems_add_interface_library(
rocprofiler-systems-compile-debuginfo
"Attempts to set best flags for more expressive profiling information in debug or optimized binaries"
)
add_target_flag_if_avail(rocprofiler-systems-compile-debuginfo "-g3"
"-fno-omit-frame-pointer" "-fno-optimize-sibling-calls"
)
if(CMAKE_CUDA_COMPILER_IS_NVIDIA)
add_target_cuda_flag(rocprofiler-systems-compile-debuginfo "-lineinfo")
endif()
target_compile_options(
rocprofiler-systems-compile-debuginfo
INTERFACE
$<$<COMPILE_LANGUAGE:C>:$<$<C_COMPILER_ID:GNU>:-rdynamic>>
$<$<COMPILE_LANGUAGE:CXX>:$<$<CXX_COMPILER_ID:GNU>:-rdynamic>>
)
if(NOT APPLE)
target_link_options(
rocprofiler-systems-compile-debuginfo
INTERFACE $<$<CXX_COMPILER_ID:GNU>:-rdynamic>
)
endif()
if(CMAKE_CUDA_COMPILER_IS_NVIDIA)
target_compile_options(
rocprofiler-systems-compile-debuginfo
INTERFACE
$<$<COMPILE_LANGUAGE:CUDA>:$<$<CXX_COMPILER_ID:GNU>:-Xcompiler=-rdynamic>>
)
endif()
if(dl_LIBRARY)
target_link_libraries(rocprofiler-systems-compile-debuginfo INTERFACE ${dl_LIBRARY})
endif()
if(rt_LIBRARY)
target_link_libraries(rocprofiler-systems-compile-debuginfo INTERFACE ${rt_LIBRARY})
endif()
# ----------------------------------------------------------------------------------------#
# non-debug optimizations
#
rocprofiler_systems_add_interface_library(rocprofiler-systems-compile-extra
"Extra optimization flags"
)
if(NOT ROCPROFSYS_BUILD_CODECOV AND ROCPROFSYS_BUILD_EXTRA_OPTIMIZATIONS)
add_target_flag_if_avail(
rocprofiler-systems-compile-extra "-finline-functions" "-funroll-loops"
"-ftree-vectorize" "-ftree-loop-optimize" "-ftree-loop-vectorize"
)
endif()
if(
NOT "${CMAKE_BUILD_TYPE}" STREQUAL "Debug"
AND ROCPROFSYS_BUILD_EXTRA_OPTIMIZATIONS
AND NOT ROCPROFSYS_BUILD_CODECOV
)
target_link_libraries(
rocprofiler-systems-compile-options
INTERFACE $<BUILD_INTERFACE:rocprofiler-systems-compile-extra>
)
add_flag_if_avail(
"-fno-signaling-nans" "-fno-trapping-math" "-fno-signed-zeros"
"-ffinite-math-only" "-fno-math-errno" "-fpredictive-commoning"
"-fvariable-expansion-in-unroller"
)
# add_flag_if_avail("-freciprocal-math" "-fno-signed-zeros" "-mfast-fp")
endif()
# ----------------------------------------------------------------------------------------#
# debug-safe optimizations
#
add_cxx_flag_if_avail("-faligned-new")
rocprofiler_systems_add_interface_library(rocprofiler-systems-lto
"Adds link-time-optimization flags"
)
if(NOT ROCPROFSYS_BUILD_CODECOV)
rocprofiler_systems_save_variables(FLTO VARIABLES CMAKE_CXX_FLAGS)
set(_CXX_FLAGS "${CMAKE_CXX_FLAGS}")
set(CMAKE_CXX_FLAGS "-flto=thin ${_CXX_FLAGS}")
add_target_flag_if_avail(rocprofiler-systems-lto "-flto=thin")
if(NOT cxx_rocprofiler_systems_lto_flto_thin)
set(CMAKE_CXX_FLAGS "-flto ${_CXX_FLAGS}")
add_target_flag_if_avail(rocprofiler-systems-lto "-flto")
if(NOT cxx_rocprofiler_systems_lto_flto)
set(ROCPROFSYS_BUILD_LTO OFF)
else()
target_link_options(rocprofiler-systems-lto INTERFACE -flto)
endif()
add_target_flag_if_avail(rocprofiler-systems-lto "-fno-fat-lto-objects")
if(cxx_rocprofiler_systems_lto_fno_fat_lto_objects)
target_link_options(rocprofiler-systems-lto INTERFACE -fno-fat-lto-objects)
endif()
else()
target_link_options(rocprofiler-systems-lto INTERFACE -flto=thin)
endif()
rocprofiler_systems_restore_variables(FLTO VARIABLES CMAKE_CXX_FLAGS)
endif()
# ----------------------------------------------------------------------------------------#
# print compilation timing reports (Clang compiler)
#
rocprofiler_systems_add_interface_library(
rocprofiler-systems-compile-timing
"Adds compiler flags which report compilation timing metrics"
)
if(CMAKE_CXX_COMPILER_IS_CLANG)
add_target_flag_if_avail(rocprofiler-systems-compile-timing "-ftime-trace")
if(NOT cxx_rocprofiler_systems_compile_timing_ftime_trace)
add_target_flag_if_avail(rocprofiler-systems-compile-timing "-ftime-report")
endif()
else()
add_target_flag_if_avail(rocprofiler-systems-compile-timing "-ftime-report")
endif()
if(ROCPROFSYS_USE_COMPILE_TIMING)
target_link_libraries(
rocprofiler-systems-compile-options
INTERFACE rocprofiler-systems-compile-timing
)
endif()
# ----------------------------------------------------------------------------------------#
# fstack-protector
#
rocprofiler_systems_add_interface_library(rocprofiler-systems-stack-protector
"Adds stack-protector compiler flags"
)
add_target_flag_if_avail(rocprofiler-systems-stack-protector "-fstack-protector-strong"
"-Wstack-protector"
)
if(ROCPROFSYS_BUILD_STACK_PROTECTOR)
target_link_libraries(
rocprofiler-systems-compile-options
INTERFACE rocprofiler-systems-stack-protector
)
endif()
# ----------------------------------------------------------------------------------------#
# developer build flags
#
if(ROCPROFSYS_BUILD_DEVELOPER)
add_target_flag_if_avail(
rocprofiler-systems-compile-options "-Werror" "-Wdouble-promotion" "-Wshadow"
"-Wextra" "-Wpedantic" "-Wstack-usage=524288" # 512 KB
"/showIncludes"
)
if(ROCPROFSYS_BUILD_NUMBER GREATER 2)
add_target_flag_if_avail(rocprofiler-systems-compile-options "-gsplit-dwarf")
endif()
endif()
if(ROCPROFSYS_BUILD_LINKER)
target_link_options(
rocprofiler-systems-compile-options
INTERFACE
$<$<C_COMPILER_ID:GNU>:-fuse-ld=${ROCPROFSYS_BUILD_LINKER}>
$<$<CXX_COMPILER_ID:GNU>:-fuse-ld=${ROCPROFSYS_BUILD_LINKER}>
)
endif()
# ----------------------------------------------------------------------------------------#
# release build flags
#
if(ROCPROFSYS_BUILD_RELEASE AND NOT ROCPROFSYS_BUILD_DEBUG)
add_target_flag_if_avail(
rocprofiler-systems-compile-options "-g1" "-feliminate-unused-debug-symbols"
"-gno-column-info" "-gno-variable-location-views" "-gline-tables-only"
)
endif()
# ----------------------------------------------------------------------------------------#
# visibility build flags
#
rocprofiler_systems_add_interface_library(rocprofiler-systems-default-visibility
"Adds -fvisibility=default compiler flag"
)
rocprofiler_systems_add_interface_library(rocprofiler-systems-hidden-visibility
"Adds -fvisibility=hidden compiler flag"
)
add_target_flag_if_avail(rocprofiler-systems-default-visibility "-fvisibility=default")
add_target_flag_if_avail(rocprofiler-systems-hidden-visibility "-fvisibility=hidden"
"-fvisibility-inlines-hidden"
)
# ----------------------------------------------------------------------------------------#
# developer build flags
#
if(dl_LIBRARY)
# This instructs the linker to add all symbols, not only used ones, to the dynamic
# symbol table. This option is needed for some uses of dlopen or to allow obtaining
# backtraces from within a program.
add_flag_if_avail("-rdynamic")
endif()
# ----------------------------------------------------------------------------------------#
# sanitizer
#
set(ROCPROFSYS_SANITIZER_TYPES
address
memory
thread
leak
undefined
unreachable
null
bounds
alignment
)
set_property(
CACHE ROCPROFSYS_SANITIZER_TYPE
PROPERTY STRINGS "${ROCPROFSYS_SANITIZER_TYPES}"
)
rocprofiler_systems_add_interface_library(rocprofiler-systems-sanitizer-compile-options
"Adds compiler flags for sanitizers"
)
rocprofiler_systems_add_interface_library(
rocprofiler-systems-sanitizer
"Adds compiler flags to enable ${ROCPROFSYS_SANITIZER_TYPE} sanitizer (-fsanitizer=${ROCPROFSYS_SANITIZER_TYPE})"
)
set(COMMON_SANITIZER_FLAGS
"-fno-optimize-sibling-calls"
"-fno-omit-frame-pointer"
"-fno-inline-functions"
)
add_target_flag(rocprofiler-systems-sanitizer-compile-options ${COMMON_SANITIZER_FLAGS})
foreach(_TYPE ${ROCPROFSYS_SANITIZER_TYPES})
set(_FLAG "-fsanitize=${_TYPE}")
rocprofiler_systems_add_interface_library(
rocprofiler-systems-${_TYPE}-sanitizer
"Adds compiler flags to enable ${_TYPE} sanitizer (${_FLAG})"
)
add_target_flag(rocprofiler-systems-${_TYPE}-sanitizer ${_FLAG})
target_link_libraries(
rocprofiler-systems-${_TYPE}-sanitizer
INTERFACE rocprofiler-systems-sanitizer-compile-options
)
set_property(
TARGET rocprofiler-systems-${_TYPE}-sanitizer
PROPERTY INTERFACE_LINK_OPTIONS ${_FLAG} ${COMMON_SANITIZER_FLAGS}
)
endforeach()
unset(_FLAG)
unset(COMMON_SANITIZER_FLAGS)
if(ROCPROFSYS_USE_SANITIZER)
foreach(_TYPE ${ROCPROFSYS_SANITIZER_TYPE})
if(TARGET rocprofiler-systems-${_TYPE}-sanitizer)
target_link_libraries(
rocprofiler-systems-sanitizer
INTERFACE rocprofiler-systems-${_TYPE}-sanitizer
)
else()
message(
FATAL_ERROR
"Error! Target 'rocprofiler-systems-${_TYPE}-sanitizer' does not exist!"
)
endif()
endforeach()
else()
set(ROCPROFSYS_USE_SANITIZER OFF)
endif()
# ----------------------------------------------------------------------------------------#
# static lib flags
#
target_compile_options(
rocprofiler-systems-static-libgcc
INTERFACE
$<$<COMPILE_LANGUAGE:C>:$<$<C_COMPILER_ID:GNU>:-static-libgcc>>
$<$<COMPILE_LANGUAGE:CXX>:$<$<CXX_COMPILER_ID:GNU>:-static-libgcc>>
)
target_link_options(
rocprofiler-systems-static-libgcc
INTERFACE
$<$<COMPILE_LANGUAGE:C>:$<$<C_COMPILER_ID:GNU,Clang>:-static-libgcc>>
$<$<COMPILE_LANGUAGE:CXX>:$<$<CXX_COMPILER_ID:GNU,Clang>:-static-libgcc>>
)
target_compile_options(
rocprofiler-systems-static-libstdcxx
INTERFACE $<$<COMPILE_LANGUAGE:CXX>:$<$<CXX_COMPILER_ID:GNU>:-static-libstdc++>>
)
target_link_options(
rocprofiler-systems-static-libstdcxx
INTERFACE $<$<COMPILE_LANGUAGE:CXX>:$<$<CXX_COMPILER_ID:GNU,Clang>:-static-libstdc++>>
)
if(ROCPROFSYS_BUILD_STATIC_LIBGCC)
target_link_libraries(
rocprofiler-systems-static-libgcc-optional
INTERFACE rocprofiler-systems-static-libgcc
)
endif()
if(ROCPROFSYS_BUILD_STATIC_LIBSTDCXX)
target_link_libraries(
rocprofiler-systems-static-libstdcxx-optional
INTERFACE rocprofiler-systems-static-libstdcxx
)
endif()
# ----------------------------------------------------------------------------------------#
# user customization
#
get_property(LANGUAGES GLOBAL PROPERTY ENABLED_LANGUAGES)
if(NOT APPLE OR "$ENV{CONDA_PYTHON_EXE}" STREQUAL "")
add_user_flags(rocprofiler-systems-compile-options "CXX")
endif()
+619
Просмотреть файл
@@ -0,0 +1,619 @@
# include guard
# ########################################################################################
#
# Compilers
#
# ########################################################################################
#
# sets (cached):
#
# CMAKE_C_COMPILER_IS_<TYPE> CMAKE_CXX_COMPILER_IS_<TYPE>
#
# where TYPE is: - GNU - CLANG - INTEL - INTEL_ICC - INTEL_ICPC - PGI - XLC - HP_ACC -
# MIPS - MSVC
#
include(CheckCCompilerFlag)
include(CheckCSourceCompiles)
include(CheckCSourceRuns)
include(CheckCXXCompilerFlag)
include(CheckCXXSourceCompiles)
include(CheckCXXSourceRuns)
include(CMakeParseArguments)
include(MacroUtilities)
if("${LIBNAME}" STREQUAL "")
string(TOLOWER "${PROJECT_NAME}" LIBNAME)
endif()
if(NOT TARGET ${LIBNAME}-compile-options)
rocprofiler_systems_add_interface_library(
${LIBNAME}-compile-options
"Adds the standard set of compiler flags used by timemory"
)
endif()
# ----------------------------------------------------------------------------------------#
# macro converting string to list
# ----------------------------------------------------------------------------------------#
macro(to_list _VAR _STR)
string(REPLACE " " " " ${_VAR} "${_STR}")
string(REPLACE " " ";" ${_VAR} "${_STR}")
endmacro(to_list _VAR _STR)
# ----------------------------------------------------------------------------------------#
# macro converting string to list
# ----------------------------------------------------------------------------------------#
macro(to_string _VAR _STR)
string(REPLACE ";" " " ${_VAR} "${_STR}")
endmacro(to_string _VAR _STR)
# ----------------------------------------------------------------------------------------#
# Macro to add to string
# ----------------------------------------------------------------------------------------#
macro(add _VAR _FLAG)
if(NOT "${_FLAG}" STREQUAL "")
if("${${_VAR}}" STREQUAL "")
set(${_VAR} "${_FLAG}")
else()
set(${_VAR} "${${_VAR}} ${_FLAG}")
endif()
endif()
endmacro()
# ----------------------------------------------------------------------------------------#
# macro to remove duplicates from string
# ----------------------------------------------------------------------------------------#
macro(set_no_duplicates _VAR)
if(NOT "${ARGN}" STREQUAL "")
set(${_VAR} "${ARGN}")
endif()
# remove the duplicates
if(NOT "${${_VAR}}" STREQUAL "")
# create list of flags
to_list(_VAR_LIST "${${_VAR}}")
list(REMOVE_DUPLICATES _VAR_LIST)
to_string(${_VAR} "${_VAR_LIST}")
endif(NOT "${${_VAR}}" STREQUAL "")
endmacro(set_no_duplicates _VAR)
# ----------------------------------------------------------------------------------------#
# call before running check_{c,cxx}_compiler_flag
# ----------------------------------------------------------------------------------------#
macro(rocprofiler_systems_begin_flag_check)
if(ROCPROFSYS_QUIET_CONFIG)
if(NOT DEFINED CMAKE_REQUIRED_QUIET)
set(CMAKE_REQUIRED_QUIET OFF)
endif()
rocprofiler_systems_save_variables(FLAG_CHECK VARIABLES CMAKE_REQUIRED_QUIET)
set(CMAKE_REQUIRED_QUIET ON)
endif()
endmacro()
# ----------------------------------------------------------------------------------------#
# call after running check_{c,cxx}_compiler_flag
# ----------------------------------------------------------------------------------------#
macro(rocprofiler_systems_end_flag_check)
if(ROCPROFSYS_QUIET_CONFIG)
rocprofiler_systems_restore_variables(FLAG_CHECK VARIABLES CMAKE_REQUIRED_QUIET)
endif()
endmacro()
# ########################################################################################
#
# C compiler flags
#
# ########################################################################################
# ----------------------------------------------------------------------------------------#
# add C flag to target
# ----------------------------------------------------------------------------------------#
macro(ADD_TARGET_C_FLAG _TARG)
get_target_property(_TARG_TYPE ${_TARG} TYPE)
if("${_TARG_TYPE}" MATCHES "INTERFACE_LIBRARY")
set(_SCOPE INTERFACE)
else()
set(_SCOPE PRIVATE)
endif()
string(REPLACE "-" "_" _MAKE_TARG "${_TARG}")
list(APPEND ROCPROFSYS_MAKE_TARGETS ${_MAKE_TARG})
target_compile_options(${_TARG} ${_SCOPE} $<$<COMPILE_LANGUAGE:C>:${ARGN}>)
list(APPEND ${_MAKE_TARG}_C_FLAGS ${ARGN})
endmacro()
# ----------------------------------------------------------------------------------------#
# add C flag w/o check
# ----------------------------------------------------------------------------------------#
macro(ADD_C_FLAG FLAG)
set(_TARG)
set(_LTARG)
if(NOT "${ARGN}" STREQUAL "")
set(_TARG ${ARGN})
string(TOLOWER "_${ARGN}" _LTARG)
endif()
if(NOT "${FLAG}" STREQUAL "")
if("${_LTARG}" STREQUAL "")
list(APPEND ${PROJECT_NAME}_C_FLAGS "${FLAG}")
list(APPEND ${PROJECT_NAME}_C_COMPILE_OPTIONS "${FLAG}")
add_target_c_flag(${LIBNAME}-compile-options ${FLAG})
else()
add_target_c_flag(${_TARG} ${FLAG})
endif()
endif()
unset(_TARG)
unset(_LTARG)
endmacro()
# ----------------------------------------------------------------------------------------#
# check C flag
# ----------------------------------------------------------------------------------------#
macro(ADD_C_FLAG_IF_AVAIL FLAG)
set(_ENABLE ON)
if(DEFINED ROCPROFSYS_BUILD_C AND NOT ROCPROFSYS_BUILD_C)
set(_ENABLE OFF)
endif()
set(_TARG)
set(_LTARG)
if(NOT "${ARGN}" STREQUAL "")
set(_TARG ${ARGN})
string(TOLOWER "_${ARGN}" _LTARG)
endif()
if(NOT "${FLAG}" STREQUAL "")
string(REGEX REPLACE "^/" "c${_LTARG}_" FLAG_NAME "${FLAG}")
string(REGEX REPLACE "^-" "c${_LTARG}_" FLAG_NAME "${FLAG_NAME}")
string(REPLACE "-" "_" FLAG_NAME "${FLAG_NAME}")
string(REPLACE " " "_" FLAG_NAME "${FLAG_NAME}")
string(REPLACE "=" "_" FLAG_NAME "${FLAG_NAME}")
if(NOT ROCPROFSYS_BUILD_C)
set(${FLAG_NAME} ON)
else()
rocprofiler_systems_begin_flag_check()
check_c_compiler_flag("-Werror" c_werror)
if(c_werror)
check_c_compiler_flag("${FLAG} -Werror" ${FLAG_NAME})
else()
check_c_compiler_flag("${FLAG}" ${FLAG_NAME})
endif()
rocprofiler_systems_end_flag_check()
if(${FLAG_NAME})
if("${_LTARG}" STREQUAL "")
list(APPEND ${PROJECT_NAME}_C_FLAGS "${FLAG}")
list(APPEND ${PROJECT_NAME}_C_COMPILE_OPTIONS "${FLAG}")
add_target_c_flag(${LIBNAME}-compile-options ${FLAG})
else()
add_target_c_flag(${_TARG} ${FLAG})
endif()
endif()
endif()
endif()
unset(_TARG)
unset(_LTARG)
endmacro()
# ----------------------------------------------------------------------------------------#
# add C flag to target
# ----------------------------------------------------------------------------------------#
macro(ADD_TARGET_C_FLAG_IF_AVAIL _TARG)
foreach(_FLAG ${ARGN})
add_c_flag_if_avail(${_FLAG} ${_TARG})
endforeach()
endmacro()
# ########################################################################################
#
# CXX compiler flags
#
# ########################################################################################
# ----------------------------------------------------------------------------------------#
# add CXX flag to target
# ----------------------------------------------------------------------------------------#
macro(ADD_TARGET_CXX_FLAG _TARG)
get_target_property(_TARG_TYPE ${_TARG} TYPE)
if("${_TARG_TYPE}" MATCHES "INTERFACE_LIBRARY")
set(_SCOPE INTERFACE)
else()
set(_SCOPE PRIVATE)
endif()
string(REPLACE "-" "_" _MAKE_TARG "${_TARG}")
list(APPEND ROCPROFSYS_MAKE_TARGETS ${_MAKE_TARG})
target_compile_options(${_TARG} ${_SCOPE} $<$<COMPILE_LANGUAGE:CXX>:${ARGN}>)
list(APPEND ${_MAKE_TARG}_CXX_FLAGS ${ARGN})
get_property(LANGUAGES GLOBAL PROPERTY ENABLED_LANGUAGES)
if(CMAKE_CUDA_COMPILER_IS_NVIDIA)
target_compile_options(
${_TARG}
${_SCOPE}
$<$<COMPILE_LANGUAGE:CUDA>:-Xcompiler=${ARGN}>
)
list(APPEND ${_MAKE_TARG}_CUDA_FLAGS -Xcompiler=${ARGN})
elseif(CMAKE_CUDA_COMPILER_IS_CLANG)
target_compile_options(${_TARG} ${_SCOPE} $<$<COMPILE_LANGUAGE:CUDA>:${ARGN}>)
list(APPEND ${_MAKE_TARG}_CUDA_FLAGS ${ARGN})
endif()
endmacro()
# ----------------------------------------------------------------------------------------#
# add CXX flag w/o check
# ----------------------------------------------------------------------------------------#
macro(ADD_CXX_FLAG FLAG)
set(_TARG)
set(_LTARG)
if(NOT "${ARGN}" STREQUAL "")
set(_TARG ${ARGN})
string(TOLOWER "_${ARGN}" _LTARG)
endif()
if(NOT "${FLAG}" STREQUAL "")
if("${_LTARG}" STREQUAL "")
list(APPEND ${PROJECT_NAME}_CXX_FLAGS "${FLAG}")
list(APPEND ${PROJECT_NAME}_CXX_COMPILE_OPTIONS "${FLAG}")
add_target_cxx_flag(${LIBNAME}-compile-options ${FLAG})
else()
add_target_cxx_flag(${_TARG} ${FLAG})
endif()
endif()
unset(_TARG)
unset(_LTARG)
endmacro()
# ----------------------------------------------------------------------------------------#
# check CXX flag
# ----------------------------------------------------------------------------------------#
macro(ADD_CXX_FLAG_IF_AVAIL FLAG)
set(_TARG)
set(_LTARG)
if(NOT "${ARGN}" STREQUAL "")
set(_TARG ${ARGN})
string(TOLOWER "_${ARGN}" _LTARG)
endif()
if(NOT "${FLAG}" STREQUAL "")
string(REGEX REPLACE "^/" "cxx${_LTARG}_" FLAG_NAME "${FLAG}")
string(REGEX REPLACE "^-" "cxx${_LTARG}_" FLAG_NAME "${FLAG_NAME}")
string(REPLACE "-" "_" FLAG_NAME "${FLAG_NAME}")
string(REPLACE " " "_" FLAG_NAME "${FLAG_NAME}")
string(REPLACE "=" "_" FLAG_NAME "${FLAG_NAME}")
string(REPLACE "/" "_" FLAG_NAME "${FLAG_NAME}")
rocprofiler_systems_begin_flag_check()
check_cxx_compiler_flag("-Werror" cxx_werror)
if(cxx_werror)
check_cxx_compiler_flag("${FLAG} -Werror" ${FLAG_NAME})
else()
check_cxx_compiler_flag("${FLAG}" ${FLAG_NAME})
endif()
rocprofiler_systems_end_flag_check()
if(${FLAG_NAME})
if("${_LTARG}" STREQUAL "")
list(APPEND ${PROJECT_NAME}_CXX_FLAGS "${FLAG}")
list(APPEND ${PROJECT_NAME}_CXX_COMPILE_OPTIONS "${FLAG}")
add_target_cxx_flag(${LIBNAME}-compile-options ${FLAG})
else()
add_target_cxx_flag(${_TARG} ${FLAG})
endif()
endif()
endif()
unset(_TARG)
unset(_LTARG)
endmacro()
# ----------------------------------------------------------------------------------------#
# add CXX flag to target
# ----------------------------------------------------------------------------------------#
macro(ADD_TARGET_CXX_FLAG_IF_AVAIL _TARG)
foreach(_FLAG ${ARGN})
add_cxx_flag_if_avail(${_FLAG} ${_TARG})
endforeach()
endmacro()
# ########################################################################################
#
# Common
#
# ########################################################################################
# ----------------------------------------------------------------------------------------#
# check C and CXX flag to compile-options w/o checking
# ----------------------------------------------------------------------------------------#
macro(ADD_FLAG)
foreach(_ARG ${ARGN})
add_c_flag("${_ARG}")
add_cxx_flag("${_ARG}")
endforeach()
endmacro()
# ----------------------------------------------------------------------------------------#
# add C and CXX flag w/o checking
# ----------------------------------------------------------------------------------------#
macro(ADD_TARGET_FLAG _TARG)
foreach(_ARG ${ARGN})
add_target_c_flag(${_TARG} ${_ARG})
add_target_cxx_flag(${_TARG} ${_ARG})
endforeach()
endmacro()
# ----------------------------------------------------------------------------------------#
# check C and CXX flag
# ----------------------------------------------------------------------------------------#
macro(ADD_FLAG_IF_AVAIL)
foreach(_ARG ${ARGN})
add_c_flag_if_avail("${_ARG}")
add_cxx_flag_if_avail("${_ARG}")
endforeach()
endmacro()
# ----------------------------------------------------------------------------------------#
# check C and CXX flag
# ----------------------------------------------------------------------------------------#
macro(ADD_TARGET_FLAG_IF_AVAIL _TARG)
foreach(_ARG ${ARGN})
add_target_c_flag_if_avail(${_TARG} ${_ARG})
add_target_cxx_flag_if_avail(${_TARG} ${_ARG})
endforeach()
endmacro()
# ----------------------------------------------------------------------------------------#
# check flag
# ----------------------------------------------------------------------------------------#
function(ROCPROFILER_SYSTEMS_TARGET_FLAG _TARG_TARGET)
cmake_parse_arguments(_TARG "IF_AVAIL" "MODE" "FLAGS;LANGUAGES" ${ARGN})
if(NOT _TARG_MODE)
set(_TARG_MODE INTERFACE)
endif()
get_property(ENABLED_LANGUAGES GLOBAL PROPERTY ENABLED_LANGUAGES)
if(NOT _TARG_LANGUAGES)
get_property(_TARG_LANGUAGES GLOBAL PROPERTY ENABLED_LANGUAGES)
endif()
string(TOLOWER "_${_TARG_TARGET}" _LTARG)
foreach(_FLAG ${_TARG_FLAGS})
foreach(_LANG ${_TARG_LANGUAGES})
if(NOT _TARG_IF_AVAIL)
target_compile_options(
${_TARG_TARGET}
${_TARG_MODE}
$<$<COMPILE_LANGUAGE:${_LANG}>:${_FLAG}>
)
continue()
endif()
if("${_LANG}" STREQUAL "C")
string(REGEX REPLACE "^/" "c${_LTARG}_" FLAG_NAME "${_FLAG}")
string(REGEX REPLACE "^-" "c${_LTARG}_" FLAG_NAME "${FLAG_NAME}")
string(REPLACE "-" "_" FLAG_NAME "${FLAG_NAME}")
string(REPLACE " " "_" FLAG_NAME "${FLAG_NAME}")
string(REPLACE "=" "_" FLAG_NAME "${FLAG_NAME}")
rocprofiler_systems_begin_flag_check()
check_c_compiler_flag("-Werror" c_werror)
if(c_werror)
check_c_compiler_flag("${FLAG} -Werror" ${FLAG_NAME})
else()
check_c_compiler_flag("${FLAG}" ${FLAG_NAME})
endif()
rocprofiler_systems_end_flag_check()
if(${FLAG_NAME})
target_compile_options(
${_TARG_TARGET}
${_TARG_MODE}
$<$<COMPILE_LANGUAGE:${_LANG}>:${_FLAG}>
)
endif()
elseif("${_LANG}" STREQUAL "CXX")
string(REGEX REPLACE "^/" "cxx${_LTARG}_" FLAG_NAME "${_FLAG}")
string(REGEX REPLACE "^-" "cxx${_LTARG}_" FLAG_NAME "${FLAG_NAME}")
string(REPLACE "-" "_" FLAG_NAME "${FLAG_NAME}")
string(REPLACE " " "_" FLAG_NAME "${FLAG_NAME}")
string(REPLACE "=" "_" FLAG_NAME "${FLAG_NAME}")
rocprofiler_systems_begin_flag_check()
check_cxx_compiler_flag("-Werror" cxx_werror)
if(cxx_werror)
check_cxx_compiler_flag("${FLAG} -Werror" ${FLAG_NAME})
else()
check_cxx_compiler_flag("${FLAG}" ${FLAG_NAME})
endif()
rocprofiler_systems_end_flag_check()
if(${FLAG_NAME})
target_compile_options(
${_TARG_TARGET}
${_TARG_MODE}
$<$<COMPILE_LANGUAGE:${_LANG}>:${_FLAG}>
)
if(CMAKE_CUDA_COMPILER_IS_NVIDIA)
target_compile_options(
${_TARG_TARGET}
${_TARG_MODE}
$<$<COMPILE_LANGUAGE:CUDA>:-Xcompiler=${_FLAG}>
)
elseif(CMAKE_CUDA_COMPILER_IS_CLANG)
target_compile_options(
${_TARG_TARGET}
${_TARG_MODE}
$<$<COMPILE_LANGUAGE:CUDA>:${_FLAG}>
)
endif()
endif()
endif()
endforeach()
endforeach()
endfunction()
# ----------------------------------------------------------------------------------------#
# add CUDA flag to target
# ----------------------------------------------------------------------------------------#
macro(ADD_TARGET_CUDA_FLAG _TARG)
string(REPLACE "-" "_" _MAKE_TARG "${_TARG}")
list(APPEND ROCPROFSYS_MAKE_TARGETS ${_MAKE_TARG})
target_compile_options(${_TARG} INTERFACE $<$<COMPILE_LANGUAGE:CUDA>:${ARGN}>)
list(APPEND ${_MAKE_TARG}_CUDA_FLAGS ${ARGN})
endmacro()
# ----------------------------------------------------------------------------------------#
# add to any language
# ----------------------------------------------------------------------------------------#
function(ADD_USER_FLAGS _TARGET _LANGUAGE)
set(_FLAGS
${${_LANGUAGE}FLAGS}
$ENV{${_LANGUAGE}FLAGS}
${${_LANGUAGE}_FLAGS}
$ENV{${_LANGUAGE}_FLAGS}
)
string(REPLACE " " ";" _FLAGS "${_FLAGS}")
set(${PROJECT_NAME}_${_LANGUAGE}_FLAGS
${${PROJECT_NAME}_${_LANGUAGE}_FLAGS}
${_FLAGS}
PARENT_SCOPE
)
set(${PROJECT_NAME}_${_LANGUAGE}_COMPILE_OPTIONS
${${PROJECT_NAME}_${_LANGUAGE}_COMPILE_OPTIONS}
${_FLAGS}
PARENT_SCOPE
)
target_compile_options(
${_TARGET}
INTERFACE $<$<COMPILE_LANGUAGE:${_LANGUAGE}>:${_FLAGS}>
)
endfunction()
# ----------------------------------------------------------------------------------------#
# add compiler definition
# ----------------------------------------------------------------------------------------#
function(ROCPROFILER_SYSTEMS_TARGET_COMPILE_DEFINITIONS _TARG _VIS)
foreach(_DEF ${ARGN})
if(NOT "${_DEF}" MATCHES "[A-Za-z_]+=.*" AND "${_DEF}" MATCHES "^ROCPROFSYS_")
set(_DEF "${_DEF}=1")
endif()
target_compile_definitions(${_TARG} ${_VIS} $<$<COMPILE_LANGUAGE:CXX>:${_DEF}>)
if(CMAKE_CUDA_COMPILER_IS_NVIDIA)
target_compile_definitions(
${_TARG}
${_VIS}
$<$<COMPILE_LANGUAGE:CUDA>:${_DEF}>
)
elseif(CMAKE_CUDA_COMPILER_IS_CLANG)
target_compile_definitions(
${_TARG}
${_VIS}
$<$<COMPILE_LANGUAGE:CUDA>:${_DEF}>
)
endif()
endforeach()
endfunction()
# ----------------------------------------------------------------------------------------#
# determine compiler types for each language
# ----------------------------------------------------------------------------------------#
get_property(ENABLED_LANGUAGES GLOBAL PROPERTY ENABLED_LANGUAGES)
foreach(LANG C CXX CUDA)
if(NOT DEFINED CMAKE_${LANG}_COMPILER)
set(CMAKE_${LANG}_COMPILER "")
endif()
if(NOT DEFINED CMAKE_${LANG}_COMPILER_ID)
set(CMAKE_${LANG}_COMPILER_ID "")
endif()
function(SET_COMPILER_VAR VAR _BOOL)
set(CMAKE_${LANG}_COMPILER_IS_${VAR}
${_BOOL}
CACHE INTERNAL
"CMake ${LANG} compiler identification (${VAR})"
FORCE
)
mark_as_advanced(CMAKE_${LANG}_COMPILER_IS_${VAR})
endfunction()
if(
("${LANG}" STREQUAL "C" AND CMAKE_COMPILER_IS_GNUCC)
OR ("${LANG}" STREQUAL "CXX" AND CMAKE_COMPILER_IS_GNUCXX)
)
# GNU compiler
set_compiler_var(GNU 1)
elseif(CMAKE_${LANG}_COMPILER MATCHES "icc.*")
# Intel icc compiler
set_compiler_var(INTEL 1)
set_compiler_var(INTEL_ICC 1)
elseif(CMAKE_${LANG}_COMPILER MATCHES "icpc.*")
# Intel icpc compiler
set_compiler_var(INTEL 1)
set_compiler_var(INTEL_ICPC 1)
elseif(CMAKE_${LANG}_COMPILER_ID MATCHES "AppleClang")
# Clang/LLVM compiler
set_compiler_var(CLANG 1)
set_compiler_var(APPLE_CLANG 1)
elseif(CMAKE_${LANG}_COMPILER_ID MATCHES "Clang")
# Clang/LLVM compiler
set_compiler_var(CLANG 1)
# HIP Clang compiler
if(CMAKE_${LANG}_COMPILER MATCHES "hipcc")
set_compiler_var(HIPCC 1)
endif()
elseif(CMAKE_${LANG}_COMPILER_ID MATCHES "PGI")
# PGI compiler
set_compiler_var(PGI 1)
elseif(CMAKE_${LANG}_COMPILER MATCHES "xlC" AND UNIX)
# IBM xlC compiler
set_compiler_var(XLC 1)
elseif(CMAKE_${LANG}_COMPILER MATCHES "aCC" AND UNIX)
# HP aC++ compiler
set_compiler_var(HP_ACC 1)
elseif(
CMAKE_${LANG}_COMPILER MATCHES "CC"
AND CMAKE_SYSTEM_NAME MATCHES "IRIX"
AND UNIX
)
# IRIX MIPSpro CC Compiler
set_compiler_var(MIPS 1)
elseif(CMAKE_${LANG}_COMPILER_ID MATCHES "Intel")
set_compiler_var(INTEL 1)
set(CTYPE ICC)
if("${LANG}" STREQUAL "CXX")
set(CTYPE ICPC)
endif()
set_compiler_var(INTEL_${CTYPE} 1)
elseif(CMAKE_${LANG}_COMPILER MATCHES "MSVC")
# Windows Visual Studio compiler
set_compiler_var(MSVC 1)
elseif(CMAKE_${LANG}_COMPILER_ID MATCHES "NVIDIA")
# NVCC
set_compiler_var(NVIDIA 1)
endif()
# set other to no
foreach(
TYPE
GNU
INTEL
INTEL_ICC
INTEL_ICPC
APPLE_CLANG
CLANG
PGI
XLC
HP_ACC
MIPS
MSVC
NVIDIA
HIPCC
)
if(NOT DEFINED CMAKE_${LANG}_COMPILER_IS_${TYPE})
set_compiler_var(${TYPE} 0)
endif()
endforeach()
endforeach()
+347
Просмотреть файл
@@ -0,0 +1,347 @@
# configure packaging
function(rocprofiler_systems_parse_release)
if(EXISTS /etc/lsb-release AND NOT IS_DIRECTORY /etc/lsb-release)
file(READ /etc/lsb-release _LSB_RELEASE)
if(_LSB_RELEASE)
string(
REGEX REPLACE
"DISTRIB_ID=(.*)\nDISTRIB_RELEASE=(.*)\nDISTRIB_CODENAME=.*"
"\\1-\\2"
_SYSTEM_NAME
"${_LSB_RELEASE}"
)
endif()
elseif(EXISTS /etc/os-release AND NOT IS_DIRECTORY /etc/os-release)
file(READ /etc/os-release _OS_RELEASE)
if(_OS_RELEASE)
string(REPLACE "\"" "" _OS_RELEASE "${_OS_RELEASE}")
string(REPLACE "-" " " _OS_RELEASE "${_OS_RELEASE}")
string(
REGEX REPLACE
"NAME=.*\nVERSION=([0-9\.]+).*\nID=([a-z]+).*"
"\\2-\\1"
_SYSTEM_NAME
"${_OS_RELEASE}"
)
endif()
endif()
string(TOLOWER "${_SYSTEM_NAME}" _SYSTEM_NAME)
if(NOT _SYSTEM_NAME)
set(_SYSTEM_NAME "${CMAKE_SYSTEM_NAME}")
endif()
set(_SYSTEM_NAME "${_SYSTEM_NAME}" PARENT_SCOPE)
endfunction()
# parse either /etc/lsb-release or /etc/os-release
rocprofiler_systems_parse_release()
if(NOT _SYSTEM_NAME)
set(_SYSTEM_NAME "${CMAKE_SYSTEM_NAME}")
endif()
# Add packaging directives
set(CPACK_PACKAGE_NAME ${PROJECT_NAME})
set(CPACK_PACKAGE_VENDOR "Advanced Micro Devices, Inc.")
set(CPACK_PACKAGE_DESCRIPTION_SUMMARY
"Runtime instrumentation and binary rewriting for Perfetto via Dyninst"
)
set(CPACK_PACKAGE_VERSION_MAJOR "${PROJECT_VERSION_MAJOR}")
set(CPACK_PACKAGE_VERSION_MINOR "${PROJECT_VERSION_MINOR}")
set(CPACK_PACKAGE_VERSION_PATCH "${PROJECT_VERSION_PATCH}")
set(CPACK_PACKAGE_CONTACT "https://github.com/ROCm/rocprofiler-systems")
set(CPACK_RESOURCE_FILE_LICENSE "${PROJECT_SOURCE_DIR}/LICENSE")
set(CPACK_INCLUDE_TOPLEVEL_DIRECTORY OFF)
# For handling the project rebranding from "omnitrace" to "rocprofiler-systems"
set(OMNITRACE_PACKAGE_NAME "omnitrace")
set(ROCPROFSYS_CPACK_SYSTEM_NAME
"${_SYSTEM_NAME}"
CACHE STRING
"System name, e.g. Linux or Ubuntu-20.04"
)
set(ROCPROFSYS_CPACK_PACKAGE_SUFFIX "")
if(ROCPROFSYS_USE_ROCM)
set(ROCPROFSYS_CPACK_PACKAGE_SUFFIX
"${ROCPROFSYS_CPACK_PACKAGE_SUFFIX}-ROCm-${ROCmVersion_NUMERIC_VERSION}"
)
endif()
if(ROCPROFSYS_USE_PAPI)
set(ROCPROFSYS_CPACK_PACKAGE_SUFFIX "${ROCPROFSYS_CPACK_PACKAGE_SUFFIX}-PAPI")
endif()
if(ROCPROFSYS_USE_OMPT)
set(ROCPROFSYS_CPACK_PACKAGE_SUFFIX "${ROCPROFSYS_CPACK_PACKAGE_SUFFIX}-OMPT")
endif()
if(ROCPROFSYS_USE_MPI)
set(VALID_MPI_IMPLS "mpich" "openmpi")
if("${MPI_C_COMPILER_INCLUDE_DIRS};${MPI_C_HEADER_DIR}" MATCHES "openmpi")
set(ROCPROFSYS_MPI_IMPL "openmpi")
elseif("${MPI_C_COMPILER_INCLUDE_DIRS};${MPI_C_HEADER_DIR}" MATCHES "mpich")
set(ROCPROFSYS_MPI_IMPL "mpich")
else()
message(
WARNING
"MPI implementation could not be determined. Please set ROCPROFSYS_MPI_IMPL to one of the following for CPack: ${VALID_MPI_IMPLS}"
)
endif()
if(ROCPROFSYS_MPI_IMPL AND NOT "${ROCPROFSYS_MPI_IMPL}" IN_LIST VALID_MPI_IMPLS)
message(
SEND_ERROR
"Invalid ROCPROFSYS_MPI_IMPL (${ROCPROFSYS_MPI_IMPL}). Should be one of: ${VALID_MPI_IMPLS}"
)
else()
rocprofiler_systems_add_feature(ROCPROFSYS_MPI_IMPL
"MPI implementation for CPack DEBIAN depends"
)
endif()
if("${ROCPROFSYS_MPI_IMPL}" STREQUAL "openmpi")
set(ROCPROFSYS_MPI_IMPL_UPPER "OpenMPI")
elseif("${ROCPROFSYS_MPI_IMPL}" STREQUAL "mpich")
set(ROCPROFSYS_MPI_IMPL_UPPER "MPICH")
else()
set(ROCPROFSYS_MPI_IMPL_UPPER "MPI")
endif()
set(ROCPROFSYS_CPACK_PACKAGE_SUFFIX
"${ROCPROFSYS_CPACK_PACKAGE_SUFFIX}-${ROCPROFSYS_MPI_IMPL_UPPER}"
)
endif()
if(ROCPROFSYS_USE_PYTHON)
set(_ROCPROFSYS_PYTHON_NAME "Python3")
foreach(_VER ${ROCPROFSYS_PYTHON_VERSIONS})
if("${_VER}" VERSION_LESS 3.0.0)
set(_ROCPROFSYS_PYTHON_NAME "Python")
endif()
endforeach()
set(ROCPROFSYS_CPACK_PACKAGE_SUFFIX "${ROCPROFSYS_CPACK_PACKAGE_SUFFIX}-Python3")
endif()
set(CPACK_PACKAGE_FILE_NAME
"${CPACK_PACKAGE_NAME}-${ROCPROFSYS_VERSION}-${ROCPROFSYS_CPACK_SYSTEM_NAME}${ROCPROFSYS_CPACK_PACKAGE_SUFFIX}"
)
if(DEFINED ENV{CPACK_PACKAGE_FILE_NAME})
set(CPACK_PACKAGE_FILE_NAME $ENV{CPACK_PACKAGE_FILE_NAME})
endif()
set(ROCPROFSYS_PACKAGE_FILE_NAME
${CPACK_PACKAGE_NAME}-${ROCPROFSYS_VERSION}-${ROCPROFSYS_CPACK_SYSTEM_NAME}${ROCPROFSYS_CPACK_PACKAGE_SUFFIX}
)
rocprofiler_systems_add_feature(ROCPROFSYS_PACKAGE_FILE_NAME "CPack filename")
if(ROCM_DEP_ROCMCORE OR ROCPROFILER_DEP_ROCMCORE)
set(_DEBIAN_PACKAGE_DEPENDS "rocm-core")
set(_RPM_PACKAGE_REQUIRES "rocm-core")
else()
set(_DEBIAN_PACKAGE_DEPENDS "")
set(_RPM_PACKAGE_REQUIRES "")
endif()
# -------------------------------------------------------------------------------------- #
#
# Debian package specific variables
#
# -------------------------------------------------------------------------------------- #
set(CPACK_DEBIAN_PACKAGE_HOMEPAGE "https://github.com/ROCm/rocprofiler-systems")
set(CPACK_DEBIAN_PACKAGE_RELEASE
"${ROCPROFSYS_CPACK_SYSTEM_NAME}${ROCPROFSYS_CPACK_PACKAGE_SUFFIX}"
)
string(
REGEX REPLACE
"([a-zA-Z])-([0-9])"
"\\1\\2"
CPACK_DEBIAN_PACKAGE_RELEASE
"${CPACK_DEBIAN_PACKAGE_RELEASE}"
)
string(REPLACE "-" "~" CPACK_DEBIAN_PACKAGE_RELEASE "${CPACK_DEBIAN_PACKAGE_RELEASE}")
if(ROCPROFSYS_USE_PAPI AND NOT ROCPROFSYS_BUILD_PAPI)
list(APPEND _DEBIAN_PACKAGE_DEPENDS libpapi-dev libpfm4)
endif()
if(NOT ROCPROFSYS_BUILD_DYNINST)
if(NOT ROCPROFSYS_BUILD_BOOST)
foreach(
_BOOST_COMPONENT
atomic
system
thread
date-time
filesystem
timer
)
list(
APPEND
_DEBIAN_PACKAGE_DEPENDS
"libboost-${_BOOST_COMPONENT}-dev (>= 1.67.0)"
)
endforeach()
endif()
if(NOT ROCPROFSYS_BUILD_TBB)
list(APPEND _DEBIAN_PACKAGE_DEPENDS "libtbb-dev (>= 2018.6)")
endif()
if(NOT ROCPROFSYS_BUILD_LIBIBERTY)
list(APPEND _DEBIAN_PACKAGE_DEPENDS "libiberty-dev (>= 20170913)")
endif()
endif()
if(ROCmVersion_FOUND)
set(_AMD_SMI_SUFFIX
" (>= ${ROCmVersion_MAJOR_VERSION}.0.0.${ROCmVersion_NUMERIC_VERSION})"
)
endif()
if(ROCPROFSYS_USE_ROCM)
list(APPEND _DEBIAN_PACKAGE_DEPENDS "amd-smi-lib${_AMD_SMI_SUFFIX}")
list(APPEND _DEBIAN_PACKAGE_DEPENDS "rocprofiler-sdk (>= ${rocprofiler-sdk_VERSION})")
endif()
if(ROCPROFSYS_USE_MPI)
if("${ROCPROFSYS_MPI_IMPL}" STREQUAL "openmpi")
list(APPEND _DEBIAN_PACKAGE_DEPENDS "libopenmpi-dev")
elseif("${ROCPROFSYS_MPI_IMPL}" STREQUAL "mpich")
list(APPEND _DEBIAN_PACKAGE_DEPENDS "libmpich-dev")
endif()
endif()
if(ROCPROFSYS_BUILD_TESTING)
list(APPEND _DEBIAN_PACKAGE_DEPENDS "rocdecode-test")
list(APPEND _DEBIAN_PACKAGE_DEPENDS "rocjpeg-test")
endif()
string(REPLACE ";" ", " _DEBIAN_PACKAGE_DEPENDS "${_DEBIAN_PACKAGE_DEPENDS}")
set(CPACK_DEBIAN_PACKAGE_DEPENDS
"${_DEBIAN_PACKAGE_DEPENDS}"
CACHE STRING
"Debian package dependencies"
FORCE
)
set(CPACK_DEBIAN_FILE_NAME "DEB-DEFAULT")
set(CPACK_DEBIAN_PACKAGE_SHLIBDEPS ON)
# Handle the project rebranding from "omnitrace" to "rocprofiler-systems"
set(CPACK_DEBIAN_PACKAGE_PROVIDES ${OMNITRACE_PACKAGE_NAME})
set(CPACK_DEBIAN_PACKAGE_REPLACES ${OMNITRACE_PACKAGE_NAME})
set(CPACK_DEBIAN_PACKAGE_BREAKS ${OMNITRACE_PACKAGE_NAME})
# -------------------------------------------------------------------------------------- #
#
# RPM package specific variables
#
# -------------------------------------------------------------------------------------- #
if(DEFINED CPACK_PACKAGING_INSTALL_PREFIX)
set(CPACK_RPM_EXCLUDE_FROM_AUTO_FILELIST_ADDITION "${CPACK_PACKAGING_INSTALL_PREFIX}")
endif()
set(CPACK_RPM_PACKAGE_RELEASE
"${ROCPROFSYS_CPACK_SYSTEM_NAME}${ROCPROFSYS_CPACK_PACKAGE_SUFFIX}"
)
string(
REGEX REPLACE
"([a-zA-Z])-([0-9])"
"\\1\\2"
CPACK_RPM_PACKAGE_RELEASE
"${CPACK_RPM_PACKAGE_RELEASE}"
)
string(REPLACE "-" "~" CPACK_RPM_PACKAGE_RELEASE "${CPACK_RPM_PACKAGE_RELEASE}")
# Handle the project rebranding from "omnitrace" to "rocprofiler-systems"
set(CPACK_RPM_PACKAGE_OBSOLETES "${OMNITRACE_PACKAGE_NAME} <= 1.13.0")
set(CPACK_RPM_PACKAGE_CONFLICTS ${OMNITRACE_PACKAGE_NAME})
set(_RPM_PACKAGE_PROVIDES ${OMNITRACE_PACKAGE_NAME})
if(ROCPROFSYS_BUILD_LIBUNWIND)
list(APPEND _RPM_PACKAGE_PROVIDES "libunwind.so.99()(64bit)")
list(APPEND _RPM_PACKAGE_PROVIDES "libunwind-x86_64.so.99()(64bit)")
list(APPEND _RPM_PACKAGE_PROVIDES "libunwind-setjmp.so.0()(64bit)")
list(APPEND _RPM_PACKAGE_PROVIDES "libunwind-ptrace.so.0()(64bit)")
endif()
string(REPLACE ";" ", " CPACK_RPM_PACKAGE_PROVIDES "${_RPM_PACKAGE_PROVIDES}")
set(CPACK_RPM_PACKAGE_PROVIDES
"${CPACK_RPM_PACKAGE_PROVIDES}"
CACHE STRING
"RPM package provides"
FORCE
)
if(ROCPROFSYS_USE_MPI)
if("${ROCPROFSYS_MPI_IMPL}" STREQUAL "openmpi")
list(APPEND _RPM_PACKAGE_REQUIRES "libopenmpi-devel")
elseif("${ROCPROFSYS_MPI_IMPL}" STREQUAL "mpich")
list(APPEND _RPM_PACKAGE_REQUIRES "libmpich-devel")
endif()
endif()
if(ROCPROFSYS_USE_ROCM)
if(ROCPROFSYS_BUILD_TESTING)
list(APPEND _RPM_PACKAGE_REQUIRES "rocdecode-test")
list(APPEND _RPM_PACKAGE_REQUIRES "rocjpeg-test")
endif()
endif()
string(REPLACE ";" ", " _RPM_PACKAGE_REQUIRES "${_RPM_PACKAGE_REQUIRES}")
set(CPACK_RPM_PACKAGE_REQUIRES
${_RPM_PACKAGE_REQUIRES}
CACHE STRING
"RPM package requires"
FORCE
)
set(CPACK_RPM_SPEC_MORE_DEFINE "%undefine __brp_mangle_shebangs")
set(CPACK_RPM_PACKAGE_LICENSE "MIT")
set(CPACK_RPM_FILE_NAME "RPM-DEFAULT")
set(CPACK_RPM_PACKAGE_RELEASE_DIST ON)
set(CPACK_RPM_PACKAGE_AUTOPROV ON)
set(CPACK_RPM_PACKAGE_AUTOREQ ON)
# -------------------------------------------------------------------------------------- #
#
# Prepare final CPACK parameters
#
# -------------------------------------------------------------------------------------- #
set(CPACK_PACKAGE_VERSION
"${CPACK_PACKAGE_VERSION_MAJOR}.${CPACK_PACKAGE_VERSION_MINOR}.${CPACK_PACKAGE_VERSION_PATCH}"
)
if(DEFINED ENV{ROCM_LIBPATCH_VERSION})
set(CPACK_PACKAGE_VERSION "${CPACK_PACKAGE_VERSION}.$ENV{ROCM_LIBPATCH_VERSION}")
endif()
if(DEFINED ENV{CPACK_DEBIAN_PACKAGE_RELEASE})
set(CPACK_DEBIAN_PACKAGE_RELEASE $ENV{CPACK_DEBIAN_PACKAGE_RELEASE})
endif()
if(DEFINED ENV{CPACK_RPM_PACKAGE_RELEASE})
set(CPACK_RPM_PACKAGE_RELEASE $ENV{CPACK_RPM_PACKAGE_RELEASE})
endif()
rocprofiler_systems_add_feature(CPACK_PACKAGE_NAME "Package name")
rocprofiler_systems_add_feature(CPACK_PACKAGE_VERSION "Package version")
rocprofiler_systems_add_feature(CPACK_PACKAGING_INSTALL_PREFIX
"Package installation prefix"
)
rocprofiler_systems_add_feature(CPACK_DEBIAN_FILE_NAME "Debian file name")
rocprofiler_systems_add_feature(CPACK_DEBIAN_PACKAGE_RELEASE
"Debian package release version"
)
rocprofiler_systems_add_feature(CPACK_DEBIAN_PACKAGE_DEPENDS
"Debian package dependencies"
)
rocprofiler_systems_add_feature(CPACK_DEBIAN_PACKAGE_SHLIBDEPS
"Debian package shared library dependencies"
)
rocprofiler_systems_add_feature(CPACK_RPM_FILE_NAME "RPM file name")
rocprofiler_systems_add_feature(CPACK_RPM_PACKAGE_RELEASE "RPM package release version")
rocprofiler_systems_add_feature(CPACK_RPM_PACKAGE_AUTOREQPROV
"RPM package auto generate requires and provides"
)
rocprofiler_systems_add_feature(CPACK_RPM_PACKAGE_REQUIRES "RPM package requires")
rocprofiler_systems_add_feature(CPACK_RPM_PACKAGE_PROVIDES "RPM package provides")
include(CPack)
+86
Просмотреть файл
@@ -0,0 +1,86 @@
# include guard
include_guard(GLOBAL)
include(CMakePackageConfigHelpers)
set(CMAKE_INSTALL_DEFAULT_COMPONENT_NAME config)
install(
EXPORT rocprofiler-systems-library-targets
FILE ${PROJECT_NAME}-library-targets.cmake
NAMESPACE rocprofiler-systems::
DESTINATION ${CMAKE_INSTALL_LIBDIR}/cmake/${PROJECT_NAME}
)
# ------------------------------------------------------------------------------#
# install tree
#
set(PROJECT_INSTALL_DIR ${CMAKE_INSTALL_PREFIX})
set(INCLUDE_INSTALL_DIR ${CMAKE_INSTALL_INCLUDEDIR})
set(LIB_INSTALL_DIR ${CMAKE_INSTALL_LIBDIR})
set(PROJECT_BUILD_TARGETS user)
configure_package_config_file(
${PROJECT_SOURCE_DIR}/cmake/Templates/rocprof-sys-config.cmake.in
${PROJECT_BINARY_DIR}/${CMAKE_INSTALL_LIBDIR}/cmake/${PROJECT_NAME}/${PROJECT_NAME}-config.cmake
INSTALL_DESTINATION ${CMAKE_INSTALL_LIBDIR}/cmake/${PROJECT_NAME}
INSTALL_PREFIX ${CMAKE_INSTALL_PREFIX}
PATH_VARS PROJECT_INSTALL_DIR INCLUDE_INSTALL_DIR LIB_INSTALL_DIR
)
write_basic_package_version_file(
${PROJECT_BINARY_DIR}/${CMAKE_INSTALL_LIBDIR}/cmake/${PROJECT_NAME}/${PROJECT_NAME}-version.cmake
VERSION ${PROJECT_VERSION}
COMPATIBILITY SameMinorVersion
)
install(
FILES
${PROJECT_BINARY_DIR}/${CMAKE_INSTALL_LIBDIR}/cmake/${PROJECT_NAME}/${PROJECT_NAME}-config.cmake
${PROJECT_BINARY_DIR}/${CMAKE_INSTALL_LIBDIR}/cmake/${PROJECT_NAME}/${PROJECT_NAME}-version.cmake
DESTINATION ${CMAKE_INSTALL_LIBDIR}/cmake/${PROJECT_NAME}
OPTIONAL
)
export(PACKAGE ${PROJECT_NAME})
# ------------------------------------------------------------------------------#
# install the validate-causal-json python script as a utility
#
configure_file(
${PROJECT_SOURCE_DIR}/tests/validate-causal-json.py
${PROJECT_BINARY_DIR}/${CMAKE_INSTALL_BINDIR}/rocprof-sys-causal-print
COPYONLY
)
install(
PROGRAMS ${PROJECT_BINARY_DIR}/${CMAKE_INSTALL_BINDIR}/rocprof-sys-causal-print
DESTINATION ${CMAKE_INSTALL_LIBEXECDIR}/${PROJECT_NAME}
)
# ------------------------------------------------------------------------------#
# build tree
#
set(_BUILDTREE_EXPORT_DIR
"${PROJECT_BINARY_DIR}/${CMAKE_INSTALL_LIBDIR}/cmake/${PROJECT_NAME}"
)
if(NOT EXISTS "${PROJECT_BINARY_DIR}/${CMAKE_INSTALL_LIBDIR}")
file(MAKE_DIRECTORY "${PROJECT_BINARY_DIR}/${CMAKE_INSTALL_LIBDIR}")
endif()
if(NOT EXISTS "${_BUILDTREE_EXPORT_DIR}")
file(MAKE_DIRECTORY "${_BUILDTREE_EXPORT_DIR}")
endif()
if(NOT EXISTS "${_BUILDTREE_EXPORT_DIR}/${PROJECT_NAME}-library-targets.cmake")
file(TOUCH "${_BUILDTREE_EXPORT_DIR}/${PROJECT_NAME}-library-targets.cmake")
endif()
export(
EXPORT ${PROJECT_NAME}-library-targets
NAMESPACE rocprofiler-systems::
FILE "${_BUILDTREE_EXPORT_DIR}/${PROJECT_NAME}-library-targets.cmake"
)
set(${PROJECT_NAME}_DIR "${_BUILDTREE_EXPORT_DIR}" CACHE PATH "${PROJECT_NAME}" FORCE)
+437
Просмотреть файл
@@ -0,0 +1,437 @@
# ========================================================================================================
# Boost.cmake
#
# Configure Boost for Dyninst
#
# ----------------------------------------
#
# Accepts the following CMake variables
#
# Boost_ROOT_DIR - Hint directory that contains the Boost installation
# PATH_BOOST - Alias for Boost_ROOT_DIR Boost_MIN_VERSION - Minimum
# acceptable version of Boost Boost_USE_MULTITHREADED - Use the multithreaded version of
# Boost Boost_USE_STATIC_RUNTIME - Use libraries linked statically to the C++ runtime
#
# Options inherited from Modules/FindBoost.cmake that may be useful
#
# BOOST_INCLUDEDIR - Hint directory that contains the Boost headers files
# BOOST_LIBRARYDIR - Hint directory that contains the Boost library files
#
# Advanced options:
#
# Boost_DEBUG - Enable debug output from FindBoost Boost_NO_SYSTEM_PATHS -
# Disable searching in locations not specified by hint variables
#
# Exports the following CMake cache variables
#
# Boost_ROOT_DIR - Computed base directory the of Boost installation
# Boost_INCLUDE_DIRS - Boost include directories Boost_INCLUDE_DIR - Alias for
# Boost_INCLUDE_DIRS Boost_LIBRARY_DIRS - Link directories for Boost libraries
# Boost_DEFINES - Boost compiler definitions Boost_LIBRARIES - Boost
# library files Boost_<C>_LIBRARY_RELEASE - Release libraries to link for component <C>
# (<C> is upper-case) Boost_<C>_LIBRARY_DEBUG - Debug libraries to link for component
# <C> Boost_THREAD_LIBRARY - The filename of the Boost thread library
# Boost_USE_MULTITHREADED - Use the multithreaded version of Boost
# Boost_USE_STATIC_RUNTIME - Use libraries linked statically to the C++ runtime
#
# NOTE: The exported Boost_ROOT_DIR can be different from the value provided by the user
# in the case that it is determined to build Boost from source. In such a case,
# Boost_ROOT_DIR will contain the directory of the from-source installation.
#
# See Modules/FindBoost.cmake for additional input and exported variables
#
# ========================================================================================================
include_guard(GLOBAL)
if(NOT BUILD_BOOST)
find_package(Boost)
endif()
if(Boost_FOUND)
return()
endif()
# Need at least Boost-1.67 because of deprecated headers
set(_boost_min_version 1.67.0)
# Provide a default, if the user didn't specify
set(Boost_MIN_VERSION ${_boost_min_version} CACHE STRING "Minimum Boost version")
# Enforce minimum version
if(${Boost_MIN_VERSION} VERSION_LESS ${_boost_min_version})
rocprofiler_systems_message(
FATAL_ERROR
"Requested Boost-${Boost_MIN_VERSION} is less than minimum supported version (${_boost_min_version})"
)
endif()
# -------------- RUNTIME CONFIGURATION ----------------------------------------
# Use the multithreaded version of Boost NB: This _must_ be a cache variable as it
# controls the tagged layout of Boost library names
set(Boost_USE_MULTITHREADED ON CACHE BOOL "Enable multithreaded Boost libraries")
# Don't use libraries linked statically to the C++ runtime NB: This _must_ be a cache
# variable as it controls the tagged layout of Boost library names
set(Boost_USE_STATIC_RUNTIME
OFF
CACHE BOOL
"Enable usage of libraries statically linked to C++ runtime"
)
# If using multithreaded Boost, make sure Threads has been intialized
if(Boost_USE_MULTITHREADED AND NOT DEFINED CMAKE_THREAD_LIBS_INIT)
find_package(Threads)
endif()
# Enable debug output from FindBoost
set(Boost_DEBUG OFF CACHE BOOL "Enable debug output from FindBoost")
# -------------- PATHS --------------------------------------------------------
# By default, search system paths
set(Boost_NO_SYSTEM_PATHS
OFF
CACHE BOOL
"Disable searching in locations not specified by hint variables"
)
# A sanity check This must be done _before_ the cache variables are set
if(PATH_BOOST AND Boost_ROOT_DIR)
rocprofiler_systems_message(
FATAL_ERROR
"PATH_BOOST AND Boost_ROOT_DIR both specified. Please provide only one"
)
endif()
# Provide a default root directory
if(NOT PATH_BOOST AND NOT Boost_ROOT_DIR)
set(PATH_BOOST "/usr")
endif()
# Set the default location to look for Boost
set(Boost_ROOT_DIR ${PATH_BOOST} CACHE PATH "Base directory the of Boost installation")
# In FindBoost, Boost_ROOT_DIR is spelled BOOST_ROOT
set(BOOST_ROOT ${Boost_ROOT_DIR})
# -------------- COMPILER DEFINES ---------------------------------------------
set(_boost_defines)
# Disable auto-linking
list(APPEND _boost_defines BOOST_ALL_NO_LIB=1)
# Disable generating serialization code in boost::multi_index
list(APPEND _boost_defines BOOST_MULTI_INDEX_DISABLE_SERIALIZATION)
# There are broken versions of MSVC that won't handle variadic templates correctly
# (despite the C++11 test case passing).
if(MSVC)
list(APPEND _boost_defines BOOST_NO_CXX11_VARIADIC_TEMPLATES)
endif()
set(Boost_DEFINES ${_boost_defines} CACHE STRING "Boost compiler defines")
add_compile_definitions(${Boost_DEFINES})
# -------------- INTERNALS ----------------------------------------------------
# Disable Boost's own CMake as it's known to be buggy NB: This should not be a cache
# variable
set(Boost_NO_BOOST_CMAKE ON)
# The required Boost library components NB: These are just the ones that require
# compilation/linking This should _not_ be a cache variable
set(_boost_components
atomic
chrono
date_time
filesystem
system
thread
timer
)
if(NOT BUILD_BOOST)
find_package(Boost ${Boost_MIN_VERSION} QUIET COMPONENTS ${_boost_components})
endif()
# -------------- SOURCE BUILD -------------------------------------------------
if(Boost_FOUND AND NOT BUILD_BOOST)
# Force the cache entries to be updated Normally, these would not be exported.
# However, we need them in the Testsuite
set(Boost_INCLUDE_DIRS
${Boost_INCLUDE_DIRS}
CACHE PATH
"Boost include directory"
FORCE
)
set(Boost_LIBRARY_DIRS
${Boost_LIBRARY_DIRS}
CACHE PATH
"Boost library directory"
FORCE
)
set(Boost_INCLUDE_DIR ${Boost_INCLUDE_DIR} CACHE PATH "Boost include directory" FORCE)
elseif(NOT Boost_FOUND AND STERILE_BUILD)
rocprofiler_systems_message(
FATAL_ERROR "Boost not found and cannot be downloaded because build is sterile."
)
elseif(NOT BUILD_BOOST)
rocprofiler_systems_message(
FATAL_ERROR
"Boost was not found. Either configure cmake to find Boost properly or set BUILD_BOOST=ON to download and build"
)
else()
rocprofiler_systems_add_option(BOOST_LINK_STATIC "Link to boost libraries statically"
ON
)
# If we didn't find a suitable version on the system, then download one from the web
rocprofiler_systems_add_cache_option(
ROCPROFSYS_BOOST_DOWNLOAD_VERSION "Version of boost to download and install"
STRING "1.79.0"
)
# If the user specifies a version other than ROCPROFSYS_BOOST_DOWNLOAD_VERSION, use
# that version.
if(${ROCPROFSYS_BOOST_DOWNLOAD_VERSION} VERSION_LESS ${Boost_MIN_VERSION})
rocprofiler_systems_message(
FATAL_ERROR
"Boost download version is set to ${ROCPROFSYS_BOOST_DOWNLOAD_VERSION} but Boost minimum version is set to ${Boost_MIN_VERSION}"
)
endif()
rocprofiler_systems_message(
STATUS
"Attempting to build BOOST(${ROCPROFSYS_BOOST_DOWNLOAD_VERSION}) as external project"
)
if(Boost_USE_MULTITHREADED)
set(_boost_threading multi)
else()
set(_boost_threading single)
endif()
if(Boost_USE_STATIC_RUNTIME)
set(_boost_runtime_link static)
else()
set(_boost_runtime_link shared)
endif()
# Change the base directory
set(Boost_ROOT_DIR
${TPL_STAGING_PREFIX}/boost
CACHE PATH
"Base directory the of Boost installation"
FORCE
)
# Update the exported variables
set(Boost_INCLUDE_DIRS
"$<BUILD_INTERFACE:${Boost_ROOT_DIR}/include>;$<INSTALL_INTERFACE:${CMAKE_INSTALL_LIBDIR}/${TPL_INSTALL_INCLUDE_DIR}>"
CACHE PATH
"Boost include directory"
FORCE
)
set(Boost_LIBRARY_DIRS
"$<BUILD_INTERFACE:${Boost_ROOT_DIR}/lib>;$<INSTALL_INTERFACE:${CMAKE_INSTALL_LIBDIR}/${TPL_INSTALL_LIB_DIR}>"
CACHE PATH
"Boost library directory"
FORCE
)
set(Boost_INCLUDE_DIR
${Boost_INCLUDE_DIRS}
CACHE PATH
"Boost include directory"
FORCE
)
file(MAKE_DIRECTORY ${Boost_ROOT_DIR}/include)
file(MAKE_DIRECTORY ${Boost_ROOT_DIR}/lib)
if(BOOST_LINK_STATIC)
set(_BOOST_LINK static)
else()
set(_BOOST_LINK shared)
endif()
set(BOOST_ARGS
link=${_BOOST_LINK}
runtime-link=${_boost_runtime_link}
threading=${_boost_threading}
)
if(WIN32)
# NB: We need to build both debug/release on windows as we don't use
# CMAKE_BUILD_TYPE
set(BOOST_BOOTSTRAP call bootstrap.bat)
set(BOOST_BUILD ".\\b2")
if(CMAKE_SIZEOF_VOID_P STREQUAL "8")
list(APPEND BOOST_ARGS address-model=64)
endif()
else()
set(BOOST_BOOTSTRAP "./bootstrap.sh")
set(BOOST_BUILD "./b2")
if(CMAKE_BUILD_TYPE MATCHES "^(Debug|DEBUG)$")
list(APPEND BOOST_ARGS variant=debug)
else()
list(APPEND BOOST_ARGS variant=release)
endif()
endif()
# Join the component names together to pass to --with-libraries during bootstrap
set(_boost_lib_names "headers,")
foreach(c ${_boost_components})
# list(JOIN ...) is in cmake 3.12
string(CONCAT _boost_lib_names "${_boost_lib_names}${c},")
endforeach()
if(CMAKE_CXX_COMPILER_ID MATCHES "(GNU|Clang|Intel)")
list(APPEND BOOST_ARGS cflags=-fPIC cxxflags=-fPIC)
endif()
string(REPLACE "." "_" _boost_download_filename ${ROCPROFSYS_BOOST_DOWNLOAD_VERSION})
# zip is subject to locales on Unix
set(_boost_download_ext "zip")
if(UNIX)
set(_boost_download_ext "tar.gz")
endif()
set(_LIB_SUFFIX "${CMAKE_SHARED_LIBRARY_SUFFIX}")
if(BOOST_LINK_STATIC)
set(_LIB_SUFFIX "${CMAKE_STATIC_LIBRARY_SUFFIX}")
endif()
if(WIN32)
# We need to specify different library names for debug vs release
set(Boost_LIBRARIES "")
foreach(c ${_boost_components})
list(APPEND Boost_LIBRARIES "optimized libboost_${c} debug libboost_${c}-gd ")
list(
APPEND
_boost_build_byproducts
"{Boost_ROOT_DIR}/lib/libboost_${c}${_LIB_SUFFIX}"
)
set(Boost_${c}_LIBRARY
$<BUILD_INTERFACE:${Boost_ROOT_DIR}/lib/libboost_${c}${_LIB_SUFFIX}>
$<INSTALL_INTERFACE:boost_${c}>
)
set(Boost_${c}_LIBRARY_DEBUG
$<BUILD_INTERFACE:${Boost_ROOT_DIR}/lib/libboost_${c}${_LIB_SUFFIX}>
$<INSTALL_INTERFACE:libboost_${c}-gd>
)
# Also export cache variables for the file location of each library
string(TOUPPER ${c} _basename)
set(Boost_${_basename}_LIBRARY_RELEASE
"${Boost_${c}_LIBRARY}"
CACHE FILEPATH
""
FORCE
)
set(Boost_${_basename}_LIBRARY_DEBUG
"${Boost_${c}_LIBRARY_DEBUG}"
CACHE FILEPATH
""
FORCE
)
endforeach()
else()
# Transform the component names into the library filenames e.g., system ->
# boost_system
set(Boost_LIBRARIES "")
foreach(c ${_boost_components})
set(Boost_${c}_LIBRARY
$<BUILD_INTERFACE:${Boost_ROOT_DIR}/lib/libboost_${c}${_LIB_SUFFIX}>
$<INSTALL_INTERFACE:$<INSTALL_PREFIX>/${INSTALL_LIB_DIR}/${TPL_INSTALL_LIB_DIR}/libboost_${c}${_LIB_SUFFIX}>
)
list(
APPEND
_boost_build_byproducts
"${Boost_ROOT_DIR}/lib/libboost_${c}${_LIB_SUFFIX}"
)
list(APPEND Boost_LIBRARIES "${Boost_${c}_LIBRARY}")
# Also export cache variables for the file location of each library
string(TOUPPER ${c} _basename)
set(Boost_${_basename}_LIBRARY_RELEASE
"${Boost_${c}_LIBRARY}"
CACHE FILEPATH
""
FORCE
)
set(Boost_${_basename}_LIBRARY_DEBUG
"${Boost_${c}_LIBRARY}"
CACHE FILEPATH
""
FORCE
)
endforeach()
endif()
include(ExternalProject)
ExternalProject_Add(
rocprofiler-systems-boost-build
PREFIX ${Boost_ROOT_DIR}
GIT_REPOSITORY https://github.com/boostorg/boost.git
GIT_TAG boost-${ROCPROFSYS_BOOST_DOWNLOAD_VERSION}
BUILD_IN_SOURCE 1
CONFIGURE_COMMAND
${BOOST_BOOTSTRAP} --prefix=${Boost_ROOT_DIR}
--with-libraries=${_boost_lib_names}
BUILD_COMMAND
${BOOST_BUILD} --ignore-site-config --prefix=${Boost_ROOT_DIR} -j2
${BOOST_ARGS} -d0 install
BUILD_BYPRODUCTS ${_boost_build_byproducts}
INSTALL_COMMAND ""
)
# target for re-executing the installation
add_custom_target(
rocprofiler-systems-boost-install
COMMAND ${BOOST_BUILD} ${BOOST_ARGS} -d0 install
WORKING_DIRECTORY ${Boost_ROOT_DIR}/src/Boost-External
COMMENT "Installing Boost..."
)
endif()
# -------------- EXPORT VARIABLES ---------------------------------------------
# Export Boost_THREAD_LIBRARY
list(FIND _boost_components "thread" _building_threads)
if(Boost_USE_MULTITHREADED AND ${_building_threads})
# On Windows, always use the debug version On Linux, we don't use tagged builds, so
# the debug/release filenames are the same
set(Boost_THREAD_LIBRARY
${Boost_THREAD_LIBRARY_DEBUG}
CACHE FILEPATH
"Boost thread library"
)
endif()
# Add the system thread library
if(Boost_USE_MULTITHREADED)
list(APPEND Boost_LIBRARIES ${CMAKE_THREAD_LIBS_INIT})
endif()
# Export the complete set of libraries
set(Boost_LIBRARIES ${Boost_LIBRARIES} CACHE FILEPATH "Boost library files" FORCE)
target_include_directories(
rocprofiler-systems-boost
SYSTEM
INTERFACE ${Boost_INCLUDE_DIRS}
)
target_compile_definitions(rocprofiler-systems-boost INTERFACE ${Boost_DEFINITIONS})
target_link_directories(rocprofiler-systems-boost INTERFACE ${Boost_LIBRARY_DIRS})
target_link_libraries(rocprofiler-systems-boost INTERFACE ${Boost_LIBRARIES})
rocprofiler_systems_message(STATUS "Boost includes: ${Boost_INCLUDE_DIRS}")
rocprofiler_systems_message(STATUS "Boost library dirs: ${Boost_LIBRARY_DIRS}")
rocprofiler_systems_message(STATUS "Boost thread library: ${Boost_THREAD_LIBRARY}")
rocprofiler_systems_message(STATUS "Boost libraries: ${Boost_LIBRARIES}")
# Just the headers (effectively a simplified Boost::headers target)
add_library(Dyninst::Boost_headers INTERFACE IMPORTED)
target_include_directories(Dyninst::Boost_headers SYSTEM INTERFACE ${Boost_INCLUDE_DIRS})
+243
Просмотреть файл
@@ -0,0 +1,243 @@
# ======================================================================================
# elfutils.cmake
#
# Configure elfutils for Dyninst
#
# ----------------------------------------
#
# Accepts the following CMake variables
#
# ElfUtils_ROOT_DIR - Base directory the of elfutils installation
# ElfUtils_INCLUDEDIR - Hint directory that contains the elfutils headers files
# ElfUtils_LIBRARYDIR - Hint directory that contains the elfutils library files
# ElfUtils_MIN_VERSION - Minimum acceptable version of elfutils
#
# Directly exports the following CMake variables
#
# ElfUtils_ROOT_DIR - Computed base directory the of elfutils installation
# ElfUtils_INCLUDE_DIRS - elfutils include directories ElfUtils_LIBRARY_DIRS - Link
# directories for elfutils libraries ElfUtils_LIBRARIES - elfutils library files
#
# NOTE: The exported ElfUtils_ROOT_DIR can be different from the value provided by the
# user in the case that it is determined to build elfutils from source. In such a case,
# ElfUtils_ROOT_DIR will contain the directory of the from-source installation.
#
# See Modules/FindLibElf.cmake and Modules/FindLibDwarf.cmake for details
#
# ======================================================================================
include_guard(GLOBAL)
if(NOT BUILD_ELFUTILS)
find_package(Elfutils)
endif()
if(LibElf_FOUND AND LibDwarf_FOUND AND NOT ENABLE_DEBUGINFOD)
return()
endif()
if(NOT UNIX)
return()
endif()
# Minimum acceptable version of elfutils NB: We need >=0.178 because libdw isn't
# thread-safe before then
set(_min_version 0.178)
set(ElfUtils_MIN_VERSION
${_min_version}
CACHE STRING
"Minimum acceptable elfutils version"
)
if(${ElfUtils_MIN_VERSION} VERSION_LESS ${_min_version})
rocprofiler_systems_message(
FATAL_ERROR
"Requested version ${ElfUtils_MIN_VERSION} is less than minimum supported version (${_min_version})"
)
endif()
# -------------- PATHS --------------------------------------------------------
# Base directory the of elfutils installation
set(ElfUtils_ROOT_DIR "/usr" CACHE PATH "Base directory the of elfutils installation")
# Hint directory that contains the elfutils headers files
set(ElfUtils_INCLUDEDIR
"${ElfUtils_ROOT_DIR}/include"
CACHE PATH
"Hint directory that contains the elfutils headers files"
)
# Hint directory that contains the elfutils library files
set(ElfUtils_LIBRARYDIR
"${ElfUtils_ROOT_DIR}/lib"
CACHE PATH
"Hint directory that contains the elfutils library files"
)
# libelf/dwarf-specific directory hints
foreach(l LibElf LibDwarf LibDebuginfod)
foreach(d ROOT_DIR INCLUDEDIR LIBRARYDIR)
set(${l}_${d} ${ElfUtils_${d}})
endforeach()
endforeach()
# -------------- PACKAGES------------------------------------------------------
if(NOT BUILD_ELFUTILS)
find_package(LibElf ${ElfUtils_MIN_VERSION})
# Don't search for libdw or libdebuginfod if we didn't find a suitable libelf
if(LibElf_FOUND)
find_package(LibDwarf ${ElfUtils_MIN_VERSION})
if(ENABLE_DEBUGINFOD)
find_package(LibDebuginfod ${ElfUtils_MIN_VERSION} REQUIRED)
endif()
endif()
endif()
# -------------- SOURCE BUILD -------------------------------------------------
if(LibElf_FOUND AND LibDwarf_FOUND AND (NOT ENABLE_DEBUGINFOD OR LibDebuginfod_FOUND))
if(ENABLE_DEBUGINFOD AND LibDebuginfod_FOUND)
set(_eu_root ${ElfUtils_ROOT_DIR})
set(_eu_inc_dirs
${LibElf_INCLUDE_DIRS}
${LibDwarf_INCLUDE_DIRS}
${LibDebuginfod_INCLUDE_DIRS}
)
set(_eu_lib_dirs
${LibElf_LIBRARY_DIRS}
${LibDwarf_LIBRARY_DIRS}
${LibDebuginfod_LIBRARY_DIRS}
)
set(_eu_libs ${LibElf_LIBRARIES} ${LibDwarf_LIBRARIES} ${LibDebuginfod_LIBRARIES})
else()
set(_eu_root ${ElfUtils_ROOT_DIR})
set(_eu_inc_dirs ${LibElf_INCLUDE_DIRS} ${LibDwarf_INCLUDE_DIRS})
set(_eu_lib_dirs ${LibElf_LIBRARY_DIRS} ${LibDwarf_LIBRARY_DIRS})
set(_eu_libs ${LibElf_LIBRARIES} ${LibDwarf_LIBRARIES})
endif()
elseif(NOT (LibElf_FOUND AND LibDwarf_FOUND) AND STERILE_BUILD)
rocprofiler_systems_message(
FATAL_ERROR
"ElfUtils not found and cannot be downloaded because build is sterile."
)
elseif(NOT BUILD_ELFUTILS)
rocprofiler_systems_message(
FATAL_ERROR
"ElfUtils was not found. Either configure cmake to find ElfUtils properly or set BUILD_ELFUTILS=ON to download and build"
)
else()
# If we didn't find a suitable version on the system, then download one from the web
rocprofiler_systems_add_cache_option(
ELFUTILS_DOWNLOAD_VERSION "Version of elfutils to download and install" STRING
"0.188"
)
set(ELFUTILS_DOWNLOAD_VERSION ${ElfUtils_DOWNLOAD_VERSION})
# make sure we are not downloading a version less than minimum
if(${ELFUTILS_DOWNLOAD_VERSION} VERSION_LESS ${ElfUtils_MIN_VERSION})
rocprofiler_systems_message(
FATAL_ERROR
"elfutils download version is set to ${ELFUTILS_DOWNLOAD_VERSION} but elfutils minimum version is set to ${ElfUtils_MIN_VERSION}"
)
endif()
rocprofiler_systems_message(STATUS "${ElfUtils_ERROR_REASON}")
rocprofiler_systems_message(
STATUS
"Attempting to build elfutils(${ELFUTILS_DOWNLOAD_VERSION}) as external project"
)
if(
NOT (${CMAKE_CXX_COMPILER_ID} STREQUAL "GNU")
OR NOT (${CMAKE_C_COMPILER_ID} STREQUAL "GNU")
)
rocprofiler_systems_message(FATAL_ERROR
"ElfUtils will only build with the GNU compiler"
)
endif()
set(_eu_root ${TPL_STAGING_PREFIX}/elfutils)
set(_eu_inc_dirs $<BUILD_INTERFACE:${_eu_root}/include>)
set(_eu_lib_dirs $<BUILD_INTERFACE:${_eu_root}/lib>)
set(_eu_libs
$<BUILD_INTERFACE:${_eu_root}/lib/libdw${CMAKE_SHARED_LIBRARY_SUFFIX}>
$<BUILD_INTERFACE:${_eu_root}/lib/libelf${CMAKE_SHARED_LIBRARY_SUFFIX}>
)
set(_eu_build_byproducts
"${_eu_root}/lib/libdw${CMAKE_SHARED_LIBRARY_SUFFIX}"
"${_eu_root}/lib/libelf${CMAKE_SHARED_LIBRARY_SUFFIX}"
)
file(MAKE_DIRECTORY "${_eu_root}/lib")
file(MAKE_DIRECTORY "${_eu_root}/include")
include(ExternalProject)
ExternalProject_Add(
rocprofiler-systems-elfutils-build
PREFIX ${_eu_root}
URL
${ElfUtils_DOWNLOAD_URL}
"https://sourceware.org/elfutils/ftp/${ELFUTILS_DOWNLOAD_VERSION}/elfutils-${ELFUTILS_DOWNLOAD_VERSION}.tar.bz2"
"https://mirrors.kernel.org/sourceware/elfutils/${ELFUTILS_DOWNLOAD_VERSION}/elfutils-${ELFUTILS_DOWNLOAD_VERSION}.tar.bz2"
BUILD_IN_SOURCE 1
CONFIGURE_COMMAND
${CMAKE_COMMAND} -E env CC=${CMAKE_C_COMPILER} CFLAGS=-fPIC\ -O3
CXX=${CMAKE_CXX_COMPILER} CXXFLAGS=-fPIC\ -O3
[=[LDFLAGS=-Wl,-rpath='$$ORIGIN']=] <SOURCE_DIR>/configure
--enable-install-elfh --prefix=${_eu_root} --disable-libdebuginfod
--disable-debuginfod --enable-thread-safety ${ElfUtils_CONFIG_OPTIONS}
--libdir=${_eu_root}/lib
BUILD_COMMAND make install
BUILD_BYPRODUCTS ${_eu_build_byproducts}
INSTALL_COMMAND ""
)
# target for re-executing the installation
add_custom_target(
rocprofiler-systems-elfutils-install
COMMAND make install
WORKING_DIRECTORY ${${_eu_root}}/src/ElfUtils-External
COMMENT "Installing ElfUtils..."
)
install(
DIRECTORY ${_eu_root}/lib/
DESTINATION ${CMAKE_INSTALL_LIBDIR}/${PROJECT_NAME}
FILES_MATCHING
PATTERN "*${CMAKE_SHARED_LIBRARY_SUFFIX}*"
)
endif()
# -------------- EXPORT VARIABLES ---------------------------------------------
set(ElfUtils_ROOT_DIR
${_eu_root}
CACHE PATH
"Base directory the of elfutils installation"
FORCE
)
set(ElfUtils_INCLUDE_DIRS ${_eu_inc_dirs} CACHE PATH "elfutils include directory" FORCE)
set(ElfUtils_LIBRARY_DIRS ${_eu_lib_dirs} CACHE PATH "elfutils library directory" FORCE)
set(ElfUtils_INCLUDE_DIR
${ElfUtils_INCLUDE_DIRS}
CACHE PATH
"elfutils include directory"
FORCE
)
set(ElfUtils_LIBRARIES ${_eu_libs} CACHE FILEPATH "elfutils library files" FORCE)
target_include_directories(
rocprofiler-systems-elfutils
SYSTEM
INTERFACE ${ElfUtils_INCLUDE_DIRS}
)
target_compile_definitions(rocprofiler-systems-elfutils INTERFACE ${ElfUtils_DEFINITIONS})
target_link_directories(rocprofiler-systems-elfutils INTERFACE ${ElfUtils_LIBRARY_DIRS})
target_link_libraries(rocprofiler-systems-elfutils INTERFACE ${ElfUtils_LIBRARIES})
rocprofiler_systems_message(STATUS "ElfUtils includes: ${ElfUtils_INCLUDE_DIRS}")
rocprofiler_systems_message(STATUS "ElfUtils library dirs: ${ElfUtils_LIBRARY_DIRS}")
rocprofiler_systems_message(STATUS "ElfUtils libraries: ${ElfUtils_LIBRARIES}")
+139
Просмотреть файл
@@ -0,0 +1,139 @@
include(MacroUtilities)
# Map deprecated DYNINST_BUILD_* variables to new ROCPROFSYS_BUILD_* variables
foreach(dep BOOST TBB ELFUTILS LIBIBERTY)
if(DYNINST_BUILD_${dep})
message(
WARNING
"DYNINST_BUILD_${dep} is deprecated. Using ROCPROFSYS_BUILD_${dep} instead."
)
set(ROCPROFSYS_BUILD_${dep} ON)
endif()
endforeach()
# Set BUILD_* to ON if ROCPROFSYS_BUILD_* is ON
foreach(dep BOOST TBB ELFUTILS LIBIBERTY)
if(ROCPROFSYS_BUILD_${dep})
if(dep STREQUAL "BOOST")
rocprofiler_systems_add_option(BUILD_BOOST "Enable building Boost internally"
ON
)
elseif(dep STREQUAL "TBB")
rocprofiler_systems_add_option(BUILD_TBB "Enable building TBB internally" ON)
elseif(dep STREQUAL "ELFUTILS")
rocprofiler_systems_add_option(BUILD_ELFUTILS
"Enable building elfutils internally" ON
)
elseif(dep STREQUAL "LIBIBERTY")
rocprofiler_systems_add_option(BUILD_LIBIBERTY
"Enable building libiberty internally" ON
)
endif()
endif()
endforeach()
set(TPL_STAGING_PREFIX
"${PROJECT_BINARY_DIR}/external"
CACHE PATH
"Third-party library build-tree install prefix"
)
file(MAKE_DIRECTORY "${TPL_STAGING_PREFIX}")
file(MAKE_DIRECTORY "${TPL_STAGING_PREFIX}/include")
add_custom_target(external-prebuild)
# Add external dependencies to be built
include(DyninstBoost)
if(TARGET rocprofiler-systems-boost-build)
# Make Boost build serially
set_target_properties(
rocprofiler-systems-boost
PROPERTIES JOB_POOL_COMPILE external_deps_pool JOB_POOL_LINK external_deps_pool
)
# Create a prebuild target that depends on Boost
add_dependencies(external-prebuild rocprofiler-systems-boost-build)
endif()
include(DyninstTBB)
if(TARGET rocprofiler-systems-tbb-build AND TARGET external-prebuild)
# Make TBB build serially and wait for Boost
set_target_properties(
rocprofiler-systems-tbb-build
PROPERTIES JOB_POOL_COMPILE external_deps_pool JOB_POOL_LINK external_deps_pool
)
add_dependencies(external-prebuild rocprofiler-systems-tbb-build)
endif()
include(DyninstElfUtils)
if(TARGET rocprofiler-systems-elfutils-build AND TARGET external-prebuild)
set_target_properties(
rocprofiler-systems-elfutils-build
PROPERTIES JOB_POOL_COMPILE external_deps_pool JOB_POOL_LINK external_deps_pool
)
add_dependencies(external-prebuild rocprofiler-systems-elfutils-build)
endif()
include(DyninstLibIberty)
if(TARGET rocprofiler-systems-libiberty-build AND TARGET external-prebuild)
set_target_properties(
rocprofiler-systems-libiberty-build
PROPERTIES JOB_POOL_COMPILE external_deps_pool JOB_POOL_LINK external_deps_pool
)
add_dependencies(external-prebuild rocprofiler-systems-libiberty-build)
endif()
# Final dependency check
if(NOT TARGET external-prebuild)
message(WARNING "Not all dyninst external dependencies found. Build may fail.")
endif()
# Create a dummy target to ensure external dependencies are fully built
add_custom_target(external-deps-complete)
if(TARGET external-prebuild)
add_dependencies(external-deps-complete external-prebuild)
endif()
if(NOT TARGET Dyninst::Boost AND TARGET rocprofiler-systems-boost)
add_library(Dyninst::Boost INTERFACE IMPORTED)
set_target_properties(
Dyninst::Boost
PROPERTIES INTERFACE_LINK_LIBRARIES rocprofiler-systems-boost
)
message(
STATUS
"Created imported target Dyninst::Boost linked to rocprofiler-systems-boost"
)
endif()
if(NOT TARGET Dyninst::ElfUtils AND TARGET rocprofiler-systems-elfutils)
add_library(Dyninst::ElfUtils INTERFACE IMPORTED)
set_target_properties(
Dyninst::ElfUtils
PROPERTIES INTERFACE_LINK_LIBRARIES rocprofiler-systems-elfutils
)
message(STATUS "Created imported target Dyninst::ElfUtils linked to ElfUtils")
endif()
if(NOT TARGET Dyninst::TBB AND TARGET rocprofiler-systems-tbb)
add_library(Dyninst::TBB INTERFACE IMPORTED)
set_target_properties(
Dyninst::TBB
PROPERTIES INTERFACE_LINK_LIBRARIES rocprofiler-systems-tbb
)
message(
STATUS
"Created imported target Dyninst::TBB linked to rocprofiler-systems-tbb"
)
endif()
if(NOT TARGET Dyninst::LibIberty AND TARGET rocprofiler-systems-libiberty)
add_library(Dyninst::LibIberty INTERFACE IMPORTED)
set_target_properties(
Dyninst::LibIberty
PROPERTIES INTERFACE_LINK_LIBRARIES rocprofiler-systems-libiberty
)
message(
STATUS
"Created imported target Dyninst::LibIberty linked to rocprofiler-systems-libiberty"
)
endif()
+156
Просмотреть файл
@@ -0,0 +1,156 @@
# ======================================================================================
# LibIberty.cmake
#
# Configure LibIberty for Dyninst
#
# ----------------------------------------
#
# Directly exports the following CMake variables
#
# LibIberty_ROOT_DIR - Computed base directory the of LibIberty installation
# LibIberty_LIBRARY_DIRS - Link directories for LibIberty libraries LibIberty_LIBRARIES
# - LibIberty library files LibIberty_INCLUDE - LibIberty include files
#
# NOTE: The exported LibIberty_ROOT_DIR can be different from the value provided by the
# user in the case that it is determined to build LibIberty from source. In such a case,
# LibIberty_ROOT_DIR will contain the directory of the from-source installation.
#
# See Modules/FindLibIberty.cmake for details
#
# ======================================================================================
include_guard(GLOBAL)
if(NOT UNIX)
return()
endif()
# -------------- PATHS --------------------------------------------------------
# Base directory the of LibIberty installation
set(LibIberty_ROOT_DIR "/usr" CACHE PATH "Base directory the of LibIberty installation")
# Hint directory that contains the LibIberty library files
set(LibIberty_LIBRARYDIR
"${LibIberty_ROOT_DIR}/lib"
CACHE PATH
"Hint directory that contains the LibIberty library files"
)
# -------------- PACKAGES -----------------------------------------------------
if(NOT BUILD_LIBIBERTY)
find_package(LibIberty)
endif()
# -------------- SOURCE BUILD -------------------------------------------------
if(LibIberty_FOUND)
set(_li_root ${LibIberty_ROOT_DIR})
set(_li_inc_dirs ${LibIberty_INCLUDE_DIRS})
set(_li_lib_dirs ${LibIberty_LIBRARY_DIRS})
set(_li_libs ${LibIberty_LIBRARIES})
elseif(STERILE_BUILD)
rocprofiler_systems_message(
FATAL_ERROR
"LibIberty not found and cannot be downloaded because build is sterile."
)
elseif(NOT BUILD_LIBIBERTY)
rocprofiler_systems_message(
FATAL_ERROR
"LibIberty was not found. Either configure cmake to find TBB properly or set BUILD_LIBIBERTY=ON to download and build"
)
else()
rocprofiler_systems_message(STATUS "${LibIberty_ERROR_REASON}")
rocprofiler_systems_message(STATUS
"Attempting to build LibIberty as external project"
)
set(_li_root ${TPL_STAGING_PREFIX}/binutils)
set(_li_project_name rocprofiler-systems-libiberty-build)
set(_li_working_dir ${_li_root}/src/${_li_project_name})
set(_li_inc_dirs $<BUILD_INTERFACE:${_li_root}/include>)
set(_li_lib_dirs $<BUILD_INTERFACE:${_li_root}/lib>)
set(_li_libs
$<BUILD_INTERFACE:${_li_root}/lib/libiberty${CMAKE_STATIC_LIBRARY_SUFFIX}>
)
set(_li_build_byproducts "${_li_root}/lib/libiberty${CMAKE_STATIC_LIBRARY_SUFFIX}")
file(MAKE_DIRECTORY "${_li_root}/lib")
file(MAKE_DIRECTORY "${_li_root}/include")
include(ExternalProject)
ExternalProject_Add(
${_li_project_name}
PREFIX ${_li_root}
URL
${DYNINST_BINUTILS_DOWNLOAD_URL}
http://ftpmirror.gnu.org/gnu/binutils/binutils-2.42.tar.gz
http://mirrors.kernel.org/sourceware/binutils/releases/binutils-2.42.tar.gz
BUILD_IN_SOURCE 1
CONFIGURE_COMMAND
${CMAKE_COMMAND} -E env CC=${CMAKE_C_COMPILER} CFLAGS=-fPIC\ -O3
CXX=${CMAKE_CXX_COMPILER} CXXFLAGS=-fPIC\ -O3 <SOURCE_DIR>/configure
--prefix=${_li_root}
BUILD_COMMAND make
BUILD_BYPRODUCTS ${_li_build_byproducts}
INSTALL_COMMAND ""
)
add_custom_command(
TARGET ${_li_project_name}
POST_BUILD
COMMAND install
ARGS -C ${_li_working_dir}/libiberty/libiberty.a ${_li_root}/lib
COMMAND install
ARGS -C ${_li_working_dir}/include/*.h ${_li_root}/include
COMMENT "Installing LibIberty..."
)
# target for re-executing the installation
add_custom_target(
rocprofiler-systems-libiberty-install
COMMAND install -C ${_li_working_dir}/libiberty/libiberty.a ${_li_root}/lib
COMMAND install ARGS -C ${_li_working_dir}/include/*.h ${_li_root}/include
WORKING_DIRECTORY ${_li_working_dir}
COMMENT "Installing LibIberty..."
)
# For backward compatibility
set(IBERTY_FOUND TRUE)
set(IBERTY_BUILD TRUE)
endif()
# -------------- EXPORT VARIABLES ---------------------------------------------
foreach(_DIR_TYPE inc lib)
if(_li_${_DIR_TYPE}_dirs)
list(REMOVE_DUPLICATES _li_${_DIR_TYPE}_dirs)
endif()
endforeach()
target_include_directories(rocprofiler-systems-libiberty INTERFACE ${_li_inc_dirs})
target_link_directories(rocprofiler-systems-libiberty INTERFACE ${_lib_lib_dirs})
target_link_libraries(rocprofiler-systems-libiberty INTERFACE ${_li_libs})
set(LibIberty_ROOT_DIR
${_li_root}
CACHE PATH
"Base directory the of LibIberty installation"
FORCE
)
set(LibIberty_INCLUDE_DIRS
${_li_inc_dirs}
CACHE PATH
"LibIberty include directories"
FORCE
)
set(LibIberty_LIBRARY_DIRS ${_li_lib_dirs} CACHE PATH "LibIberty library directory" FORCE)
set(LibIberty_LIBRARIES ${_li_libs} CACHE FILEPATH "LibIberty library files" FORCE)
# For backward compatibility only
set(IBERTY_LIBRARIES ${LibIberty_LIBRARIES})
rocprofiler_systems_message(STATUS "LibIberty include dirs: ${LibIberty_INCLUDE_DIRS}")
rocprofiler_systems_message(STATUS "LibIberty library dirs: ${LibIberty_LIBRARY_DIRS}")
rocprofiler_systems_message(STATUS "LibIberty libraries: ${LibIberty_LIBRARIES}")
+248
Просмотреть файл
@@ -0,0 +1,248 @@
# =====================================================================================
# ThreadingBuildingBlocks.cmake
#
# Configure Intel's Threading Building Blocks for Dyninst
#
# ----------------------------------------
#
# Accepts the following CMake variables
#
# TBB_ROOT_DIR - Hint directory that contains the TBB installation TBB_INCLUDEDIR -
# Hint directory that contains the TBB headers files TBB_LIBRARYDIR - Hint directory
# that contains the TBB library files TBB_LIBRARY - Alias for TBB_LIBRARYDIR
# TBB_USE_DEBUG_BUILD - Use debug version of tbb libraries, if present TBB_MIN_VERSION -
# Minimum acceptable version of TBB
#
# Directly exports the following CMake variables
#
# TBB_ROOT_DIR - Computed base directory of TBB installation TBB_INCLUDE_DIRS -
# TBB include directory TBB_INCLUDE_DIR - Alias for TBB_INCLUDE_DIRS TBB_LIBRARY_DIRS
# - TBB library directory TBB_LIBRARY_DIR - Alias for TBB_LIBRARY_DIRS TBB_DEFINITIONS -
# TBB compiler definitions TBB_LIBRARIES - TBB library files
#
# TBB_<c>_LIBRARY_RELEASE - Path to the release version of component <c>
# TBB_<c>_LIBRARY_DEBUG - Path to the debug version of component <c>
#
# NOTE: The exported TBB_ROOT_DIR can be different from the value provided by the user in
# the case that it is determined to build TBB from source. In such a case, TBB_ROOT_DIR
# will contain the directory of the from-source installation.
#
# See Modules/FindTBB.cmake for additional input and exported variables
#
# =====================================================================================
include_guard(GLOBAL)
if(TBB_FOUND)
return()
endif()
# -------------- RUNTIME CONFIGURATION ----------------------------------------
# Use debug versions of TBB libraries
set(TBB_USE_DEBUG_BUILD OFF CACHE BOOL "Use debug versions of TBB libraries")
# Minimum version of TBB (assumes a dotted-decimal format: YYYY.XX)
if(${CMAKE_CXX_COMPILER_ID} STREQUAL "Clang")
set(_tbb_min_version 2019.7)
else()
set(_tbb_min_version 2018.6)
endif()
set(TBB_MIN_VERSION
${_tbb_min_version}
CACHE STRING
"Minimum version of TBB (assumes a dotted-decimal format: YYYY.XX)"
)
if(${TBB_MIN_VERSION} VERSION_LESS ${_tbb_min_version})
dyninst_message(
FATAL_ERROR
"Requested TBB version ${TBB_MIN_VERSION} is less than minimum supported version ${_tbb_min_version}"
)
endif()
# -------------- PATHS --------------------------------------------------------
# TBB root directory
set(TBB_ROOT_DIR "/usr" CACHE PATH "TBB root directory")
# TBB include directory hint
set(TBB_INCLUDEDIR "${TBB_ROOT_DIR}/include" CACHE INTERNAL "TBB include directory")
# TBB library directory hint
set(TBB_LIBRARYDIR "${TBB_ROOT_DIR}/lib" CACHE INTERNAL "TBB library directory")
# Translate to FindTBB names
set(TBB_LIBRARY ${TBB_LIBRARYDIR})
set(TBB_INCLUDE_DIR ${TBB_INCLUDEDIR})
# The specific TBB libraries we need NB: This should _NOT_ be a cache variable
set(_tbb_components tbb tbbmalloc_proxy tbbmalloc)
if(NOT BUILD_TBB)
find_package(TBB ${TBB_MIN_VERSION} COMPONENTS ${_tbb_components})
endif()
# -------------- SOURCE BUILD -------------------------------------------------
if(TBB_FOUND)
# Force the cache entries to be updated Normally, these would not be exported.
# However, we need them in the Testsuite
set(TBB_INCLUDE_DIRS ${TBB_INCLUDE_DIRS} CACHE PATH "TBB include directory" FORCE)
set(TBB_LIBRARY_DIRS ${TBB_LIBRARY_DIRS} CACHE PATH "TBB library directory" FORCE)
set(TBB_DEFINITIONS ${TBB_DEFINITIONS} CACHE STRING "TBB compiler definitions" FORCE)
set(TBB_LIBRARIES ${TBB_LIBRARIES} CACHE FILEPATH "TBB library files" FORCE)
elseif(STERILE_BUILD)
rocprofiler_systems_message(
FATAL_ERROR "TBB not found and cannot be downloaded because build is sterile."
)
elseif(NOT BUILD_TBB)
rocprofiler_systems_message(
FATAL_ERROR
"TBB was not found. Either configure cmake to find TBB properly or set BUILD_TBB=ON to download and build"
)
else()
# If we didn't find a suitable version on the system, then download one from the web
rocprofiler_systems_message(STATUS "${ThreadingBuildingBlocks_ERROR_REASON}")
rocprofiler_systems_message(
STATUS "Attempting to build TBB(${TBB_MIN_VERSION}) as external project"
)
if(NOT UNIX)
rocprofiler_systems_message(
FATAL_ERROR "Building TBB from source is not supported on this platform"
)
endif()
set(TBB_ROOT_DIR ${TPL_STAGING_PREFIX}/tbb CACHE PATH "TBB root directory" FORCE)
set(_tbb_libraries)
set(_tbb_components_cfg)
set(_tbb_library_dirs
$<BUILD_INTERFACE:${TBB_ROOT_DIR}/lib>
$<INSTALL_INTERFACE:${INSTALL_LIB_DIR}/${TPL_INSTALL_LIB_DIR}>
)
set(_tbb_include_dirs
$<BUILD_INTERFACE:${TBB_ROOT_DIR}/include>
$<INSTALL_INTERFACE:${INSTALL_LIB_DIR}/${TPL_INSTALL_INCLUDE_DIR}>
)
# Forcibly update the cache variables
set(TBB_INCLUDE_DIRS "${_tbb_include_dirs}" CACHE PATH "TBB include directory" FORCE)
set(TBB_LIBRARY_DIRS "${_tbb_library_dirs}" CACHE PATH "TBB library directory" FORCE)
set(TBB_DEFINITIONS "" CACHE STRING "TBB compiler definitions" FORCE)
file(MAKE_DIRECTORY "${TBB_ROOT_DIR}/include")
file(MAKE_DIRECTORY "${TBB_ROOT_DIR}/lib")
foreach(c ${_tbb_components})
# Generate make target names
if(${c} STREQUAL tbbmalloc_proxy)
# tbbmalloc_proxy is spelled tbbproxy in their Makefiles
list(APPEND _tbb_components_cfg tbbproxy_release)
else()
list(APPEND _tbb_components_cfg ${c}_release)
endif()
set(_tbb_${c}_lib
$<BUILD_INTERFACE:${TBB_ROOT_DIR}/lib/lib${c}${CMAKE_SHARED_LIBRARY_SUFFIX}>
$<INSTALL_INTERFACE:${c}>
)
# Generate library filenames
list(APPEND _tbb_libraries ${_tbb_${c}_lib})
list(
APPEND
_tbb_build_byproducts
"${TBB_ROOT_DIR}/lib/lib${c}${CMAKE_SHARED_LIBRARY_SUFFIX}"
)
foreach(t RELEASE DEBUG)
set(TBB_${c}_LIBRARY_${t} "${_tbb_${c}_lib}" CACHE FILEPATH "" FORCE)
endforeach()
endforeach()
set(TBB_LIBRARIES "${_tbb_libraries}" CACHE FILEPATH "TBB library files" FORCE)
# Split the dotted decimal version into major/minor parts
string(REGEX REPLACE "\\." ";" _tbb_download_name ${TBB_MIN_VERSION})
list(GET _tbb_download_name 0 _tbb_ver_major)
list(GET _tbb_download_name 1 _tbb_ver_minor)
# Set the compiler for TBB It assumes gcc and tests for Intel, so clang is the only
# one that needs special treatment.
if(${CMAKE_CXX_COMPILER_ID} STREQUAL "Clang")
set(_tbb_compiler "compiler=clang")
endif()
find_program(MAKE_EXECUTABLE NAMES make gmake PATH_SUFFIXES bin)
if(NOT MAKE_EXECUTABLE AND CMAKE_GENERATOR MATCHES "Ninja")
dyninst_message(
FATAL_ERROR
"make/gmake executable not found. Please re-run with -DMAKE_EXECUTABLE=/path/to/make"
)
elseif(NOT MAKE_EXECUTABLE AND CMAKE_GENERATOR MATCHES "Makefiles")
set(MAKE_EXECUTABLE "$(MAKE)")
endif()
include(ExternalProject)
ExternalProject_Add(
rocprofiler-systems-tbb-build
PREFIX ${TBB_ROOT_DIR}
URL
https://github.com/ajanicijamd/oneTBB/archive/refs/tags/v${_tbb_ver_major}.${_tbb_ver_minor}.01.tar.gz
BUILD_IN_SOURCE 1
CONFIGURE_COMMAND ""
BUILD_COMMAND
${CMAKE_COMMAND} -E env CC=${CMAKE_C_COMPILER} CXX=${CMAKE_CXX_COMPILER}
[=[LDFLAGS=-Wl,-rpath='$$ORIGIN']=] ${MAKE_EXECUTABLE} -C src
${_tbb_components_cfg} tbb_build_dir=${TBB_ROOT_DIR}/src tbb_build_prefix=tbb
${_tbb_compiler}
BUILD_BYPRODUCTS ${_tbb_build_byproducts}
INSTALL_COMMAND ""
)
# post-build target for installing build
add_custom_command(
TARGET rocprofiler-systems-tbb-build
POST_BUILD
COMMAND ${CMAKE_COMMAND}
ARGS
-DLIBDIR=${TBB_LIBRARY_DIRS} -DINCDIR=${TBB_INCLUDE_DIRS}
-DPREFIX=${TBB_ROOT_DIR} -DCMAKE_STRIP=${CMAKE_STRIP} -P
${CMAKE_CURRENT_LIST_DIR}/DyninstTBBInstall.cmake
COMMENT "Installing TBB..."
)
add_custom_target(
rocprofiler-systems-tbb-install
COMMAND
${CMAKE_COMMAND} -DLIBDIR=${TBB_LIBRARY_DIRS} -DINCDIR=${TBB_INCLUDE_DIRS}
-DPREFIX=${TBB_ROOT_DIR} -P ${CMAKE_CURRENT_LIST_DIR}/DyninstTBBInstall.cmake
COMMENT "Installing TBB..."
)
install(
DIRECTORY ${TPL_STAGING_PREFIX}/tbb/lib/
DESTINATION ${CMAKE_INSTALL_LIBDIR}/${PROJECT_NAME}
FILES_MATCHING
PATTERN "*${CMAKE_SHARED_LIBRARY_SUFFIX}*"
)
endif()
foreach(_DIR_TYPE INCLUDE LIBRARY)
if(TBB_${_DIR_TYPE}_DIRS)
list(REMOVE_DUPLICATES TBB_${_DIR_TYPE}_DIRS)
endif()
endforeach()
target_include_directories(rocprofiler-systems-tbb SYSTEM INTERFACE ${TBB_INCLUDE_DIRS})
target_compile_definitions(rocprofiler-systems-tbb INTERFACE ${TBB_DEFINITIONS})
target_link_directories(rocprofiler-systems-tbb INTERFACE ${TBB_LIBRARY_DIRS})
target_link_libraries(rocprofiler-systems-tbb INTERFACE ${TBB_LIBRARIES})
rocprofiler_systems_message(STATUS "TBB include directory: ${TBB_INCLUDE_DIRS}")
rocprofiler_systems_message(STATUS "TBB library directory: ${TBB_LIBRARY_DIRS}")
rocprofiler_systems_message(STATUS "TBB libraries: ${TBB_LIBRARIES}")
rocprofiler_systems_message(STATUS "TBB definitions: ${TBB_DEFINITIONS}")
+46
Просмотреть файл
@@ -0,0 +1,46 @@
# ########################################################################################
# ThreadingBuildingBlocks.cmake
#
# Install Intel's Threading Building Blocks for Dyninst
#
# The default TBB build does not have an 'install' target, so we have to do it manually.
# This file contains the necessary CMake commands to complete the installation assuming it
# has been built using ExternalProject_Add.
#
# ########################################################################################
cmake_minimum_required(VERSION 3.13.0)
if(NOT CMAKE_STRIP)
find_program(CMAKE_STRIP NAMES strip)
endif()
file(MAKE_DIRECTORY ${LIBDIR} ${INCDIR})
file(
COPY ${PREFIX}/src/tbb_release/
DESTINATION ${LIBDIR}
FILES_MATCHING
PATTERN "*.so.*"
)
file(COPY ${PREFIX}/src/rocprofiler-systems-tbb-build/include/tbb DESTINATION ${INCDIR})
file(GLOB _tbb_libs ${LIBDIR}/libtbb*.so.*)
foreach(_lib ${_tbb_libs})
string(REGEX REPLACE "\\.2$" "" _lib_short ${_lib})
get_filename_component(_lib "${_lib}" NAME)
execute_process(
COMMAND ${CMAKE_COMMAND} -E create_symlink ${_lib} ${_lib_short}
WORKING_DIRECTORY ${LIBDIR}
)
endforeach()
foreach(_lib ${_tbb_libs})
get_filename_component(_lib_realpath "${_lib}" REALPATH)
if(NOT "${_lib_realpath}" IN_LIST _tbb_libs_realpath)
list(APPEND _tbb_libs_realpath ${_lib_realpath})
endif()
endforeach()
foreach(_lib ${_tbb_libs_realpath})
execute_process(COMMAND ${CMAKE_STRIP} ${_lib})
endforeach()
+154
Просмотреть файл
@@ -0,0 +1,154 @@
include_guard(DIRECTORY)
# ----------------------------------------------------------------------------------------#
#
# Clang Tidy
#
# ----------------------------------------------------------------------------------------#
# clang-tidy
macro(ROCPROFILER_SYSTEMS_ACTIVATE_CLANG_TIDY)
if(ROCPROFSYS_USE_CLANG_TIDY)
find_program(CLANG_TIDY_COMMAND NAMES clang-tidy)
rocprofiler_systems_add_feature(CLANG_TIDY_COMMAND "Path to clang-tidy command")
if(NOT CLANG_TIDY_COMMAND)
timemory_message(
WARNING "ROCPROFSYS_USE_CLANG_TIDY is ON but clang-tidy is not found!"
)
set(ROCPROFSYS_USE_CLANG_TIDY OFF)
else()
set(CMAKE_CXX_CLANG_TIDY ${CLANG_TIDY_COMMAND})
# Create a preprocessor definition that depends on .clang-tidy content so the
# compile command will change when .clang-tidy changes. This ensures that a
# subsequent build re-runs clang-tidy on all sources even if they do not
# otherwise need to be recompiled. Nothing actually uses this definition. We
# add it to targets on which we run clang-tidy just to get the build
# dependency on the .clang-tidy file.
file(SHA1 ${CMAKE_CURRENT_LIST_DIR}/.clang-tidy clang_tidy_sha1)
set(CLANG_TIDY_DEFINITIONS "CLANG_TIDY_SHA1=${clang_tidy_sha1}")
unset(clang_tidy_sha1)
endif()
endif()
endmacro()
# ------------------------------------------------------------------------------#
#
# clang-format target
#
# ------------------------------------------------------------------------------#
find_program(ROCPROFSYS_CLANG_FORMAT_EXE NAMES clang-format-18 clang-format)
find_program(ROCPROFSYS_CMAKE_FORMAT_EXE NAMES gersemi)
find_program(ROCPROFSYS_BLACK_FORMAT_EXE NAMES black)
add_custom_target(format-rocprofiler-systems)
if(NOT TARGET format)
add_custom_target(format)
endif()
foreach(_TYPE source python cmake)
if(NOT TARGET format-${_TYPE})
add_custom_target(format-${_TYPE})
endif()
endforeach()
if(
ROCPROFSYS_CLANG_FORMAT_EXE
OR ROCPROFSYS_BLACK_FORMAT_EXE
OR ROCPROFSYS_CMAKE_FORMAT_EXE
)
file(
GLOB_RECURSE sources
${PROJECT_SOURCE_DIR}/source/*.cpp
${PROJECT_SOURCE_DIR}/source/*.c
)
file(
GLOB_RECURSE headers
${PROJECT_SOURCE_DIR}/source/*.hpp
${PROJECT_SOURCE_DIR}/source/*.hpp.in
${PROJECT_SOURCE_DIR}/source/*.h
${PROJECT_SOURCE_DIR}/source/*.h.in
)
file(
GLOB_RECURSE examples
${PROJECT_SOURCE_DIR}/examples/*.cpp
${PROJECT_SOURCE_DIR}/examples/*.c
${PROJECT_SOURCE_DIR}/examples/*.hpp
${PROJECT_SOURCE_DIR}/examples/*.h
)
file(
GLOB_RECURSE tests_source
${PROJECT_SOURCE_DIR}/tests/source/*.cpp
${PROJECT_SOURCE_DIR}/tests/source/*.hpp
)
file(GLOB_RECURSE external ${PROJECT_SOURCE_DIR}/examples/lulesh/external/kokkos/*)
file(
GLOB_RECURSE cmake_files
${PROJECT_SOURCE_DIR}/source/*CMakeLists.txt
${PROJECT_SOURCE_DIR}/examples/*CMakeLists.txt
${PROJECT_SOURCE_DIR}/tests/*CMakeLists.txt
${PROJECT_SOURCE_DIR}/source/*.cmake
${PROJECT_SOURCE_DIR}/examples/*.cmake
${PROJECT_SOURCE_DIR}/tests/*.cmake
${PROJECT_SOURCE_DIR}/cmake/*.cmake
${PROJECT_SOURCE_DIR}/source/*.cmake
)
list(APPEND cmake_files ${PROJECT_SOURCE_DIR}/CMakeLists.txt)
if(external)
list(REMOVE_ITEM examples ${external})
list(REMOVE_ITEM cmake_files ${external})
endif()
if(ROCPROFSYS_CLANG_FORMAT_EXE)
add_custom_target(
format-rocprofiler-systems-source
${ROCPROFSYS_CLANG_FORMAT_EXE} -i ${sources} ${headers} ${examples}
${tests_source}
COMMENT
"[rocprofiler-systems] Running C++ formatter ${ROCPROFSYS_CLANG_FORMAT_EXE}..."
)
endif()
if(ROCPROFSYS_BLACK_FORMAT_EXE)
add_custom_target(
format-rocprofiler-systems-python
${ROCPROFSYS_BLACK_FORMAT_EXE} -q ${PROJECT_SOURCE_DIR}
COMMENT
"[rocprofiler-systems] Running Python formatter ${ROCPROFSYS_BLACK_FORMAT_EXE}..."
)
if(NOT TARGET format-python)
add_custom_target(format-python)
endif()
endif()
if(ROCPROFSYS_CMAKE_FORMAT_EXE)
add_custom_target(
format-rocprofiler-systems-cmake
${ROCPROFSYS_CMAKE_FORMAT_EXE} -i ${cmake_files}
COMMENT
"[rocprofiler-systems] Running CMake formatter ${ROCPROFSYS_CMAKE_FORMAT_EXE}..."
)
if(NOT TARGET format-cmake)
add_custom_target(format-cmake)
endif()
endif()
foreach(_TYPE source python cmake)
if(TARGET format-rocprofiler-systems-${_TYPE})
add_dependencies(
format-rocprofiler-systems
format-rocprofiler-systems-${_TYPE}
)
add_dependencies(format-${_TYPE} format-rocprofiler-systems-${_TYPE})
endif()
endforeach()
foreach(_TYPE source python)
if(TARGET format-rocprofiler-systems-${_TYPE})
add_dependencies(format format-rocprofiler-systems-${_TYPE})
endif()
endforeach()
else()
message(STATUS "clang-format could not be found. format build target not available.")
endif()
Разница между файлами не показана из-за своего большого размера Загрузить разницу
Разница между файлами не показана из-за своего большого размера Загрузить разницу
+142
Просмотреть файл
@@ -0,0 +1,142 @@
#[=======================================================================[.rst:
FindLibDW
---------
Find libdw, the elfutils library for DWARF data and ELF file or process inspection.
Variables that affect this module
``LibDW_NO_SYSTEM_PATHS``
If `True`, no system paths are searched.
Imported targets
^^^^^^^^^^^^^^^^
This module defines the following :prop_tgt:`IMPORTED` target:
``LibDW::LibDW``
The libdw library, if found.
Result variables
^^^^^^^^^^^^^^^^
This module will set the following variables in your project:
``LibDW_INCLUDE_DIRS``
where to find libdw.h, etc.
``LibDW_LIBRARIES``
the libraries to link against to use libdw.
``LibDW_FOUND``
If false, do not try to use libdw.
``LibDW_VERSION``
the version of the libdw library found
#]=======================================================================]
cmake_policy(SET CMP0074 NEW) # Use <Package>_ROOT
if(LibDW_NO_SYSTEM_PATHS)
set(_find_path_args NO_CMAKE_SYSTEM_PATH NO_SYSTEM_ENVIRONMENT_PATH)
endif()
# There is no way to tell pkg-config to ignore directories, so disable it
if(NOT LibDW_NO_SYSTEM_PATHS)
find_package(PkgConfig QUIET)
if(PKG_CONFIG_FOUND)
if(NOT "x${LibDW_FIND_VERSION}" STREQUAL "x")
set(_version ">=${LibDW_FIND_VERSION}")
endif()
if(LibDW_FIND_QUIETLY)
set(_quiet "QUIET")
endif()
pkg_check_modules(PC_LIBDW ${_quiet} "libdw${_version}")
unset(_version)
unset(_quiet)
endif()
endif()
if(PC_LIBDW_FOUND)
# FindPkgConfig sometimes gets the include dir wrong
if("x${PC_LIBDW_INCLUDE_DIRS}" STREQUAL "x")
pkg_get_variable(PC_LIBDW_INCLUDE_DIRS libdw includedir)
endif()
set(LibDW_INCLUDE_DIRS ${PC_LIBDW_INCLUDE_DIRS} CACHE PATH "")
set(LibDW_LIBRARIES ${PC_LIBDW_LINK_LIBRARIES} CACHE PATH "")
set(LibDW_VERSION ${PC_LIBDW_VERSION} CACHE STRING "")
else()
find_path(LibDW_INCLUDE_DIRS NAMES libdw.h PATH_SUFFIXES elfutils ${_find_path_args})
find_library(LibDW_LIBRARIES NAMES libdw dw PATH_SUFFIXES elfutils ${_find_path_args})
if(EXISTS "${LibDW_INCLUDE_DIRS}/version.h")
file(
STRINGS
"${LibDW_INCLUDE_DIRS}/version.h"
_version_line
REGEX "^#define _ELFUTILS_VERSION[ \t]+[0-9]+"
)
string(REGEX MATCH "[0-9]+" _version "${_version_line}")
if(NOT "x${_version}" STREQUAL "x")
set(LibDW_VERSION "0.${_version}")
endif()
unset(_version_line)
unset(_version)
endif()
endif()
include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(
LibDW
FOUND_VAR LibDW_FOUND
REQUIRED_VARS LibDW_LIBRARIES LibDW_INCLUDE_DIRS
VERSION_VAR LibDW_VERSION
)
if(LibDW_FOUND)
mark_as_advanced(LibDW_INCLUDE_DIRS)
mark_as_advanced(LibDW_LIBRARIES)
mark_as_advanced(LibDW_VERSION)
# Some platforms explicitly list libelf as a dependency, so separate it out
list(LENGTH LibDW_LIBRARIES _cnt)
if(${_cnt} GREATER 1)
foreach(_l ${LibDW_LIBRARIES})
if(${_l} MATCHES "libdw")
set(_libdw ${_l})
else()
list(APPEND _link_libs ${_l})
endif()
endforeach()
endif()
unset(_cnt)
if(NOT TARGET LibDW::LibDW)
add_library(LibDW::LibDW UNKNOWN IMPORTED)
set_target_properties(
LibDW::LibDW
PROPERTIES INTERFACE_INCLUDE_DIRECTORIES "${LibDW_INCLUDE_DIRS}"
)
if(NOT "x${_link_libs}" STREQUAL "x")
set_target_properties(
LibDW::LibDW
PROPERTIES
IMPORTED_LINK_INTERFACE_LANGUAGES "C"
IMPORTED_LINK_DEPENDENT_LIBRARIES "${_link_libs}"
)
set(LibDW_LIBRARIES ${_libdw})
unset(_libdw)
unset(_link_libs)
endif()
set_target_properties(
LibDW::LibDW
PROPERTIES
IMPORTED_LINK_INTERFACE_LANGUAGES "C"
IMPORTED_LOCATION "${LibDW_LIBRARIES}"
)
endif()
endif()
unset(_find_path_args)
+121
Просмотреть файл
@@ -0,0 +1,121 @@
#[=======================================================================[.rst:
FindLibDebuginfod
-----------------
Find libdebuginfod, the elfutils library to query debuginfo files from debuginfod servers.
Variables that affect this module
``LibDebuginfod_NO_SYSTEM_PATHS``
If `True`, no system paths are searched.
Imported targets
^^^^^^^^^^^^^^^^
This module defines the following :prop_tgt:`IMPORTED` target:
``LibDebuginfod::LibDebuginfod``
The libdebuginfod library, if found.
Result variables
^^^^^^^^^^^^^^^^
This module will set the following variables in your project:
``LibDebuginfod_INCLUDE_DIRS``
where to find debuginfod.h, etc.
``LibDebuginfod_LIBRARIES``
the libraries to link against to use libdebuginfod.
``LibDebuginfod_FOUND``
If false, do not try to use libdebuginfod.
``LibDebuginfod_VERSION``
the version of the libdebuginfod library found
#]=======================================================================]
cmake_policy(SET CMP0074 NEW) # Use <Package>_ROOT
if(LibDebuginfod_NO_SYSTEM_PATHS)
set(_find_path_args NO_CMAKE_SYSTEM_PATH NO_SYSTEM_ENVIRONMENT_PATH)
endif()
# There is no way to tell pkg-config to ignore directories, so disable it
if(NOT LibDebuginfod_NO_SYSTEM_PATHS)
find_package(PkgConfig QUIET)
if(PKG_CONFIG_FOUND)
if(NOT "x${LibDebuginfod_FIND_VERSION}" STREQUAL "x")
set(_version ">=${LibDebuginfod_FIND_VERSION}")
endif()
if(LibDebuginfod_FIND_QUIETLY)
set(_quiet "QUIET")
endif()
pkg_check_modules(PC_LIBDEBUGINFOD ${_quiet} "libdebuginfod${_version}")
unset(_version)
unset(_quiet)
endif()
endif()
if(PC_LIBDEBUGINFOD_FOUND)
# FindPkgConfig sometimes gets the include dir wrong
if("x${PC_LIBDEBUGINFOD_INCLUDE_DIRS}" STREQUAL "x")
pkg_get_variable(PC_LIBDEBUGINFOD_INCLUDE_DIRS libdebuginfod includedir)
endif()
set(LibDebuginfod_INCLUDE_DIRS ${PC_LIBDEBUGINFOD_INCLUDE_DIRS} CACHE PATH "")
set(LibDebuginfod_LIBRARIES ${PC_LIBDEBUGINFOD_LINK_LIBRARIES} CACHE PATH "")
set(LibDebuginfod_VERSION ${PC_LIBDEBUGINFOD_VERSION} CACHE STRING "")
else()
find_path(
LibDebuginfod_INCLUDE_DIRS
NAMES debuginfod.h
PATH_SUFFIXES elfutils ${_find_path_args}
)
find_library(
LibDebuginfod_LIBRARIES
NAMES libdebuginfod debuginfod
PATH_SUFFIXES elfutils ${_find_path_args}
)
if(EXISTS "${LibDebuginfod_INCLUDE_DIRS}/version.h")
file(
STRINGS
"${LibDebuginfod_INCLUDE_DIRS}/version.h"
_version_line
REGEX "^#define _ELFUTILS_VERSION[ \t]+[0-9]+"
)
string(REGEX MATCH "[0-9]+" _version "${_version_line}")
if(NOT "x${_version}" STREQUAL "x")
set(LibDebuginfod_VERSION "0.${_version}")
endif()
unset(_version_line)
unset(_version)
endif()
endif()
include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(
LibDebuginfod
FOUND_VAR LibDebuginfod_FOUND
REQUIRED_VARS LibDebuginfod_LIBRARIES LibDebuginfod_INCLUDE_DIRS
VERSION_VAR LibDebuginfod_VERSION
)
if(LibDebuginfod_FOUND)
mark_as_advanced(LibDebuginfod_INCLUDE_DIR)
mark_as_advanced(LibDebuginfod_LIBRARIES)
mark_as_advanced(LibDebuginfod_VERSION)
if(NOT TARGET LibDebuginfod::LibDebuginfod)
add_library(LibDebuginfod::LibDebuginfod UNKNOWN IMPORTED)
set_target_properties(
LibDebuginfod::LibDebuginfod
PROPERTIES
INTERFACE_INCLUDE_DIRECTORIES "${LibDebuginfod_INCLUDE_DIRS}"
IMPORTED_LINK_INTERFACE_LANGUAGES "C"
IMPORTED_LOCATION "${LibDebuginfod_LIBRARIES}"
)
endif()
endif()
unset(_find_path_args)
+89
Просмотреть файл
@@ -0,0 +1,89 @@
# ===================================================================================
# FindLibDwarf.cmake
#
# Find libdw include dirs and libraries
#
# ----------------------------------------
#
# Use this module by invoking find_package with the form::
#
# find_package(LibDwarf [version] [EXACT] # Minimum or EXACT version e.g. 0.173
# [REQUIRED] # Fail with error if libdw is not found )
#
# This module reads hints about search locations from variables::
#
# LibDwarf_ROOT_DIR - Base directory the of libdw installation
# LibDwarf_INCLUDEDIR - Hint directory that contains the libdw headers files
# LibDwarf_LIBRARYDIR - Hint directory that contains the libdw library files
#
# and saves search results persistently in CMake cache entries::
#
# LibDwarf_FOUND - True if headers and requested libraries were found
# LibDwarf_INCLUDE_DIRS - libdw include directories LibDwarf_LIBRARY_DIRS - Link
# directories for libdw libraries LibDwarf_LIBRARIES - libdw library files
#
# ===================================================================================
# Non-standard subdirectories to search
set(_path_suffixes libdw libdwarf elfutils)
find_path(
LibDwarf_INCLUDE_DIR
NAMES libdw.h
HINTS ${LibDwarf_ROOT_DIR}/include ${LibDwarf_ROOT_DIR} ${LibDwarf_INCLUDEDIR}
PATHS ${DYNINST_SYSTEM_INCLUDE_PATHS}
PATH_SUFFIXES ${_path_suffixes}
DOC "libdw include directories"
)
find_library(
LibDwarf_LIBRARIES
NAMES libdw.so.1 libdw.so
HINTS ${LibDwarf_ROOT_DIR}/lib ${LibDwarf_ROOT_DIR} ${LibDwarf_LIBRARYDIR}
PATHS ${DYNINST_SYSTEM_LIBRARY_PATHS}
PATH_SUFFIXES ${_path_suffixes}
)
# Find the library with the highest version
set(_max_ver 0.0)
set(_max_ver_lib)
foreach(l ${LibDwarf_LIBRARIES})
get_filename_component(_dw_realpath ${LibDwarf_LIBRARIES} REALPATH)
string(REGEX MATCH "libdw\\-(.+)\\.so\\.*$" res ${_dw_realpath})
# The library version number is stored in CMAKE_MATCH_1
set(_cur_ver ${CMAKE_MATCH_1})
if(${_cur_ver} VERSION_GREATER ${_max_ver})
set(_max_ver ${_cur_ver})
set(_max_ver_lib ${l})
endif()
endforeach()
# Set the exported variables to the best match
set(LibDwarf_LIBRARIES ${_max_ver_lib})
set(LibDwarf_VERSION ${_max_ver})
include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(
LibDwarf
FOUND_VAR LibDwarf_FOUND
REQUIRED_VARS LibDwarf_LIBRARIES LibDwarf_INCLUDE_DIR
VERSION_VAR LibDwarf_VERSION
)
# Export cache variables
if(LibDwarf_FOUND)
set(LibDwarf_INCLUDE_DIRS ${LibDwarf_INCLUDE_DIR})
set(LibDwarf_LIBRARIES ${LibDwarf_LIBRARIES})
# Because we only report the library with the largest version, we are guaranteed there
# is only one file in LibDwarf_LIBRARIES
get_filename_component(_dw_dir ${LibDwarf_LIBRARIES} DIRECTORY)
set(LibDwarf_LIBRARY_DIRS ${_dw_dir})
add_library(LibDwarf::LibDwarf INTERFACE IMPORTED)
target_include_directories(LibDwarf::LibDwarf INTERFACE ${LibDwarf_INCLUDE_DIR})
target_link_directories(LibDwarf::LibDwarf INTERFACE ${LibDwarf_LIBRARY_DIRS})
target_link_libraries(LibDwarf::LibDwarf INTERFACE ${LibDwarf_LIBRARIES})
endif()
+91
Просмотреть файл
@@ -0,0 +1,91 @@
# ========================================================================================
# FindLibElf.cmake
#
# Find libelf include dirs and libraries
#
# ----------------------------------------
#
# Use this module by invoking find_package with the form::
#
# find_package(LibElf [version] [EXACT] # Minimum or EXACT version e.g. 0.173
# [REQUIRED] # Fail with error if libelf is not found )
#
# This module reads hints about search locations from variables::
#
# LibElf_ROOT_DIR - Base directory the of libelf installation LibElf_INCLUDEDIR -
# Hint directory that contains the libelf headers files LibElf_LIBRARYDIR - Hint
# directory that contains the libelf library files
#
# and saves search results persistently in CMake cache entries::
#
# LibElf_FOUND - True if headers and requested libraries were found
# LibElf_INCLUDE_DIRS - libelf include directories LibElf_LIBRARY_DIRS - Link
# directories for libelf libraries LibElf_LIBRARIES - libelf library files
#
# Based on the version by Bernhard Walle <bernhard.walle@gmx.de> Copyright (c) 2008
#
# ========================================================================================
# Non-standard subdirectories to search
set(_path_suffixes libelf libelfls elfutils)
find_path(
LibElf_INCLUDE_DIR
NAMES libelf.h
HINTS ${LibElf_ROOT_DIR}/include ${LibElf_ROOT_DIR} ${LibElf_INCLUDEDIR}
PATHS ${DYNINST_SYSTEM_INCLUDE_PATHS}
PATH_SUFFIXES ${_path_suffixes}
DOC "libelf include directories"
)
find_library(
LibElf_LIBRARIES
NAMES libelf.so.1 libelf.so
HINTS ${LibElf_ROOT_DIR}/lib ${LibElf_ROOT_DIR} ${LibElf_LIBRARYDIR}
PATHS ${DYNINST_SYSTEM_LIBRARY_PATHS}
PATH_SUFFIXES ${_path_suffixes}
)
# Find the library with the highest version
set(_max_ver 0.0)
set(_max_ver_lib)
foreach(l ${LibElf_LIBRARIES})
get_filename_component(_elf_realpath ${LibElf_LIBRARIES} REALPATH)
string(REGEX MATCH "libelf\\-(.+)\\.so\\.*$" res ${_elf_realpath})
# The library version number is stored in CMAKE_MATCH_1
set(_cur_ver ${CMAKE_MATCH_1})
if(${_cur_ver} VERSION_GREATER ${_max_ver})
set(_max_ver ${_cur_ver})
set(_max_ver_lib ${l})
endif()
endforeach()
# Set the exported variables to the best match
set(LibElf_LIBRARIES ${_max_ver_lib})
set(LibElf_VERSION ${_max_ver})
include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(
LibElf
FOUND_VAR LibElf_FOUND
REQUIRED_VARS LibElf_LIBRARIES LibElf_INCLUDE_DIR
VERSION_VAR LibElf_VERSION
)
# Export cache variables
if(LibElf_FOUND)
set(LibElf_INCLUDE_DIRS ${LibElf_INCLUDE_DIR})
set(LibElf_LIBRARIES ${LibElf_LIBRARIES})
# Because we only report the library with the largest version, we are guaranteed there
# is only one file in LibElf_LIBRARIES
get_filename_component(_elf_dir ${LibElf_LIBRARIES} DIRECTORY)
set(LibElf_LIBRARY_DIRS ${_elf_dir} "${_elf_dir}/elfutils")
add_library(LibElf::LibElf INTERFACE IMPORTED)
target_include_directories(LibElf::LibElf INTERFACE ${LibElf_INCLUDE_DIR})
target_link_directories(LibElf::LibElf INTERFACE ${LibElf_LIBRARY_DIRS})
target_link_libraries(LibElf::LibElf INTERFACE ${LibElf_LIBRARIES})
endif()
+83
Просмотреть файл
@@ -0,0 +1,83 @@
# ========================================================================================
# FindLibIberty.cmake
#
# Find LibIberty include dirs and libraries
#
# ----------------------------------------
#
# Use this module by invoking find_package with the form::
#
# find_package(LibIberty [REQUIRED] # Fail with error if LibIberty is not
# found )
#
# This module reads hints about search locations from variables::
#
# LibIberty_ROOT_DIR - Base directory the of LibIberty installation
# LibIberty_LIBRARYDIR - Hint directory that contains the LibIberty library files
# IBERTY_LIBRARIES - Alias for LibIberty_LIBRARIES (backwards compatibility only)
# LibIberty_INCLUDEDIR - Hint directory that contains the libiberty headers files
#
# and saves search results persistently in CMake cache entries::
#
# LibIberty_FOUND - True if headers and requested libraries were found
# IBERTY_FOUND - Alias for LibIberty_FOUND (backwards compatibility only)
# LibIberty_INCLUDE_DIRS - libiberty include directories LibIberty_LIBRARY_DIRS - Link
# directories for LibIberty libraries LibIberty_LIBRARIES - LibIberty library files
# IBERTY_LIBRARIES - Alias for LibIberty_LIBRARIES (backwards compatibility only)
#
# ========================================================================================
cmake_minimum_required(VERSION 3.13.0 FATAL_ERROR)
# Keep the semantics of IBERTY_LIBRARIES for backward compatibility NB: If both are
# specified, LibIberty_LIBRARIES is ignored
if(NOT "${IBERTY_LIBRARIES}" STREQUAL "")
set(LibIberty_LIBRARIES ${IBERTY_LIBRARIES})
endif()
# Non-standard subdirectories to search
set(_path_suffixes libiberty iberty)
find_path(
LibIberty_INCLUDE_DIRS
NAMES libiberty.h
HINTS ${LibIberty_ROOT_DIR} ${LibIberty_ROOT_DIR}/include ${LibIberty_INCLUDEDIR}
PATHS ${DYNINST_SYSTEM_INCLUDE_PATHS}
PATH_SUFFIXES ${_path_suffixes}
DOC "LibIberty include directories"
)
# iberty_pic is for Debian <= wheezy
find_library(
LibIberty_LIBRARIES
NAMES iberty_pic iberty
HINTS ${LibIberty_ROOT_DIR} ${LibIberty_LIBRARYDIR} ${IBERTY_LIBRARIES}
PATHS ${DYNINST_SYSTEM_LIBRARY_PATHS}
PATH_SUFFIXES ${_path_suffixes}
)
include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(
LibIberty
FOUND_VAR LibIberty_FOUND
REQUIRED_VARS LibIberty_INCLUDE_DIRS LibIberty_LIBRARIES
)
# For backwards compatibility only
set(IBERTY_FOUND ${LibIberty_FOUND})
if(LibIberty_FOUND)
foreach(l ${LibIberty_LIBRARIES})
get_filename_component(_dir ${l} DIRECTORY)
if(NOT "${_dir}" IN_LIST LibIberty_LIBRARY_DIRS)
list(APPEND LibIberty_LIBRARY_DIRS ${_dir})
endif()
endforeach()
add_library(LibIberty::LibIberty INTERFACE IMPORTED)
target_include_directories(LibIberty::LibIberty INTERFACE ${LibIberty_INCLUDE_DIRS})
target_link_libraries(LibIberty::LibIberty INTERFACE ${LibIberty_LIBRARIES})
# For backwards compatibility only
set(IBERTY_LIBRARIES ${LibIberty_LIBRARIES})
endif()
+56
Просмотреть файл
@@ -0,0 +1,56 @@
# Distributed under the OSI-approved BSD 3-Clause License. See accompanying file
# Copyright.txt or https://cmake.org/licensing for details.
include(FindPackageHandleStandardArgs)
# ----------------------------------------------------------------------------------------#
set(LIBVA_HEADERS_INCLUDE_DIR_INTERNAL
"${PROJECT_SOURCE_DIR}/source/lib/rocprof-sys/library/tpls"
CACHE PATH
"Path to internal va headers"
)
# ----------------------------------------------------------------------------------------#
find_path(
LIBVA_HEADERS_INCLUDE_DIR
NAMES va/va.h
PATHS /opt/amdgpu/include
NO_DEFAULT_PATH
)
if(NOT EXISTS "${LIBVA_HEADERS_INCLUDE_DIR}")
rocprofiler_systems_message(
AUTHOR_WARNING
"VA API header does not exist! Setting LIBVA_HEADERS_INCLUDE_DIR to internal directory: ${LIBVA_HEADERS_INCLUDE_DIR}"
)
set(LIBVA_HEADERS_INCLUDE_DIR
"${LIBVA_HEADERS_INCLUDE_DIR_INTERNAL}"
CACHE PATH
"Path to VA API headers"
FORCE
)
else()
rocprofiler_systems_message(STATUS
"VA API header found: ${LIBVA_HEADERS_INCLUDE_DIR}"
)
endif()
mark_as_advanced(LIBVA_HEADERS_INCLUDE_DIR)
# ----------------------------------------------------------------------------------------#
find_package_handle_standard_args(Libva-headers DEFAULT_MSG LIBVA_HEADERS_INCLUDE_DIR)
# ------------------------------------------------------------------------------#
if(Libva-headers_FOUND)
add_library(roc::libva-headers INTERFACE IMPORTED)
target_include_directories(
roc::libva-headers
SYSTEM
INTERFACE ${LIBVA_HEADERS_INCLUDE_DIR}
)
endif()
# ------------------------------------------------------------------------------#
+198
Просмотреть файл
@@ -0,0 +1,198 @@
# ------------------------------------------------------------------------------#
#
# Finds headers for MPI
#
# ------------------------------------------------------------------------------#
include(FindPackageHandleStandardArgs)
set(MPI_HEADERS_VENDOR_INTERNAL
"OpenMPI"
CACHE STRING
"Distribution type of internal mpi.h"
)
set(MPI_HEADERS_INCLUDE_DIR_INTERNAL
"${PROJECT_SOURCE_DIR}/source/lib/rocprof-sys/library/tpls/mpi"
CACHE PATH
"Path to internal ${MPI_HEADERS_VENDOR_INTERNAL} mpi.h"
)
mark_as_advanced(MPI_HEADERS_VENDOR_INTERNAL)
mark_as_advanced(MPI_HEADERS_INCLUDE_DIR_INTERNAL)
if(
DEFINED _MPI_HEADERS_LAST_MPI_HEADERS_INCLUDE_DIR
AND NOT _MPI_HEADERS_LAST_MPI_HEADERS_INCLUDE_DIR STREQUAL MPI_HEADERS_INCLUDE_DIR
)
unset(MPI_HEADERS_VENDOR CACHE)
# if skip mpicxx is on because of internal unset this value
if(
MPI_HEADERS_SKIP_MPICXX
AND "${_MPI_HEADERS_LAST_MPI_HEADERS_INCLUDE_DIR}"
STREQUAL
"${MPI_HEADERS_INCLUDE_DIR_INTERNAL}"
)
unset(MPI_HEADERS_SKIP_MPICXX CACHE)
endif()
endif()
# define the (OMPI|MPICH)_SKIP_MPICXX pp definition
option(MPI_HEADERS_SKIP_MPICXX "Skip MPI C++" ON)
mark_as_advanced(MPI_HEADERS_SKIP_MPICXX)
# ------------------------------------------------------------------------------#
#
# Try to find an openmpi header
#
# ------------------------------------------------------------------------------#
find_path(
MPI_HEADERS_INCLUDE_DIR
NAMES mpi.h
PATH_SUFFIXES include/openmpi openmpi include
HINTS ${MPI_ROOT_DIR}
PATHS ${MPI_ROOT_DIR}
)
# ------------------------------------------------------------------------------#
#
# If direct find failed, try to find MPI and use MPI_C_INCLUDE_DIRS
#
# ------------------------------------------------------------------------------#
if(NOT MPI_HEADERS_INCLUDE_DIR)
find_package(MPI QUIET)
if(MPI_C_INCLUDE_DIRS)
set(MPI_HEADERS_INCLUDE_DIR
${MPI_C_INCLUDE_DIRS}
CACHE PATH
"Include directory for MPI"
FORCE
)
elseif(MPI_CXX_INCLUDE_DIRS)
set(MPI_HEADERS_INCLUDE_DIR
${MPI_CXX_INCLUDE_DIRS}
CACHE PATH
"Include directory for MPI"
FORCE
)
endif()
endif()
# ------------------------------------------------------------------------------#
#
# If found, try to determine the MPI vendor (i.e. distribution)
#
# ------------------------------------------------------------------------------#
if(MPI_HEADERS_INCLUDE_DIR)
file(STRINGS ${MPI_HEADERS_INCLUDE_DIR}/mpi.h _MPI_H_LINES REGEX "#([ \t]+)define ")
foreach(_LINE ${_MPI_H_LINES})
if("${_LINE}" MATCHES "define([ \t]+)OMPI_")
set(MPI_HEADERS_VENDOR
"OpenMPI"
CACHE STRING
"MPI headers are from OpenMPI distribution"
)
break()
elseif("${_LINE}" MATCHES "define([ \t]+)MPICH_")
set(MPI_HEADERS_VENDOR
"MPICH"
CACHE STRING
"MPI headers are from MPICH distribution"
)
break()
elseif("${_LINE}" MATCHES "define([ \t]+)MVAPICH_")
set(MPI_HEADERS_VENDOR
"MVAPICH"
CACHE STRING
"MPI headers are from MVAPICH distribution"
)
break()
endif()
endforeach()
endif()
# ------------------------------------------------------------------------------#
#
# If not found, use internal version or if vendor is MPICH set to internal
#
# ------------------------------------------------------------------------------#
if(NOT MPI_HEADERS_INCLUDE_DIR)
set(MPI_HEADERS_INCLUDE_DIR "${MPI_HEADERS_INCLUDE_DIR_INTERNAL}" CACHE PATH "" FORCE)
set(MPI_HEADERS_VENDOR
"${MPI_HEADERS_VENDOR_INTERNAL}"
CACHE STRING
"MPI headers are from OpenMPI distribution"
FORCE
)
set(MPI_HEADERS_SKIP_MPICXX ON CACHE BOOL "" FORCE)
elseif("${MPI_HEADERS_VENDOR}" STREQUAL "MPICH")
option(
MPI_HEADERS_ALLOW_MPICH
"Permit the use of MPI headers from MPICH instead of using internal OpenMPI header"
OFF
)
mark_as_advanced(MPI_HEADERS_ALLOW_MPICH)
if(NOT MPI_HEADERS_ALLOW_MPICH)
set(_MESSAGE "\nFound MPI headers belonging to a MPICH distribution. ")
set(_MESSAGE
"${_MESSAGE}The data types for MPICH will cause segfaults when an application uses OpenMPI, "
)
set(_MESSAGE
"${_MESSAGE}whereas the OpenMPI data types are compatible with both. "
)
set(_MESSAGE
"${_MESSAGE}Forcing internal OpenMPI header... This can be disabled via MPI_HEADERS_ALLOW_MPICH=ON ...\n"
)
message(AUTHOR_WARNING "${_MESSAGE}")
unset(_MESSAGE)
set(MPI_HEADERS_INCLUDE_DIR
"${MPI_HEADERS_INCLUDE_DIR_INTERNAL}"
CACHE PATH
""
FORCE
)
set(MPI_HEADERS_VENDOR
"${MPI_HEADERS_VENDOR_INTERNAL}"
CACHE STRING
"MPI headers are from OpenMPI distribution"
FORCE
)
set(MPI_HEADERS_SKIP_MPICXX ON CACHE BOOL "" FORCE)
endif()
endif()
# set local variable
if(MPI_HEADERS_INCLUDE_DIR)
set(MPI_HEADERS_INCLUDE_DIRS ${MPI_HEADERS_INCLUDE_DIR})
endif()
mark_as_advanced(MPI_HEADERS_INCLUDE_DIR)
# store value to detect changes
set(_MPI_HEADERS_LAST_MPI_HEADERS_INCLUDE_DIR
"${MPI_HEADERS_INCLUDE_DIR}"
CACHE INTERNAL
"Last value of MPI_HEADERS_INCLUDE_DIR"
)
# handle find_package
find_package_handle_standard_args(MPI-Headers REQUIRED_VARS MPI_HEADERS_INCLUDE_DIR)
if(MPI-Headers_FOUND)
add_library(MPI::MPI_HEADERS IMPORTED INTERFACE)
if(MPI_HEADERS_SKIP_MPICXX)
if(MPI_HEADERS_VENDOR STREQUAL "MPICH")
target_compile_definitions(MPI::MPI_HEADERS INTERFACE MPICH_SKIP_MPICXX=1)
else()
target_compile_definitions(MPI::MPI_HEADERS INTERFACE OMPI_SKIP_MPICXX=1)
endif()
endif()
target_include_directories(
MPI::MPI_HEADERS
INTERFACE
$<$<COMPILE_LANGUAGE:C>:${MPI_HEADERS_INCLUDE_DIR}>
$<$<COMPILE_LANGUAGE:CXX>:${MPI_HEADERS_INCLUDE_DIR}>
)
endif()
+372
Просмотреть файл
@@ -0,0 +1,372 @@
#[=======================================================================[.rst:
FindROCmVersion
---------------
Search the <ROCM_PATH>/.info/version* files to determine the version of ROCm
Use this module by invoking find_package with the form::
find_package(ROCmVersion
[version] [EXACT]
[REQUIRED])
This module finds the version info for ROCm. The cached variables are::
ROCmVersion_FOUND - Whether the ROCm versioning was found
ROCmVersion_FULL_VERSION - The exact string from `<ROCM_PATH>/.info/version` or similar
ROCmVersion_MAJOR_VERSION - Major version, e.g. 4 in 4.5.2.100-40502
ROCmVersion_MINOR_VERSION - Minor version, e.g. 5 in 4.5.2.100-40502
ROCmVersion_PATCH_VERSION - Patch version, e.g. 2 in 4.5.2.100-40502
ROCmVersion_TWEAK_VERSION - Tweak version, e.g. 100 in 4.5.2.100-40502
ROCmVersion_REVISION_VERSION - Revision version, e.g. 40502 in 4.5.2.100-40502.
ROCmVersion_EPOCH_VERSION - See deb-version for a description of epochs. Epochs are used when versioning system change
ROCmVersion_CANONICAL_VERSION - `[<EPOCH>:]<MAJOR>.<MINOR>.<MINOR>[.<TWEAK>][-<REVISION>]`
ROCmVersion_NUMERIC_VERSION - e.g. `10000*<MAJOR> + 100*<MINOR> + <PATCH>`, e.g. 40502 for ROCm 4.5.2
ROCmVersion_TRIPLE_VERSION - e.g. `<MAJOR>.<MINOR>.<PATCH>`, e.g. 4.5.2 for ROCm 4.5.2
These variables are relevant for the find procedure::
ROCmVersion_DEBUG - Print info about processing
ROCmVersion_VERSION_FILE - `<FILE>` to read from in `<ROCM_PATH>/.info/<FILE>`, e.g. `version`, `version-dev`, `version-hip-libraries`, etc.
It may also be a full path
ROCmVersion_DIR - Root location for <ROCM_PATH>
#]=======================================================================]
set(ROCmVersion_VARIABLES
EPOCH
MAJOR
MINOR
PATCH
TWEAK
REVISION
TRIPLE
NUMERIC
CANONICAL
FULL
)
function(ROCM_VERSION_MESSAGE _TYPE)
if(ROCmVersion_DEBUG)
message(${_TYPE} "[ROCmVersion] ${ARGN}")
endif()
endfunction()
# read a .info/version* file and propagate the variables to the calling scope
function(ROCM_VERSION_COMPUTE FULL_VERSION_STRING _VAR_PREFIX)
# remove any line endings
string(REGEX REPLACE "(\n|\r)" "" FULL_VERSION_STRING "${FULL_VERSION_STRING}")
# store the full version so it can be set later
set(FULL_VERSION "${FULL_VERSION_STRING}")
# get number and remove from full version string
string(REGEX REPLACE "([0-9]+)\:(.*)" "\\1" EPOCH_VERSION "${FULL_VERSION_STRING}")
string(
REGEX REPLACE
"([0-9]+)\:(.*)"
"\\2"
FULL_VERSION_STRING
"${FULL_VERSION_STRING}"
)
if(EPOCH_VERSION STREQUAL FULL_VERSION)
set(EPOCH_VERSION)
endif()
# get number and remove from full version string
string(REGEX REPLACE "([0-9]+)(.*)" "\\1" MAJOR_VERSION "${FULL_VERSION_STRING}")
string(
REGEX REPLACE
"([0-9]+)(.*)"
"\\2"
FULL_VERSION_STRING
"${FULL_VERSION_STRING}"
)
# get number and remove from full version string
string(REGEX REPLACE "\.([0-9]+)(.*)" "\\1" MINOR_VERSION "${FULL_VERSION_STRING}")
string(
REGEX REPLACE
"\.([0-9]+)(.*)"
"\\2"
FULL_VERSION_STRING
"${FULL_VERSION_STRING}"
)
# get number and remove from full version string
string(REGEX REPLACE "\.([0-9]+)(.*)" "\\1" PATCH_VERSION "${FULL_VERSION_STRING}")
string(
REGEX REPLACE
"\.([0-9]+)(.*)"
"\\2"
FULL_VERSION_STRING
"${FULL_VERSION_STRING}"
)
if(NOT PATCH_VERSION LESS 100)
set(PATCH_VERSION 0)
endif()
# get number and remove from full version string
string(REGEX REPLACE "\.([0-9]+)(.*)" "\\1" TWEAK_VERSION "${FULL_VERSION_STRING}")
string(
REGEX REPLACE
"\.([0-9]+)(.*)"
"\\2"
FULL_VERSION_STRING
"${FULL_VERSION_STRING}"
)
# get number
string(
REGEX REPLACE
"-([0-9A-Za-z+~]+)"
"\\1"
REVISION_VERSION
"${FULL_VERSION_STRING}"
)
set(CANONICAL_VERSION)
set(_MAJOR_SEP ":")
set(_MINOR_SEP ".")
set(_PATCH_SEP ".")
set(_TWEAK_SEP ".")
set(_REVISION_SEP "-")
foreach(
_V
EPOCH
MAJOR
MINOR
PATCH
TWEAK
REVISION
)
if(${_V}_VERSION)
set(CANONICAL_VERSION "${CANONICAL_VERSION}${_${_V}_SEP}${${_V}_VERSION}")
else()
set(CANONICAL_VERSION "${CANONICAL_VERSION}${_${_V}_SEP}0")
endif()
endforeach()
set(_MAJOR_SEP "")
foreach(_V MAJOR MINOR PATCH)
if(${_V}_VERSION)
set(TRIPLE_VERSION "${TRIPLE_VERSION}${_${_V}_SEP}${${_V}_VERSION}")
else()
set(TRIPLE_VERSION "${TRIPLE_VERSION}${_${_V}_SEP}0")
endif()
endforeach()
math(
EXPR
NUMERIC_VERSION
"(10000 * (${MAJOR_VERSION}+0)) + (100 * (${MINOR_VERSION}+0)) + (${PATCH_VERSION}+0)"
)
# propagate to parent scopes
foreach(_V ${ROCmVersion_VARIABLES})
set(${_VAR_PREFIX}_${_V}_VERSION ${${_V}_VERSION} PARENT_SCOPE)
endforeach()
endfunction()
# this macro watches for changes in the variables and unsets the remaining cache varaible
# when they change
function(ROCM_VERSION_WATCH_FOR_CHANGE _var)
set(_rocm_version_watch_var_name ROCmVersion_WATCH_VALUE_${_var})
if(DEFINED ${_rocm_version_watch_var_name})
if("${${_var}}" STREQUAL "${${_rocm_version_watch_var_name}}")
if(NOT "${${_var}}" STREQUAL "")
rocm_version_message(STATUS "${_var} :: ${${_var}}")
endif()
list(REMOVE_ITEM _REMAIN_VARIABLES ${_var})
set(_REMAIN_VARIABLES "${_REMAIN_VARIABLES}" PARENT_SCOPE)
return()
else()
rocm_version_message(
STATUS
"${_var} changed :: ${${_rocm_version_watch_var_name}} --> ${${_var}}"
)
foreach(_V ${_REMAIN_VARIABLES})
rocm_version_message(
STATUS "${_var} changed :: Unsetting cache variable ${_V}..."
)
unset(${_V} CACHE)
endforeach()
endif()
else()
if(NOT "${${_var}}" STREQUAL "")
rocm_version_message(STATUS "${_var} :: ${${_var}}")
endif()
endif()
# store the value for the next run
set(${_rocm_version_watch_var_name}
"${${_var}}"
CACHE INTERNAL
"Last value of ${_var}"
FORCE
)
endfunction()
# scope this to a function to avoid leaking local variables
function(ROCM_VERSION_PARSE_VERSION_FILES)
# the list of variables set by module. when one of these changes, we need to unset the
# cache variables after it
set(_ALL_VARIABLES)
foreach(_V ${ROCmVersion_VARIABLES})
list(APPEND _ALL_VARIABLES ROCmVersion_${_V}_VERSION)
endforeach()
set(_REMAIN_VARIABLES ${_ALL_VARIABLES})
# read a .info/version* file and propagate the variables to the calling scope
function(ROCM_VERSION_READ_FILE _FILE _VAR_PREFIX)
file(READ "${_FILE}" FULL_VERSION_STRING LIMIT_COUNT 1)
rocm_version_compute("${FULL_VERSION_STRING}" "${_VAR_PREFIX}")
# propagate to parent scopes
foreach(_V ${ROCmVersion_VARIABLES})
set(${_VAR_PREFIX}_${_V}_VERSION ${${_VAR_PREFIX}_${_V}_VERSION} PARENT_SCOPE)
endforeach()
endfunction()
# search for HIP to set ROCM_PATH if(NOT hip_FOUND) find_package(hip) endif()
function(COMPUTE_ROCM_VERSION_DIR)
if(
EXISTS "${ROCmVersion_VERSION_FILE}"
AND IS_ABSOLUTE "${ROCmVersion_VERSION_FILE}"
)
get_filename_component(_VERSION_DIR "${ROCmVersion_VERSION_FILE}" PATH)
get_filename_component(_VERSION_DIR "${_VERSION_DIR}/.." REALPATH)
set(ROCmVersion_DIR
"${_VERSION_DIR}"
CACHE PATH
"Root path to ROCm's .info/${ROCmVersion_VERSION_FILE}"
${ARGN}
)
rocm_version_watch_for_change(ROCmVersion_DIR)
endif()
endfunction()
if(ROCmVersion_VERSION_FILE)
get_filename_component(_VERSION_FILE "${ROCmVersion_VERSION_FILE}" NAME)
set(_VERSION_FILES ${_VERSION_FILE})
compute_rocm_version_dir(FORCE)
else()
set(_VERSION_FILES
version
version-dev
version-hip-libraries
version-hiprt
version-hiprt-devel
version-hip-sdk
version-libs
version-utils
)
rocm_version_message(STATUS "ROCmVersion version files: ${_VERSION_FILES}")
endif()
# convert env to cache if not defined
foreach(
_PATH
ROCmVersion_DIR
ROCmVersion_ROOT
ROCmVersion_ROOT_DIR
ROCPROFSYS_DEFAULT_ROCM_PATH
ROCM_PATH
)
if(NOT DEFINED ${_PATH} AND DEFINED ENV{${_PATH}})
set(_VAL "$ENV{${_PATH}}")
get_filename_component(_VAL "${_VAL}" REALPATH)
set(${_PATH}
"${_VAL}"
CACHE PATH
"Search path for ROCm version for ROCmVersion"
)
endif()
endforeach()
if(ROCmVersion_DIR)
set(_PATHS ${ROCmVersion_DIR})
else()
set(_PATHS)
foreach(
_DIR
${ROCmVersion_DIR}
${ROCmVersion_ROOT}
${ROCmVersion_ROOT_DIR}
$ENV{CMAKE_PREFIX_PATH}
${CMAKE_PREFIX_PATH}
${ROCPROFSYS_DEFAULT_ROCM_PATH}
${ROCM_PATH}
/opt/rocm
)
if(EXISTS ${_DIR})
get_filename_component(_ABS_DIR "${_DIR}" REALPATH)
list(APPEND _PATHS ${_ABS_DIR})
endif()
endforeach()
rocm_version_message(STATUS "ROCmVersion search paths: ${_PATHS}")
endif()
string(REPLACE ":" ";" _PATHS "${_PATHS}")
foreach(_PATH ${_PATHS})
foreach(_FILE ${_VERSION_FILES})
set(_F ${_PATH}/.info/${_FILE})
if(EXISTS ${_F})
set(ROCmVersion_VERSION_FILE
"${_F}"
CACHE FILEPATH
"File with versioning info"
)
rocm_version_watch_for_change(ROCmVersion_VERSION_FILE)
compute_rocm_version_dir()
else()
rocm_version_message(AUTHOR_WARNING "File does not exist: ${_F}")
endif()
endforeach()
endforeach()
if(EXISTS "${ROCmVersion_VERSION_FILE}")
set(_F "${ROCmVersion_VERSION_FILE}")
rocm_version_message(STATUS "Reading ${_F}...")
get_filename_component(_B "${_F}" NAME)
string(REPLACE "." "_" _B "${_B}")
string(REPLACE "-" "_" _B "${_B}")
rocm_version_read_file(${_F} ${_B})
foreach(_V ${ROCmVersion_VARIABLES})
set(_CACHE_VAR ROCmVersion_${_V}_VERSION)
set(_LOCAL_VAR ${_B}_${_V}_VERSION)
set(ROCmVersion_${_V}_VERSION
"${${_LOCAL_VAR}}"
CACHE STRING
"ROCm ${_V} version"
)
rocm_version_watch_for_change(${_CACHE_VAR})
endforeach()
endif()
endfunction()
# execute
rocm_version_parse_version_files()
include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(
ROCmVersion
VERSION_VAR ROCmVersion_FULL_VERSION
REQUIRED_VARS
ROCmVersion_FULL_VERSION
ROCmVersion_TRIPLE_VERSION
ROCmVersion_DIR
ROCmVersion_VERSION_FILE
)
# don't add major/minor/patch/etc. version variables to required vars because they might
# be zero, which will cause CMake to evaluate it as not set
+297
Просмотреть файл
@@ -0,0 +1,297 @@
# ======================================================================================================
# FindTBB.cmake
#
# Find TBB include directories and libraries.
#
# ----------------------------------------
#
# Use this module by invoking find_package with the form::
#
# find_package(TBB [major[.minor]] [EXACT] [QUIET] # Minimum or EXACT version e.g.
# 2018.6 [REQUIRED] # Fail with error if TBB is not found
# [[COMPONENTS] [components...]] # Required components [OPTIONAL_COMPONENTS
# components...] # Optional components )
#
# This module reads hints about search locations from variables::
#
# TBB_ROOT_DIR - The base directory the of TBB installation. TBB_INCLUDE_DIR -
# The directory that contains the TBB headers files. TBB_LIBRARY - The directory
# that contains the TBB library files. TBB_<library>_LIBRARY - The path of the TBB the
# corresponding TBB library. These libraries override the corresponding library search
# results. TBB_USE_DEBUG_BUILD - Use the debug version of tbb libraries
#
# Environment variable aliases for TBB_ROOT_DIR:
#
# TBB_INSTALL_DIR TBBROOT LIBRARY_PATH
#
# This module will set the following variables:
#
# TBB_FOUND - If false, or undefined, TBB not found, or dont want to
# use TBB. TBB_<component>_FOUND - If False, optional <component> part of TBB
# sytem is not available. TBB_VERSION - The full version string TBB_VERSION_MAJOR - The
# major version TBB_VERSION_MINOR - The minor version TBB_INTERFACE_VERSION - The
# interface version number defined in tbb/tbb_stddef.h. TBB_<library>_LIBRARY_RELEASE -
# The path of the TBB release version of <library>. TBB_<library>_LIBRARY_DEBUG - The
# path of the TBB debug version of <library>.
#
# The following varibles should be used to build and link with TBB:
#
# TBB_INCLUDE_DIRS - The include directory for TBB. TBB_LIBRARY_DIRS - The library
# directory for TBB. TBB_LIBRARIES - The libraries to link against to use TBB.
# TBB_LIBRARIES_RELEASE - The release libraries to link against to use TBB.
# TBB_LIBRARIES_DEBUG - The debug libraries to link against to use TBB.
# TBB_DEFINITIONS - Definitions to use when compiling code that uses TBB.
# TBB_DEFINITIONS_RELEASE - Definitions to use when compiling release code that uses TBB.
# TBB_DEFINITIONS_DEBUG - Definitions to use when compiling debug code that uses TBB.
#
# This module will also create the "TBB" target that may be used when building executables
# and libraries.
#
# Based on the version by Justus Calvin - Copyright (c) 2015
#
# ======================================================================================================
if(TBB_FOUND)
return()
endif()
include(FindPackageHandleStandardArgs)
#
# Check the build type
#
if(NOT DEFINED TBB_USE_DEBUG_BUILD)
if(CMAKE_BUILD_TYPE MATCHES "(Debug|DEBUG|debug)")
set(TBB_BUILD_TYPE DEBUG)
else()
set(TBB_BUILD_TYPE RELEASE)
endif()
elseif(TBB_USE_DEBUG_BUILD)
set(TBB_BUILD_TYPE DEBUG)
else()
set(TBB_BUILD_TYPE RELEASE)
endif()
#
# Set the TBB search directories
#
# Define search paths based on user input and environment variables
set(TBB_SEARCH_DIR ${TBB_ROOT_DIR} $ENV{TBB_INSTALL_DIR} $ENV{TBBROOT})
# Define the search directories based on the current platform
if(CMAKE_SYSTEM_NAME STREQUAL "Windows")
set(TBB_DEFAULT_SEARCH_DIR
"C:/Program Files/Intel/TBB"
"C:/Program Files (x86)/Intel/TBB"
)
# Set the target architecture
if(CMAKE_SIZEOF_VOID_P EQUAL 8)
set(TBB_ARCHITECTURE "intel64")
else()
set(TBB_ARCHITECTURE "ia32")
endif()
# Set the TBB search library path search suffix based on the version of VC
if(WINDOWS_STORE)
set(TBB_LIB_PATH_SUFFIX "lib/${TBB_ARCHITECTURE}/vc11_ui")
elseif(MSVC14)
set(TBB_LIB_PATH_SUFFIX "lib/${TBB_ARCHITECTURE}/vc14")
elseif(MSVC12)
set(TBB_LIB_PATH_SUFFIX "lib/${TBB_ARCHITECTURE}/vc12")
elseif(MSVC11)
set(TBB_LIB_PATH_SUFFIX "lib/${TBB_ARCHITECTURE}/vc11")
elseif(MSVC10)
set(TBB_LIB_PATH_SUFFIX "lib/${TBB_ARCHITECTURE}/vc10")
endif()
# Add the library path search suffix for the VC independent version of TBB
list(APPEND TBB_LIB_PATH_SUFFIX "lib/${TBB_ARCHITECTURE}/vc_mt")
elseif(CMAKE_SYSTEM_NAME STREQUAL "Darwin")
# OS X
set(TBB_DEFAULT_SEARCH_DIR "/opt/intel/tbb")
# TODO: Check to see which C++ library is being used by the compiler.
if(NOT ${CMAKE_SYSTEM_VERSION} VERSION_LESS 13.0)
# The default C++ library on OS X 10.9 and later is libc++
set(TBB_LIB_PATH_SUFFIX "lib/libc++" "lib")
else()
set(TBB_LIB_PATH_SUFFIX "lib")
endif()
elseif(CMAKE_SYSTEM_NAME STREQUAL "Linux")
# Linux
set(TBB_DEFAULT_SEARCH_DIR "/opt/intel/tbb")
# TODO: Check compiler version to see the suffix should be <arch>/gcc4.1 or
# <arch>/gcc4.1. For now, assume that the compiler is more recent than gcc 4.4.x or
# later.
if(CMAKE_SYSTEM_PROCESSOR STREQUAL "x86_64")
set(TBB_LIB_PATH_SUFFIX "lib/intel64/gcc4.4")
elseif(CMAKE_SYSTEM_PROCESSOR MATCHES "^i.86$")
set(TBB_LIB_PATH_SUFFIX "lib/ia32/gcc4.4")
endif()
endif()
#
# Find the TBB include dir
#
find_path(
TBB_INCLUDE_DIRS
tbb/tbb.h
HINTS ${TBB_INCLUDE_DIRS} ${TBB_SEARCH_DIR}
PATHS ${TBB_DEFAULT_SEARCH_DIR}
PATH_SUFFIXES include
)
#
# Set version strings
#
if(TBB_INCLUDE_DIRS)
# Starting in 2020.1.1, tbb_stddef.h is replaced by version.h
set(_version_files
"${TBB_INCLUDE_DIRS}/tbb/tbb_stddef.h"
"${TBB_INCLUDE_DIRS}/tbb/version.h"
)
foreach(f IN ITEMS ${_version_files})
if(EXISTS ${f})
set(_version_file ${f})
endif()
endforeach()
unset(_version_files)
file(READ ${_version_file} _tbb_version_file)
string(
REGEX REPLACE
".*#define TBB_VERSION_MAJOR ([0-9]+).*"
"\\1"
TBB_VERSION_MAJOR
"${_tbb_version_file}"
)
string(
REGEX REPLACE
".*#define TBB_VERSION_MINOR ([0-9]+).*"
"\\1"
TBB_VERSION_MINOR
"${_tbb_version_file}"
)
string(
REGEX REPLACE
".*#define TBB_INTERFACE_VERSION ([0-9]+).*"
"\\1"
TBB_INTERFACE_VERSION
"${_tbb_version_file}"
)
# The TBB_VERSION_MINOR isn't necessarily changed for minor releases Hence, we need to
# read the engineering versioning in TBB_INTERFACE_VERSION to get the minor version
# correct
if("${TBB_VERSION_MINOR}" STREQUAL "0")
math(EXPR _tbb_iface_major_ver "${TBB_INTERFACE_VERSION} / 100")
math(
EXPR
TBB_VERSION_MINOR
"${TBB_INTERFACE_VERSION} - ${_tbb_iface_major_ver} * 100"
)
endif()
set(TBB_VERSION "${TBB_VERSION_MAJOR}.${TBB_VERSION_MINOR}")
endif()
#
# Find TBB components
#
if(TBB_VERSION VERSION_LESS 4.3)
set(TBB_SEARCH_COMPOMPONENTS tbb_preview tbbmalloc tbb)
else()
set(TBB_SEARCH_COMPOMPONENTS tbb_preview tbbmalloc_proxy tbbmalloc tbb)
endif()
set(TBB_LIBRARY_DIRS)
# Find each component
foreach(_comp ${TBB_SEARCH_COMPOMPONENTS})
# message(STATUS "Searching for ${_comp}...") message(STATUS "Hints: ${TBB_LIBRARY}
# ${TBB_SEARCH_DIR}")
if(";${TBB_FIND_COMPONENTS};tbb;" MATCHES ";${_comp};")
# Search for the libraries
find_library(
TBB_${_comp}_LIBRARY_RELEASE
${_comp}
HINTS ${TBB_LIBRARY} ${TBB_SEARCH_DIR}
PATHS ${TBB_DEFAULT_SEARCH_DIR}
ENV LIBRARY_PATH
PATH_SUFFIXES ${TBB_LIB_PATH_SUFFIX} lib_release
)
find_library(
TBB_${_comp}_LIBRARY_DEBUG
${_comp}_debug
HINTS ${TBB_LIBRARY} ${TBB_SEARCH_DIR}
PATHS ${TBB_DEFAULT_SEARCH_DIR}
ENV LIBRARY_PATH
PATH_SUFFIXES ${TBB_LIB_PATH_SUFFIX} lib_debug
)
if(TBB_${_comp}_LIBRARY_DEBUG)
list(APPEND TBB_LIBRARIES_DEBUG "${TBB_${_comp}_LIBRARY_DEBUG}")
# message(STATUS "Found ${TBB_${_comp}_LIBRARY_DEBUG}")
endif()
if(TBB_${_comp}_LIBRARY_RELEASE)
list(APPEND TBB_LIBRARIES_RELEASE "${TBB_${_comp}_LIBRARY_RELEASE}")
# message(STATUS "Found ${TBB_${_comp}_LIBRARY_RELEASE}")
endif()
if(TBB_${_comp}_LIBRARY_${TBB_BUILD_TYPE} AND NOT TBB_${_comp}_LIBRARY)
set(TBB_${_comp}_LIBRARY "${TBB_${_comp}_LIBRARY_${TBB_BUILD_TYPE}}")
endif()
if(TBB_${_comp}_LIBRARY AND EXISTS "${TBB_${_comp}_LIBRARY}")
set(TBB_${_comp}_FOUND TRUE)
else()
set(TBB_${_comp}_FOUND FALSE)
endif()
# Mark internal variables as advanced
mark_as_advanced(TBB_${_comp}_LIBRARY_RELEASE)
mark_as_advanced(TBB_${_comp}_LIBRARY_DEBUG)
mark_as_advanced(TBB_${_comp}_LIBRARY)
# Save the directory names for each library component
if(TBB_USE_DEBUG_BUILD)
get_filename_component(_dir ${TBB_${_comp}_LIBRARY_DEBUG} DIRECTORY)
else()
get_filename_component(_dir ${TBB_${_comp}_LIBRARY_RELEASE} DIRECTORY)
endif()
list(APPEND TBB_LIBRARY_DIRS ${_dir})
endif()
endforeach()
#
# Set compile flags and libraries
#
set(TBB_DEFINITIONS_RELEASE "")
set(TBB_DEFINITIONS_DEBUG "-DTBB_USE_DEBUG=1")
if(TBB_LIBRARIES_${TBB_BUILD_TYPE})
set(TBB_DEFINITIONS "${TBB_DEFINITIONS_${TBB_BUILD_TYPE}}")
set(TBB_LIBRARIES "${TBB_LIBRARIES_${TBB_BUILD_TYPE}}")
elseif(TBB_LIBRARIES_RELEASE)
set(TBB_DEFINITIONS "${TBB_DEFINITIONS_RELEASE}")
set(TBB_LIBRARIES "${TBB_LIBRARIES_RELEASE}")
elseif(TBB_LIBRARIES_DEBUG)
set(TBB_DEFINITIONS "${TBB_DEFINITIONS_DEBUG}")
set(TBB_LIBRARIES "${TBB_LIBRARIES_DEBUG}")
endif()
find_package_handle_standard_args(
TBB
REQUIRED_VARS TBB_INCLUDE_DIRS TBB_LIBRARIES
HANDLE_COMPONENTS
VERSION_VAR TBB_VERSION
)
mark_as_advanced(TBB_INCLUDE_DIRS TBB_LIBRARIES TBB_LIBRARY_DIRS)
unset(TBB_ARCHITECTURE)
unset(TBB_BUILD_TYPE)
unset(TBB_LIB_PATH_SUFFIX)
unset(TBB_DEFAULT_SEARCH_DIR)
+86
Просмотреть файл
@@ -0,0 +1,86 @@
# Distributed under the OSI-approved BSD 3-Clause License. See accompanying file
# Copyright.txt or https://cmake.org/licensing for details.
include(FindPackageHandleStandardArgs)
# ----------------------------------------------------------------------------------------#
if(NOT ROCM_PATH AND NOT "$ENV{ROCM_PATH}" STREQUAL "")
set(ROCM_PATH "$ENV{ROCM_PATH}")
endif()
foreach(_DIR ${ROCmVersion_DIR} ${ROCM_PATH} /opt/rocm /opt/rocm/amd_smi)
if(EXISTS ${_DIR})
get_filename_component(_ABS_DIR "${_DIR}" REALPATH)
list(APPEND _AMD_SMI_PATHS ${_ABS_DIR})
endif()
endforeach()
# ----------------------------------------------------------------------------------------#
find_path(
amd-smi_ROOT_DIR
NAMES include/amd_smi/amdsmi.h
HINTS ${_AMD_SMI_PATHS}
PATHS ${_AMD_SMI_PATHS}
PATH_SUFFIXES amd_smi
)
mark_as_advanced(amd-smi_ROOT_DIR)
# ----------------------------------------------------------------------------------------#
find_path(
amd-smi_INCLUDE_DIR
NAMES amd_smi/amdsmi.h
HINTS ${amd-smi_ROOT_DIR} ${_AMD_SMI_PATHS}
PATHS ${amd-smi_ROOT_DIR} ${_AMD_SMI_PATHS}
PATH_SUFFIXES include amd_smi/include
)
mark_as_advanced(amd-smi_INCLUDE_DIR)
# ----------------------------------------------------------------------------------------#
find_library(
amd-smi_LIBRARY
NAMES amd_smi
HINTS ${amd-smi_ROOT_DIR} ${_AMD_SMI_PATHS}
PATHS ${amd-smi_ROOT_DIR} ${_AMD_SMI_PATHS}
PATH_SUFFIXES amd-smi/lib lib
)
if(amd-smi_LIBRARY)
get_filename_component(amd-smi_LIBRARY_DIR "${amd-smi_LIBRARY}" PATH CACHE)
endif()
mark_as_advanced(amd-smi_LIBRARY)
# ----------------------------------------------------------------------------------------#
find_package_handle_standard_args(
amd-smi
DEFAULT_MSG
amd-smi_ROOT_DIR
amd-smi_INCLUDE_DIR
amd-smi_LIBRARY
)
# ------------------------------------------------------------------------------#
if(amd-smi_FOUND)
add_library(amd-smi::amd-smi INTERFACE IMPORTED)
add_library(amd-smi::roctx INTERFACE IMPORTED)
set(amd-smi_INCLUDE_DIRS ${amd-smi_INCLUDE_DIR})
set(amd-smi_LIBRARIES ${amd-smi_LIBRARY})
set(amd-smi_LIBRARY_DIRS ${amd-smi_LIBRARY_DIR})
target_include_directories(amd-smi::amd-smi INTERFACE ${amd-smi_INCLUDE_DIR})
target_link_libraries(amd-smi::amd-smi INTERFACE ${amd-smi_LIBRARY})
endif()
# ------------------------------------------------------------------------------#
unset(_AMD_SMI_PATHS)
# ------------------------------------------------------------------------------#
+388
Просмотреть файл
@@ -0,0 +1,388 @@
# ======================================================================================
# PAPI.cmake
#
# Configure papi for rocprofiler-systems
#
# ======================================================================================
include_guard(GLOBAL)
rocprofiler_systems_checkout_git_submodule(
RELATIVE_PATH external/papi
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}
REPO_URL https://bitbucket.org/icl/papi.git
REPO_BRANCH effd1ef4e0fd4b80e36546791277215a2d6b9eba
TEST_FILE src/configure
)
set(PAPI_LIBPFM_SOVERSION "4.11.1" CACHE STRING "libpfm.so version")
set(ROCPROFSYS_PAPI_SOURCE_DIR ${PROJECT_BINARY_DIR}/external/papi/source)
set(ROCPROFSYS_PAPI_INSTALL_DIR ${PROJECT_BINARY_DIR}/external/papi/install)
if(NOT EXISTS "${ROCPROFSYS_PAPI_SOURCE_DIR}")
execute_process(
COMMAND
${CMAKE_COMMAND} -E copy_directory ${PROJECT_SOURCE_DIR}/external/papi
${ROCPROFSYS_PAPI_SOURCE_DIR}
)
endif()
if(NOT EXISTS "${ROCPROFSYS_PAPI_INSTALL_DIR}")
execute_process(
COMMAND ${CMAKE_COMMAND} -E make_directory ${ROCPROFSYS_PAPI_INSTALL_DIR}
)
execute_process(
COMMAND ${CMAKE_COMMAND} -E make_directory ${ROCPROFSYS_PAPI_INSTALL_DIR}/include
)
execute_process(
COMMAND ${CMAKE_COMMAND} -E make_directory ${ROCPROFSYS_PAPI_INSTALL_DIR}/lib
)
execute_process(
COMMAND
${CMAKE_COMMAND} -E touch ${ROCPROFSYS_PAPI_INSTALL_DIR}/lib/libpapi.a
${ROCPROFSYS_PAPI_INSTALL_DIR}/lib/libpfm.a
${ROCPROFSYS_PAPI_INSTALL_DIR}/lib/libpfm.so
)
set(_ROCPROFSYS_PAPI_BUILD_BYPRODUCTS
${ROCPROFSYS_PAPI_INSTALL_DIR}/lib/libpapi.a
${ROCPROFSYS_PAPI_INSTALL_DIR}/lib/libpfm.a
${ROCPROFSYS_PAPI_INSTALL_DIR}/lib/libpfm.so
)
endif()
# Set ROCPROFSYS_PAPI_CONFIGURE_JOBS for commands that need to be run nonparallel
set(ROCPROFSYS_PAPI_CONFIGURE_JOBS 1)
rocprofiler_systems_add_option(ROCPROFSYS_PAPI_AUTO_COMPONENTS
"Automatically enable components" OFF
)
# -------------- PACKAGES -----------------------------------------------------
set(_ROCPROFSYS_VALID_PAPI_COMPONENTS
appio
bgpm
coretemp
coretemp_freebsd
cuda
emon
example
host_micpower
infiniband
intel_gpu
io
libmsr
lmsensors
lustre
micpower
mx
net
nvml
pcp
perfctr
perfctr_ppc
perf_event
perf_event_uncore
perfmon2
perfmon_ia64
perfnec
powercap
powercap_ppc
rapl
rocm
rocm_smi
sde
sensors_ppc
stealtime
sysdetect
vmware
)
set(ROCPROFSYS_VALID_PAPI_COMPONENTS
"${_ROCPROFSYS_VALID_PAPI_COMPONENTS}"
CACHE STRING
"Valid PAPI components"
)
mark_as_advanced(ROCPROFSYS_VALID_PAPI_COMPONENTS)
# default components which do not require 3rd-party headers or libraries
set(_ROCPROFSYS_PAPI_COMPONENTS
appio
coretemp
io
infiniband
# lustre micpower mx
net
perf_event
perf_event_uncore
# rapl stealtime
)
if(ROCPROFSYS_PAPI_AUTO_COMPONENTS)
# lmsensors
find_path(
ROCPROFSYS_PAPI_LMSENSORS_ROOT_DIR
NAMES include/sensors/sensors.h include/sensors.h
)
if(ROCPROFSYS_PAPI_LMSENSORS_ROOT_DIR)
list(APPEND _ROCPROFSYS_PAPI_COMPONENTS lmsensors)
endif()
# pcp
find_path(ROCPROFSYS_PAPI_PCP_ROOT_DIR NAMES include/pcp/impl.h)
find_library(ROCPROFSYS_PAPI_PCP_LIBRARY NAMES pcp PATH_SUFFIXES lib lib64)
if(ROCPROFSYS_PAPI_PCP_ROOT_DIR AND ROCPROFSYS_PAPI_PCP_LIBRARY)
list(APPEND _ROCPROFSYS_PAPI_COMPONENTS pcp)
endif()
endif()
# set the ROCPROFSYS_PAPI_COMPONENTS cache variable
set(ROCPROFSYS_PAPI_COMPONENTS
"${_ROCPROFSYS_PAPI_COMPONENTS}"
CACHE STRING
"PAPI components"
)
rocprofiler_systems_add_feature(ROCPROFSYS_PAPI_COMPONENTS "PAPI components")
string(REPLACE ";" "\ " _ROCPROFSYS_PAPI_COMPONENTS "${ROCPROFSYS_PAPI_COMPONENTS}")
set(ROCPROFSYS_PAPI_EXTRA_ENV)
foreach(_COMP ${ROCPROFSYS_PAPI_COMPONENTS})
string(
REPLACE
";"
", "
_ROCPROFSYS_VALID_PAPI_COMPONENTS_MSG
"${ROCPROFSYS_VALID_PAPI_COMPONENTS}"
)
if(NOT "${_COMP}" IN_LIST ROCPROFSYS_VALID_PAPI_COMPONENTS)
rocprofiler_systems_message(
AUTHOR_WARNING
"ROCPROFSYS_PAPI_COMPONENTS contains an unknown component '${_COMP}'. Known components: ${_ROCPROFSYS_VALID_PAPI_COMPONENTS_MSG}"
)
endif()
unset(_ROCPROFSYS_VALID_PAPI_COMPONENTS_MSG)
endforeach()
if("rocm" IN_LIST ROCPROFSYS_PAPI_COMPONENTS)
find_package(ROCmVersion REQUIRED)
list(APPEND ROCPROFSYS_PAPI_EXTRA_ENV PAPI_ROCM_ROOT=${ROCmVersion_DIR})
endif()
if("lmsensors" IN_LIST ROCPROFSYS_PAPI_COMPONENTS AND ROCPROFSYS_PAPI_LMSENSORS_ROOT_DIR)
list(
APPEND
ROCPROFSYS_PAPI_EXTRA_ENV
PAPI_LMSENSORS_ROOT=${ROCPROFSYS_PAPI_LMSENSORS_ROOT_DIR}
)
endif()
if("pcp" IN_LIST ROCPROFSYS_PAPI_COMPONENTS AND ROCPROFSYS_PAPI_PCP_ROOT_DIR)
list(APPEND ROCPROFSYS_PAPI_EXTRA_ENV PAPI_PCP_ROOT=${ROCPROFSYS_PAPI_PCP_ROOT_DIR})
endif()
if(
"perf_event_uncore" IN_LIST ROCPROFSYS_PAPI_COMPONENTS
AND NOT "perf_event" IN_LIST ROCPROFSYS_PAPI_COMPONENTS
)
rocprofiler_systems_message(
FATAL_ERROR
"ROCPROFSYS_PAPI_COMPONENTS :: 'perf_event_uncore' requires 'perf_event' component"
)
endif()
find_program(MAKE_EXECUTABLE NAMES make gmake PATH_SUFFIXES bin)
if(NOT MAKE_EXECUTABLE)
rocprofiler_systems_message(
FATAL_ERROR
"make/gmake executable not found. Please re-run with -DMAKE_EXECUTABLE=/path/to/make"
)
endif()
set(_PAPI_C_COMPILER ${CMAKE_C_COMPILER})
if(CMAKE_C_COMPILER_IS_CLANG)
find_program(ROCPROFSYS_GNU_C_COMPILER NAMES gcc)
if(ROCPROFSYS_GNU_C_COMPILER)
set(_PAPI_C_COMPILER ${ROCPROFSYS_GNU_C_COMPILER})
endif()
endif()
set(PAPI_C_COMPILER ${_PAPI_C_COMPILER} CACHE FILEPATH "C compiler used to compile PAPI")
include(ExternalProject)
ExternalProject_Add(
rocprofiler-systems-papi-build
PREFIX ${PROJECT_BINARY_DIR}/external/papi
SOURCE_DIR ${ROCPROFSYS_PAPI_SOURCE_DIR}/src
BUILD_IN_SOURCE 1
PATCH_COMMAND
${CMAKE_COMMAND} -E env CC=${PAPI_C_COMPILER}
CFLAGS=-fPIC\ -O3\ -Wno-stringop-truncation\ -Wno-use-after-free LIBS=-lrt
LDFLAGS=-lrt ${ROCPROFSYS_PAPI_EXTRA_ENV} <SOURCE_DIR>/configure --quiet
--prefix=${ROCPROFSYS_PAPI_INSTALL_DIR} --with-static-lib=yes --with-shared-lib=no
--with-perf-events --with-tests=no
--with-components=${_ROCPROFSYS_PAPI_COMPONENTS}
--libdir=${ROCPROFSYS_PAPI_INSTALL_DIR}/lib
CONFIGURE_COMMAND
${CMAKE_COMMAND} -E env
CFLAGS=-fPIC\ -O3\ -Wno-stringop-truncation\ -Wno-use-after-free
${ROCPROFSYS_PAPI_EXTRA_ENV} ${MAKE_EXECUTABLE} static install -s -j
${ROCPROFSYS_PAPI_CONFIGURE_JOBS}
BUILD_COMMAND
${CMAKE_COMMAND} -E env
CFLAGS=-fPIC\ -O3\ -Wno-stringop-truncation\ -Wno-use-after-free
${ROCPROFSYS_PAPI_EXTRA_ENV} ${MAKE_EXECUTABLE} utils install-utils -s
INSTALL_COMMAND ""
BUILD_BYPRODUCTS "${_ROCPROFSYS_PAPI_BUILD_BYPRODUCTS}"
)
# target for re-executing the installation
add_custom_target(
rocprofiler-systems-papi-install
COMMAND
${CMAKE_COMMAND} -E env
CFLAGS=-fPIC\ -O3\ -Wno-stringop-truncation\ -Wno-use-after-free
${ROCPROFSYS_PAPI_EXTRA_ENV} ${MAKE_EXECUTABLE} static install -s
COMMAND
${CMAKE_COMMAND} -E env
CFLAGS=-fPIC\ -O3\ -Wno-stringop-truncation\ -Wno-use-after-free
${ROCPROFSYS_PAPI_EXTRA_ENV} ${MAKE_EXECUTABLE} utils install-utils -s
WORKING_DIRECTORY ${ROCPROFSYS_PAPI_SOURCE_DIR}/src
COMMENT "Installing PAPI..."
)
add_custom_target(
rocprofiler-systems-papi-clean
COMMAND ${MAKE_EXECUTABLE} distclean
COMMAND ${CMAKE_COMMAND} -E rm -rf ${ROCPROFSYS_PAPI_INSTALL_DIR}/include/*
COMMAND ${CMAKE_COMMAND} -E rm -rf ${ROCPROFSYS_PAPI_INSTALL_DIR}/lib/*
COMMAND
${CMAKE_COMMAND} -E touch ${ROCPROFSYS_PAPI_INSTALL_DIR}/lib/libpapi.a
${ROCPROFSYS_PAPI_INSTALL_DIR}/lib/libpfm.a
${ROCPROFSYS_PAPI_INSTALL_DIR}/lib/libpfm.so
WORKING_DIRECTORY ${ROCPROFSYS_PAPI_SOURCE_DIR}/src
COMMENT "Cleaning PAPI..."
)
set(PAPI_ROOT_DIR
${ROCPROFSYS_PAPI_INSTALL_DIR}
CACHE PATH
"Root PAPI installation"
FORCE
)
set(PAPI_INCLUDE_DIR
${ROCPROFSYS_PAPI_INSTALL_DIR}/include
CACHE PATH
"PAPI include folder"
FORCE
)
set(PAPI_LIBRARY
${ROCPROFSYS_PAPI_INSTALL_DIR}/lib/libpapi.a
CACHE FILEPATH
"PAPI library"
FORCE
)
set(PAPI_pfm_LIBRARY
${ROCPROFSYS_PAPI_INSTALL_DIR}/lib/libpfm.so
CACHE FILEPATH
"PAPI library"
FORCE
)
set(PAPI_STATIC_LIBRARY
${ROCPROFSYS_PAPI_INSTALL_DIR}/lib/libpapi.a
CACHE FILEPATH
"PAPI library"
FORCE
)
set(PAPI_pfm_STATIC_LIBRARY
${ROCPROFSYS_PAPI_INSTALL_DIR}/lib/libpfm.a
CACHE FILEPATH
"PAPI library"
FORCE
)
target_include_directories(
rocprofiler-systems-papi
SYSTEM
INTERFACE $<BUILD_INTERFACE:${PAPI_INCLUDE_DIR}>
)
target_link_libraries(
rocprofiler-systems-papi
INTERFACE $<BUILD_INTERFACE:${PAPI_LIBRARY}> $<BUILD_INTERFACE:${PAPI_pfm_LIBRARY}>
)
rocprofiler_systems_target_compile_definitions(
rocprofiler-systems-papi INTERFACE ROCPROFSYS_USE_PAPI
$<BUILD_INTERFACE:TIMEMORY_USE_PAPI=1>
)
install(
DIRECTORY ${ROCPROFSYS_PAPI_INSTALL_DIR}/lib/
DESTINATION ${CMAKE_INSTALL_LIBDIR}/${PROJECT_NAME}
COMPONENT papi
FILES_MATCHING
PATTERN "*.so*"
)
foreach(
_UTIL_EXE
papi_avail
papi_clockres
papi_command_line
papi_component_avail
papi_cost
papi_decode
papi_error_codes
papi_event_chooser
papi_hardware_avail
papi_hl_output_writer.py
papi_mem_info
papi_multiplex_cost
papi_native_avail
papi_version
papi_xml_event_info
)
string(REPLACE "_" "-" _UTIL_EXE_INSTALL_NAME "${BINARY_NAME_PREFIX}-${_UTIL_EXE}")
# RPM installer on RedHat/RockyLinux throws error that #!/usr/bin/python should either
# be #!/usr/bin/python2 or #!/usr/bin/python3
if(_UTIL_EXE STREQUAL "papi_hl_output_writer.py")
file(
READ
"${PROJECT_BINARY_DIR}/external/papi/source/src/high-level/scripts/${_UTIL_EXE}"
_HL_OUTPUT_WRITER
)
string(
REPLACE
"#!/usr/bin/python\n"
"#!/usr/bin/python3\n"
_HL_OUTPUT_WRITER
"${_HL_OUTPUT_WRITER}"
)
file(MAKE_DIRECTORY "${ROCPROFSYS_PAPI_INSTALL_DIR}/bin")
file(
WRITE
"${ROCPROFSYS_PAPI_INSTALL_DIR}/bin/${_UTIL_EXE}3"
"${_HL_OUTPUT_WRITER}"
)
set(_UTIL_EXE "${_UTIL_EXE}3")
# python script file install to libexec
install(
PROGRAMS ${ROCPROFSYS_PAPI_INSTALL_DIR}/bin/${_UTIL_EXE}
DESTINATION ${CMAKE_INSTALL_LIBEXECDIR}/${PROJECT_NAME}
RENAME ${_UTIL_EXE_INSTALL_NAME}
COMPONENT papi
OPTIONAL
)
else()
# Binary files moved to bin
install(
PROGRAMS ${ROCPROFSYS_PAPI_INSTALL_DIR}/bin/${_UTIL_EXE}
DESTINATION ${CMAKE_INSTALL_LIBDIR}/${PROJECT_NAME}
RENAME ${_UTIL_EXE_INSTALL_NAME}
COMPONENT papi
OPTIONAL
)
endif()
endforeach()
Разница между файлами не показана из-за своего большого размера Загрузить разницу
+302
Просмотреть файл
@@ -0,0 +1,302 @@
# ======================================================================================
# Perfetto.cmake
#
# Configure perfetto for rocprofiler-systems
#
# ======================================================================================
include_guard(GLOBAL)
include(ExternalProject)
include(ProcessorCount)
# ---------------------------------------------------------------------------------------#
#
# executables and libraries
#
# ---------------------------------------------------------------------------------------#
find_program(ROCPROFSYS_COPY_EXECUTABLE NAMES cp PATH_SUFFIXES bin)
find_program(ROCPROFSYS_NINJA_EXECUTABLE NAMES ninja PATH_SUFFIXES bin)
mark_as_advanced(ROCPROFSYS_COPY_EXECUTABLE)
mark_as_advanced(ROCPROFSYS_NINJA_EXECUTABLE)
# ---------------------------------------------------------------------------------------#
#
# variables
#
# ---------------------------------------------------------------------------------------#
ProcessorCount(NUM_PROCS_REAL)
math(EXPR _NUM_THREADS "${NUM_PROCS_REAL} - (${NUM_PROCS_REAL} / 2)")
if(_NUM_THREADS GREATER 8)
set(_NUM_THREADS 8)
elseif(_NUM_THREADS LESS 1)
set(_NUM_THREADS 1)
endif()
set(ROCPROFSYS_PERFETTO_SOURCE_DIR ${PROJECT_BINARY_DIR}/external/perfetto/source)
set(ROCPROFSYS_PERFETTO_TOOLS_DIR ${PROJECT_BINARY_DIR}/external/perfetto/source/tools)
set(ROCPROFSYS_PERFETTO_BINARY_DIR
${PROJECT_BINARY_DIR}/external/perfetto/source/out/linux
)
set(ROCPROFSYS_PERFETTO_INSTALL_DIR
${PROJECT_BINARY_DIR}/external/perfetto/source/out/linux/stripped
)
set(ROCPROFSYS_PERFETTO_LINK_FLAGS
"-static-libgcc"
CACHE STRING
"Link flags for perfetto"
)
set(ROCPROFSYS_PERFETTO_BUILD_THREADS
${_NUM_THREADS}
CACHE STRING
"Number of threads to use when building perfetto tools"
)
if(CMAKE_CXX_COMPILER_IS_CLANG)
set(PERFETTO_IS_CLANG true)
set(ROCPROFSYS_PERFETTO_C_FLAGS "" CACHE STRING "Perfetto C flags")
set(ROCPROFSYS_PERFETTO_CXX_FLAGS "" CACHE STRING "Perfetto C++ flags")
else()
set(PERFETTO_IS_CLANG false)
set(ROCPROFSYS_PERFETTO_C_FLAGS
"-static-libgcc -Wno-maybe-uninitialized -Wno-stringop-overflow"
CACHE STRING
"Perfetto C flags"
)
set(ROCPROFSYS_PERFETTO_CXX_FLAGS
"-static-libgcc -Wno-maybe-uninitialized -Wno-stringop-overflow -Wno-mismatched-new-delete"
CACHE STRING
"Perfetto C++ flags"
)
endif()
mark_as_advanced(ROCPROFSYS_PERFETTO_C_FLAGS)
mark_as_advanced(ROCPROFSYS_PERFETTO_CXX_FLAGS)
mark_as_advanced(ROCPROFSYS_PERFETTO_LINK_FLAGS)
if(NOT ROCPROFSYS_NINJA_EXECUTABLE)
set(ROCPROFSYS_NINJA_EXECUTABLE
${ROCPROFSYS_PERFETTO_TOOLS_DIR}/ninja
CACHE FILEPATH
"Ninja"
FORCE
)
endif()
# ---------------------------------------------------------------------------------------#
#
# source tree
#
# ---------------------------------------------------------------------------------------#
if(NOT EXISTS "${ROCPROFSYS_PERFETTO_SOURCE_DIR}")
execute_process(
COMMAND ${CMAKE_COMMAND} -E make_directory ${PROJECT_BINARY_DIR}/external/perfetto
)
# cmake -E copy_directory fails for some reason
execute_process(
COMMAND
${ROCPROFSYS_COPY_EXECUTABLE} -r ${PROJECT_SOURCE_DIR}/external/perfetto/
${ROCPROFSYS_PERFETTO_SOURCE_DIR}
)
endif()
file(READ ${PROJECT_SOURCE_DIR}/external/perfetto/sdk/perfetto.h _PERFETTO_HEADER)
string(
REGEX REPLACE
" perfetto::internal::ValidateEventNameType"
" ::perfetto::internal::ValidateEventNameType"
_PERFETTO_HEADER
"${_PERFETTO_HEADER}"
)
if(ROCPROFSYS_USE_SANITIZER AND ROCPROFSYS_SANITIZER_TYPE MATCHES "address")
string(
REPLACE
"__asan_poison_memory_region((a), (s))"
""
_PERFETTO_HEADER
"${_PERFETTO_HEADER}"
)
string(
REPLACE
"__asan_unpoison_memory_region((a), (s))"
""
_PERFETTO_HEADER
"${_PERFETTO_HEADER}"
)
endif()
file(WRITE ${ROCPROFSYS_PERFETTO_SOURCE_DIR}/sdk/perfetto.h.tmp "${_PERFETTO_HEADER}")
configure_file(
${ROCPROFSYS_PERFETTO_SOURCE_DIR}/sdk/perfetto.h.tmp
${ROCPROFSYS_PERFETTO_SOURCE_DIR}/sdk/perfetto.h
COPYONLY
)
configure_file(
${PROJECT_SOURCE_DIR}/external/perfetto/sdk/perfetto.cc
${ROCPROFSYS_PERFETTO_SOURCE_DIR}/sdk/perfetto.cc
COPYONLY
)
configure_file(
${PROJECT_SOURCE_DIR}/cmake/Templates/args.gn.in
${ROCPROFSYS_PERFETTO_BINARY_DIR}/args.gn
@ONLY
)
# ---------------------------------------------------------------------------------------#
#
# build tools
#
# ---------------------------------------------------------------------------------------#
if(ROCPROFSYS_INSTALL_PERFETTO_TOOLS)
find_program(ROCPROFSYS_CURL_EXECUTABLE NAMES curl PATH_SUFFIXES bin)
if(NOT ROCPROFSYS_CURL_EXECUTABLE)
rocprofiler_systems_message(
SEND_ERROR
"curl executable cannot be found. install-build-deps script for perfetto will fail"
)
endif()
ExternalProject_Add(
rocprofiler-systems-perfetto-build
PREFIX ${PROJECT_BINARY_DIR}/external/perfetto
SOURCE_DIR ${ROCPROFSYS_PERFETTO_SOURCE_DIR}
BUILD_IN_SOURCE 1
PATCH_COMMAND ${ROCPROFSYS_PERFETTO_TOOLS_DIR}/install-build-deps
CONFIGURE_COMMAND
${ROCPROFSYS_PERFETTO_TOOLS_DIR}/gn gen ${ROCPROFSYS_PERFETTO_BINARY_DIR}
BUILD_COMMAND
${ROCPROFSYS_NINJA_EXECUTABLE} -C ${ROCPROFSYS_PERFETTO_BINARY_DIR} -j
${ROCPROFSYS_PERFETTO_BUILD_THREADS}
INSTALL_COMMAND ""
BUILD_BYPRODUCTS ${ROCPROFSYS_PERFETTO_BINARY_DIR}/args.gn
)
add_custom_target(
rocprofiler-systems-perfetto-clean
COMMAND ${ROCPROFSYS_NINJA_EXECUTABLE} -t clean
COMMAND
${CMAKE_COMMAND} -E rm -rf
${PROJECT_BINARY_DIR}/external/perfetto/src/rocprof-sys-perfetto-build-stamp
WORKING_DIRECTORY ${ROCPROFSYS_PERFETTO_BINARY_DIR}
COMMENT "Cleaning Perfetto..."
)
install(
DIRECTORY ${ROCPROFSYS_PERFETTO_INSTALL_DIR}/
DESTINATION ${CMAKE_INSTALL_LIBDIR}/${PROJECT_NAME}
COMPONENT perfetto
FILES_MATCHING
PATTERN "*libperfetto.so*"
)
foreach(
_FILE
perfetto
traced
tracebox
traced_probes
traced_perf
trigger_perfetto
)
if("${_FILE}" STREQUAL "perfetto")
string(REPLACE "_" "-" _INSTALL_FILE "rocprof-sys-${_FILE}")
else()
string(REPLACE "_" "-" _INSTALL_FILE "rocprof-sys-perfetto-${_FILE}")
endif()
install(
PROGRAMS ${ROCPROFSYS_PERFETTO_INSTALL_DIR}/${_FILE}
DESTINATION ${CMAKE_INSTALL_BINDIR}
COMPONENT perfetto
RENAME ${_INSTALL_FILE}
OPTIONAL
)
endforeach()
endif()
# ---------------------------------------------------------------------------------------#
#
# perfetto static library
#
# ---------------------------------------------------------------------------------------#
add_library(rocprofiler-systems-perfetto-library STATIC)
add_library(
rocprofiler-systems::rocprofiler-systems-perfetto-library
ALIAS rocprofiler-systems-perfetto-library
)
target_sources(
rocprofiler-systems-perfetto-library
PRIVATE
${ROCPROFSYS_PERFETTO_SOURCE_DIR}/sdk/perfetto.cc
${ROCPROFSYS_PERFETTO_SOURCE_DIR}/sdk/perfetto.h
)
target_link_libraries(
rocprofiler-systems-perfetto-library
PRIVATE
rocprofiler-systems::rocprofiler-systems-threading
rocprofiler-systems::rocprofiler-systems-static-libgcc
rocprofiler-systems::rocprofiler-systems-static-libstdcxx
)
set_target_properties(
rocprofiler-systems-perfetto-library
PROPERTIES
OUTPUT_NAME perfetto
ARCHIVE_OUTPUT_DIRECTORY ${ROCPROFSYS_PERFETTO_BINARY_DIR}
POSITION_INDEPENDENT_CODE ON
CXX_VISIBILITY_PRESET "internal"
)
set(perfetto_DIR ${ROCPROFSYS_PERFETTO_SOURCE_DIR})
set(PERFETTO_ROOT_DIR
${ROCPROFSYS_PERFETTO_SOURCE_DIR}
CACHE PATH
"Root Perfetto installation"
FORCE
)
set(PERFETTO_INCLUDE_DIR
${ROCPROFSYS_PERFETTO_SOURCE_DIR}/sdk
CACHE PATH
"Perfetto include folder"
FORCE
)
set(PERFETTO_LIBRARY
${ROCPROFSYS_PERFETTO_BINARY_DIR}/${CMAKE_STATIC_LIBRARY_PREFIX}perfetto${CMAKE_STATIC_LIBRARY_SUFFIX}
CACHE FILEPATH
"Perfetto library"
FORCE
)
mark_as_advanced(PERFETTO_ROOT_DIR)
mark_as_advanced(PERFETTO_INCLUDE_DIR)
mark_as_advanced(PERFETTO_LIBRARY)
# ---------------------------------------------------------------------------------------#
#
# perfetto interface library
#
# ---------------------------------------------------------------------------------------#
rocprofiler_systems_target_compile_definitions(rocprofiler-systems-perfetto
INTERFACE ROCPROFSYS_USE_PERFETTO
)
target_include_directories(
rocprofiler-systems-perfetto
SYSTEM
INTERFACE $<BUILD_INTERFACE:${PERFETTO_INCLUDE_DIR}>
)
target_link_libraries(
rocprofiler-systems-perfetto
INTERFACE
$<BUILD_INTERFACE:${PERFETTO_LIBRARY}>
$<BUILD_INTERFACE:rocprofiler-systems::rocprofiler-systems-threading>
)
+28
Просмотреть файл
@@ -0,0 +1,28 @@
# Set build arguments here. See `gn help buildargs`.
cc = "@CMAKE_C_COMPILER@"
cxx = "@CMAKE_CXX_COMPILER@"
is_debug = false
is_clang = @PERFETTO_IS_CLANG@
is_hermetic_clang = false
enable_perfetto_benchmarks = false
enable_perfetto_integration_tests = true
enable_perfetto_unittests = false
enable_perfetto_fuzzers = false
# enable_perfetto_stderr_crash_dump = false
enable_perfetto_heapprofd = false
enable_perfetto_tools = false
enable_perfetto_trace_processor = true
enable_perfetto_trace_processor_httpd = true
enable_perfetto_trace_processor_json = false
enable_perfetto_trace_processor_linenoise = false
enable_perfetto_trace_processor_percentile = false
enable_perfetto_trace_processor_sqlite = true
enable_perfetto_ui = false
extra_cflags = "@ROCPROFSYS_PERFETTO_C_FLAGS@"
extra_cxxflags = "@ROCPROFSYS_PERFETTO_CXX_FLAGS@"
extra_ldflags = "@ROCPROFSYS_PERFETTO_LINK_FLAGS@ -Wl,-rpath=\\\$ORIGIN:\\\$ORIGIN/../lib:\\\$ORIGIN/../lib/rocprof-sys"
+17
Просмотреть файл
@@ -0,0 +1,17 @@
#!/usr/bin/env bash
export PYTHONPATH=$(cd $(dirname ${BASH_SOURCE[0]})/../@CMAKE_INSTALL_PYTHONDIR@ && pwd):${PYTHONPATH}
: ${PYTHON_EXECUTABLE:=@PYTHON_EXECUTABLE@}
if [ ! -f ${PYTHON_EXECUTABLE} ]; then PYTHON_EXECUTABLE=$(basename ${PYTHON_EXECUTABLE}); fi
set -e
run-script()
{
echo -e "\n##### ${PROJECT_NAME} :: executing '${@}'... #####\n"
eval $@
}
run-script ${PYTHON_EXECUTABLE} -m @SCRIPT_SUBMODULE@ "$(printf ' %q' "$@")"
+17
Просмотреть файл
@@ -0,0 +1,17 @@
#%Module1.0
module-whatis "@PROJECT_NAME@ (version @PROJECT_VERSION@)"
proc ModulesHelp { } {
puts stderr "Loads @PROJECT_NAME@ v@PROJECT_VERSION@"
}
set ROOT [file normalize [file dirname [file normalize ${ModulesCurrentModulefile}]]/../../..]
setenv @PROJECT_NAME_UNDERSCORED@_ROOT "${ROOT}"
prepend-path CMAKE_PREFIX_PATH "${ROOT}"
prepend-path PATH "${ROOT}/bin"
prepend-path PATH "${ROOT}/@CMAKE_INSTALL_LIBEXECDIR@/@PROJECT_NAME@"
prepend-path LD_LIBRARY_PATH "${ROOT}/@CMAKE_INSTALL_LIBDIR@"
prepend-path PYTHONPATH "${ROOT}/@CMAKE_INSTALL_PYTHONDIR@"
setenv @PROJECT_NAME_UNDERSCORED@_DIR "${ROOT}/@CMAKE_INSTALL_DATAROOTDIR@/cmake/@PROJECT_NAME@"
+56
Просмотреть файл
@@ -0,0 +1,56 @@
# - Config file for @PROJECT_NAME@ and its component libraries
# It defines the following variables:
#
# @PROJECT_NAME@_INCLUDE_DIRS
# @PROJECT_NAME@_LIBRARIES
# @PROJECT_NAME@_INTERNAL_DEFINES - used by the test suite
# compute paths
get_filename_component(@PROJECT_NAME@_CMAKE_DIR "${CMAKE_CURRENT_LIST_FILE}" PATH)
# version
include(${CMAKE_CURRENT_LIST_DIR}/@PROJECT_NAME@-version.cmake)
@PACKAGE_INIT@
set_and_check(@PROJECT_NAME@_INCLUDE_DIR "@PACKAGE_INCLUDE_INSTALL_DIR@")
set_and_check(@PROJECT_NAME@_LIB_DIR "@PACKAGE_LIB_INSTALL_DIR@")
get_filename_component(@PROJECT_NAME@_ROOT_DIR ${@PROJECT_NAME@_INCLUDE_DIR} PATH)
set(@PROJECT_NAME@_LIBRARIES)
add_library(@PROJECT_NAME@::@PROJECT_NAME@ INTERFACE IMPORTED)
include("${@PROJECT_NAME@_CMAKE_DIR}/@PROJECT_NAME@-library-targets.cmake")
# Library dependencies
foreach(TARG @PROJECT_BUILD_TARGETS@)
set(TARG @PROJECT_NAME@-${TARG}-library)
if(NOT @PROJECT_NAME@_FIND_COMPONENTS)
list(APPEND @PROJECT_NAME@_LIBRARIES @PROJECT_NAME@::${TARG})
target_link_libraries(@PROJECT_NAME@::@PROJECT_NAME@
INTERFACE @PROJECT_NAME@::${TARG})
endif()
endforeach()
if(@PROJECT_NAME@_FIND_COMPONENTS)
foreach(COMP ${@PROJECT_NAME@_FIND_COMPONENTS})
set(TARG @PROJECT_NAME@::@PROJECT_NAME@-${COMP}-library)
if(TARGET ${TARG})
set(@PROJECT_NAME@_${COMP}_FOUND 1)
list(APPEND @PROJECT_NAME@_LIBRARIES ${TARG})
target_link_libraries(@PROJECT_NAME@::@PROJECT_NAME@
INTERFACE ${TARG})
else()
set(@PROJECT_NAME@_${COMP}_FOUND 0)
endif()
endforeach()
endif()
include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(
@PROJECT_NAME@
FOUND_VAR @PROJECT_NAME@_FOUND
REQUIRED_VARS @PROJECT_NAME@_ROOT_DIR @PROJECT_NAME@_INCLUDE_DIR @PROJECT_NAME@_LIBRARIES
VERSION_VAR @PROJECT_NAME@_VERSION
HANDLE_COMPONENTS)
+294
Просмотреть файл
@@ -0,0 +1,294 @@
#!/usr/bin/env python3
import os
import re
import sys
import stat
import argparse
import tempfile
import subprocess as sp
from urllib import request
from urllib.error import HTTPError
rocprofsys_version = "@ROCPROFSYS_VERSION@"
rocprofsys_git_tag = "@ROCPROFSYS_GIT_TAG@"
_rocm_path = os.environ.get("ROCM_PATH", "/opt/rocm")
_rocm_version = None
def get_rocm_version(rocm_hint):
global _rocm_path
global _rocm_version
if rocm_hint is not None and rocm_hint is not True:
if rocm_hint.replace(".", "0").isnumeric():
_rocm_version = rocm_hint
else:
_rocm_path = rocm_hint
def _parse_version(_v):
return re.split(r"[\\.-]", _v) if _v is not None else None
_version = _parse_version(_rocm_version)
for fname in [
"version",
"version-dev",
"version-hip-libraries",
"version-hiprt",
"version-hiprt-devel",
"version-hip-sdk",
"version-libs",
"version-utils",
]:
if _version is not None and len(_version) > 0:
break
_fname = os.path.join(_rocm_path, ".info", fname)
if os.path.exists(_fname):
with open(_fname, "r") as f:
_version = _parse_version(f.readlines()[0].strip("\n"))
if _version is not None and len(_version) > 0:
_major = int(_version[0])
_minor = int(_version[1]) if len(_version) >= 2 else 0
_rocm_version = f"{_major}.{_minor}"
return "-ROCm-{}".format((10000 * _major) + (100 * _minor))
return None
def get_os_info(os_distrib, os_version):
_os_info = {}
with open("/etc/os-release", "r") as f:
for line in [_v.strip() for _v in f.readlines()]:
if "=" not in line:
continue
_key, _data = line.split("=", 1)
_os_info[_key] = _data.strip('"')
def _parse_version(_v):
_version = re.split(r"[\\.-]", _v)
return (
"{}.{}".format(_version[0], _version[1])
if len(_version) > 1
else "{}".format(_version[0])
)
if os_distrib is None or os_distrib == "auto":
if "ubuntu" in _os_info["ID"]:
os_distrib = "ubuntu"
elif "opensuse" in _os_info["ID"]:
os_distrib = "opensuse"
elif "rhel" in _os_info["ID"]:
os_distrib = "rhel"
elif "centos" in _os_info["ID"]:
os_distrib = "rhel"
elif "rockylinux" in _os_info["ID"]:
os_distrib = "rhel"
elif "debian" in _os_info["ID"]:
os_distrib = "ubuntu"
if "debian" in _os_info["ID"] and os_version is None:
_debian_version = float(_parse_version(_os_info["VERSION_ID"]))
if _debian_version >= 11.0:
os_version = "20.04"
else:
os_version = "18.04"
elif "fedora" in _os_info["ID"]:
os_distrib = "rhel"
# fedora has different versioning system so fallback to 8.7
if os_version is None:
os_version = "8.7"
else:
# if we don't have an exact match, check ID_LIKE
if "ID_LIKE" not in _os_info.keys():
_os_info["ID_LIKE"] = _os_info["ID"]
if "debian" in _os_info["ID_LIKE"]:
os_distrib = "ubuntu"
if os_version is None:
# fallback on 20.04 if ID is not ubuntu but debian-like
os_version = "20.04"
elif "suse" in _os_info["ID_LIKE"]:
os_distrib = "opensuse"
# fallback on 15.3 if ID is not opensuse but suse-like
if os_version is None:
os_version = "15.3"
elif "rhel" in _os_info["ID_LIKE"] or "centos" in _os_info["ID_LIKE"]:
os_distrib = "rhel"
if os_version is None:
os_version = "8.7"
else:
raise RuntimeError(
"Unknown ID_LIKE value in /etc/os-release: {}".format(
_os_info["ID_LIKE"]
)
)
elif os_distrib == "centos":
os_distrib = "rhel"
# uses same versioning system
elif os_distrib == "fedora":
os_distrib = "rhel"
if os_version is None:
# fedora has different versioning system so fallback to 8.7
os_version = "8.7"
if os_version is None:
os_version = _parse_version(_os_info["VERSION_ID"])
return (os_distrib, os_version)
def print_log(*args, **kwargs):
sys.stdout.flush()
sys.stderr.flush()
sys.stderr.write("### ")
sys.stderr.write(*args, **kwargs)
sys.stderr.write("\n")
sys.stderr.flush()
def run(*args, **kwargs):
print_log("Executing: {}\n".format(" ".join(*args)))
sp.run(*args, **kwargs, check=True)
sys.stderr.write("\n")
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument(
"--version",
help="Print ROCm Systems Profiler version which will be installed",
action="store_true",
)
parser.add_argument(
"-p", "--prefix", help="Installation prefix", type=str, default="/opt/rocprofiler-systems"
)
parser.add_argument(
"-i",
"--interactive",
help="Prompt to accept the license and include/exclude subdirectory",
action="store_true",
)
parser.add_argument(
"-D",
"--download-path",
help="Download directory (default: temporary directory)",
type=str,
default=None,
)
parser.add_argument(
"-d",
"--os-distrib",
help="Target OS distribution",
type=str,
default=None,
choices=("auto", "ubuntu", "opensuse", "rhel", "centos", "fedora"),
)
parser.add_argument(
"-v", "--os-version", help="Target OS version", type=str, default=None
)
parser.add_argument(
"-k",
"--keep-download",
help="Do not delete downloaded file as installation",
action="store_true",
)
parser.add_argument(
"--rocm",
help="Install ROCm Systems Profiler with ROCm support. Accepts either a ROCm version (e.g. '6.2') or the root path to the ROCm install containing .info/version* file(s) (e.g. /opt/rocm if /opt/rocm/.info/version exists). If no argument is provided, the ROCm version will attempted to be deduced from $ENV{ROCM_PATH}/.info/version",
nargs="?",
default=None,
const=True,
metavar="VERSION or ROCM_PATH with .info/version file(s)",
)
# right now, only valid set of extensions are: papi + ompt + python3
# in the future, this might change, e.g. MPI variants
parser.add_argument(
"-e",
"--extensions",
help="ROCm Systems Profiler extensions, e.g. PAPI, OMPT, and Python3",
nargs="*",
default=("papi", "ompt", "python3"),
choices=("papi", "ompt", "python3"),
)
args = parser.parse_args()
if args.version:
print(f"ROCm Systems Profiler {rocprofsys_version}")
sys.exit(0)
os_distrib, os_version = get_os_info(args.os_distrib, args.os_version)
rocm_version = get_rocm_version(args.rocm) if args.rocm is not None else ""
extensions = ""
if "papi" in args.extensions:
extensions += "-PAPI"
if "ompt" in args.extensions:
extensions += "-OMPT"
if "python3" in args.extensions:
extensions += "-Python3"
if rocm_version is None:
raise RuntimeError(
f"Error! ROCm version could not be determined from {_rocm_path}/.info/version*. Please provide a ROCm version or the root path to the ROCm install containing the .info directory, e.g. '--rocm 5.4' or '--rocm /path/to/rocm/install'"
)
script = f"rocprofiler-systems-{rocprofsys_version}-{os_distrib}-{os_version}{rocm_version}{extensions}.sh"
url = f"https://github.com/ROCm/rocprofiler-systems/releases/download/{rocprofsys_git_tag}/{script}"
download_dir = (
tempfile.mkdtemp(prefix="rocprof-sys-install-")
if args.download_path is None
else args.download_path
)
install_script = os.path.join(download_dir, script)
try:
if not os.path.exists(download_dir):
print_log(f"Creating download directory: {download_dir} ...")
os.makedirs(download_dir)
print_log(f"Downloading {url} ...")
try:
response = request.urlretrieve(url, install_script)
except HTTPError as e:
print_log(f"")
print_log(f"Error: {e}")
print_log(f"")
print_log(f"Error: Installer script download from {url} failed!")
if args.rocm is not None:
print_log(
f"There may not be a pre-built installer for ROCm version {_rocm_version}"
)
sys.exit(-1)
if os.path.exists(install_script):
print_log(f"Download completed: {install_script}")
else:
raise RuntimeError(f"Download completed but {install_script} does not exist")
os.chmod(install_script, stat.S_IRWXU)
if not os.path.exists(args.prefix):
print_log(f"Creating directory: {args.prefix} ...")
os.makedirs(args.prefix)
install_args = (
["--exclude-subdir", "--skip-license"] if not args.interactive else []
)
print_log(f"Installing ROCm Systems Profiler to {args.prefix} ...")
run([install_script, f"--prefix={args.prefix}"] + install_args)
print_log(
f"ROCm Systems Profiler v{rocprofsys_version} installation to {args.prefix} succeeded!"
)
finally:
if not args.keep_download:
print_log(f"Removing install script {install_script} ...")
os.remove(install_script)
# remove the directory if it is a temporary directory
if args.download_path is None:
print_log(f"Removing temporary directory {download_dir} ...")
os.rmdir(download_dir)
+29
Просмотреть файл
@@ -0,0 +1,29 @@
#!/usr/bin/env bash
if [ -z "$BASH_SOURCE" ]; then
# If not running bash, try to obtain directory with $0
BASEDIR="$( cd "$(dirname "$0")"; pwd -P )"
else
BASEDIR=$(dirname ${BASH_SOURCE[0]})
fi
command -v realpath &> /dev/null && BASEDIR=$(realpath ${BASEDIR}/../..) || BASEDIR=$(cd ${BASEDIR}/../.. && pwd)
if [ ! -d "${BASEDIR}" ]; then
echo "${BASEDIR} does not exist"
return 1
fi
@PROJECT_NAME_UNDERSCORED@_ROOT=${BASEDIR}
PATH=${BASEDIR}/bin:${PATH}
PATH=${BASEDIR}/@CMAKE_INSTALL_LIBEXECDIR@/@PROJECT_NAME@:${PATH}
LD_LIBRARY_PATH=${BASEDIR}/@CMAKE_INSTALL_LIBDIR@:${LD_LIBRARY_PATH}
PYTHONPATH=${BASEDIR}/@CMAKE_INSTALL_PYTHONDIR@:${PYTHONPATH}
CMAKE_PREFIX_PATH=${BASEDIR}:${CMAKE_PREFIX_PATH}
@PROJECT_NAME_UNDERSCORED@_DIR=${BASEDIR}/@CMAKE_INSTALL_DATAROOTDIR@/cmake/@PROJECT_NAME@
export @PROJECT_NAME_UNDERSCORED@_ROOT
export PATH
export LD_LIBRARY_PATH
export PYTHONPATH
export CMAKE_PREFIX_PATH
export @PROJECT_NAME_UNDERSCORED@_DIR
+1
Просмотреть файл
@@ -0,0 +1 @@
/dyninst-source
+65
Просмотреть файл
@@ -0,0 +1,65 @@
ARG DISTRO=opensuse/leap
ARG VERSION=15.5
FROM ${DISTRO}:${VERSION}
ENV HOME /root
ENV SHELL /bin/bash
ENV BASH_ENV /etc/bash.bashrc
ENV DEBIAN_FRONTEND noninteractive
WORKDIR /tmp
SHELL [ "/bin/bash", "-c" ]
ENV PATH /usr/local/bin:${PATH}
ENV LIBRARY_PATH ${LIBRARY_PATH}:/opt/amdgpu/lib64
RUN set +e; \
zypper --non-interactive -i --gpg-auto-import-keys refresh; \
zypper --non-interactive -i patch; \
zypper --non-interactive -i patch; \
zypper --non-interactive -i --gpg-auto-import-keys refresh; \
exit 0
RUN zypper --non-interactive update -y && \
zypper --non-interactive dist-upgrade -y && \
zypper --non-interactive install -y -t pattern devel_basis && \
zypper --non-interactive install -y binutils-gold chrpath cmake curl dpkg-devel \
gcc-c++ git iproute2 libdrm-devel libnuma-devel openmpi3-devel python3-pip rpm-build \
sqlite3-devel wget && \
python3 -m pip install 'cmake==3.21'
ARG ROCM_VERSION=0.0
RUN ROCM_MAJOR=$(echo "${ROCM_VERSION}" | sed 's/\./ /g' | awk '{print $1}') && \
ROCM_MINOR=$(echo "${ROCM_VERSION}" | sed 's/\./ /g' | awk '{print $2}') && \
if [ "${ROCM_MAJOR}" != "0" ] || [ "${ROCM_MINOR}" != "0" ]; then \
OS_VERSION=$(grep '^VERSION_ID=' /etc/os-release | cut -d'=' -f2 | tr -d '"') && \
OS_VERSION_MAJOR=$(echo "$OS_VERSION" | cut -d'.' -f1) && \
OS_VERSION_MINOR=$(echo "$OS_VERSION" | cut -d'.' -f2) && \
ROCM_PATCH=$(echo "${ROCM_VERSION}" | sed 's/\./ /g' | awk '{print $3}') && \
if [ -z "${ROCM_PATCH}" ] || [ "${ROCM_PATCH}" = "0" ]; then \
ROCM_PATCH=0 && \
ROCM_VERSION=$(echo "${ROCM_VERSION}" | sed 's/\.0$//') \
; fi && \
ROCM_VERSN=$(( ("${ROCM_MAJOR}"*10000)+("${ROCM_MINOR}"*100) + ("${ROCM_PATCH}"))) && \
zypper --non-interactive addrepo https://download.opensuse.org/repositories/devel:languages:perl/15.6/devel:languages:perl.repo && \
zypper --non-interactive --no-gpg-checks install -y https://repo.radeon.com/amdgpu-install/${ROCM_VERSION}/sle/${OS_VERSION}/amdgpu-install-${ROCM_MAJOR}.${ROCM_MINOR}.${ROCM_VERSN}-1.noarch.rpm && \
zypper --non-interactive --gpg-auto-import-keys refresh && \
zypper --non-interactive install -y rocm-dev rccl-devel libpciaccess0 && \
zypper --non-interactive clean --all; \
fi
ARG PYTHON_VERSIONS="6 7 8 9 10 11 12 13"
RUN wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh -O miniforge.sh && \
bash miniforge.sh -b -p /opt/conda && \
export PATH="/opt/conda/bin:${PATH}" && \
conda config --set always_yes yes --set changeps1 no && \
conda update -c conda-forge -n base conda && \
for i in ${PYTHON_VERSIONS}; do conda create -n py3.${i} -c conda-forge python=3.${i} pip; done && \
for i in ${PYTHON_VERSIONS}; do /opt/conda/envs/py3.${i}/bin/python -m pip install numpy perfetto dataclasses; done && \
conda clean -a -y && \
conda init
WORKDIR /home
SHELL [ "/bin/bash", "--login", "-c" ]
+51
Просмотреть файл
@@ -0,0 +1,51 @@
ARG DISTRO=opensuse/leap
ARG VERSION=15.5
FROM ${DISTRO}:${VERSION}
ENV HOME /root
ENV SHELL /bin/bash
ENV BASH_ENV /etc/bash.bashrc
ENV DEBIAN_FRONTEND noninteractive
WORKDIR /tmp
SHELL [ "/bin/bash", "-c" ]
ENV PATH /usr/local/bin:${PATH}
ARG EXTRA_PACKAGES=""
ARG ELFUTILS_DOWNLOAD_VERSION="0.188"
ARG BOOST_DOWNLOAD_VERSION="1.79.0"
ARG NJOBS="8"
RUN set +e; \
zypper --non-interactive -i --gpg-auto-import-keys refresh; \
zypper --non-interactive -i patch; \
zypper --non-interactive -i patch; \
zypper --non-interactive -i --gpg-auto-import-keys refresh; \
exit 0
RUN zypper --non-interactive update -y && \
zypper --non-interactive dist-upgrade -y && \
zypper --non-interactive install -y -t pattern devel_basis && \
zypper --non-interactive install -y binutils-gold chrpath cmake curl dpkg-devel \
gcc-c++ git iproute2 libnuma-devel openmpi3-devel papi-devel python3-pip \
rpm-build sqlite3-devel vim wget && \
zypper --non-interactive clean --all && \
python3 -m pip install 'cmake==3.21' perfetto
ARG PYTHON_VERSIONS="6 7 8 9 10 11 12 13"
RUN wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh -O miniforge.sh && \
bash miniforge.sh -b -p /opt/conda && \
export PATH="/opt/conda/bin:${PATH}" && \
conda config --set always_yes yes --set changeps1 no && \
conda update -c conda-forge -n base conda && \
for i in ${PYTHON_VERSIONS}; do conda create -n py3.${i} -c conda-forge python=3.${i} pip numpy; done && \
for i in ${PYTHON_VERSIONS}; do /opt/conda/envs/py3.${i}/bin/python -m pip install numpy perfetto dataclasses; done && \
conda clean -a -y && \
cd /tmp && \
shopt -s dotglob extglob && \
rm -rf *
WORKDIR /home
SHELL [ "/bin/bash", "--login", "-c" ]
+64
Просмотреть файл
@@ -0,0 +1,64 @@
ARG DISTRO=rockylinux/rockylinux
ARG VERSION=8
FROM ${DISTRO}:${VERSION}
ENV HOME /root
ENV SHELL /bin/bash
ENV BASH_ENV /etc/bash.bashrc
ENV DEBIAN_FRONTEND noninteractive
WORKDIR /tmp
SHELL [ "/bin/bash", "-c" ]
ENV PATH /usr/lib64/openmpi/bin:/usr/local/bin:${PATH}
ENV LIBRARY_PATH ${LIBRARY_PATH}:/opt/amdgpu/lib64
RUN yum groupinstall -y "Development Tools" && \
yum install -y epel-release && crb enable && \
yum install -y --allowerasing chrpath cmake curl dpkg-devel iproute libdrm-devel \
numactl-devel openmpi-devel papi-devel python3-pip sqlite-devel texinfo \
wget which zlib-devel && \
yum clean all && \
python3 -m pip install 'cmake==3.21' && \
python3 -m pip install 'perfetto'
ARG ROCM_VERSION=0.0
RUN ROCM_MAJOR=$(echo "${ROCM_VERSION}" | sed 's/\./ /g' | awk '{print $1}') && \
ROCM_MINOR=$(echo "${ROCM_VERSION}" | sed 's/\./ /g' | awk '{print $2}') && \
if [ "${ROCM_MAJOR}" != "0" ] || [ "${ROCM_MINOR}" != "0" ]; then \
OS_VERSION=$(grep '^VERSION_ID=' /etc/os-release | cut -d'=' -f2 | tr -d '"') && \
OS_VERSION_MAJOR=$(echo "$OS_VERSION" | cut -d'.' -f1) && \
RPM_TAG=".el${OS_VERSION_MAJOR}" && \
ROCM_PATCH=$(echo "${ROCM_VERSION}" | sed 's/\./ /g' | awk '{print $3}') && \
if [ -z "${ROCM_PATCH}" ] || [ "${ROCM_PATCH}" = "0" ]; then \
ROCM_PATCH=0 && \
ROCM_VERSION=$(echo "${ROCM_VERSION}" | sed 's/\.0$//') \
; fi && \
ROCM_VERSN=$(( ("${ROCM_MAJOR}"*10000)+("${ROCM_MINOR}"*100) + ("${ROCM_PATCH}"))) && \
if [ "${OS_VERSION_MAJOR}" -eq 8 ]; then PERL_REPO=powertools; else PERL_REPO=crb; fi && \
dnf -y --enablerepo=${PERL_REPO} install perl-File-BaseDir && \
yum install -y https://repo.radeon.com/amdgpu-install/${ROCM_VERSION}/rhel/${OS_VERSION}/amdgpu-install-${ROCM_MAJOR}.${ROCM_MINOR}.${ROCM_VERSN}-1${RPM_TAG}.noarch.rpm && \
yum install -y rocm-dev && \
yum clean all; \
fi
ARG PYTHON_VERSIONS="6 7 8 9 10 11 12 13"
RUN wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh -O miniforge.sh && \
bash miniforge.sh -b -p /opt/conda && \
export PATH="/opt/conda/bin:${PATH}" && \
conda config --set always_yes yes --set changeps1 no && \
conda update -c conda-forge -n base conda && \
for i in ${PYTHON_VERSIONS}; do conda create -n py3.${i} -c conda-forge python=3.${i} pip; done && \
for i in ${PYTHON_VERSIONS}; do /opt/conda/envs/py3.${i}/bin/python -m pip install numpy perfetto dataclasses; done && \
conda clean -a -y && \
conda init
RUN if [ "${ROCM_VERSION}" != "0.0" ]; then ln -sf /opt/rocm-${ROCM_VERSION}* /opt/rocm; fi
WORKDIR /home
ENV LC_ALL C.UTF-8
SHELL [ "/bin/bash", "--login", "-c" ]
COPY ./entrypoint-rhel.sh /docker-entrypoint.sh
ENTRYPOINT [ "/docker-entrypoint.sh" ]
+42
Просмотреть файл
@@ -0,0 +1,42 @@
ARG DISTRO=rockylinux/rockylinux
ARG VERSION=8
FROM ${DISTRO}:${VERSION}
ENV HOME /root
ENV SHELL /bin/bash
ENV BASH_ENV /etc/bash.bashrc
ENV DEBIAN_FRONTEND noninteractive
WORKDIR /tmp
SHELL [ "/bin/bash", "-c" ]
ENV PATH /usr/lib64/openmpi/bin:/usr/local/bin:${PATH}
ARG EXTRA_PACKAGES=""
ARG ELFUTILS_DOWNLOAD_VERSION="0.188"
ARG BOOST_DOWNLOAD_VERSION="1.79.0"
ARG NJOBS="8"
RUN yum groupinstall -y "Development Tools" && \
yum install -y epel-release && crb enable && \
yum install -y --allowerasing chrpath cmake curl dpkg-devel iproute numactl-devel \
openmpi-devel papi-devel python3-pip sqlite-devel texinfo wget which vim zlib-devel && \
yum clean all && \
python3 -m pip install 'cmake==3.21' perfetto
ARG PYTHON_VERSIONS="6 7 8 9 10 11 12 13"
RUN wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh -O miniforge.sh && \
bash miniforge.sh -b -p /opt/conda && \
export PATH="/opt/conda/bin:${PATH}" && \
conda config --set always_yes yes --set changeps1 no && \
conda update -c conda-forge -n base conda && \
for i in ${PYTHON_VERSIONS}; do conda create -n py3.${i} -c conda-forge python=3.${i} pip numpy; done && \
for i in ${PYTHON_VERSIONS}; do /opt/conda/envs/py3.${i}/bin/python -m pip install numpy perfetto dataclasses; done && \
conda clean -a -y && \
cd /tmp && \
shopt -s dotglob extglob && \
rm -rf *
WORKDIR /home
SHELL [ "/bin/bash", "--login", "-c" ]
+67
Просмотреть файл
@@ -0,0 +1,67 @@
ARG DISTRO
ARG VERSION
FROM ${DISTRO}:${VERSION}
ENV HOME /root
ENV LANG en_US.UTF-8
ENV LANGUAGE en_US
ENV LC_ALL C
ENV SHELL /bin/bash
ENV BASH_ENV /etc/bash.bashrc
ENV DEBIAN_FRONTEND noninteractive
WORKDIR /tmp
SHELL [ "/bin/bash", "-c" ]
ARG EXTRA_PACKAGES=""
ARG ROCM_VERSION="0.0"
ENV PATH ${HOME}/.local/bin:${PATH}
RUN apt-get update && \
apt-get dist-upgrade -y && \
apt-get install -y apt-utils autoconf autotools-dev bash-completion bison \
build-essential chrpath cmake curl flex gettext git-core gnupg2 iproute2 \
libnuma1 libopenmpi-dev libpapi-dev libpfm4-dev librpm-dev libsqlite3-dev \
libtool libudev1 lsb-release m4 python3-pip rpm texinfo wget && \
OS_VERSION=$(cat /etc/os-release | grep VERSION_ID | sed 's/=/ /'1 | awk '{print $NF}' | sed 's/"//g') && \
if [ "${OS_VERSION}" == "24.04" ]; then \
python3 -m pip install --break-system-packages 'cmake==3.21'; \
else \
python3 -m pip install 'cmake==3.21'; \
fi
RUN ROCM_MAJOR=$(echo "${ROCM_VERSION}" | sed 's/\./ /g' | awk '{print $1}') && \
ROCM_MINOR=$(echo "${ROCM_VERSION}" | sed 's/\./ /g' | awk '{print $2}') && \
if [ "${ROCM_MAJOR}" != "0" ] || [ "${ROCM_MINOR}" != "0" ]; then \
OS_VERSION=$(grep '^VERSION_ID=' /etc/os-release | cut -d'=' -f2 | tr -d '"') && \
OS_CODENAME=$(grep '^VERSION_CODENAME=' /etc/os-release | cut -d'=' -f2) && \
ROCM_PATCH=$(echo "${ROCM_VERSION}" | sed 's/\./ /g' | awk '{print $3}') && \
if [ -z "${ROCM_PATCH}" ] || [ "${ROCM_PATCH}" = "0" ]; then \
ROCM_PATCH=0 && \
ROCM_VERSION=$(echo "${ROCM_VERSION}" | sed 's/\.0$//') \
; fi && \
ROCM_VERSN=$(( ("${ROCM_MAJOR}"*10000)+("${ROCM_MINOR}"*100) + ("${ROCM_PATCH}"))) && \
AMDGPU_DEB="amdgpu-install_${ROCM_MAJOR}.${ROCM_MINOR}.${ROCM_VERSN}-1_all.deb" && \
wget https://repo.radeon.com/amdgpu-install/${ROCM_VERSION}/ubuntu/${OS_CODENAME}/${AMDGPU_DEB} && \
apt-get install -y ./${AMDGPU_DEB} && \
apt-get update && \
apt-get install -y rocm-dev rccl-dev libpciaccess0 ${EXTRA_PACKAGES} && \
apt-get autoclean; \
fi
ARG PYTHON_VERSIONS="6 7 8 9 10 11 12 13"
RUN wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh -O miniforge.sh && \
bash miniforge.sh -b -p /opt/conda && \
export PATH="/opt/conda/bin:${PATH}" && \
conda config --set always_yes yes --set changeps1 no && \
conda update -c conda-forge -n base conda && \
for i in ${PYTHON_VERSIONS}; do conda create -n py3.${i} -c conda-forge python=3.${i} pip; done && \
for i in ${PYTHON_VERSIONS}; do /opt/conda/envs/py3.${i}/bin/python -m pip install numpy perfetto dataclasses; done && \
conda clean -a -y && \
conda init
ENV LC_ALL C.UTF-8
WORKDIR /home
SHELL [ "/bin/bash", "--login", "-c" ]
+56
Просмотреть файл
@@ -0,0 +1,56 @@
ARG DISTRO
ARG VERSION
FROM ${DISTRO}:${VERSION}
ENV HOME /root
ENV LANG C.UTF-8
ENV SHELL /bin/bash
ENV BASH_ENV /etc/bash.bashrc
ENV DEBIAN_FRONTEND noninteractive
WORKDIR /tmp
SHELL [ "/bin/bash", "-c" ]
ARG EXTRA_PACKAGES=""
ARG ELFUTILS_DOWNLOAD_VERSION="0.188"
ARG BOOST_DOWNLOAD_VERSION="1.79.0"
ARG NJOBS="8"
ENV PATH /usr/local/bin:${PATH}
ENV LIBRARY_PATH /usr/local/lib:/usr/local/lib64:${LIBRARY_PATH}
ENV LD_LIBRARY_PATH /usr/local/lib:/usr/local/lib64:${LD_LIBRARY_PATH}
ENV CMAKE_PREFIX_PATH /usr/local:${CMAKE_PREFIX_PATH}
RUN apt-get update && \
apt-get dist-upgrade -y && \
apt-get install -y autoconf autotools-dev bash-completion bison build-essential \
bzip2 chrpath cmake curl environment-modules flex gettext git-core gnupg2 \
gzip iproute2 libiberty-dev libpapi-dev libpfm4-dev libsqlite3-dev libtool \
locales lsb-release m4 python3-pip texinfo unzip wget vim zip zlib1g-dev && \
apt-get autoclean && \
if [ "${OS_VERSION}" == "24.04" ]; then \
python3 -m pip install --break-system-packages 'cmake==3.21' perfetto \
else \
python3 -m pip install 'cmake==3.21' perfetto; \
fi
ARG PYTHON_VERSIONS="6 7 8 9 10 11 12 13"
RUN wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh -O miniforge.sh && \
bash miniforge.sh -b -p /opt/conda && \
export PATH="/opt/conda/bin:${PATH}" && \
conda config --set always_yes yes --set changeps1 no && \
conda update -c conda-forge -n base conda && \
for i in ${PYTHON_VERSIONS}; do conda create -n py3.${i} -c conda-forge python=3.${i} pip numpy; done && \
for i in ${PYTHON_VERSIONS}; do /opt/conda/envs/py3.${i}/bin/python -m pip install numpy perfetto dataclasses; done && \
conda clean -a -y && \
cd /tmp && \
shopt -s dotglob extglob && \
rm -rf *
ENV LC_ALL C.UTF-8
WORKDIR /home
SHELL [ "/bin/bash", "--login", "-c" ]
+171
Просмотреть файл
@@ -0,0 +1,171 @@
#!/usr/bin/env bash
set -e
: ${USER:=$(whoami)}
: ${DISTRO:=ubuntu}
: ${VERSIONS:=20.04}
: ${NJOBS=$(nproc)}
: ${ELFUTILS_VERSION:=0.186}
: ${BOOST_VERSION:=1.79.0}
: ${PYTHON_VERSIONS:="6 7 8 9 10 11 12 13"}
: ${PUSH:=0}
: ${PULL:=--pull}
verbose-run()
{
echo -e "\n### Executing \"${@}\"... ###\n"
eval $@
}
tolower()
{
echo "$@" | awk -F '\\|~\\|' '{print tolower($1)}';
}
toupper()
{
echo "$@" | awk -F '\\|~\\|' '{print toupper($1)}';
}
usage()
{
print_option() { printf " --%-20s %-24s %s\n" "${1}" "${2}" "${3}"; }
echo "Options:"
print_option "help -h" "" "This message"
print_option "push" "" "Push the container to DockerHub when completed"
print_option "no-pull" "" "Do not pull down most recent base container"
echo ""
print_default_option() { printf " --%-20s %-24s %s (default: %s)\n" "${1}" "${2}" "${3}" "$(tolower ${4})"; }
print_default_option distro "[ubuntu|opensuse|rhel]" "OS distribution" "${DISTRO}"
print_default_option versions "[VERSION] [VERSION...]" "Ubuntu, OpenSUSE, or RHEL release" "${VERSIONS}"
print_default_option python-versions "[VERSION] [VERSION...]" "Python 3 minor releases" "${PYTHON_VERSIONS}"
print_default_option "jobs -j" "[N]" "parallel build jobs" "${NJOBS}"
print_default_option elfutils-version "[0.183..0.188]" "ElfUtils version" "${ELFUTILS_VERSION}"
print_default_option boost-version "[1.67.0..1.79.0]" "Boost version" "${BOOST_VERSION}"
print_default_option user "[USERNAME]" "DockerHub username" "${USER}"
}
send-error()
{
usage
echo -e "\nError: ${@}"
exit 1
}
reset-last()
{
last() { send-error "Unsupported argument :: ${1}"; }
}
reset-last
n=0
while [[ $# -gt 0 ]]
do
case "${1}" in
-h|--help)
usage
exit 0
;;
"--distro")
shift
DISTRO=${1}
last() { DISTRO="${DISTRO} ${1}"; }
;;
"--versions")
shift
VERSIONS=${1}
last() { VERSIONS="${VERSIONS} ${1}"; }
;;
"--python-versions")
shift
PYTHON_VERSIONS=${1}
last() { PYTHON_VERSIONS="${PYTHON_VERSIONS} ${1}"; }
;;
--jobs|-j)
shift
NJOBS=${1}
reset-last
;;
"--elfutils-version")
shift
ELFUTILS_VERSION=${1}
reset-last
;;
"--boost-version")
shift
BOOST_VERSION=${1}
reset-last
;;
--user|-u)
shift
USER=${1}
reset-last
;;
"--push")
PUSH=1
reset-last
;;
"--no-pull")
PULL=""
reset-last
;;
--*)
reset-last
last ${1}
;;
*)
last ${1}
;;
esac
n=$((${n} + 1))
shift
done
DOCKER_FILE=Dockerfile.${DISTRO}.ci
if [ ! -f ${DOCKER_FILE} ]; then cd docker; fi
if [ ! -f ${DOCKER_FILE} ]; then
echo "Error! Execute script from source directory"
exit 1
fi
verbose-run rm -rf ./dyninst-source
verbose-run cp -r ../external/dyninst ./dyninst-source
verbose-run rm -rf ./dyninst-source/{build,install}*
set -e
if [ "${DISTRO}" = "opensuse" ]; then
DISTRO_IMAGE="opensuse/leap"
elif [ "${DISTRO}" = "rhel" ]; then
DISTRO_IMAGE="rockylinux/rockylinux"
else
DISTRO_IMAGE=${DISTRO}
fi
for VERSION in ${VERSIONS}
do
verbose-run docker build . \
${PULL} \
-f ${DOCKER_FILE} \
--tag ${USER}/rocprofiler-systems:ci-base-${DISTRO}-${VERSION} \
--build-arg DISTRO=${DISTRO_IMAGE} \
--build-arg VERSION=${VERSION} \
--build-arg NJOBS=${NJOBS} \
--build-arg PYTHON_VERSIONS=\"${PYTHON_VERSIONS}\" \
--build-arg ELFUTILS_DOWNLOAD_VERSION=${ELFUTILS_VERSION} \
--build-arg BOOST_DOWNLOAD_VERSION=${BOOST_VERSION}
done
if [ "${PUSH}" -gt 0 ]; then
for VERSION in ${VERSIONS}
do
verbose-run docker push ${USER}/rocprofiler-systems:ci-base-${DISTRO}-${VERSION}
done
fi
verbose-run rm -rf ./dyninst-source
+175
Просмотреть файл
@@ -0,0 +1,175 @@
#!/bin/bash -e
if [ ! -f CMakeLists.txt ]; then
if [ ! -f ../CMakeLists.txt ]; then
echo "Error! Execute script from source directory"
exit 1
else
cd ..
fi
fi
set -e
tolower()
{
echo "$@" | awk -F '\\|~\\|' '{print tolower($1)}';
}
toupper()
{
echo "$@" | awk -F '\\|~\\|' '{print toupper($1)}';
}
usage()
{
print_option() { printf " --%-20s %-24s %s\n" "${1}" "${2}" "${3}"; }
echo "Options:"
print_option "help -h" "" "This message"
echo ""
print_default_option() { printf " --%-20s %-24s %s (default: %s)\n" "${1}" "${2}" "${3}" "$(tolower ${4})"; }
print_default_option distro "[ubuntu|opensuse|rhel]" "OS distribution" "${DISTRO}"
print_default_option versions "[VERSION] [VERSION...]" "Ubuntu or OpenSUSE release" "${VERSIONS}"
print_default_option rocm-versions "[VERSION] [VERSION...]" "ROCm versions" "${ROCM_VERSIONS}"
print_default_option python-versions "[VERSION] [VERSION...]" "Python 3 minor releases" "${PYTHON_VERSIONS}"
print_default_option "user -u" "[USERNAME]" "DockerHub username" "${USER}"
print_default_option "retry -r" "[N]" "Number of attempts to build (to account for network errors)" "${RETRY}"
echo ""
echo "Usage: ${BASH_SOURCE[0]} <OPTIONS> -- <build-release.sh OPTIONS>"
echo " e.g:"
echo " ${BASH_SOURCE[0]} --distro ubuntu --versions 20.04 --rocm-versions 5.0 5.1 -- --core +nopython --rocm-mpi +nopython"
echo " ${BASH_SOURCE[0]} --distro ubuntu --versions 20.04 --python-version 6 7 8 9 10 -- --rocm +python --rocm-mpi +nopython"
}
send-error()
{
usage
echo -e "\nError: ${@}"
exit 1
}
verbose-run()
{
echo -e "\n### Executing \"${@}\" a maximum of ${RETRY} times... ###\n"
for i in $(seq 1 1 ${RETRY})
do
set +e
eval "${@}"
local RETC=$?
set -e
if [ "${RETC}" -eq 0 ]; then
break
else
echo -en "\n### Command failed with error code ${RETC}... "
if [ "${i}" -ne "${RETRY}" ]; then
echo -e "Retrying... ###\n"
sleep 3
else
echo -e "Exiting... ###\n"
exit ${RETC}
fi
fi
done
}
build-release()
{
CONTAINER=$1
OS=$2
ROCM_VERSION=$3
CODE_VERSION=$4
shift
shift
shift
shift
local DOCKER_ARGS=""
tty -s && DOCKER_ARGS="-it" || DOCKER_ARGS=""
verbose-run docker run ${DOCKER_ARGS} --rm -v ${PWD}:/home/rocprofiler-systems --stop-signal "SIGINT" --env DISTRO=${OS} --env ROCM_VERSION=${ROCM_VERSION} --env VERSION=${CODE_VERSION} --env PYTHON_VERSIONS=\"${PYTHON_VERSIONS}\" --env IS_DOCKER=1 ${CONTAINER} /home/rocprofiler-systems/scripts/build-release.sh ${@}
}
reset-last()
{
last() { send-error "Unsupported argument :: ${1}"; }
}
reset-last
: ${USER:=$(whoami)}
: ${DISTRO:=ubuntu}
: ${VERSIONS:=22.04 20.04}
: ${ROCM_VERSIONS:=5.0 4.5 4.3}
: ${MPI:=0}
: ${PYTHON_VERSIONS:="6 7 8 9 10 11 12 13"}
: ${RETRY:=3}
n=0
while [[ $# -gt 0 ]]
do
case "${1}" in
-h|--help)
usage
exit 0
;;
"--distro")
shift
DISTRO=${1}
last() { DISTRO="${DISTRO} ${1}"; }
;;
"--versions")
shift
VERSIONS=${1}
last() { VERSIONS="${VERSIONS} ${1}"; }
;;
"--rocm-versions")
shift
ROCM_VERSIONS=${1}
last() { ROCM_VERSIONS="${ROCM_VERSIONS} ${1}"; }
;;
"--python-versions")
shift
PYTHON_VERSIONS=${1}
last() { PYTHON_VERSIONS="${PYTHON_VERSIONS} ${1}"; }
;;
--user|-u)
shift
USER=${1}
reset-last
;;
--retry|-r)
shift
RETRY=${1}
reset-last
;;
"--")
shift
SCRIPT_ARGS=${@}
break
;;
*)
last ${1}
;;
esac
n=$((${n} + 1))
shift
done
CODE_VERSION=$(cat VERSION)
if [ "${RETRY}" -lt 1 ]; then
RETRY=1
fi
if [ "${DISTRO}" = "rhel" ]; then
SCRIPT_ARGS="${SCRIPT_ARGS} --static-libstdcxx off"
fi
for VERSION in ${VERSIONS}
do
TAG=${DISTRO}-${VERSION}
for ROCM_VERSION in ${ROCM_VERSIONS}
do
build-release ${USER}/rocprofiler-systems:release-base-${TAG}-rocm-${ROCM_VERSION} ${DISTRO}-${VERSION} ${ROCM_VERSION} ${CODE_VERSION} ${SCRIPT_ARGS}
done
done
+363
Просмотреть файл
@@ -0,0 +1,363 @@
#!/usr/bin/env bash
set-user-defaults()
{
: ${USER:=$(whoami)}
: ${ROCM_VERSIONS:="6.3"}
: ${DISTRO:=ubuntu}
: ${VERSIONS:=20.04}
: ${PYTHON_VERSIONS:="6 7 8 9 10 11 12 13"}
: ${BUILD_CI:=""}
: ${PUSH:=0}
: ${PULL:=--pull}
: ${RETRY:=3}
: ${SCRIPT_DIR=$(dirname "$(readlink -f "${BASH_SOURCE[0]:-$0}")")}
}
set-user-defaults
set -e
cd $(dirname ${SCRIPT_DIR})
declare -a MATRIX_DISTROS=()
declare -a MATRIX_VERSIONS=()
declare -a MATRIX_ROCM_VERSIONS=()
load-matrix()
{
local workflow_file=".github/workflows/containers.yml"
if [ ! -f "${workflow_file}" ]; then
echo -e "\n Error: Cannot find ${workflow_file}"
exit 1
fi
# In form os-distro;os-version;rocm-version
local matrix_data=$(awk '
/rocprofiler-systems-release:/, /steps:/ {
if (/- os-distro:/) {
gsub(/[[:space:]]*- os-distro:[[:space:]]*"/, "")
gsub(/"/, "")
distro = $0
}
if (/os-version:/) {
gsub(/[[:space:]]*os-version:[[:space:]]*"/, "")
gsub(/"/, "")
version = $0
}
if (/rocm-version:/) {
gsub(/[[:space:]]*rocm-version:[[:space:]]*"/, "")
gsub(/"/, "")
rocm = $0
printf "%s;%s;%s\n", distro, version, rocm
}
}
' "${workflow_file}")
while IFS=';' read -r os_distro os_version rocm_version; do
MATRIX_DISTROS+=("$os_distro")
MATRIX_VERSIONS+=("$os_version")
MATRIX_ROCM_VERSIONS+=("$rocm_version")
done <<< "$matrix_data"
}
validate-distro()
{
local distro="${1}"
if [ -n "${distro}" ]; then
distro=$(tolower "${distro}")
case "${distro}" in
ubuntu|opensuse|rhel)
;;
*)
send-error "Unsupported distribution '${distro}'" "Supported distributions: ubuntu, opensuse, rhel"
;;
esac
fi
}
show-matrix()
{
local filter_distro="${1:-}"
if [ -n "${filter_distro}" ]; then
validate-distro "${filter_distro}"
filter_distro=$(tolower "${filter_distro}")
fi
echo ""
if [ -n "${filter_distro}" ]; then
echo " Supported ${filter_distro} + ROCm Combinations "
echo " =============================================="
else
echo " Supported OS + ROCm Combinations "
echo " =========================================="
fi
echo ""
echo " OS Distribution Version ROCm Version"
echo " ---------------- ------- ------------"
for i in "${!MATRIX_DISTROS[@]}"; do
if [[ -z "${filter_distro}" || "${MATRIX_DISTROS[i]}" == "${filter_distro}" ]]; then
printf " %-16s %-9s %s\n" "${MATRIX_DISTROS[i]}" "${MATRIX_VERSIONS[i]}" "${MATRIX_ROCM_VERSIONS[i]}"
fi
done
echo ""
echo "ROCm '0.0' means no ROCm installation (CPU-only build)"
echo ""
echo "Note: Patch versions are also supported (See: https://repo.radeon.com/amdgpu-install/)"
echo ""
}
# Cross checks arguments against compatibility matrix (ignores ROCm patch version)
validate-combinations()
{
# Check OS version combinations
for VERSION in ${VERSIONS}; do
VERSION_MAJOR=$(echo ${VERSION} | sed 's/\./ /g' | awk '{print $1}')
VERSION_MINOR=$(echo ${VERSION} | sed 's/\./ /g' | awk '{print $2}')
local os_version_valid=0
for i in "${!MATRIX_DISTROS[@]}"; do
if [[ "${MATRIX_DISTROS[i]}" == "${DISTRO}" && \
"${MATRIX_VERSIONS[i]}" == "${VERSION}" ]]; then
os_version_valid=1
break
fi
done
if [ ${os_version_valid} -eq 0 ]; then
send-error "Unsupported OS version :: ${VERSION}. See compatibility matrix for supported versions."
fi
done
# Check ROCm version combinations
for VERSION in ${VERSIONS}; do
for ROCM_VERSION in ${ROCM_VERSIONS}; do
local valid=0
ROCM_MAJOR=$(echo ${ROCM_VERSION} | sed 's/\./ /g' | awk '{print $1}')
ROCM_MINOR=$(echo ${ROCM_VERSION} | sed 's/\./ /g' | awk '{print $2}')
ROCM_MAJOR_MINOR="${ROCM_MAJOR}.${ROCM_MINOR}"
if ! ([ "${ROCM_MAJOR_MINOR}" == "0.0" ] && [ "${ROCM_VERSION}" != "0.0" ]); then
for i in "${!MATRIX_DISTROS[@]}"; do
if [[ "${MATRIX_DISTROS[i]}" == "${DISTRO}" && \
"${MATRIX_VERSIONS[i]}" == "${VERSION}" && \
"${MATRIX_ROCM_VERSIONS[i]}" == "${ROCM_MAJOR_MINOR}" ]]; then
valid=1
break
fi
done
fi
if [ ${valid} -eq 0 ]; then
send-error "Unsupported combination :: ${DISTRO}-${VERSION} + ROCm ${ROCM_VERSION}. See compatibility matrix for supported versions."
fi
done
done
}
tolower()
{
echo "$@" | awk -F '\\|~\\|' '{print tolower($1)}';
}
toupper()
{
echo "$@" | awk -F '\\|~\\|' '{print toupper($1)}';
}
usage()
{
set-user-defaults
print_option() { printf " --%-20s %-24s %s\n" "${1}" "${2}" "${3}"; }
echo "Options:"
print_option "help -h" "" "This message"
print_option "no-pull" "" "Do not pull down most recent base container"
print_option "matrix -m" "[ubuntu|opensuse|rhel]" "Shows compatibility matrix"
echo ""
print_default_option() { printf " --%-20s %-24s %s (default: %s)\n" "${1}" "${2}" "${3}" "$(tolower ${4})"; }
print_default_option distro "[ubuntu|opensuse|rhel]" "OS distribution" "${DISTRO}"
print_default_option versions "[VERSION] [VERSION...]" "Ubuntu, OpenSUSE, or RHEL release" "${VERSIONS}"
print_default_option rocm-versions "[VERSION] [VERSION...]" "ROCm versions (format: Major.Minor.Patch, patch defaults to 0 if not specified)" "${ROCM_VERSIONS}"
print_default_option python-versions "[VERSION] [VERSION...]" "Python 3 minor releases" "${PYTHON_VERSIONS}"
print_default_option "user -u" "[USERNAME]" "DockerHub username" "${USER}"
print_default_option "retry -r" "[N]" "Number of attempts to build (to account for network errors)" "${RETRY}"
print_default_option push "" "Push the image to Dockerhub" ""
#print_default_option lto "[on|off]" "Enable LTO" "${LTO}"
}
send-error()
{
usage
echo -e "\nError: ${@}"
exit 1
}
verbose-run()
{
echo -e "\n### Executing \"${@}\"... ###\n"
eval "${@}"
}
verbose-build()
{
echo -e "\n### Executing \"${@}\" a maximum of ${RETRY} times... ###\n"
for i in $(seq 1 1 ${RETRY})
do
set +e
eval "${@}"
local RETC=$?
set -e
if [ "${RETC}" -eq 0 ]; then
break
else
echo -en "\n### Command failed with error code ${RETC}... "
if [ "${i}" -ne "${RETRY}" ]; then
echo -e "Retrying... ###\n"
sleep 3
else
echo -e "Exiting... ###\n"
exit ${RETC}
fi
fi
done
}
reset-last()
{
last() { send-error "Unsupported argument :: ${1}"; }
}
reset-last
load-matrix
n=0
while [[ $# -gt 0 ]]
do
case "${1}" in
-h|--help)
usage
exit 0
;;
-m|--matrix)
shift
if [[ $# -gt 0 && ! "${1}" =~ ^-- ]]; then
show-matrix "${1}"
else
show-matrix
fi
exit 0
;;
"--distro")
shift
DISTRO=$(tolower ${1})
last() { DISTRO="${DISTRO} $(tolower${1})"; }
;;
"--versions")
shift
VERSIONS=${1}
last() { VERSIONS="${VERSIONS} ${1}"; }
;;
"--rocm-versions")
shift
ROCM_VERSIONS=${1}
last() { ROCM_VERSIONS="${ROCM_VERSIONS} ${1}"; }
;;
"--python-versions")
shift
PYTHON_VERSIONS=${1}
last() { PYTHON_VERSIONS="${PYTHON_VERSIONS} ${1}"; }
;;
--user|-u)
shift
USER=${1}
reset-last
;;
--push)
PUSH=1
reset-last
;;
--no-pull)
PULL=""
reset-last
;;
--retry|-r)
shift
RETRY=${1}
reset-last
;;
"--*")
send-error "Unsupported argument at position $((${n} + 1)) :: ${1}"
;;
*)
last ${1}
;;
esac
n=$((${n} + 1))
shift
done
# Validate input parameters for os-distros and rocm-versions
validate-distro
validate-combinations
DOCKER_FILE="Dockerfile.${DISTRO}"
if [ "${RETRY}" -lt 1 ]; then
RETRY=1
fi
if [ -n "${BUILD_CI}" ]; then DOCKER_FILE="${DOCKER_FILE}.ci"; fi
cd docker # Forced since PWD is parent dir
if [ ! -f ${DOCKER_FILE} ]; then send-error "File \"${DOCKER_FILE}\" not found"; fi
for VERSION in ${VERSIONS}
do
VERSION_MAJOR=$(echo ${VERSION} | sed 's/\./ /g' | awk '{print $1}')
VERSION_MINOR=$(echo ${VERSION} | sed 's/\./ /g' | awk '{print $2}')
VERSION_PATCH=$(echo ${VERSION} | sed 's/\./ /g' | awk '{print $3}')
for ROCM_VERSION in ${ROCM_VERSIONS}
do
ROCM_MAJOR=$(echo ${ROCM_VERSION} | sed 's/\./ /g' | awk '{print $1}')
ROCM_MINOR=$(echo ${ROCM_VERSION} | sed 's/\./ /g' | awk '{print $2}')
ROCM_PATCH=$(echo ${ROCM_VERSION} | sed 's/\./ /g' | awk '{print $3}')
if [ "${ROCM_PATCH}" = "0" ] || [ -z "${ROCM_PATCH}" ]; then
CONTAINER=${USER}/rocprofiler-systems:release-base-${DISTRO}-${VERSION}-rocm-${ROCM_MAJOR}.${ROCM_MINOR}
else
CONTAINER=${USER}/rocprofiler-systems:release-base-${DISTRO}-${VERSION}-rocm-${ROCM_VERSION}
fi
if [ "${DISTRO}" = "ubuntu" ]; then
case "${VERSION}" in
24.04)
ROCM_REPO_DIST="noble"
;;
22.04)
ROCM_REPO_DIST="jammy"
;;
20.04)
ROCM_REPO_DIST="focal"
;;
*)
;;
esac
verbose-build docker build . ${PULL} --progress plain -f ${DOCKER_FILE} --tag ${CONTAINER} --build-arg DISTRO=${DISTRO} --build-arg VERSION=${VERSION} --build-arg ROCM_VERSION=${ROCM_VERSION} --build-arg PYTHON_VERSIONS=\"${PYTHON_VERSIONS}\"
elif [ "${DISTRO}" = "rhel" ]; then
# use Rocky Linux as a base image for RHEL builds
DISTRO_BASE_IMAGE=rockylinux/rockylinux
verbose-build docker build . ${PULL} --progress plain -f ${DOCKER_FILE} --tag ${CONTAINER} --build-arg DISTRO=${DISTRO_BASE_IMAGE} --build-arg VERSION=${VERSION} --build-arg ROCM_VERSION=${ROCM_VERSION} --build-arg PYTHON_VERSIONS=\"${PYTHON_VERSIONS}\"
elif [ "${DISTRO}" = "opensuse" ]; then
DISTRO_IMAGE="opensuse/leap"
if [[ "${VERSION_MAJOR}" -le 15 && "${VERSION_MINOR}" -le 5 ]]; then
PERL_REPO="15.6"
else
PERL_REPO="${VERSION_MAJOR}.${VERSION_MINOR}"
fi
verbose-build docker build . ${PULL} --progress plain -f ${DOCKER_FILE} --tag ${CONTAINER} --build-arg DISTRO=${DISTRO_IMAGE} --build-arg VERSION=${VERSION} --build-arg ROCM_VERSION=${ROCM_VERSION} --build-arg PERL_REPO=${PERL_REPO} --build-arg PYTHON_VERSIONS=\"${PYTHON_VERSIONS}\"
fi
if [ "${PUSH}" -ne 0 ]; then
docker push ${CONTAINER}
fi
done
done
+14
Просмотреть файл
@@ -0,0 +1,14 @@
#!/bin/bash -l
if [ -f /etc/profile.d/modules.sh ]; then
source /etc/profile.d/modules.sh
module load mpi &> /dev/null
fi
if [ -z "${1}" ]; then
: ${SHELL:=/bin/bash}
exec ${SHELL}
else
set -e
eval $@
fi
+126
Просмотреть файл
@@ -0,0 +1,126 @@
#!/usr/bin/env bash
if [ ! -f CMakeLists.txt ]; then
if [ ! -f ../CMakeLists.txt ]; then
echo "Error! Execute script from source directory"
exit 1
else
cd ..
fi
fi
set -e
tolower()
{
echo "$@" | awk -F '\\|~\\|' '{print tolower($1)}';
}
toupper()
{
echo "$@" | awk -F '\\|~\\|' '{print toupper($1)}';
}
usage()
{
print_option() { printf " --%-20s %-24s %s\n" "${1}" "${2}" "${3}"; }
echo "Options:"
print_option "help -h" "" "This message"
echo ""
print_default_option() { printf " --%-20s %-24s %s (default: %s)\n" "${1}" "${2}" "${3}" "$(tolower ${4})"; }
print_default_option distro "[ubuntu|opensuse]" "OS distribution" "${DISTRO}"
print_default_option versions "[VERSION] [VERSION...]" "Ubuntu or OpenSUSE release" "${VERSIONS}"
print_default_option rocm-versions "[VERSION] [VERSION...]" "ROCm versions" "${ROCM_VERSIONS}"
print_default_option user "[USERNAME]" "DockerHub username" "${USER}"
echo ""
echo "Usage: ${BASH_SOURCE[0]} <OPTIONS> -- <test-release.sh OPTIONS>"
echo " e.g:"
echo " ${BASH_SOURCE[0]} --distro ubuntu --versions 20.04 --rocm-versions 5.0 5.1 -- --stgz /path/to/stgz/installer"
}
send-error()
{
echo -e "\nError: ${@}"
exit 1
}
verbose-run()
{
echo -e "\n### Executing \"${@}\"... ###\n"
exec "${@}"
}
test-release()
{
CONTAINER=${1}
shift
local DOCKER_ARGS=""
tty -s && DOCKER_ARGS="-it" || DOCKER_ARGS=""
verbose-run docker run ${DOCKER_ARGS} --rm -v ${PWD}:/home/rocprofiler-systems ${CONTAINER} /home/rocprofiler-systems/scripts/test-release.sh ${@}
}
reset-last()
{
last() { send-error "Unsupported argument :: ${1}"; }
}
reset-last
: ${USER:=$(whoami)}
: ${DISTRO:=ubuntu}
: ${VERSIONS:=22.04 20.04}
: ${ROCM_VERSIONS:=6.2}
n=0
while [[ $# -gt 0 ]]
do
case "${1}" in
-h|--help)
usage
exit 0
;;
"--distro")
shift
DISTRO=${1}
last() { DISTRO="${DISTRO} ${1}"; }
;;
"--versions")
shift
VERSIONS=${1}
last() { VERSIONS="${VERSIONS} ${1}"; }
;;
"--rocm-versions")
shift
ROCM_VERSIONS=${1}
last() { ROCM_VERSIONS="${ROCM_VERSIONS} ${1}"; }
;;
--user|-u)
shift
USER=${1}
reset-last
;;
"--")
shift
SCRIPT_ARGS=${@}
break
;;
*)
last ${1}
;;
esac
n=$((${n} + 1))
shift
done
CODE_VERSION=$(cat VERSION)
for VERSION in ${VERSIONS}
do
TAG=${DISTRO}-${VERSION}
for ROCM_VERSION in ${ROCM_VERSIONS}
do
test-release ${USER}/rocprofiler-systems:release-base-${TAG}-rocm-${ROCM_VERSION} ${SCRIPT_ARGS}
done
done
+2
Просмотреть файл
@@ -0,0 +1,2 @@
_build/
_doxygen/
+147
Просмотреть файл
@@ -0,0 +1,147 @@
.. meta::
:description: ROCm Systems Profiler data collection modes documentation
:keywords: rocprof-sys, rocprofiler-systems, Omnitrace, ROCm, profiler, data collection, tracking, visualization, tool, Instinct, accelerator, AMD
**********************
Data collection modes
**********************
ROCm Systems Profiler supports several modes of recording trace and profiling data for your application.
.. note::
For an explanation of the terms used in this topic, see
the :doc:`ROCm Systems Profiler glossary <../reference/rocprof-sys-glossary>`.
+-----------------------------+---------------------------------------------------------+
| Mode | Description |
+=============================+=========================================================+
| Binary Instrumentation | Locates functions (and loops, if desired) in the binary |
| | and inserts snippets at the entry and exit |
+-----------------------------+---------------------------------------------------------+
| Statistical Sampling | Periodically pauses application at specified intervals |
| | and records various metrics for the given call stack |
+-----------------------------+---------------------------------------------------------+
| Callback APIs | Parallelism frameworks such as ROCm, OpenMP, and Kokkos |
| | make callbacks into ROCm Systems Profiler to provide |
| | information about the work the API is performing |
+-----------------------------+---------------------------------------------------------+
| Dynamic Symbol Interception | Wrap function symbols defined in a position independent |
| | dynamic library/executable, like ``pthread_mutex_lock`` |
| | in ``libpthread.so`` or ``MPI_Init`` in the MPI library |
+-----------------------------+---------------------------------------------------------+
| User API | User-defined regions and controls for ROCm Systems |
| | Profiler |
+-----------------------------+---------------------------------------------------------+
The two most generic and important modes are binary instrumentation and statistical sampling.
It is important to understand their advantages and disadvantages.
Binary instrumentation and statistical sampling can be performed with the ``rocprof-sys-instrument``
executable. For statistical sampling, it's highly recommended to use the
``rocprof-sys-sample`` executable instead if binary instrumentation isn't required or needed.
Callback APIs and dynamic symbol interception can be utilized with either tool.
Binary instrumentation
-----------------------------------
Binary instrumentation lets you record deterministic measurements for
every single invocation of a given function.
Binary instrumentation effectively adds instructions to the target application to
collect the required information. It therefore has the potential to cause performance
changes which might, in some cases, lead to inaccurate results. The effect depends on
the information being collected and which features are activated in ROCm Systems Profiler.
For example, collecting only the wall-clock timing data
has less of an effect than collecting the wall-clock timing, CPU-clock timing,
memory usage, cache-misses, and number of instructions that were run. Similarly,
collecting a flat profile has less overhead than a hierarchical profile
and collecting a trace OR a profile has less overhead than collecting a
trace AND a profile.
In ROCm Systems Profiler, the primary heuristic for controlling the overhead with binary
instrumentation is the minimum number of instructions for selecting functions
for instrumentation.
Statistical sampling
-----------------------------------
Statistical call-stack sampling periodically interrupts the application at
regular intervals using operating system interrupts.
Sampling is typically less numerically accurate and specific, but the
target program runs at nearly full speed.
In contrast to the data derived from binary instrumentation, the resulting
data is not exact but is instead a statistical approximation.
However, sampling often provides a more accurate picture of the application
execution because it is less intrusive to the target application and has fewer
side effects on memory caches or instruction decoding pipelines. Furthermore,
because sampling does not affect the execution speed as much, is it
relatively immune to over-evaluating the cost of small, frequently called
functions or "tight" loops.
In ROCm Systems Profiler, the overhead for statistical sampling depends on the
sampling rate and whether the samples are taken with respect to the CPU time
and/or real time.
Binary instrumentation vs. statistical sampling example
-------------------------------------------------------
Consider the following code:
.. code-block:: c++
long fib(long n)
{
if(n < 2) return n;
return fib(n - 1) + fib(n - 2);
}
void run(long n)
{
long result = fib(n);
printf("[%li] fibonacci(%li) = %li\n", i, n, result);
}
int main(int argc, char** argv)
{
long nfib = 30;
long nitr = 10;
if(argc > 1) nfib = atol(argv[1]);
if(argc > 2) nitr = atol(argv[2]);
for(long i = 0; i < nitr; ++i)
run(nfib);
return 0;
}
Binary instrumentation of the ``fib`` function will record **every single invocation**
of the function. For a very small function
such as ``fib``, this results in **significant** overhead since this simple function
takes about 20 instructions, whereas the entry and
exit snippets are ~1024 instructions. Therefore, you generally want to avoid
instrumenting functions where the instrumented function has significantly fewer
instructions than entry and exit instrumentation. (Note that many of the
instructions in entry and exit functions are either logging functions or
depend on the runtime settings and thus might never run). However,
due to the number of potential instructions in the entry and exit snippets,
the default behavior of ``rocprof-sys-instrument`` is to only instrument functions
which contain at least 1024 instructions.
However, recording every single invocation of the function can be extremely
useful for detecting anomalies, such as profiles that show minimum or maximum values much smaller or larger
than the average or a high standard deviation. In this case, the traces help you
identify exactly when and where those instances deviated from the norm.
Compare the level of detail in the following traces. In the top image,
every instance of the ``fib`` function is instrumented, while in the bottom image,
the ``fib`` call-stack is derived via sampling.
Binary instrumentation of the Fibonacci function
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. image:: ../data/fibonacci-instrumented.png
:alt: Visualization of the output of a binary instrumentation of the Fibonacci function
Statistical sampling of the Fibonacci function
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. image:: ../data/fibonacci-sampling.png
:alt: Visualization of the output of a statistical sample of the Fibonacci function
+137
Просмотреть файл
@@ -0,0 +1,137 @@
.. meta::
:description: ROCm Systems Profiler feature set documentation and reference
:keywords: rocprof-sys, rocprofiler-systems, Omnitrace, ROCm, profiler, feature set, use cases, tracking, visualization, tool, Instinct, accelerator, AMD
********************************************
ROCm Systems Profiler features and use cases
********************************************
`ROCm Systems Profiler <https://github.com/ROCm/rocprofiler-systems>`_ is designed to be highly extensible.
Internally, it leverages the `Timemory performance analysis toolkit <https://github.com/ROCm/timemory>`_
to manage extensions, resources, data, and other items. It supports the following features,
modes, metrics, and APIs.
Data collection modes
========================================
* Dynamic instrumentation
* Runtime instrumentation: Instrument executables and shared libraries at runtime
* Binary rewriting: Generate a new executable and/or library with instrumentation built-in
* Statistical sampling: Periodic software interrupts per-thread
* Process-level sampling: A background thread records process-, system- and device-level metrics while the application runs
* Causal profiling: Quantifies the potential impact of optimizations in parallel code
Data analysis
========================================
* High-level summary profiles with mean, min, max, and standard deviation statistics
* Low overhead and memory efficient
* Ideal for running at scale
* Comprehensive traces for every individual event and measurement
* Application speed-up predictions resulting from potential optimizations in functions and lines of code based on causal profiling
Parallelism API support
========================================
* HIP
* HSA
* Pthreads
* MPI
* Kokkos-Tools (KokkosP)
* OpenMP-Tools (OMPT)
GPU metrics
========================================
* GPU hardware counters
* HIP API tracing
* HIP kernel tracing
* HSA API tracing
* HSA operation tracing
* rocDecode API tracing
* rocJPEG API tracing
* System-level sampling (via AMD-SMI)
* Memory usage
* Power usage
* Temperature
* Utilization
* VCN activity
* JPEG activity
Note: The availability of VCN and JPEG engine activity depends on device support for different ASICs. If unsupported, all values for VCN_ACTIVITY and JPEG_ACTIVITY will be reported as N/A in the output of amd-smi metric--usage.
CPU metrics
========================================
* CPU hardware counters sampling and profiles
* CPU frequency sampling
* Various timing metrics
* Wall time
* CPU time (process and thread)
* CPU utilization (process and thread)
* User CPU time
* Kernel CPU time
* Various memory metrics
* High-water mark (sampling and profiles)
* Memory page allocation
* Virtual memory usage
* Network statistics
* I/O metrics
* Many others
Third-party API support
========================================
* TAU
* LIKWID
* Caliper
* CrayPAT
* VTune
* NVTX
* ROCTX
ROCm Systems Profiler use cases
========================================
When analyzing the performance of an application, do NOT
assume you know where the performance bottlenecks are
and why they are happening. ROCm Systems Profiler is a tool for analyzing the entire
application and its performance. It is
ideal for characterizing where optimization would have the greatest impact
on an end-to-end run of the application and for
viewing what else is happening on the system during a performance bottleneck.
When GPUs are involved, there is a tendency to assume that
the quickest path to performance improvement is minimizing
the runtime of the GPU kernels. This is a highly flawed assumption.
If you optimize the runtime of a kernel from one millisecond
to 1 microsecond (1000x speed-up) but the original application never
spent time waiting for kernels to complete,
there would be no statistically significant reduction in the end-to-end
runtime of your application. In other words, it does not matter
how fast or slow the code on GPU is if the application has a
bottleneck on waiting on the GPU.
Use ROCm Systems Profiler to obtain a high-level view of the entire application. Use it
to determine where the performance bottlenecks are and
obtain clues to why these bottlenecks are happening. Rather than worrying about kernel
performance, start your investigation with ROCm Systems Profiler, which characterizes the
broad picture.
.. note::
For insight into the execution of individual kernels on the GPU,
use `ROCm Compute Profiler <https://github.com/rocm/rocprofiler-compute>`_.
In terms of CPU analysis, ROCm Systems Profiler does not target any specific vendor.
It works just as well on AMD and non-AMD CPUs.
With regard to the GPU, ROCm Systems Profiler is currently restricted to HIP and HSA APIs
and kernels running on AMD GPUs.
+56
Просмотреть файл
@@ -0,0 +1,56 @@
# MIT License
# Copyright (c) 2023 - 2025 Advanced Micro Devices, Inc. All rights reserved.
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
# Configuration file for the Sphinx documentation builder.
#
# This file only contains a selection of the most common options. For a full
# list see the documentation:
# https://www.sphinx-doc.org/en/master/usage/configuration.html
import re
from rocm_docs import ROCmDocs
with open("../VERSION", encoding="utf-8") as f:
match = re.search(r"([0-9.]+)[^0-9.]+", f.read())
if not match:
raise ValueError("VERSION not found!")
version_number = match[1]
external_projects_current_project = "rocprofiler-systems"
project = "rocprofiler-systems"
author = "Advanced Micro Devices, Inc."
copyright = "Copyright (c) 2025 Advanced Micro Devices, Inc. All rights reserved."
version = version_number
release = version_number
html_title = f"ROCm Systems Profiler {version} documentation"
external_toc_path = "./sphinx/_toc.yml"
docs_core = ROCmDocs(html_title)
docs_core.setup()
docs_core.run_doxygen(doxygen_root="doxygen", doxygen_path="doxygen/xml")
docs_core.enable_api_reference()
for sphinx_var in ROCmDocs.SPHINX_VARS:
globals()[sphinx_var] = getattr(docs_core, sphinx_var)
Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 27 KiB

Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 106 KiB

Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 408 KiB

Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 18 KiB

Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 96 KiB

Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 138 KiB

Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 905 KiB

Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 433 KiB

Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 119 KiB

Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 195 KiB

Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 230 KiB

Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 277 KiB

Двоичный файл не отображается.

После

Ширина:  |  Высота:  |  Размер: 106 KiB

+3
Просмотреть файл
@@ -0,0 +1,3 @@
html/
latex/
xml/
+373
Просмотреть файл
@@ -0,0 +1,373 @@
# Doxyfile 1.8.20
#---------------------------------------------------------------------------
# Project related configuration options
#---------------------------------------------------------------------------
DOXYFILE_ENCODING = UTF-8
PROJECT_NAME = rocprofiler-systems
PROJECT_NUMBER = 1.11.3
PROJECT_BRIEF = "High-level and comprehensive application tracing and profiling on both the CPU and GPU"
PROJECT_LOGO =
OUTPUT_DIRECTORY = .
CREATE_SUBDIRS = NO
ALLOW_UNICODE_NAMES = YES
OUTPUT_LANGUAGE = English
OUTPUT_TEXT_DIRECTION = None
BRIEF_MEMBER_DESC = YES
REPEAT_BRIEF = YES
ABBREVIATE_BRIEF =
ALWAYS_DETAILED_SEC = YES
INLINE_INHERITED_MEMB = YES
FULL_PATH_NAMES = YES
STRIP_FROM_PATH = /home/docs/checkouts/readthedocs.org/user_builds/advanced-micro-devices-rocprofiler-systems/checkouts/
STRIP_FROM_INC_PATH = /home/docs/checkouts/readthedocs.org/user_builds/advanced-micro-devices-rocprofiler-systems/checkouts/
SHORT_NAMES = NO
JAVADOC_AUTOBRIEF = NO
JAVADOC_BANNER = NO
QT_AUTOBRIEF = NO
MULTILINE_CPP_IS_BRIEF = YES
PYTHON_DOCSTRING = YES
INHERIT_DOCS = YES
SEPARATE_MEMBER_PAGES = NO
TAB_SIZE = 4
ALIASES =
OPTIMIZE_OUTPUT_FOR_C = NO
OPTIMIZE_OUTPUT_JAVA = NO
OPTIMIZE_FOR_FORTRAN = NO
OPTIMIZE_OUTPUT_VHDL = NO
OPTIMIZE_OUTPUT_SLICE = NO
EXTENSION_MAPPING = hpp=C++ \
cpp=C++ \
hh=C++ \
cc=C++ \
h=C \
c=C \
py=Python
MARKDOWN_SUPPORT = YES
TOC_INCLUDE_HEADINGS = 2
AUTOLINK_SUPPORT = YES
BUILTIN_STL_SUPPORT = YES
CPP_CLI_SUPPORT = NO
SIP_SUPPORT = NO
IDL_PROPERTY_SUPPORT = YES
DISTRIBUTE_GROUP_DOC = NO
GROUP_NESTED_COMPOUNDS = YES
SUBGROUPING = YES
INLINE_GROUPED_CLASSES = NO
INLINE_SIMPLE_STRUCTS = YES
TYPEDEF_HIDES_STRUCT = NO
LOOKUP_CACHE_SIZE = 5
NUM_PROC_THREADS = 0
#---------------------------------------------------------------------------
# Build related configuration options
#---------------------------------------------------------------------------
EXTRACT_ALL = YES
EXTRACT_PRIVATE = NO
EXTRACT_PRIV_VIRTUAL = NO
EXTRACT_PACKAGE = NO
EXTRACT_STATIC = NO
EXTRACT_LOCAL_CLASSES = YES
EXTRACT_LOCAL_METHODS = NO
EXTRACT_ANON_NSPACES = NO
HIDE_UNDOC_MEMBERS = NO
HIDE_UNDOC_CLASSES = YES
HIDE_FRIEND_COMPOUNDS = NO
HIDE_IN_BODY_DOCS = NO
INTERNAL_DOCS = NO
CASE_SENSE_NAMES = NO
HIDE_SCOPE_NAMES = NO
HIDE_COMPOUND_REFERENCE= NO
SHOW_INCLUDE_FILES = YES
SHOW_GROUPED_MEMB_INC = NO
FORCE_LOCAL_INCLUDES = YES
INLINE_INFO = YES
SORT_MEMBER_DOCS = YES
SORT_BRIEF_DOCS = NO
SORT_MEMBERS_CTORS_1ST = YES
SORT_GROUP_NAMES = NO
SORT_BY_SCOPE_NAME = NO
STRICT_PROTO_MATCHING = NO
GENERATE_TODOLIST = NO
GENERATE_TESTLIST = NO
GENERATE_BUGLIST = NO
GENERATE_DEPRECATEDLIST= NO
ENABLED_SECTIONS =
MAX_INITIALIZER_LINES = 30
SHOW_USED_FILES = YES
SHOW_FILES = YES
SHOW_NAMESPACES = YES
FILE_VERSION_FILTER =
LAYOUT_FILE =
CITE_BIB_FILES =
#---------------------------------------------------------------------------
# Configuration options related to warning and progress messages
#---------------------------------------------------------------------------
QUIET = NO
WARNINGS = YES
WARN_IF_UNDOCUMENTED = YES
WARN_IF_DOC_ERROR = YES
WARN_NO_PARAMDOC = YES
WARN_AS_ERROR = YES
WARN_FORMAT = "---> WARNING! $file:$line: $text"
WARN_LOGFILE = doc/warnings.log
#---------------------------------------------------------------------------
# Configuration options related to the input files
#---------------------------------------------------------------------------
INPUT = ../../README.md \
../../source/lib/rocprof-sys-user/rocprofiler-systems/types.h \
../../source/lib/rocprof-sys-user/rocprofiler-systems/categories.h \
../../source/lib/rocprof-sys-user/rocprofiler-systems/user.h \
../../source/lib/rocprof-sys-user/rocprofiler-systems/causal.h
INPUT_ENCODING = UTF-8
FILE_PATTERNS = *.h \
*.hh \
*.hpp \
*.c \
*.cc \
*.cxx \
*.cpp \
*.c++ \
*.icc \
*.tcc \
*.py
RECURSIVE = YES
EXCLUDE =
EXCLUDE_SYMLINKS = YES
EXCLUDE_PATTERNS = */.git/* \
../../external/* \
../../examples/* \
../../tests/*
EXCLUDE_SYMBOLS = "std::*" \
"ROCPROFSYS_ATTRIBUTE" \
"ROCPROFSYS_VISIBILITY" \
"ROCPROFSYS_PUBLIC_API" \
"ROCPROFSYS_HIDDEN_API" \
"SpaceHandle" \
"KokkosPDevice*"
EXAMPLE_PATH = ../../examples
EXAMPLE_PATTERNS = *.h \
*.hh \
*.hpp \
*.c \
*.cc \
*.cpp \
*.py \
*.txt
EXAMPLE_RECURSIVE = YES
IMAGE_PATH =
INPUT_FILTER =
FILTER_PATTERNS =
FILTER_SOURCE_FILES = NO
FILTER_SOURCE_PATTERNS =
USE_MDFILE_AS_MAINPAGE = ../../README.md
#---------------------------------------------------------------------------
# Configuration options related to source browsing
#---------------------------------------------------------------------------
SOURCE_BROWSER = YES
INLINE_SOURCES = YES
STRIP_CODE_COMMENTS = NO
REFERENCED_BY_RELATION = YES
REFERENCES_RELATION = YES
REFERENCES_LINK_SOURCE = YES
SOURCE_TOOLTIPS = YES
USE_HTAGS = NO
VERBATIM_HEADERS = YES
#---------------------------------------------------------------------------
# Configuration options related to the alphabetical class index
#---------------------------------------------------------------------------
ALPHABETICAL_INDEX = YES
COLS_IN_ALPHA_INDEX = 5
IGNORE_PREFIX =
#---------------------------------------------------------------------------
# Configuration options related to the HTML output
#---------------------------------------------------------------------------
GENERATE_HTML = YES
HTML_OUTPUT = html
HTML_FILE_EXTENSION = .html
HTML_HEADER = ../_doxygen/header.html
HTML_FOOTER = ../_doxygen/footer.html
HTML_STYLESHEET = ../_doxygen/stylesheet.css
HTML_EXTRA_STYLESHEET = ../_doxygen/extra_stylesheet.css
HTML_EXTRA_FILES =
HTML_COLORSTYLE_HUE = 220
HTML_COLORSTYLE_SAT = 100
HTML_COLORSTYLE_GAMMA = 80
HTML_TIMESTAMP = YES
HTML_DYNAMIC_MENUS = YES
HTML_DYNAMIC_SECTIONS = YES
HTML_INDEX_NUM_ENTRIES = 1000
GENERATE_DOCSET = NO
DOCSET_FEEDNAME = "Doxygen generated docs"
DOCSET_BUNDLE_ID = org.doxygen.rocprofiler-systems
DOCSET_PUBLISHER_ID = org.doxygen.amd
DOCSET_PUBLISHER_NAME = "Advanced Micro Devices, Inc."
GENERATE_HTMLHELP = NO
CHM_FILE =
HHC_LOCATION =
GENERATE_CHI = NO
CHM_INDEX_ENCODING =
BINARY_TOC = NO
TOC_EXPAND = YES
GENERATE_QHP = NO
QCH_FILE =
QHP_NAMESPACE =
QHP_VIRTUAL_FOLDER = doc
QHP_CUST_FILTER_NAME =
QHP_CUST_FILTER_ATTRS =
QHP_SECT_FILTER_ATTRS =
QHG_LOCATION =
GENERATE_ECLIPSEHELP = NO
ECLIPSE_DOC_ID = org.doxygen.rocprofiler-systems
DISABLE_INDEX = NO
GENERATE_TREEVIEW = NO
ENUM_VALUES_PER_LINE = 1
TREEVIEW_WIDTH = 300
EXT_LINKS_IN_WINDOW = YES
HTML_FORMULA_FORMAT = png
FORMULA_FONTSIZE = 12
FORMULA_TRANSPARENT = YES
FORMULA_MACROFILE =
USE_MATHJAX = NO
MATHJAX_FORMAT = HTML-CSS
MATHJAX_RELPATH = http://cdn.mathjax.org/mathjax/latest
MATHJAX_EXTENSIONS =
MATHJAX_CODEFILE =
SEARCHENGINE = NO
SERVER_BASED_SEARCH = NO
EXTERNAL_SEARCH = NO
SEARCHENGINE_URL =
SEARCHDATA_FILE = searchdata.xml
EXTERNAL_SEARCH_ID =
EXTRA_SEARCH_MAPPINGS =
#---------------------------------------------------------------------------
# Configuration options related to the LaTeX output
#---------------------------------------------------------------------------
GENERATE_LATEX = NO
LATEX_OUTPUT = latex
LATEX_CMD_NAME = latex
MAKEINDEX_CMD_NAME = makeindex
LATEX_MAKEINDEX_CMD = makeindex
COMPACT_LATEX = NO
PAPER_TYPE = a4wide
EXTRA_PACKAGES = float
LATEX_HEADER =
LATEX_FOOTER =
LATEX_EXTRA_STYLESHEET =
LATEX_EXTRA_FILES =
PDF_HYPERLINKS = YES
USE_PDFLATEX = YES
LATEX_BATCHMODE = YES
LATEX_HIDE_INDICES = NO
LATEX_SOURCE_CODE = YES
LATEX_BIB_STYLE = plain
LATEX_TIMESTAMP = NO
LATEX_EMOJI_DIRECTORY =
#---------------------------------------------------------------------------
# Configuration options related to the RTF output
#---------------------------------------------------------------------------
GENERATE_RTF = NO
RTF_OUTPUT = rtf
COMPACT_RTF = NO
RTF_HYPERLINKS = NO
RTF_STYLESHEET_FILE =
RTF_EXTENSIONS_FILE =
RTF_SOURCE_CODE = NO
#---------------------------------------------------------------------------
# Configuration options related to the man page output
#---------------------------------------------------------------------------
GENERATE_MAN = NO
MAN_OUTPUT = man
MAN_EXTENSION = .3
MAN_SUBDIR =
MAN_LINKS = YES
#---------------------------------------------------------------------------
# Configuration options related to the XML output
#---------------------------------------------------------------------------
GENERATE_XML = YES
XML_OUTPUT = xml
XML_PROGRAMLISTING = YES
XML_NS_MEMB_FILE_SCOPE = YES
#---------------------------------------------------------------------------
# Configuration options related to the DOCBOOK output
#---------------------------------------------------------------------------
GENERATE_DOCBOOK = NO
DOCBOOK_OUTPUT = docbook
DOCBOOK_PROGRAMLISTING = NO
#---------------------------------------------------------------------------
# Configuration options for the AutoGen Definitions output
#---------------------------------------------------------------------------
GENERATE_AUTOGEN_DEF = NO
#---------------------------------------------------------------------------
# Configuration options related to the Perl module output
#---------------------------------------------------------------------------
GENERATE_PERLMOD = NO
PERLMOD_LATEX = NO
PERLMOD_PRETTY = YES
PERLMOD_MAKEVAR_PREFIX =
#---------------------------------------------------------------------------
# Configuration options related to the preprocessor
#---------------------------------------------------------------------------
ENABLE_PREPROCESSING = YES
MACRO_EXPANSION = YES
EXPAND_ONLY_PREDEF = NO
SEARCH_INCLUDES = YES
INCLUDE_PATH = ../../source/lib/rocprof-sys-user
INCLUDE_FILE_PATTERNS = *.h \
*.hpp
PREDEFINED = ROCPROFSYS_PUBLIC_API= \
ROCPROFSYS_HIDDEN_API= \
"ROCPROFSYS_ATTRIBUTE(...)=" \
"ROCPROFSYS_VISIBILITY(...)=" \
"__attribute__(x)=" \
"__declspec(x)=" \
"size_t=unsigned long" \
"uintptr_t=unsigned long" \
DOXYGEN_SHOULD_SKIP_THIS
EXPAND_AS_DEFINED =
SKIP_FUNCTION_MACROS = NO
#---------------------------------------------------------------------------
# Configuration options related to external references
#---------------------------------------------------------------------------
TAGFILES =
GENERATE_TAGFILE = html/tagfile.xml
ALLEXTERNALS = NO
EXTERNAL_GROUPS = YES
EXTERNAL_PAGES = YES
#---------------------------------------------------------------------------
# Configuration options related to the dot tool
#---------------------------------------------------------------------------
CLASS_DIAGRAMS = YES
DIA_PATH =
HIDE_UNDOC_RELATIONS = NO
HAVE_DOT = NO
DOT_NUM_THREADS = 0
DOT_FONTNAME = Helvetica
DOT_FONTSIZE = 12
DOT_FONTPATH =
CLASS_GRAPH = NO
COLLABORATION_GRAPH = YES
GROUP_GRAPHS = YES
UML_LOOK = YES
UML_LIMIT_NUM_FIELDS = 10
TEMPLATE_RELATIONS = YES
INCLUDE_GRAPH = YES
INCLUDED_BY_GRAPH = YES
CALL_GRAPH = NO
CALLER_GRAPH = NO
GRAPHICAL_HIERARCHY = YES
DIRECTORY_GRAPH = YES
DOT_IMAGE_FORMAT = svg
INTERACTIVE_SVG = YES
DOT_PATH = /usr/bin/dot
DOTFILE_DIRS =
MSCFILE_DIRS =
DIAFILE_DIRS =
PLANTUML_JAR_PATH =
PLANTUML_CFG_FILE =
PLANTUML_INCLUDE_PATH =
DOT_GRAPH_MAX_NODES = 50
MAX_DOT_GRAPH_DEPTH = 0
DOT_TRANSPARENT = NO
DOT_MULTI_TARGETS = YES
GENERATE_LEGEND = YES
DOT_CLEANUP = YES
Разница между файлами не показана из-за своего большого размера Загрузить разницу
+71
Просмотреть файл
@@ -0,0 +1,71 @@
.. meta::
:description: ROCm Systems Profiler environment validation documentation and reference
:keywords: rocprof-sys, rocprofiler-systems, Omnitrace, ROCm, profiler, environment, tracking, visualization, tool, Instinct, accelerator, AMD
****************************************************
Configuring and validating the environment
****************************************************
After installing `ROCm Systems Profiler <https://github.com/ROCm/rocprofiler-systems>`_, additional steps are required to set up
and validate the environment.
.. note::
The following instructions use the installation path ``/opt/rocprofiler-systems``. If
ROCm Systems Profiler is installed elsewhere, substitute the actual installation path.
Configuring the environment
========================================
After ROCm Systems Profiler is installed, source the ``setup-env.sh`` script to prefix the
``PATH``, ``LD_LIBRARY_PATH``, and other environment variables:
.. code-block:: shell
source /opt/rocprofiler-systems/share/rocprofiler-systems/setup-env.sh
Alternatively, if environment modules are supported, add the ``<prefix>/share/modulefiles`` directory
to ``MODULEPATH``:
.. code-block:: shell
module use /opt/rocprofiler-systems/share/modulefiles
.. note::
As an alternative, the above line can be added to the ``${HOME}/.modulerc`` file.
After ROCm Systems Profiler has been added to the ``MODULEPATH``, it can be loaded
using ``module load rocprofiler-systems/<VERSION>`` and unloaded using ``module unload rocprofiler-systems/<VERSION>``.
.. code-block:: shell
module load rocprofiler-systems/1.0.0
module unload rocprofiler-systems/1.0.0
.. note::
You might also need to add the path to the ROCm libraries to ``LD_LIBRARY_PATH``,
for example, ``export LD_LIBRARY_PATH=/opt/rocm/lib:${LD_LIBRARY_PATH}``
Validating the environment configuration
========================================
If the following commands all run successfully with the expected output,
then you are ready to use ROCm Systems Profiler:
.. code-block:: shell
which rocprof-sys
which rocprof-sys-avail
which rocprof-sys-sample
rocprof-sys-instrument --help
rocprof-sys-avail --all
rocprof-sys-sample --help
If ROCm Systems Profiler was built with Python support, validate these additional commands:
.. code-block:: shell
which rocprof-sys-python
rocprof-sys-python --help
+60
Просмотреть файл
@@ -0,0 +1,60 @@
.. meta::
:description: ROCm Systems Profiler general tips and usage documentation and reference
:keywords: rocprof-sys, rocprofiler-systems, Omnitrace, ROCm, tips, how to, profiler, tracking, visualization, tool, Instinct, accelerator, AMD
********************************************
General tips for using ROCm Systems Profiler
********************************************
Follow these general guidelines when using ROCm Systems Profiler. For an explanation of the terms used in this topic, see
the :doc:`ROCm Systems Profiler glossary <../reference/rocprof-sys-glossary>`.
* Use ``rocprof-sys-avail`` to look up configuration settings, hardware counters, and data collection components
* Use the ``-d`` flag for descriptions
* Generate a default configuration with ``rocprof-sys-avail -G ${HOME}/.rocprof-sys.cfg`` and adjust it
to the desired default behavior
* **Decide whether binary instrumentation, statistical sampling, or both** provides the desired performance data (for non-Python applications)
* Compile code with optimization enabled (``-O2`` or higher), disable asserts (i.e. ``-DNDEBUG``), and include debug info (for instance, ``-g1`` at a minimum)
* Compiling with debug info does not slow down the code, it only increases compile time and the size of the binary
* In CMake, this is generally done with the settings ``CMAKE_BUILD_TYPE=RelWithDebInfo`` or ``CMAKE_BUILD_TYPE=Release`` and ``CMAKE_<LANG>_FLAGS=-g1``
* **Use binary instrumentation for characterizing the performance of every invocation of specific functions**
* **Use statistical sampling to characterize the performance of the entire application while minimizing overhead**
* Enable statistical sampling after binary instrumentation to help "fill in the gaps" between instrumented regions
* Use the user API to create custom regions and enable/disable ROCm Systems Profiler for specific processes, threads, and regions
* Dynamic symbol interception, callback APIs, and the user API are always available with binary instrumentation and sampling
* Dynamic symbol interception and callback APIs are (generally) controlled through ``ROCPROFSYS_USE_<API>``
options, for example, ``ROCPROFSYS_USE_KOKKOSP`` and ``ROCPROFSYS_USE_OMPT`` enable Kokkos-Tools and OpenMP-Tools
callbacks, respectively
* When generically seeking regions for performance improvement:
* **Start off by collecting a flat profile**
* Look for functions with high call counts, large cumulative runtimes/values, or large standard deviations
* When call counts are high, improving the performance of this function or "inlining" the function can result in quick and easy performance improvements
* When the standard deviation is high, collect a hierarchical profile and see if the high variation can be attributable to the calling context.
In this scenario, consider creating a specialized version of the function for the longer-running contexts
* **Collect a hierarchical profile** and verify the functions that are part of the "critical path" of your
application, as indicated in the flat profile
* For example, functions with high call counts but which are part of a "setup" or "post-processing"
phase that does not consume much time relative to the overall time are generally a lower priority for optimization
* **Use the information from the profiles when analyzing detailed traces**
* When using binary instrumentation in "trace" mode, **binary rewrites are preferable to runtime instrumentation**.
* Binary rewrites only instrument the functions defined in the target binary, whereas runtime instrumentation might instrument functions defined in the shared libraries which are linked into the target binary
* When using binary instrumentation with MPI, avoid runtime instrumentation
* Runtime instrumentation requires a fork and a ``ptrace``, which is generally incompatible with how MPI applications spawn processes
* Perform a binary rewrite of the executable (and optionally, libraries used by the executable) using MPI and run
the generated instrumented executable using ``rocprof-sys-run`` instead of the original.
For example, instead of ``mpirun -n 2 ./myexe``, use ``mpirun -n 2 rocprof-sys-run -- ./myexe.inst``, where
``myexe.inst`` is the instrumented ``myexe`` executable that was generated.
+939
Просмотреть файл
@@ -0,0 +1,939 @@
.. meta::
:description: ROCm Systems Profiler binary instrumentation and rewrite documentation and reference
:keywords: rocprof-sys, rocprofiler-systems, Omnitrace, ROCm, binary instrumentation, binary rewrite, profiler, tracking, visualization, tool, Instinct, accelerator, AMD
****************************************************
Instrumenting and rewriting a binary application
****************************************************
There are three ways to perform instrumentation with the ``rocprof-sys-instrument`` executable:
* Runtime instrumentation
* Attaching to an already running process
* Binary rewrite
Here is a comparison of the three modes:
* Runtime instrumentation of the application using the ``rocprof-sys-instrument`` executable
(analogous to ``gdb --args <program> <args>``)
* This mode is the default if neither the ``-p`` nor ``-o`` command-line options are used
* Runtime instrumentation supports instrumenting not only the target executable but also
the shared libraries loaded by the target executable. Consequently, this mode consumes more memory,
takes longer to perform the instrumentation, and tends to add more significant overhead to the
runtime of the application.
* This mode is recommended if you want to analyze not only the performance of your executable and/or
libraries but also the performance of the library dependencies
* Attaching to a process that is currently running (analogous to ``gdb -p <PID>``)
* This mode is activated using ``-p <PID>``
* The same caveats from the first example apply with respect to memory and overhead
.. note::
Attaching to a running process is an alpha feature and detaching from the target process
without ending the target process is not currently supported.
* Binary rewrite to generate a new executable or library with the instrumentation built-in
* This mode is activated through the ``-o <output-file>`` option
* Binary rewriting is limited to the text section of the target executable or library. It does not instrument
the dynamically-linked libraries. Consequently, this mode performs the
instrumentation significantly faster
and has a much lower overhead when running the instrumented executable and libraries.
* Binary rewriting is the recommended mode when the target executable uses
process-level parallelism (for example, MPI)
* If the target executable has a minimal ``main`` routine and the bulk of your
application is in one specific dynamic library,
see :ref:`binary-rewriting-library-label` for help
The rocprof-sys-instrument executable
========================================
Instrumentation is performed with the ``rocprof-sys-instrument`` executable. For more details, use the ``-h`` or ``--help`` option to
view the help menu.
.. code-block:: shell
$ rocprof-sys-instrument --help
[rocprof-sys-instrument] Usage: rocprof-sys-instrument [ --help (count: 0, dtype: bool)
--version (count: 0, dtype: bool)
--verbose (max: 1, dtype: bool)
--error (max: 1, dtype: boolean)
--debug (max: 1, dtype: bool)
--log (count: 1)
--log-file (count: 1)
--simulate (max: 1, dtype: boolean)
--print-format (min: 1, dtype: string)
--print-dir (count: 1, dtype: string)
--print-available (count: 1)
--print-instrumented (count: 1)
--print-coverage (count: 1)
--print-excluded (count: 1)
--print-overlapping (count: 1)
--print-instructions (max: 1, dtype: bool)
--output (min: 0, dtype: string)
--pid (count: 1, dtype: int)
--mode (count: 1)
--force (max: 1, dtype: bool)
--command (count: 1)
--prefer (count: 1)
--library (count: unlimited)
--main-function (count: 1)
--load (count: unlimited, dtype: string)
--load-instr (count: unlimited, dtype: filepath)
--init-functions (count: unlimited, dtype: string)
--fini-functions (count: unlimited, dtype: string)
--all-functions (max: 1, dtype: boolean)
--function-include (count: unlimited)
--function-exclude (count: unlimited)
--function-restrict (count: unlimited)
--caller-include (count: unlimited)
--module-include (count: unlimited)
--module-exclude (count: unlimited)
--module-restrict (count: unlimited)
--internal-function-include (count: unlimited)
--internal-module-include (count: unlimited)
--instruction-exclude (count: unlimited)
--internal-library-deps (min: 0, dtype: boolean)
--internal-library-append (count: unlimited)
--internal-library-remove (count: unlimited)
--linkage (min: 1)
--visibility (min: 1)
--label (count: unlimited, dtype: string)
--config (min: 1, dtype: string)
--default-components (count: unlimited, dtype: string)
--env (count: unlimited)
--mpi (max: 1, dtype: bool)
--instrument-loops (max: 1, dtype: boolean)
--min-instructions (count: 1, dtype: int)
--min-address-range (count: 1, dtype: int)
--min-instructions-loop (count: 1, dtype: int)
--min-address-range-loop (count: 1, dtype: int)
--coverage (max: 1, dtype: bool)
--dynamic-callsites (max: 1, dtype: boolean)
--traps (max: 1, dtype: boolean)
--loop-traps (max: 1, dtype: boolean)
--allow-overlapping (max: 1, dtype: bool)
--parse-all-modules (max: 1, dtype: bool)
--batch-size (count: 1, dtype: int)
--dyninst-rt (min: 1, dtype: filepath)
--dyninst-options (count: unlimited)
] -- <CMD> <ARGS>
Options:
-h, -?, --help Shows this page
--version Prints the version and exit
[DEBUG OPTIONS]
-v, --verbose Verbose output
-e, --error All warnings produce runtime errors
--debug Debug output
--log Number of log entries to display after an error. Any value < 0 will emit the entire log
--log-file Write the log out the specified file during the run
--simulate Exit after outputting diagnostic {available,instrumented,excluded,overlapping} module
function lists, e.g. available.txt
--print-format [ json | txt | xml ]
Output format for diagnostic {available,instrumented,excluded,overlapping} module
function lists, e.g. {print-dir}/available.txt
--print-dir Output directory for diagnostic {available,instrumented,excluded,overlapping} module
function lists, e.g. {print-dir}/available.txt
--print-available [ functions | functions+ | modules | pair | pair+ ]
Print the available entities for instrumentation (functions, modules, or module-function
pair) to stdout after applying regular expressions
--print-instrumented [ functions | functions+ | modules | pair | pair+ ]
Print the instrumented entities (functions, modules, or module-function pair) to stdout
after applying regular expressions
--print-coverage [ functions | functions+ | modules | pair | pair+ ]
Print the instrumented coverage entities (functions, modules, or module-function pair) to
stdout after applying regular expressions
--print-excluded [ functions | functions+ | modules | pair | pair+ ]
Print the entities for instrumentation (functions, modules, or module-function pair)
which are excluded from the instrumentation to stdout after applying regular expressions
--print-overlapping [ functions | functions+ | modules | pair | pair+ ]
Print the entities for instrumentation (functions, modules, or module-function pair)
which overlap other function calls or have multiple entry points to stdout after applying
regular expressions
--print-instructions Print the instructions for each basic-block in the JSON/XML outputs
[MODE OPTIONS]
-o, --output Enable generation of a new executable (binary-rewrite). If a filename is not provided,
rocprof-sys will use the basename and output to the cwd, unless the target binary is in the
cwd. In the latter case, rocprof-sys will either use ${PWD}/<basename>.inst (non-libraries)
or ${PWD}/instrumented/<basename> (libraries)
-p, --pid Connect to running process
-M, --mode [ coverage | sampling | trace ]
Instrumentation mode. \'trace\' mode instruments the selected functions, \'sampling\' mode
only instruments the main function to start and stop the sampler.
-f, --force Force the command-line argument configuration, i.e. don't get cute. Useful for forcing
runtime instrumentation of an executable that [A] Dyninst thinks is a library after
reading ELF and [B] whose name makes it look like a library (e.g. starts with 'lib'
and/or ends in \'.so\', \'.so.*\', or \'.a\')
-c, --command Input executable and arguments (if \'-- <CMD>\' not provided)
[LIBRARY OPTIONS]
--prefer [ shared | static ] Prefer this library types when available
-L, --library Libraries with instrumentation routines (default: "librocprof-sys-dl")
-m, --main-function The primary function to instrument around, e.g. \'main\'
--load Supplemental instrumentation library names w/o extension (e.g. \'libinstr\' for
\'libinstr.so\' or \'libinstr.a\')
--load-instr Load {available,instrumented,excluded,overlapping}-instr JSON or XML file(s) and override
what is read from the binary
--init-functions Initialization function(s) for supplemental instrumentation libraries (see \'--load\'
option)
--fini-functions Finalization function(s) for supplemental instrumentation libraries (see \'--load\' option)
--all-functions When finding functions, include the functions which are not instrumentable. This is
purely diagnostic for the available/excluded functions output
[SYMBOL SELECTION OPTIONS]
-I, --function-include Regex(es) for including functions (despite heuristics)
-E, --function-exclude Regex(es) for excluding functions (always applied)
-R, --function-restrict Regex(es) for restricting functions only to those that match the provided
regular-expressions
--caller-include Regex(es) for including functions that call the listed functions (despite heuristics)
-MI, --module-include Regex(es) for selecting modules/files/libraries (despite heuristics)
-ME, --module-exclude Regex(es) for excluding modules/files/libraries (always applied)
-MR, --module-restrict Regex(es) for restricting modules/files/libraries only to those that match the provided
regular-expressions
--internal-function-include Regex(es) for including functions which are (likely) utilized by rocprof-sys itself. Use
this option with care.
--internal-module-include Regex(es) for including modules/libraries which are (likely) utilized by rocprof-sys
itself. Use this option with care.
--instruction-exclude Regex(es) for excluding functions containing certain instructions
--internal-library-deps Treat the libraries linked to the internal libraries as internal libraries. This increase
the internal library processing time and consume more memory (so use with care) but may
be useful when the application uses Boost libraries and Dyninst is dynamically linked
against the same boost libraries
--internal-library-append Append to the list of libraries which rocprof-sys treats as being used internally, e.g.
ROCm Systems Profiler will find all the symbols in this library and prevent them from being
instrumented.
--internal-library-remove [ ld-linux-x86-64.so.2
libBrokenLocale.so.1
libanl.so.1
libbfd.so
libbz2.so
libc.so.6
libcaliper.so
libcommon.so
libcrypt.so.1
libdl.so.2
libdw.so
libdwarf.so
libdyninstAPI_RT.so
libelf.so
libgcc_s.so.1
libgotcha.so
liblikwid.so
liblzma.so
libnsl.so.1
libnss_compat.so.2
libnss_db.so.2
libnss_dns.so.2
libnss_files.so.2
libnss_hesiod.so.2
libnss_ldap.so.2
libnss_nis.so.2
libnss_nisplus.so.2
libnss_test1.so.2
libnss_test2.so.2
libpapi.so
libpfm.so
libprofiler.so
libpthread.so.0
libresolv.so.2
libamd_smi64.so
librocprofiler-sdk.so
librt.so.1
libstdc++.so.6
libtbb.so
libtbbmalloc.so
libtbbmalloc_proxy.so
libtcmalloc.so
libtcmalloc_and_profiler.so
libtcmalloc_debug.so
libtcmalloc_minimal.so
libtcmalloc_minimal_debug.so
libthread_db.so.1
libunwind-coredump.so
libunwind-generic.so
libunwind-ptrace.so
libunwind-setjmp.so
libunwind-x86_64.so
libunwind.so
libutil.so.1
libz.so
libzstd.so ]
Remove the specified libraries from being treated as being used internally, e.g.
ROCm System Profiler will permit all the symbols in these libraries to be eligible for
instrumentation.
--linkage [ global | local | unique | unknown | weak ]
Only instrument functions with specified linkage (default: global, local, unique)
--visibility [ default | hidden | internal | protected | unknown ]
Only instrument functions with specified visibility (default: default, internal, hidden,
protected)
[RUNTIME OPTIONS]
--label [ args | file | line | return ]
Labeling info for functions. By default, just the function name is recorded. Use these
options to gain more information about the function signature or location of the
functions
-C, --config Read in a configuration file and encode these values as the defaults in the executable
-d, --default-components Default components to instrument (only useful when timemory is enabled in rocprof-sys
library)
--env Environment variables to add to the runtime in form VARIABLE=VALUE. E.g. use \'--env
ROCPROFSYS_PROFILE=ON\' to default to using timemory instead of perfetto
--mpi Enable MPI support (requires rocprof-sys built w/ full or partial MPI support). NOTE: this
will automatically be activated if MPI_Init, MPI_Init_thread, MPI_Finalize,
MPI_Comm_rank, or MPI_Comm_size are found in the symbol table of target
[GRANULARITY OPTIONS]
-l, --instrument-loops Instrument at the loop level
-i, --min-instructions If the number of instructions in a function is less than this value, exclude it from
instrumentation
-r, --min-address-range If the address range of a function is less than this value, exclude it from
instrumentation
--min-instructions-loop If the number of instructions in a function containing a loop is less than this value,
exclude it from instrumentation
--min-address-range-loop If the address range of a function containing a loop is less than this value, exclude it
from instrumentation
--coverage [ basic_block | function | none ]
Enable recording the code coverage. If instrumenting in coverage mode (\'-M converage\'),
this simply specifies the granularity. If instrumenting in trace or sampling mode, this
enables recording code-coverage in addition to the instrumentation of that mode (if any).
--dynamic-callsites Force instrumentation if a function has dynamic callsites (e.g. function pointers)
--traps Instrument points which require using a trap. On the x86 architecture, because
instructions are of variable size, the instruction at a point may be too small for
Dyninst to replace it with the normal code sequence used to call instrumentation. Also,
when instrumentation is placed at points other than subroutine entry, exit, or call
points, traps may be used to ensure the instrumentation fits. In this case, Dyninst
replaces the instruction with a single-byte instruction that generates a trap.
--loop-traps Instrument points within a loop which require using a trap (only relevant when
--instrument-loops is enabled).
--allow-overlapping Allow dyninst to instrument either multiple functions which overlap (share part of same
function body) or single functions with multiple entry points. For more info, see Section
2 of the DyninstAPI documentation.
--parse-all-modules By default, rocprof-sys simply requests Dyninst to provide all the procedures in the
application image. If this option is enabled, rocprof-sys will iterate over all the modules
and extract the functions. Theoretically, it should be the same but the data is slightly
different, possibly due to weak binding scopes. In general, enabling option will probably
have no visible effect
[DYNINST OPTIONS]
-b, --batch-size Dyninst supports batch insertion of multiple points during runtime instrumentation. If
one large batch insertion fails, this value will be used to create smaller batches.
Larger batches generally decrease the instrumentation time
--dyninst-rt Path(s) to the dyninstAPI_RT library
--dyninst-options [ BaseTrampDeletion
DebugParsing
DelayedParsing
InstrStackFrames
MergeTramp
SaveFPR
TrampRecursive
TypeChecking ]
Advanced dyninst options: BPatch::set<OPTION>(bool), e.g. bpatch->setTrampRecursive(true)
``rocprof-sys-instrument`` uses a similar syntax as LLVM to separate command-line arguments from the
application's arguments. It uses a standalone
double-hyphen (``--``) as a separator.
All arguments preceding the double-hyphen
are interpreted as belonging to ROCm Systems Profiler and all arguments following the
double-hyphen are interpreted as being part of the
application and its arguments. In binary rewrite mode, all application arguments after the first argument
are ignored. As an example, ``./rocprof-sys-instrument -o ls.inst -- ls -l`` interprets ``ls`` as
the target to instrument, ignoring the ``-l`` argument,
and generates a ``ls.inst`` executable that you can subsequently run using the
``rocprof-sys-run -- ls.inst -l`` command.
Runtime instrumentation example
========================================
The following example shows how to enable runtime instrumentation.
.. code-block:: shell
rocprof-sys-instrument <rocprof-sys-options> -- <exe> [<exe-options>...]
Attaching to a running process
========================================
Use the following command to attach to an active process.
.. code-block:: shell
rocprof-sys-instrument <rocprof-sys-options> -p <PID> -- <exe-name>
Binary rewrite
========================================
This example demonstrates how to rewrite a binary.
.. code-block:: shell
rocprof-sys-instrument <rocprof-sys-options> -o <name-of-new-exe-or-library> -- <exe-or-library>
.. _binary-rewriting-library-label:
Binary rewrite of a library
-----------------------------------
Many applications bundle the bulk of their functionality into one or more
dynamic libraries and have a relatively simple ``main``
which links to these libraries and serves as the "driver" for
setting up the workflow. If you perform a binary rewrite of an
executable like this and find there is insufficient information, you
can either switch to runtime instrumentation or perform a
binary rewrite on the relevant libraries.
Support for stand-alone binary rewriting of a dynamic library without a binary rewrite of
the executable is a beta feature.
In general, it is supported as long as the library contains the ``_init`` and
``_fini`` symbols but these symbols are not
standardized to the extent of ``main`` in an executable.
Here is the recommended workflow for the binary rewrite of a library:
#. Determine the names of the dynamically linked libraries of interest using ``ldd``
#. Generate a binary rewrite of the executable
#. Generate a binary rewrite of the desired libraries with the same base name as the
original library, for example, ``libfoo.so.2`` instead of ``libfoo.so``, and output the instrumented
library into a different folder than the original library.
#. Prefix the ``LD_LIBRARY_PATH`` executable with the output folder from the previous step
#. Use ``ldd`` to verify that the instrumented executable can resolve the location of the instrumented library
Binary rewrite of a library example
-----------------------------------
The ``foo`` executable is dynamically linked to ``libfoo.so.2``:
.. code-block:: shell
$ pwd
/home/user
$ which foo
/usr/local/bin/foo
$ ldd /usr/local/bin/foo
...
libfoo.so.2 => /usr/local/lib/libfoo.so.2 (...)
...
Generate binary rewrites of ``foo`` and ``libfoo.so.2``:
.. code-block:: shell
rocprof-sys-instrument -o ./foo.inst -- foo
rocprof-sys-instrument -o ./libfoo.so.2 -- /usr/local/lib/libfoo.so.2
At this point, the instrumented ``foo.inst`` executable still dynamically loads the
original ``libfoo.so.2`` in ``/usr/local/lib``:
.. code-block:: shell
$ ldd ./foo.inst
...
libfoo.so.2 => /usr/local/lib/libfoo.so.2 (...)
...
Prefix the ``LD_LIBRARY_PATH`` environment variable with the folder containing
the instrumented ``libfoo.so.2``:
.. code-block:: shell
export LD_LIBRARY_PATH=/home/user:${LD_LIBRARY_PATH}
``foo.inst`` now loads the instrumented library when it runs:
.. code-block:: shell
$ ldd ./foo.inst
...
libfoo.so.2 => /home/user/libfoo.so.2 (...)
...
Selective instrumentation
========================================
The default behavior of ``rocprof-sys-instrument`` does not instrument every symbol in the binary.
The default rules are:
* Skip instrumenting dynamic call-sites (such as function pointers)
* The ``--dynamic-callsites`` option forces instrumentation for all dynamic call-sites
* The cost of a function can be loosely approximated by the number of
instructions. By default, ``rocprof-sys-instrument`` only instruments functions
with at least 1024 instructions
* The ``--min-instructions`` option modifies this heuristic for all functions which do not contain loops
* The ``--min-instructions-loop`` option modifies this heuristic for functions which contain loops.
* The cost of a function can be also be loosely approximated by the size of the function
in the binary so this heuristic can be used in lieu of or in addition to the
minimum number of instructions
* The ``--min-address-range`` option modifies this heuristic for all functions which do not contain loops
* The ``--min-address-range-loop`` option modifies this heuristic for functions which contain loops
* Skip instrumentation points which require using a trap
* See the description for the ``--traps`` and ``--loop-traps`` options for more information
* Skip instrumenting loops within the body of a function
* The ``--instrument-loops`` option enables this behavior
* Skip instrumenting functions with overlapping function bodies and single
functions with multiple entry point
* These behaviors arise from various optimizations. Enable instrumenting for these functions
by using the ``--allow-overlapping`` option
.. note::
The separate loop options ``--min-instructions-loop`` and ``--min-address-range-loop``
are provided because functions with loops can be compact in the binary while also being costly
Viewing the available, instrumented, excluded, and overlapping functions
-------------------------------------------------------------------------
Whenever ``rocprof-sys-instrument`` runs with a verbosity of zero or higher,
it generates files that detail which functions
were available for instrumentation (along with the module they were defined in), actually instrumented,
excluded, and which contained overlapping function bodies.
By default, these files are saved to the ``rocprof-sys-<NAME>-output`` folder
where ``<NAME>`` is the base name of the targeted binary (or
the base name of the resulting executable in the case of binary rewrite). For example,
``rocprof-sys-instrument -- ls`` outputs these files to ``rocprof-sys-ls-output``
whereas ``rocprof-sys-instrument -o ls.inst -- ls`` places them in ``rocprof-sys-ls.inst-output``.
To generate these files without running or generating an
executable, use the ``--simulate`` option:
.. code-block:: shell
rocprof-sys-instrument --simulate -- foo
rocprof-sys-instrument --simulate -o foo.inst -- foo
Excluding and including modules and functions
----------------------------------------------
ROCm Systems Profiler has a set of six command-line options which each accept one or more
regular expressions for customizing the scope of which module and/or functions are
instrumented. Multiple regex patterns per option are treated as an OR operation,
for example, ``--module-include libfoo libbar`` is effectively the same as ``--module-include 'libfoo|libbar'``.
To force the inclusion of certain modules and/or function
without changing any of the heuristics, use the ``--module-include`` and/or ``--function-include`` options.
These options do not exclude modules or functions which do
not satisfy their regular expression.
To narrow the scope of the instrumentation to a specific set
of libraries and/or functions, use the ``--module-restrict`` and ``--function-restrict`` options.
These options let you exclusively select the union of one or more
regular expressions, regardless of whether or not the functions satisfy the
previously-mentioned default heuristics. Any function or module that is not within
the union of these regular expressions is excluded from instrumentation.
To avoid instrumenting a set of modules and/or functions,
use the ``--module-exclude`` and ``--function-exclude`` options.
These options are always applied, even if the module or function
satisfies the "restrict" or "include" regular expression.
.. _available-module-function-output:
An example of the available module and function info output
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code-block:: shell
rocprof-sys-instrument -o lulesh.inst --label file line args --simulate -- lulesh
.. code-block:: shell
AddressRange Module Function FunctionSignature
9165 ../examples/lulesh/lulesh-comm.cc CommMonoQ CommMonoQ(domain) [lulesh-comm.cc:1891]
3396 ../examples/lulesh/lulesh-comm.cc CommRecv CommRecv(domain, int, Index_t, Index_t, Index_t, Index_t, bool, bool) [lulesh...
8666 ../examples/lulesh/lulesh-comm.cc CommSBN CommSBN(domain, int, Domain_member *) [lulesh-comm.cc:926]
10212 ../examples/lulesh/lulesh-comm.cc CommSend CommSend(domain, int, Index_t, Domain_member *, Index_t, Index_t, Index_t, bo...
6823 ../examples/lulesh/lulesh-comm.cc CommSyncPosVel CommSyncPosVel(domain) [lulesh-comm.cc:1404]
126 ../examples/lulesh/lulesh-comm.cc _GLOBAL__sub_I_lulesh_comm.cc _GLOBAL__sub_I_lulesh_comm.cc() [lulesh-comm.cc]
308 ../examples/lulesh/lulesh-init.cc .omp_outlined..26 .omp_outlined..26(const , const , const ParallelFor<Kokkos::Impl::ViewCopy<Ko...
628 ../examples/lulesh/lulesh-init.cc .omp_outlined..34 .omp_outlined..34(const , const , const ParallelFor<Kokkos::Impl::ViewCopy<Ko...
656 ../examples/lulesh/lulesh-init.cc .omp_outlined..41 .omp_outlined..41(const , const , const ParallelFor<Kokkos::Impl::ViewCopy<Ko...
662 ../examples/lulesh/lulesh-init.cc .omp_outlined..45 .omp_outlined..45(const , const , const ParallelFor<Kokkos::Impl::ViewCopy<Ko...
550 ../examples/lulesh/lulesh-init.cc .omp_outlined..55 .omp_outlined..55(const , const , const ParallelFor<Kokkos::Impl::ViewFill<Ko...
556 ../examples/lulesh/lulesh-init.cc .omp_outlined..57 .omp_outlined..57(const , const , const ParallelFor<Kokkos::Impl::ViewFill<Ko...
550 ../examples/lulesh/lulesh-init.cc .omp_outlined..78 .omp_outlined..78(const , const , const ParallelFor<Kokkos::Impl::ViewFill<Ko...
640 ../examples/lulesh/lulesh-init.cc .omp_outlined..84 .omp_outlined..84(const , const , const ParallelFor<Kokkos::Impl::ViewCopy<Ko...
646 ../examples/lulesh/lulesh-init.cc .omp_outlined..88 .omp_outlined..88(const , const , const ParallelFor<Kokkos::Impl::ViewCopy<Ko...
1840 ../examples/lulesh/lulesh-init.cc Domain::AllocateElemPersistent Domain::AllocateElemPersistent(Domain *, Int_t) [lulesh-init.cc:94]
1384 ../examples/lulesh/lulesh-init.cc Domain::AllocateNodePersistent Domain::AllocateNodePersistent(Domain *, Int_t) [lulesh-init.cc:94]
1264 ../examples/lulesh/lulesh-init.cc Domain::BuildMesh Domain::BuildMesh(Domain *, Int_t, Int_t, Int_t) [lulesh-init.cc:308]
2312 ../examples/lulesh/lulesh-init.cc Domain::CreateRegionIndexSets Domain::CreateRegionIndexSets(Domain *, Int_t, Int_t) [lulesh-init.cc:409]
7109 ../examples/lulesh/lulesh-init.cc Domain::Domain Domain::Domain(Domain *, Int_t, Index_t, Index_t, Index_t, Index_t, int, int,...
2458 ../examples/lulesh/lulesh-init.cc Domain::SetupBoundaryConditions Domain::SetupBoundaryConditions(Domain *, Int_t) [lulesh-init.cc:409]
956 ../examples/lulesh/lulesh-init.cc Domain::SetupCommBuffers Domain::SetupCommBuffers(Domain *, Int_t) [lulesh-init.cc]
1456 ../examples/lulesh/lulesh-init.cc Domain::SetupElementConnectivities Domain::SetupElementConnectivities(Domain *, Int_t) [lulesh-init.cc:409]
721 ../examples/lulesh/lulesh-init.cc Domain::SetupSymmetryPlanes Domain::SetupSymmetryPlanes(Domain *, Int_t) [lulesh-init.cc:409]
1591 ../examples/lulesh/lulesh-init.cc Domain::SetupThreadSupportStructures Domain::SetupThreadSupportStructures(Domain *) [lulesh-init.cc:376]
1644 ../examples/lulesh/lulesh-init.cc Domain::~Domain Domain::~Domain(Domain *) [lulesh-init.cc:286]
218 ../examples/lulesh/lulesh-init.cc InitMeshDecomp InitMeshDecomp(Int_t, Int_t, Int_t *, Int_t *, Int_t *, Int_t *) [lulesh-init...
260 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::CommonSubview<Kokkos::View<int* [8], Kokkos::LayoutRight>, Kokk... Kokkos::Impl::CommonSubview<Kokkos::View<int* [8], Kokkos::LayoutRight>, Kokk...
1786 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::HostIterateTile<Kokkos::MDRangePolicy<Kokkos::OpenMP, Kokkos::R... Kokkos::Impl::HostIterateTile<Kokkos::MDRangePolicy<Kokkos::OpenMP, Kokkos::R...
330 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ParallelConstructName<Kokkos::Impl::ViewCopy<Kokkos::View<int**... Kokkos::Impl::ParallelConstructName<Kokkos::Impl::ViewCopy<Kokkos::View<int**...
330 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ParallelConstructName<Kokkos::Impl::ViewCopy<Kokkos::View<int**... Kokkos::Impl::ParallelConstructName<Kokkos::Impl::ViewCopy<Kokkos::View<int**...
330 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ParallelConstructName<Kokkos::Impl::ViewCopy<Kokkos::View<int*,... Kokkos::Impl::ParallelConstructName<Kokkos::Impl::ViewCopy<Kokkos::View<int*,...
330 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ParallelConstructName<Kokkos::Impl::ViewCopy<Kokkos::View<int*,... Kokkos::Impl::ParallelConstructName<Kokkos::Impl::ViewCopy<Kokkos::View<int*,...
330 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ParallelConstructName<Kokkos::Impl::ViewFill<Kokkos::View<doubl... Kokkos::Impl::ParallelConstructName<Kokkos::Impl::ViewFill<Kokkos::View<doubl...
330 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ParallelConstructName<Kokkos::Impl::ViewFill<Kokkos::View<doubl... Kokkos::Impl::ParallelConstructName<Kokkos::Impl::ViewFill<Kokkos::View<doubl...
330 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ParallelConstructName<Kokkos::Impl::ViewFill<Kokkos::View<doubl... Kokkos::Impl::ParallelConstructName<Kokkos::Impl::ViewFill<Kokkos::View<doubl...
522 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ParallelFor<Kokkos::Impl::ViewCopy<Kokkos::View<int**, Kokkos::... Kokkos::Impl::ParallelFor<Kokkos::Impl::ViewCopy<Kokkos::View<int**, Kokkos::...
232 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ParallelFor<Kokkos::Impl::ViewCopy<Kokkos::View<int**, Kokkos::... Kokkos::Impl::ParallelFor<Kokkos::Impl::ViewCopy<Kokkos::View<int**, Kokkos::...
49 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::SharedAllocationRecord<Kokkos::HostSpace, Kokkos::Impl::ViewVal... Kokkos::Impl::SharedAllocationRecord<Kokkos::HostSpace, Kokkos::Impl::ViewVal...
1476 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::Tile_Loop_Type<2, false, int, void, void>::apply<Kokkos::Impl::... Kokkos::Impl::Tile_Loop_Type<2, false, int, void, void>::apply<Kokkos::Impl::...
555 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ViewCopy<Kokkos::View<int**, Kokkos::LayoutRight, Kokkos::Devic... Kokkos::Impl::ViewCopy<Kokkos::View<int**, Kokkos::LayoutRight, Kokkos::Devic...
613 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ViewCopy<Kokkos::View<int**, Kokkos::LayoutRight, Kokkos::Devic... Kokkos::Impl::ViewCopy<Kokkos::View<int**, Kokkos::LayoutRight, Kokkos::Devic...
603 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ViewCopy<Kokkos::View<int*, Kokkos::LayoutLeft, Kokkos::Device<... Kokkos::Impl::ViewCopy<Kokkos::View<int*, Kokkos::LayoutLeft, Kokkos::Device<...
604 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ViewCopy<Kokkos::View<int*, Kokkos::LayoutLeft, Kokkos::Device<... Kokkos::Impl::ViewCopy<Kokkos::View<int*, Kokkos::LayoutLeft, Kokkos::Device<...
281 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ViewCtorProp<std::__cxx11::basic_string<char, std::char_traits<... Kokkos::Impl::ViewCtorProp<std::__cxx11::basic_string<char, std::char_traits<...
281 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ViewCtorProp<std::__cxx11::basic_string<char, std::char_traits<... Kokkos::Impl::ViewCtorProp<std::__cxx11::basic_string<char, std::char_traits<...
281 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ViewCtorProp<std::__cxx11::basic_string<char, std::char_traits<... Kokkos::Impl::ViewCtorProp<std::__cxx11::basic_string<char, std::char_traits<...
281 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ViewCtorProp<std::__cxx11::basic_string<char, std::char_traits<... Kokkos::Impl::ViewCtorProp<std::__cxx11::basic_string<char, std::char_traits<...
281 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ViewCtorProp<std::__cxx11::basic_string<char, std::char_traits<... Kokkos::Impl::ViewCtorProp<std::__cxx11::basic_string<char, std::char_traits<...
524 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ViewFill<Kokkos::View<double*, Kokkos::LayoutRight, Kokkos::Dev... Kokkos::Impl::ViewFill<Kokkos::View<double*, Kokkos::LayoutRight, Kokkos::Dev...
525 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ViewFill<Kokkos::View<double*, Kokkos::LayoutRight, Kokkos::Dev... Kokkos::Impl::ViewFill<Kokkos::View<double*, Kokkos::LayoutRight, Kokkos::Dev...
524 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ViewFill<Kokkos::View<double*, Kokkos::LayoutRight, Kokkos::Dev... Kokkos::Impl::ViewFill<Kokkos::View<double*, Kokkos::LayoutRight, Kokkos::Dev...
583 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ViewMapping<Kokkos::ViewTraits<int* [8], Kokkos::LayoutRight>, ... SharedAllocationRecord<void, void> * Kokkos::Impl::ViewMapping<Kokkos::ViewTr...
529 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ViewMapping<Kokkos::ViewTraits<int*, Kokkos::HostSpace>, void>:... SharedAllocationRecord<void, void> * Kokkos::Impl::ViewMapping<Kokkos::ViewTr...
529 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ViewMapping<Kokkos::ViewTraits<int*>, void>::allocate_shared<st... SharedAllocationRecord<void, void> * Kokkos::Impl::ViewMapping<Kokkos::ViewTr...
203 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ViewRemap<Kokkos::View<int* [8], Kokkos::LayoutRight>, Kokkos::... Kokkos::Impl::ViewRemap<Kokkos::View<int* [8], Kokkos::LayoutRight>, Kokkos::...
331 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ViewRemap<Kokkos::View<int*>, Kokkos::View<int*>, Kokkos::OpenM... Kokkos::Impl::ViewRemap<Kokkos::View<int*>, Kokkos::View<int*>, Kokkos::OpenM...
461 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::ViewValueFunctor<Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpa... enable_if_t<std::is_trivial<int>::value && std::is_trivially_copy_assignable<...
353 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::contiguous_fill<Kokkos::OpenMP, double*> Kokkos::Impl::contiguous_fill<Kokkos::OpenMP, double*>(exec_space, dst, value...
139 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::contiguous_fill<Kokkos::OpenMP, double, Kokkos::LayoutRight, Ko... Kokkos::Impl::contiguous_fill<Kokkos::OpenMP, double, Kokkos::LayoutRight, Ko...
824 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::view_copy<Kokkos::View<int* [8], Kokkos::LayoutRight, Kokkos::D... Kokkos::Impl::view_copy<Kokkos::View<int* [8], Kokkos::LayoutRight, Kokkos::D...
824 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::view_copy<Kokkos::View<int* [8], Kokkos::LayoutRight, Kokkos::D... Kokkos::Impl::view_copy<Kokkos::View<int* [8], Kokkos::LayoutRight, Kokkos::D...
824 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::view_copy<Kokkos::View<int* [8], Kokkos::LayoutRight>, Kokkos::... Kokkos::Impl::view_copy<Kokkos::View<int* [8], Kokkos::LayoutRight>, Kokkos::...
824 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::view_copy<Kokkos::View<int* [8], Kokkos::LayoutRight>, Kokkos::... Kokkos::Impl::view_copy<Kokkos::View<int* [8], Kokkos::LayoutRight>, Kokkos::...
697 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::view_copy<Kokkos::View<int*, Kokkos::LayoutRight, Kokkos::Devic... Kokkos::Impl::view_copy<Kokkos::View<int*, Kokkos::LayoutRight, Kokkos::Devic...
697 ../examples/lulesh/lulesh-init.cc Kokkos::Impl::view_copy<Kokkos::View<int*>, Kokkos::View<int*> > Kokkos::Impl::view_copy<Kokkos::View<int*>, Kokkos::View<int*> >(dst, src) [l...
2036 ../examples/lulesh/lulesh-init.cc Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::Schedule<Kokkos::Static>, int>::R... Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::Schedule<Kokkos::Static>, int>::R...
2506 ../examples/lulesh/lulesh-init.cc Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::Schedule<Kokkos::Static>, long>::... Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::Schedule<Kokkos::Static>, long>::...
271 ../examples/lulesh/lulesh-init.cc Kokkos::StaticCrsGraph<int, Kokkos::LayoutLeft, Kokkos::OpenMP, Kokkos::Memor... Kokkos::StaticCrsGraph<int, Kokkos::LayoutLeft, Kokkos::OpenMP, Kokkos::Memor...
470 ../examples/lulesh/lulesh-init.cc Kokkos::View<int* [8], Kokkos::LayoutRight>::View<std::__cxx11::basic_string<... Kokkos::View<int* [8], Kokkos::LayoutRight>::View<std::__cxx11::basic_string<...
323 ../examples/lulesh/lulesh-init.cc Kokkos::View<int* [8], Kokkos::LayoutRight>::View<std::__cxx11::basic_string<... Kokkos::View<int* [8], Kokkos::LayoutRight>::View<std::__cxx11::basic_string<...
410 ../examples/lulesh/lulesh-init.cc Kokkos::View<int*, Kokkos::HostSpace>::View<char [10]> Kokkos::View<int*, Kokkos::HostSpace>::View<char [10]>(View<int *, Kokkos::Ho...
410 ../examples/lulesh/lulesh-init.cc Kokkos::View<int*, Kokkos::HostSpace>::View<char [14]> Kokkos::View<int*, Kokkos::HostSpace>::View<char [14]>(View<int *, Kokkos::Ho...
462 ../examples/lulesh/lulesh-init.cc Kokkos::View<int*, Kokkos::HostSpace>::View<std::__cxx11::basic_string<char, ... Kokkos::View<int*, Kokkos::HostSpace>::View<std::__cxx11::basic_string<char, ...
410 ../examples/lulesh/lulesh-init.cc Kokkos::View<int*>::View<char [16]> Kokkos::View<int*>::View<char [16]>(View<int *> *, arg_label, type, const siz...
410 ../examples/lulesh/lulesh-init.cc Kokkos::View<int*>::View<char [19]> Kokkos::View<int*>::View<char [19]>(View<int *> *, arg_label, type, const siz...
410 ../examples/lulesh/lulesh-init.cc Kokkos::View<int*>::View<char [21]> Kokkos::View<int*>::View<char [21]>(View<int *> *, arg_label, type, const siz...
462 ../examples/lulesh/lulesh-init.cc Kokkos::View<int*>::View<std::__cxx11::basic_string<char, std::char_traits<ch... Kokkos::View<int*>::View<std::__cxx11::basic_string<char, std::char_traits<ch...
323 ../examples/lulesh/lulesh-init.cc Kokkos::View<int*>::View<std::__cxx11::basic_string<char, std::char_traits<ch... Kokkos::View<int*>::View<std::__cxx11::basic_string<char, std::char_traits<ch...
6589 ../examples/lulesh/lulesh-init.cc Kokkos::deep_copy<double*, , double*, Kokkos::LayoutRight, Kokkos::Device<Kok... Kokkos::deep_copy<double*, , double*, Kokkos::LayoutRight, Kokkos::Device<Kok...
1052 ../examples/lulesh/lulesh-init.cc Kokkos::deep_copy<double*> Kokkos::deep_copy<double*>(dst, value) [lulesh-init.cc]
1050 ../examples/lulesh/lulesh-init.cc Kokkos::deep_copy<double, Kokkos::LayoutRight, Kokkos::Device<Kokkos::OpenMP,... Kokkos::deep_copy<double, Kokkos::LayoutRight, Kokkos::Device<Kokkos::OpenMP,...
7686 ../examples/lulesh/lulesh-init.cc Kokkos::deep_copy<int* [8], Kokkos::LayoutRight, Kokkos::Device<Kokkos::OpenM... Kokkos::deep_copy<int* [8], Kokkos::LayoutRight, Kokkos::Device<Kokkos::OpenM...
7686 ../examples/lulesh/lulesh-init.cc Kokkos::deep_copy<int* [8], Kokkos::LayoutRight, int* [8], Kokkos::LayoutRigh... Kokkos::deep_copy<int* [8], Kokkos::LayoutRight, int* [8], Kokkos::LayoutRigh...
6589 ../examples/lulesh/lulesh-init.cc Kokkos::deep_copy<int*, , int*, Kokkos::LayoutRight, Kokkos::Device<Kokkos::O... Kokkos::deep_copy<int*, , int*, Kokkos::LayoutRight, Kokkos::Device<Kokkos::O...
6589 ../examples/lulesh/lulesh-init.cc Kokkos::deep_copy<int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Ko... Kokkos::deep_copy<int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Ko...
6589 ../examples/lulesh/lulesh-init.cc Kokkos::deep_copy<int*, Kokkos::LayoutRight, Kokkos::Device<Kokkos::OpenMP, K... Kokkos::deep_copy<int*, Kokkos::LayoutRight, Kokkos::Device<Kokkos::OpenMP, K...
863 ../examples/lulesh/lulesh-init.cc Kokkos::impl_resize<, int* [8], Kokkos::LayoutRight> type Kokkos::impl_resize<, int* [8], Kokkos::LayoutRight>(v, const size_t, co...
854 ../examples/lulesh/lulesh-init.cc Kokkos::impl_resize<, int*> type Kokkos::impl_resize<, int*>(v, const size_t, const size_t, const size_t,...
697 ../examples/lulesh/lulesh-init.cc Kokkos::parallel_for<Kokkos::MDRangePolicy<Kokkos::OpenMP, Kokkos::Rank<2u, (... Kokkos::parallel_for<Kokkos::MDRangePolicy<Kokkos::OpenMP, Kokkos::Rank<2u, (...
706 ../examples/lulesh/lulesh-init.cc Kokkos::parallel_for<Kokkos::MDRangePolicy<Kokkos::OpenMP, Kokkos::Rank<2u, (... Kokkos::parallel_for<Kokkos::MDRangePolicy<Kokkos::OpenMP, Kokkos::Rank<2u, (...
912 ../examples/lulesh/lulesh-init.cc Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<in... Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<in...
791 ../examples/lulesh/lulesh-init.cc Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<in... Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<in...
791 ../examples/lulesh/lulesh-init.cc Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<in... Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<in...
944 ../examples/lulesh/lulesh-init.cc Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<lo... Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<lo...
839 ../examples/lulesh/lulesh-init.cc Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<lo... Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<lo...
126 ../examples/lulesh/lulesh-init.cc _GLOBAL__sub_I_lulesh_init.cc _GLOBAL__sub_I_lulesh_init.cc() [lulesh-init.cc]
6589 ../examples/lulesh/lulesh-util.cc Kokkos::deep_copy<double*, Kokkos::LayoutRight, Kokkos::Device<Kokkos::OpenMP... Kokkos::deep_copy<double*, Kokkos::LayoutRight, Kokkos::Device<Kokkos::OpenMP...
1345 ../examples/lulesh/lulesh-util.cc ParseCommandLineOptions ParseCommandLineOptions(int, char * *, int, cmdLineOpts *) [lulesh-util.cc:67]
171 ../examples/lulesh/lulesh-util.cc PrintCommandLineOptions PrintCommandLineOptions(char *, int) [lulesh-util.cc:31]
67 ../examples/lulesh/lulesh-util.cc StrToInt int StrToInt(const char *, int *) [lulesh-util.cc:13]
706 ../examples/lulesh/lulesh-util.cc VerifyAndWriteFinalOutput VerifyAndWriteFinalOutput(Real_t, locDom, Int_t, Int_t) [lulesh-util.cc:222]
126 ../examples/lulesh/lulesh-util.cc _GLOBAL__sub_I_lulesh_util.cc _GLOBAL__sub_I_lulesh_util.cc() [lulesh-util.cc]
17 ../examples/lulesh/lulesh-viz.cc DumpToVisit DumpToVisit(domain, int, int, int) [lulesh-viz.cc:415]
126 ../examples/lulesh/lulesh-viz.cc _GLOBAL__sub_I_lulesh_viz.cc _GLOBAL__sub_I_lulesh_viz.cc() [lulesh-viz.cc]
451 ../examples/lulesh/lulesh.cc .omp_outlined..103 .omp_outlined..103(const , const , const ParallelReduce<(lambda at ../example...
796 ../examples/lulesh/lulesh.cc .omp_outlined..109 .omp_outlined..109(const , const , const ParallelFor<(lambda at ../examples/l...
394 ../examples/lulesh/lulesh.cc .omp_outlined..111 .omp_outlined..111(const , const , const ParallelFor<(lambda at ../examples/l...
402 ../examples/lulesh/lulesh.cc .omp_outlined..113 .omp_outlined..113(const , const , const ParallelFor<(lambda at ../examples/l...
427 ../examples/lulesh/lulesh.cc .omp_outlined..115 .omp_outlined..115(const , const , const ParallelReduce<(lambda at ../example...
859 ../examples/lulesh/lulesh.cc .omp_outlined..119 .omp_outlined..119(const , const , const ParallelFor<(lambda at ../examples/l...
243 ../examples/lulesh/lulesh.cc .omp_outlined..122 .omp_outlined..122(const , const , const ParallelFor<(lambda at ../examples/l...
426 ../examples/lulesh/lulesh.cc .omp_outlined..124 .omp_outlined..124(const , const , const ParallelFor<(lambda at ../examples/l...
529 ../examples/lulesh/lulesh.cc .omp_outlined..127 .omp_outlined..127(const , const , const ParallelFor<(lambda at ../examples/l...
865 ../examples/lulesh/lulesh.cc .omp_outlined..130 .omp_outlined..130(const , const , const ParallelFor<(lambda at ../examples/l...
539 ../examples/lulesh/lulesh.cc .omp_outlined..132 .omp_outlined..132(const , const , const ParallelReduce<(lambda at ../example...
456 ../examples/lulesh/lulesh.cc .omp_outlined..134 .omp_outlined..134(const , const , const ParallelReduce<(lambda at ../example...
252 ../examples/lulesh/lulesh.cc .omp_outlined..20 .omp_outlined..20(const , const , const ParallelFor<(lambda at ../examples/lu...
870 ../examples/lulesh/lulesh.cc .omp_outlined..35 .omp_outlined..35(const , const , const ParallelFor<(lambda at ../examples/lu...
473 ../examples/lulesh/lulesh.cc .omp_outlined..42 .omp_outlined..42(const , const , const ParallelFor<(lambda at ../examples/lu...
252 ../examples/lulesh/lulesh.cc .omp_outlined..46 .omp_outlined..46(const , const , const ParallelFor<(lambda at ../examples/lu...
1101 ../examples/lulesh/lulesh.cc .omp_outlined..48 .omp_outlined..48(const , const , const ParallelFor<(lambda at ../examples/lu...
427 ../examples/lulesh/lulesh.cc .omp_outlined..55 .omp_outlined..55(const , const , const ParallelReduce<(lambda at ../examples...
1326 ../examples/lulesh/lulesh.cc .omp_outlined..57 .omp_outlined..57(const , const , const ParallelReduce<(lambda at ../examples...
243 ../examples/lulesh/lulesh.cc .omp_outlined..61 .omp_outlined..61(const , const , const ParallelFor<(lambda at ../examples/lu...
1101 ../examples/lulesh/lulesh.cc .omp_outlined..63 .omp_outlined..63(const , const , const ParallelFor<(lambda at ../examples/lu...
372 ../examples/lulesh/lulesh.cc .omp_outlined..66 .omp_outlined..66(const , const , const ParallelFor<(lambda at ../examples/lu...
499 ../examples/lulesh/lulesh.cc .omp_outlined..71 .omp_outlined..71(const , const , const ParallelFor<(lambda at ../examples/lu...
499 ../examples/lulesh/lulesh.cc .omp_outlined..73 .omp_outlined..73(const , const , const ParallelFor<(lambda at ../examples/lu...
499 ../examples/lulesh/lulesh.cc .omp_outlined..75 .omp_outlined..75(const , const , const ParallelFor<(lambda at ../examples/lu...
465 ../examples/lulesh/lulesh.cc .omp_outlined..78 .omp_outlined..78(const , const , const ParallelFor<(lambda at ../examples/lu...
396 ../examples/lulesh/lulesh.cc .omp_outlined..81 .omp_outlined..81(const , const , const ParallelFor<(lambda at ../examples/lu...
656 ../examples/lulesh/lulesh.cc .omp_outlined..85 .omp_outlined..85(const , const , const ParallelFor<Kokkos::Impl::ViewCopy<Ko...
662 ../examples/lulesh/lulesh.cc .omp_outlined..89 .omp_outlined..89(const , const , const ParallelFor<Kokkos::Impl::ViewCopy<Ko...
443 ../examples/lulesh/lulesh.cc .omp_outlined..93 .omp_outlined..93(const , const , const ParallelReduce<(lambda at ../examples...
243 ../examples/lulesh/lulesh.cc .omp_outlined..96 .omp_outlined..96(const , const , const ParallelFor<(lambda at ../examples/lu...
243 ../examples/lulesh/lulesh.cc .omp_outlined..99 .omp_outlined..99(const , const , const ParallelFor<(lambda at ../examples/lu...
13367 ../examples/lulesh/lulesh.cc ApplyMaterialPropertiesForElems ApplyMaterialPropertiesForElems(domain) [lulesh.cc:409]
1530 ../examples/lulesh/lulesh.cc CalcElemCharacteristicLength Real_t CalcElemCharacteristicLength(const Real_t *, const Real_t *, const Rea...
982 ../examples/lulesh/lulesh.cc CalcElemFBHourglassForce CalcElemFBHourglassForce(const Real_t *, const Real_t[] *, coefficient, Real_...
2428 ../examples/lulesh/lulesh.cc CalcElemNodeNormals CalcElemNodeNormals(Real_t *, Real_t *, Real_t *, const Real_t *, const Real_...
853 ../examples/lulesh/lulesh.cc CalcElemShapeFunctionDerivatives CalcElemShapeFunctionDerivatives(const Real_t *, const Real_t *, const Real_t...
1097 ../examples/lulesh/lulesh.cc CalcElemVolumeDerivative CalcElemVolumeDerivative(i, dvdx, dvdy, dvdz, const Real_t *, const Real_t *,...
1054 ../examples/lulesh/lulesh.cc CalcKinematicsForElems CalcKinematicsForElems(domain, Real_t, Index_t) [lulesh.cc]
14160 ../examples/lulesh/lulesh.cc CalcVolumeForceForElems CalcVolumeForceForElems(domain) [lulesh.cc:409]
366 ../examples/lulesh/lulesh.cc Domain::AllocateGradients Domain::AllocateGradients(Domain *, Int_t, Int_t) [lulesh.cc:214]
475 ../examples/lulesh/lulesh.cc Domain::DeallocateGradients Domain::DeallocateGradients(Domain *) [lulesh.cc:105]
250 ../examples/lulesh/lulesh.cc Domain::DeallocateStrains Domain::DeallocateStrains(Domain *) [lulesh.cc:105]
4356 ../examples/lulesh/lulesh.cc Domain::Domain Domain::Domain(Domain *) [lulesh.cc:78]
15 ../examples/lulesh/lulesh.cc Domain::delv_eta Domain::delv_eta(const Domain *, const Index_t) [lulesh.cc:371]
15 ../examples/lulesh/lulesh.cc Domain::delv_xi Domain::delv_xi(const Domain *, const Index_t) [lulesh.cc:368]
15 ../examples/lulesh/lulesh.cc Domain::delv_zeta Domain::delv_zeta(const Domain *, const Index_t) [lulesh.cc:374]
15 ../examples/lulesh/lulesh.cc Domain::fx Domain::fx(const Domain *, const Index_t) [lulesh.cc:303]
15 ../examples/lulesh/lulesh.cc Domain::fy Domain::fy(const Domain *, const Index_t) [lulesh.cc:306]
15 ../examples/lulesh/lulesh.cc Domain::fz Domain::fz(const Domain *, const Index_t) [lulesh.cc:309]
15 ../examples/lulesh/lulesh.cc Domain::nodalMass Domain::nodalMass(const Domain *, const Index_t) [lulesh.cc:314]
15 ../examples/lulesh/lulesh.cc Domain::x Domain::x(const Domain *, const Index_t) [lulesh.cc:257]
15 ../examples/lulesh/lulesh.cc Domain::xd Domain::xd(const Domain *, const Index_t) [lulesh.cc:272]
15 ../examples/lulesh/lulesh.cc Domain::y Domain::y(const Domain *, const Index_t) [lulesh.cc:258]
15 ../examples/lulesh/lulesh.cc Domain::yd Domain::yd(const Domain *, const Index_t) [lulesh.cc:275]
15 ../examples/lulesh/lulesh.cc Domain::z Domain::z(const Domain *, const Index_t) [lulesh.cc:259]
15 ../examples/lulesh/lulesh.cc Domain::zd Domain::zd(const Domain *, const Index_t) [lulesh.cc:278]
330 ../examples/lulesh/lulesh.cc Kokkos::Impl::ParallelConstructName<Kokkos::Impl::ViewCopy<Kokkos::View<doubl... Kokkos::Impl::ParallelConstructName<Kokkos::Impl::ViewCopy<Kokkos::View<doubl...
330 ../examples/lulesh/lulesh.cc Kokkos::Impl::ParallelConstructName<Kokkos::Impl::ViewCopy<Kokkos::View<doubl... Kokkos::Impl::ParallelConstructName<Kokkos::Impl::ViewCopy<Kokkos::View<doubl...
1508 ../examples/lulesh/lulesh.cc Kokkos::Impl::ParallelFor<CalcEnergyForElems(double*, double*, double*, doubl... type Kokkos::Impl::ParallelFor<CalcEnergyForElems(double*, double*, double*, ...
3606 ../examples/lulesh/lulesh.cc Kokkos::Impl::ParallelFor<CalcFBHourglassForceForElems(Domain&, double*, Kokk... type Kokkos::Impl::ParallelFor<CalcFBHourglassForceForElems(Domain&, double*,...
2917 ../examples/lulesh/lulesh.cc Kokkos::Impl::ParallelFor<CalcKinematicsForElems(Domain&, double, int)::$_0, ... type Kokkos::Impl::ParallelFor<CalcKinematicsForElems(Domain&, double, int)::...
3119 ../examples/lulesh/lulesh.cc Kokkos::Impl::ParallelFor<CalcMonotonicQGradientsForElems(Domain&)::{lambda(i... type Kokkos::Impl::ParallelFor<CalcMonotonicQGradientsForElems(Domain&)::{lam...
1969 ../examples/lulesh/lulesh.cc Kokkos::Impl::ParallelFor<CalcMonotonicQRegionForElems(Domain&, int, double):... type Kokkos::Impl::ParallelFor<CalcMonotonicQRegionForElems(Domain&, int, dou...
1265 ../examples/lulesh/lulesh.cc Kokkos::Impl::ParallelFor<IntegrateStressForElems(Domain&, double*, double*, ... type Kokkos::Impl::ParallelFor<IntegrateStressForElems(Domain&, double*, doub...
49 ../examples/lulesh/lulesh.cc Kokkos::Impl::SharedAllocationRecord<Kokkos::HostSpace, Kokkos::Impl::ViewVal... Kokkos::Impl::SharedAllocationRecord<Kokkos::HostSpace, Kokkos::Impl::ViewVal...
1497 ../examples/lulesh/lulesh.cc Kokkos::Impl::TeamPolicyInternal<Kokkos::OpenMP>::TeamPolicyInternal Kokkos::Impl::TeamPolicyInternal<Kokkos::OpenMP>::TeamPolicyInternal(TeamPoli...
603 ../examples/lulesh/lulesh.cc Kokkos::Impl::ViewCopy<Kokkos::View<double*, Kokkos::LayoutLeft, Kokkos::Devi... Kokkos::Impl::ViewCopy<Kokkos::View<double*, Kokkos::LayoutLeft, Kokkos::Devi...
604 ../examples/lulesh/lulesh.cc Kokkos::Impl::ViewCopy<Kokkos::View<double*, Kokkos::LayoutLeft, Kokkos::Devi... Kokkos::Impl::ViewCopy<Kokkos::View<double*, Kokkos::LayoutLeft, Kokkos::Devi...
281 ../examples/lulesh/lulesh.cc Kokkos::Impl::ViewCtorProp<std::__cxx11::basic_string<char, std::char_traits<... Kokkos::Impl::ViewCtorProp<std::__cxx11::basic_string<char, std::char_traits<...
281 ../examples/lulesh/lulesh.cc Kokkos::Impl::ViewCtorProp<std::__cxx11::basic_string<char, std::char_traits<... Kokkos::Impl::ViewCtorProp<std::__cxx11::basic_string<char, std::char_traits<...
521 ../examples/lulesh/lulesh.cc Kokkos::Impl::ViewMapping<Kokkos::ViewTraits<double*>, void>::allocate_shared... SharedAllocationRecord<void, void> * Kokkos::Impl::ViewMapping<Kokkos::ViewTr...
331 ../examples/lulesh/lulesh.cc Kokkos::Impl::ViewRemap<Kokkos::View<double*>, Kokkos::View<double*>, Kokkos:... Kokkos::Impl::ViewRemap<Kokkos::View<double*>, Kokkos::View<double*>, Kokkos:...
461 ../examples/lulesh/lulesh.cc Kokkos::Impl::ViewValueFunctor<Kokkos::Device<Kokkos::OpenMP, Kokkos::HostSpa... enable_if_t<std::is_trivial<double>::value && std::is_trivially_copy_assignab...
1609 ../examples/lulesh/lulesh.cc Kokkos::Impl::runtime_check_rank_host Kokkos::Impl::runtime_check_rank_host(const size_t, const bool, const size_t,...
697 ../examples/lulesh/lulesh.cc Kokkos::Impl::view_copy<Kokkos::View<double*, Kokkos::LayoutRight, Kokkos::De... Kokkos::Impl::view_copy<Kokkos::View<double*, Kokkos::LayoutRight, Kokkos::De...
697 ../examples/lulesh/lulesh.cc Kokkos::Impl::view_copy<Kokkos::View<double*>, Kokkos::View<double*> > Kokkos::Impl::view_copy<Kokkos::View<double*>, Kokkos::View<double*> >(dst, s...
2250 ../examples/lulesh/lulesh.cc Kokkos::RangePolicy<Kokkos::OpenMP>::RangePolicy Kokkos::RangePolicy<Kokkos::OpenMP>::RangePolicy(RangePolicy<Kokkos::OpenMP> ...
213 ../examples/lulesh/lulesh.cc Kokkos::StaticCrsGraph<int, Kokkos::LayoutLeft, Kokkos::OpenMP, Kokkos::Memor... Kokkos::StaticCrsGraph<int, Kokkos::LayoutLeft, Kokkos::OpenMP, Kokkos::Memor...
410 ../examples/lulesh/lulesh.cc Kokkos::View<double*>::View<char [6]> Kokkos::View<double*>::View<char [6]>(View<double *> *, arg_label, type, cons...
410 ../examples/lulesh/lulesh.cc Kokkos::View<double*>::View<char [7]> Kokkos::View<double*>::View<char [7]>(View<double *> *, arg_label, type, cons...
462 ../examples/lulesh/lulesh.cc Kokkos::View<double*>::View<std::__cxx11::basic_string<char, std::char_traits... Kokkos::View<double*>::View<std::__cxx11::basic_string<char, std::char_traits...
323 ../examples/lulesh/lulesh.cc Kokkos::View<double*>::View<std::__cxx11::basic_string<char, std::char_traits... Kokkos::View<double*>::View<std::__cxx11::basic_string<char, std::char_traits...
25 ../examples/lulesh/lulesh.cc Kokkos::View<double*>::~View Kokkos::View<double*>::~View(View<double *> *) [lulesh.cc:409]
840 ../examples/lulesh/lulesh.cc Kokkos::abort Kokkos::abort(const const char *, const const char *) [lulesh.cc:202]
854 ../examples/lulesh/lulesh.cc Kokkos::impl_resize<, double*> type Kokkos::impl_resize<, double*>(v, const size_t, const size_t, const size...
928 ../examples/lulesh/lulesh.cc Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<in... Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<in...
960 ../examples/lulesh/lulesh.cc Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<lo... Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<lo...
21470 ../examples/lulesh/lulesh.cc LagrangeLeapFrog LagrangeLeapFrog(domain) [lulesh.cc]
226 ../examples/lulesh/lulesh.cc ResizeBuffer ResizeBuffer(const size_t) [lulesh.cc:23]
169 ../examples/lulesh/lulesh.cc _GLOBAL__sub_I_lulesh.cc _GLOBAL__sub_I_lulesh.cc() [lulesh.cc]
1836 ../examples/lulesh/lulesh.cc main int main(int, char * *) [lulesh.cc]
63 ../examples/lulesh/lulesh.cc std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::a... std::_Rb_tree<std::__cxx11::basic_string<char, std::char_traits<char>, std::a...
20 ../examples/lulesh/lulesh.cc std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::alloca... std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::alloca...
160 ../examples/lulesh/lulesh.cc std::operator+<char, std::char_traits<char>, std::allocator<char> > basic_string<char, std::char_traits<char>, std::allocator<char> > std::operat...
187 ../examples/lulesh/lulesh.cc std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::alloc... std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::alloc...
11 lulesh __clang_call_terminate __clang_call_terminate() [lulesh]
33 lulesh __do_global_dtors_aux __do_global_dtors_aux() [lulesh]
5 lulesh __libc_csu_fini __libc_csu_fini() [lulesh]
101 lulesh __libc_csu_init __libc_csu_init() [lulesh]
5 lulesh _dl_relocate_static_pie _dl_relocate_static_pie() [lulesh]
13 lulesh _fini _fini() [lulesh]
27 lulesh _init _init() [lulesh]
47 lulesh _start _start() [lulesh]
6 lulesh frame_dummy frame_dummy() [lulesh]
An example of instrumented module and function info output
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code-block:: shell
rocprof-sys-instrument -o lulesh.inst --label file line args --simulate -- lulesh
After the heuristics are applied based on the pattern in :ref:`available-module-function-output`,
the selected module and functions are:
.. code-block:: shell
AddressRange Module Function FunctionSignature
9165 ../examples/lulesh/lulesh-comm.cc CommMonoQ CommMonoQ(domain) [lulesh-comm.cc:1891]
3396 ../examples/lulesh/lulesh-comm.cc CommRecv CommRecv(domain, int, Index_t, Index_t, Index_t, Index_t, bool, bool) [lulesh...
8666 ../examples/lulesh/lulesh-comm.cc CommSBN CommSBN(domain, int, Domain_member *) [lulesh-comm.cc:926]
10212 ../examples/lulesh/lulesh-comm.cc CommSend CommSend(domain, int, Index_t, Domain_member *, Index_t, Index_t, Index_t, bo...
6823 ../examples/lulesh/lulesh-comm.cc CommSyncPosVel CommSyncPosVel(domain) [lulesh-comm.cc:1404]
1840 ../examples/lulesh/lulesh-init.cc Domain::AllocateElemPersistent Domain::AllocateElemPersistent(Domain *, Int_t) [lulesh-init.cc:94]
1384 ../examples/lulesh/lulesh-init.cc Domain::AllocateNodePersistent Domain::AllocateNodePersistent(Domain *, Int_t) [lulesh-init.cc:94]
1264 ../examples/lulesh/lulesh-init.cc Domain::BuildMesh Domain::BuildMesh(Domain *, Int_t, Int_t, Int_t) [lulesh-init.cc:308]
2312 ../examples/lulesh/lulesh-init.cc Domain::CreateRegionIndexSets Domain::CreateRegionIndexSets(Domain *, Int_t, Int_t) [lulesh-init.cc:409]
7109 ../examples/lulesh/lulesh-init.cc Domain::Domain Domain::Domain(Domain *, Int_t, Index_t, Index_t, Index_t, Index_t, int, int,...
2458 ../examples/lulesh/lulesh-init.cc Domain::SetupBoundaryConditions Domain::SetupBoundaryConditions(Domain *, Int_t) [lulesh-init.cc:409]
956 ../examples/lulesh/lulesh-init.cc Domain::SetupCommBuffers Domain::SetupCommBuffers(Domain *, Int_t) [lulesh-init.cc]
1456 ../examples/lulesh/lulesh-init.cc Domain::SetupElementConnectivities Domain::SetupElementConnectivities(Domain *, Int_t) [lulesh-init.cc:409]
721 ../examples/lulesh/lulesh-init.cc Domain::SetupSymmetryPlanes Domain::SetupSymmetryPlanes(Domain *, Int_t) [lulesh-init.cc:409]
1591 ../examples/lulesh/lulesh-init.cc Domain::SetupThreadSupportStructures Domain::SetupThreadSupportStructures(Domain *) [lulesh-init.cc:376]
1644 ../examples/lulesh/lulesh-init.cc Domain::~Domain Domain::~Domain(Domain *) [lulesh-init.cc:286]
271 ../examples/lulesh/lulesh-init.cc Kokkos::StaticCrsGraph<int, Kokkos::LayoutLeft, Kokkos::OpenMP, Kokkos::Memor... Kokkos::StaticCrsGraph<int, Kokkos::LayoutLeft, Kokkos::OpenMP, Kokkos::Memor...
410 ../examples/lulesh/lulesh-init.cc Kokkos::View<int*, Kokkos::HostSpace>::View<char [10]> Kokkos::View<int*, Kokkos::HostSpace>::View<char [10]>(View<int *, Kokkos::Ho...
410 ../examples/lulesh/lulesh-init.cc Kokkos::View<int*, Kokkos::HostSpace>::View<char [14]> Kokkos::View<int*, Kokkos::HostSpace>::View<char [14]>(View<int *, Kokkos::Ho...
410 ../examples/lulesh/lulesh-init.cc Kokkos::View<int*>::View<char [16]> Kokkos::View<int*>::View<char [16]>(View<int *> *, arg_label, type, const siz...
410 ../examples/lulesh/lulesh-init.cc Kokkos::View<int*>::View<char [19]> Kokkos::View<int*>::View<char [19]>(View<int *> *, arg_label, type, const siz...
410 ../examples/lulesh/lulesh-init.cc Kokkos::View<int*>::View<char [21]> Kokkos::View<int*>::View<char [21]>(View<int *> *, arg_label, type, const siz...
6589 ../examples/lulesh/lulesh-init.cc Kokkos::deep_copy<double*, , double*, Kokkos::LayoutRight, Kokkos::Device<Kok... Kokkos::deep_copy<double*, , double*, Kokkos::LayoutRight, Kokkos::Device<Kok...
1052 ../examples/lulesh/lulesh-init.cc Kokkos::deep_copy<double*> Kokkos::deep_copy<double*>(dst, value) [lulesh-init.cc]
1050 ../examples/lulesh/lulesh-init.cc Kokkos::deep_copy<double, Kokkos::LayoutRight, Kokkos::Device<Kokkos::OpenMP,... Kokkos::deep_copy<double, Kokkos::LayoutRight, Kokkos::Device<Kokkos::OpenMP,...
7686 ../examples/lulesh/lulesh-init.cc Kokkos::deep_copy<int* [8], Kokkos::LayoutRight, Kokkos::Device<Kokkos::OpenM... Kokkos::deep_copy<int* [8], Kokkos::LayoutRight, Kokkos::Device<Kokkos::OpenM...
7686 ../examples/lulesh/lulesh-init.cc Kokkos::deep_copy<int* [8], Kokkos::LayoutRight, int* [8], Kokkos::LayoutRigh... Kokkos::deep_copy<int* [8], Kokkos::LayoutRight, int* [8], Kokkos::LayoutRigh...
6589 ../examples/lulesh/lulesh-init.cc Kokkos::deep_copy<int*, , int*, Kokkos::LayoutRight, Kokkos::Device<Kokkos::O... Kokkos::deep_copy<int*, , int*, Kokkos::LayoutRight, Kokkos::Device<Kokkos::O...
6589 ../examples/lulesh/lulesh-init.cc Kokkos::deep_copy<int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Ko... Kokkos::deep_copy<int*, Kokkos::LayoutLeft, Kokkos::Device<Kokkos::OpenMP, Ko...
6589 ../examples/lulesh/lulesh-init.cc Kokkos::deep_copy<int*, Kokkos::LayoutRight, Kokkos::Device<Kokkos::OpenMP, K... Kokkos::deep_copy<int*, Kokkos::LayoutRight, Kokkos::Device<Kokkos::OpenMP, K...
697 ../examples/lulesh/lulesh-init.cc Kokkos::parallel_for<Kokkos::MDRangePolicy<Kokkos::OpenMP, Kokkos::Rank<2u, (... Kokkos::parallel_for<Kokkos::MDRangePolicy<Kokkos::OpenMP, Kokkos::Rank<2u, (...
706 ../examples/lulesh/lulesh-init.cc Kokkos::parallel_for<Kokkos::MDRangePolicy<Kokkos::OpenMP, Kokkos::Rank<2u, (... Kokkos::parallel_for<Kokkos::MDRangePolicy<Kokkos::OpenMP, Kokkos::Rank<2u, (...
912 ../examples/lulesh/lulesh-init.cc Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<in... Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<in...
791 ../examples/lulesh/lulesh-init.cc Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<in... Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<in...
791 ../examples/lulesh/lulesh-init.cc Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<in... Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<in...
944 ../examples/lulesh/lulesh-init.cc Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<lo... Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<lo...
839 ../examples/lulesh/lulesh-init.cc Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<lo... Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<lo...
6589 ../examples/lulesh/lulesh-util.cc Kokkos::deep_copy<double*, Kokkos::LayoutRight, Kokkos::Device<Kokkos::OpenMP... Kokkos::deep_copy<double*, Kokkos::LayoutRight, Kokkos::Device<Kokkos::OpenMP...
1345 ../examples/lulesh/lulesh-util.cc ParseCommandLineOptions ParseCommandLineOptions(int, char * *, int, cmdLineOpts *) [lulesh-util.cc:67]
706 ../examples/lulesh/lulesh-util.cc VerifyAndWriteFinalOutput VerifyAndWriteFinalOutput(Real_t, locDom, Int_t, Int_t) [lulesh-util.cc:222]
13367 ../examples/lulesh/lulesh.cc ApplyMaterialPropertiesForElems ApplyMaterialPropertiesForElems(domain) [lulesh.cc:409]
982 ../examples/lulesh/lulesh.cc CalcElemFBHourglassForce CalcElemFBHourglassForce(const Real_t *, const Real_t[] *, coefficient, Real_...
2428 ../examples/lulesh/lulesh.cc CalcElemNodeNormals CalcElemNodeNormals(Real_t *, Real_t *, Real_t *, const Real_t *, const Real_...
853 ../examples/lulesh/lulesh.cc CalcElemShapeFunctionDerivatives CalcElemShapeFunctionDerivatives(const Real_t *, const Real_t *, const Real_t...
1054 ../examples/lulesh/lulesh.cc CalcKinematicsForElems CalcKinematicsForElems(domain, Real_t, Index_t) [lulesh.cc]
14160 ../examples/lulesh/lulesh.cc CalcVolumeForceForElems CalcVolumeForceForElems(domain) [lulesh.cc:409]
366 ../examples/lulesh/lulesh.cc Domain::AllocateGradients Domain::AllocateGradients(Domain *, Int_t, Int_t) [lulesh.cc:214]
475 ../examples/lulesh/lulesh.cc Domain::DeallocateGradients Domain::DeallocateGradients(Domain *) [lulesh.cc:105]
4356 ../examples/lulesh/lulesh.cc Domain::Domain Domain::Domain(Domain *) [lulesh.cc:78]
410 ../examples/lulesh/lulesh.cc Kokkos::View<double*>::View<char [6]> Kokkos::View<double*>::View<char [6]>(View<double *> *, arg_label, type, cons...
410 ../examples/lulesh/lulesh.cc Kokkos::View<double*>::View<char [7]> Kokkos::View<double*>::View<char [7]>(View<double *> *, arg_label, type, cons...
928 ../examples/lulesh/lulesh.cc Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<in... Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<in...
960 ../examples/lulesh/lulesh.cc Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<lo... Kokkos::parallel_for<Kokkos::RangePolicy<Kokkos::OpenMP, Kokkos::IndexType<lo...
21470 ../examples/lulesh/lulesh.cc LagrangeLeapFrog LagrangeLeapFrog(domain) [lulesh.cc]
1836 ../examples/lulesh/lulesh.cc main int main(int, char * *) [lulesh.cc]
Sampling
========================================
.. note::
This capability has been deprecated in favor of :doc:`Call stack sampling <./sampling-call-stack>`.
By default, ``rocprof-sys-instrument`` uses ``--mode trace`` for instrumentation. The ``--mode sampling`` option
only instruments ``main`` in an executable. It activates both CPU call-stack sampling and
background system-level thread sampling by default.
Tracing capabilities which do not rely on instrumentation, such as the HIP API and kernel tracing
are still available.
The ROCm Systems Profiler sampling capabilities are always available, even in trace mode, but are deactivated by default.
To activate sampling in trace mode, set ``ROCPROFSYS_USE_SAMPLING=ON`` in the environment
or in an ROCm Systems Profiler configuration file.
Embedding a default configuration
========================================
Use the ``--env`` option to embed a default configuration into the target. Although this option
works for runtime instrumentation, it is most useful when generating new binaries because the generated
binary can be used later on in a different login session when the environment might have changed.
For example, if the following commands are run,
the configuration settings are not be preserved for subsequent sessions:
.. code-block:: shell
rocprof-sys-instrument -o ./foo.inst -- ./foo
export ROCPROFSYS_USE_SAMPLING=ON
export ROCPROFSYS_SAMPLING_FREQ=5
rocprof-sys-run -- ./foo.inst
Whereas the following command preserves those environment variables:
.. code-block:: shell
rocprof-sys-instrument -o ./foo.samp --env ROCPROFSYS_USE_SAMPLING=ON ROCPROFSYS_SAMPLING_FREQ=5 -- ./foo
They can now be used in future sessions.
.. code-block:: shell
# will sample 5x per second
rocprof-sys-run -- ./foo.samp
Even though the environment variables are preserved, subsequent sessions can still override those defaults:
.. code-block:: shell
# will sample 100x per second
export ROCPROFSYS_SAMPLING_FREQ=100
rocprof-sys-run -- ./foo.samp
.. _rpath-troubleshooting:
Troubleshooting
----------------------------------------------
Checking for RPATH
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If ``ldd ./foo.inst`` from the :ref:`binary-rewriting-library-label`
section still returns ``/usr/local/lib/libfoo.so.2``, the executable could have
an rpath encoded in the binary.
This ELF entry results in the dynamic linker ignoring ``LD_LIBRARY_PATH`` if
it finds ``libfoo.so.2`` in the rpath.
Using the ``objdump`` tool, perform the following query:
.. code-block:: shell
objdump -p <exe-or-library> | egrep 'RPATH|RUNPATH'
If this produces output that appears similar to this output.:
.. code-block:: shell
RUNPATH $ORIGIN:$ORIGIN/../lib
Remove or modify the rpath to get ``foo.inst`` to resolve
to the instrumented ``libfoo.so.2`` as explained in the next section.
Modifying an RPATH
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This code snippet uses the ``patchelf`` tool to modify the rpath of the given executable
or library to ``/home/user``, which is where the instrumented libraries are located.
.. note::
This functionality requires the ``patchelf`` package.
.. code-block:: shell
patchelf --remove-rpath <exe-or-library>
patchelf --set-rpath '/home/user' <exe-or-library>
+131
Просмотреть файл
@@ -0,0 +1,131 @@
.. meta::
:description: ROCm Systems Profiler network performance profiling
:keywords: rocprof-sys, rocprofiler-systems, ROCm, tips, how to, profiler, tracking, NIC, network, AMD
********************************************
Network performance profiling
********************************************
`ROCm Systems Profiler <https://github.com/ROCm/rocprofiler-systems>`_ supports network profiling.
All network events that can be traced on the system can be listed by running the command:
.. code-block:: shell
rocprof-sys-avail -H -r net
For example, if the system's NIC is enp7s0, then the output of this command looks like:
.. code-block:: shell
|-------------------------------|---------|-----------|-------------------------------|
| HARDWARE COUNTER | DEVICE | AVAILABLE | SUMMARY |
|-------------------------------|---------|-----------|-------------------------------|
| net:::enp7s0:rx:byte | CPU | true | enp7s0 receive byte |
| net:::enp7s0:rx:packet | CPU | true | enp7s0 receive packet |
| net:::enp7s0:rx:error | CPU | true | enp7s0 receive error |
| net:::enp7s0:rx:droppe | CPU | true | enp7s0 receive droppe |
| net:::enp7s0:rx:fif | CPU | true | enp7s0 receive fif |
| net:::enp7s0:rx:fram | CPU | true | enp7s0 receive fram |
| net:::enp7s0:rx:compresse | CPU | true | enp7s0 receive compresse |
| net:::enp7s0:rx:multicas | CPU | true | enp7s0 receive multicas |
| net:::enp7s0:tx:byte | CPU | true | enp7s0 transmit byte |
| net:::enp7s0:tx:packet | CPU | true | enp7s0 transmit packet |
| net:::enp7s0:tx:error | CPU | true | enp7s0 transmit error |
| net:::enp7s0:tx:droppe | CPU | true | enp7s0 transmit droppe |
| net:::enp7s0:tx:fif | CPU | true | enp7s0 transmit fif |
| net:::enp7s0:tx:coll | CPU | true | enp7s0 transmit coll |
| net:::enp7s0:tx:carrie | CPU | true | enp7s0 transmit carrie |
| net:::enp7s0:tx:compresse | CPU | true | enp7s0 transmit compresse |
|-------------------------------|---------|-----------|-------------------------------|
To track bytes and packets sent and received by the NIC ``enp7s0``, the configuration parameters should be configured as the following example:
.. code-block:: shell
ROCPROFSYS_PAPI_EVENTS = net:::enp7s0:tx:byte net:::enp7s0:rx:byte net:::enp7s0:tx:packet net:::enp7s0:rx:packet
Configuration
=============
A sample configuration parameter settings looks like:
.. code-block:: shell
ROCPROFSYS_SAMPLING_FREQ=10
ROCPROFSYS_USE_SAMPLING=ON
ROCPROFSYS_TIMEMORY_COMPONENTS=wall_clock papi_array network_stats
ROCPROFSYS_NETWORK_INTERFACE=enp7s0
ROCPROFSYS_PAPI_EVENTS=net:::enp7s0:tx:byte net:::enp7s0:rx:byte net:::enp7s0:rx:packet net:::enp7s0:tx:packet
Details of the configuration parameter settings configured in the example are:
* **Sampling Frequency**: 10 samples per second
* **TIMEMORY**: Outputs the summaries for the ``wall_clock``, ``papi_array``, and ``network_stats`` components.
* **Network Interface**: ``enp7s0`` is the predictable network interface device name.
* **Events for the network device to be sampled**: Bytes transmitted, bytes received, packets transmitted, and packets received.
The configuration parameter settings can be saved in a configuration file. Here is an example of a complete configuration file, ``rocprofsys.cfg``:
.. code-block:: shell
ROCPROFSYS_VERBOSE=1
ROCPROFSYS_DL_VERBOSE=1
ROCPROFSYS_SAMPLING_FREQ=10
ROCPROFSYS_SAMPLING_DELAY=0.05
ROCPROFSYS_SAMPLING_CPUS=0-9
ROCPROFSYS_SAMPLING_GPUS=$env:HIP_VISIBLE_DEVICES
ROCPROFSYS_TRACE=ON
ROCPROFSYS_PROFILE=ON
ROCPROFSYS_USE_SAMPLING=ON
ROCPROFSYS_USE_PROCESS_SAMPLING=OFF
ROCPROFSYS_TIME_OUTPUT=OFF
ROCPROFSYS_FILE_OUTPUT=ON
ROCPROFSYS_TIMEMORY_COMPONENTS=wall_clock papi_array network_stats
ROCPROFSYS_USE_PID=OFF
ROCPROFSYS_OUTPUT_PREFIX=foo/
ROCPROFSYS_NETWORK_INTERFACE=enp7s0
ROCPROFSYS_PAPI_EVENTS = net:::enp7s0:tx:byte net:::enp7s0:rx:byte net:::enp7s0:rx:packet net:::enp7s0:tx:packet
To specify the configuration file, use the ``ROCPROFSYS_CONFIG_FILE`` setting:
.. code-block:: shell
ROCPROFSYS_CONFIG_FILE=/path/to/rocprofsys.cfg
This setting defines the location of the ROCm Systems Profiler configuration file.
.. note::
To collect network counters using Process Application Program Interface (PAPI), ensure that
`/proc/sys/kernel/perf_event_paranoid` has a value <= 2. See
:ref:`rocprof-sys_papi_events`
for details.
Instrumenting and running a program
===================================
An example rocprof-sys-instrument command is:
.. code-block:: shell
rocprof-sys-instrument -o foo.inst \
--log-file mylog.log --verbose --debug \
"--print-instrumented" "functions" "-e" "-v" "2" "--caller-include" \
"inner" "-i" "4096" "--" ./foo
This command generates an instrumented binary ``foo.inst``. Then, run
it with the following command:
.. code-block:: shell
rocprof-sys-run -- ./foo.inst
To view the generated ``.proto`` file in the browser, open the
`Perfetto UI page <https://ui.perfetto.dev/>`_. Then, click on
``Open trace file`` and select the ``.proto`` file. In the browser, it looks
like this:
.. image:: ../data/rocprof-sys-perfetto-nic-trace.png
:alt: Visualization of a performance graph in Perfetto with network tracks
:width: 800
+626
Просмотреть файл
@@ -0,0 +1,626 @@
.. meta::
:description: ROCm Systems Profiler causal profiling documentation and reference
:keywords: rocprof-sys, rocprofiler-systems, Omnitrace, ROCm, causal profiling, profiler, tracking, visualization, tool, Instinct, accelerator, AMD
****************************************************
Performing causal profiling
****************************************************
The process of causal profiling can be summarized as:
*If you speed up a given block of code by X%, the application will run Y% faster*.
Causal profiling directs parallel application developers to where they should focus their optimization
efforts by quantifying the potential impact of optimizations. Causal profiling is rooted in the concept
that *software execution speed is relative*. Speeding up a block of code by X% is mathematically equivalent
to that block of code running at its current speed if all the other code is running slower by X%.
Thus, causal profiling works by performing experiments on blocks of code during program execution which
insert pauses to slow down all other concurrently running code. During post-processing, these experiments
are translated into calculations for the potential impact of speeding up this block of code.
Consider the following C++ code executing ``foo`` and ``bar`` concurrently in two different threads
where ``foo`` is ideally 30% faster than ``bar``:
.. code-block:: cpp
#include <cstddef>
#include <thread>
constexpr size_t FOO_N = 7 * 1000000000UL;
constexpr size_t BAR_N = 10 * 1000000000UL;
void foo()
{
for(volatile size_t i = 0; i < FOO_N; ++i) {}
}
void bar()
{
for(volatile size_t i = 0; i < BAR_N; ++i) {}
}
int main()
{
std::thread _threads[] = { std::thread{ foo },
std::thread{ bar } };
for(auto& itr : _threads)
itr.join();
}
No matter how many optimizations are applied to ``foo``, the application will always
require the same amount of time
because the end-to-end performance is limited by ``bar``. However, a 5% speed-up
in ``bar`` results in the
end-to-end performance improving by 5%. This trend continues linearly, with a 10% speed-up
in ``bar`` yielding a 10% speed-up in
end-to-end performance, and so on, up to a 30% speed-up, at which point ``bar`` runs as fast as ``foo``.
Any speed-up to ``bar`` beyond 30% still only yields an end-to-end performance
improvement of 30% because the application
is now limited by performance of ``foo``, as demonstrated below in the causal
profiling visualization:
.. image:: ../data/causal-foobar.png
:alt: Visualization of the performance improvements for two functions with causal profiling
The full details of the causal profiling methodology can be found in the paper
`Coz: Finding Code that Counts with Causal Profiling <http://arxiv.org/pdf/1608.03676v1.pdf>`_.
The author's implementation is publicly available on `GitHub <https://github.com/plasma-umass/coz>`_.
Getting started
========================================
To effectively use causal profiling, it is important to understand a few key
concepts, such as progress points.
Progress points
-----------------------------------
Causal profiling requires "progress points" to track progress through the code
in between samples. Progress points must be triggered in a deterministic manner via instrumentation.
This can happen in three different ways:
* `ROCm Systems Profiler <https://github.com/ROCm/rocprofiler-systems>`_ can leverage the callbacks from
Kokkos-Tools, OpenMP-Tools, rocprofiler-sdk, etc. and the wrappers around functions for
MPI, NUMA, RCCL, etc. to act as progress points
* Users can leverage the :doc:`runtime instrumentation capabilities <./instrumenting-rewriting-binary-application>`
to insert progress points
* Users can leverage :doc:`User APIs <../how-to/using-rocprof-sys-api>`,
such as ``ROCPROFSYS_CAUSAL_PROGRESS``
.. note::
Binary rewrite to insert progress points is not supported. When a rewritten binary
runs, Dyninst translates the instruction pointer address in order to perform
the instrumentation. As a result, call stack samples never return instruction
pointer addresses within the valid ROCm Systems Profiler range.
Key concepts
-----------------------------------
+------------------+--------------------------------------+----------------------------------+--------------------------------------------+
| Concept | Setting | Options | Description |
+==================+======================================+==================================+============================================+
| Backend | ``ROCPROFSYS_CAUSAL_BACKEND`` | ``perf``, ``timer`` | Backend for recording samples required |
| | | | to calculate the virtual speed-up |
+------------------+--------------------------------------+----------------------------------+--------------------------------------------+
| Mode | ``ROCPROFSYS_CAUSAL_MODE`` | ``function``, ``line`` | Select an entire function or individual |
| | | | line of code for causal experiments |
+------------------+--------------------------------------+----------------------------------+--------------------------------------------+
| End-to-end | ``ROCPROFSYS_CAUSAL_END_TO_END`` | Boolean | Perform a single experiment during the |
| | | | entire run (does not require |
| | | | progress points) |
+------------------+--------------------------------------+----------------------------------+--------------------------------------------+
| Fixed speed-up | ``ROCPROFSYS_CAUSAL_FIXED_SPEEDUP`` | one or more values from [0, 100] | Virtual speed-up or pool of virtual |
| | | | speed-ups to randomly select |
+------------------+--------------------------------------+----------------------------------+--------------------------------------------+
| Binary scope | ``ROCPROFSYS_CAUSAL_BINARY_SCOPE`` | regular expression(s) | Dynamic binaries containing code for |
| | | | experiments |
+------------------+--------------------------------------+----------------------------------+--------------------------------------------+
| Source scope | ``ROCPROFSYS_CAUSAL_SOURCE_SCOPE`` | regular expression(s) | ``<file>`` and/or ``<file>:<line>`` |
| | | | containing code to include in experiments |
+------------------+--------------------------------------+----------------------------------+--------------------------------------------+
| Function scope | ``ROCPROFSYS_CAUSAL_FUNCTION_SCOPE`` | regular expression(s) | Restricts experiments to matching |
| | | | functions (function mode) or lines of |
| | | | code within matching functions (line mode) |
+------------------+--------------------------------------+----------------------------------+--------------------------------------------+
.. note::
* Binary scope defaults to ``%MAIN%`` (in the executable), but the scope can be expanded to include linked libraries.
* ``<file>`` and ``<file>:<line>`` support requires debug info (for example, the code must be compiled with ``-g`` or, preferably, with ``-g3``)
* Function mode does not require debug info but does not support stripped binaries
Backends
-----------------------------------
There are two backends to choose from: ``perf`` and ``timer``.
They are used to record the samples required to calculate the virtual speedup.
Both backends interrupt each thread 1000 times per second (of CPU-time) to apply the virtual speed-ups.
The difference between each backend is how the samples are recorded.
There are three key differences between the two backends:
* the ``perf`` backend requires Linux Perf and elevated security priviledges
* the ``perf`` backend interrupts the application less frequently whereas the ``timer`` backend
interrupts the application 1000 times per second of realtime
* the ``timer`` backend has less accurate call stacks due to instruction pointer skid
In general, the ``perf`` backend is preferred over the ``timer`` backend when sufficient
security priviledges permit its usage.
If ``ROCPROFSYS_CAUSAL_BACKEND`` is set to ``auto``, ROCm Systems Profiler falls back
to using the ``timer`` backend only if
the ``perf`` backend fails. If ``ROCPROFSYS_CAUSAL_BACKEND`` is
set to ``perf`` and using this backend fails, ROCm Systems Profiler aborts.
Instruction pointer skid
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Instruction pointer (IP) skid measures how many instructions run after the event of interest
before the program actually stops. The IP skid is calculated by subtracting
the location of the IP at the point of interest from the location of the IP
when the kernel finally stops the application.
For the ``timer`` backend, this translates to the
difference in the IP between when the timer generated a signal and when the
signal was actually generated. Although IP skid still occurs with the ``perf`` backend,
it is much more pronounced with the ``timer`` backend due to the overhead of pausing the entire thread.
This means the ``timer`` backend tends to have a lower resolution than the ``perf`` backend,
especially in ``line`` mode.
Installing Linux Perf
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Linux Perf is built into the kernel and may already be installed
(for instance, it is included in the default kernel for OpenSUSE).
The official method of checking whether Linux Perf is installed is
checking for the existence of the file
``/proc/sys/kernel/perf_event_paranoid``. If the file exists, the kernel has Perf installed.
If this file does not exist, as with Debian-based systems like Ubuntu, run the following command as superuser:
.. code-block:: shell
apt-get install linux-tools-common linux-tools-generic linux-tools-$(uname -r)
and reboot your computer. In order to use the ``perf`` backend, the value
of ``/proc/sys/kernel/perf_event_paranoid``
should be less than or equal to 2. If the value in this file is greater than 2, you can't
use the ``perf`` backend.
To update the paranoid level temporarily until the system is rebooted, run
one of the following commands
as a superuser (where ``PARANOID_LEVEL=<N>`` has a value of ``<N>`` in the range ``[-1, 2]``):
.. code-block:: shell
echo ${PARANOID_LEVEL} | sudo tee /proc/sys/kernel/perf_event_paranoid
or
.. code-block:: shell
sysctl kernel.perf_event_paranoid=${PARANOID_LEVEL}
To make the paranoid level persistent after a reboot, add ``kernel.perf_event_paranoid=<N>``
(where ``<N>`` is the desired paranoid level) to the ``/etc/sysctl.conf`` file.
Speed-up prediction variability and the rocprof-sys-causal executable
-----------------------------------------------------------------------
Causal profiling typically requires running the application several times in
order to adequately sample all the code domains, experiment
with speed-ups and other techniques, and resolve statistical fluctuations.
The ``rocprof-sys-causal`` executable is designed to simplify this procedure:
.. code-block:: shell
$ rocprof-sys-causal --help
[rocprof-sys-causal] Usage: ./bin/rocprof-sys-causal [ --help (count: 0, dtype: bool)
--version (count: 0, dtype: bool)
--monochrome (max: 1, dtype: bool)
--debug (max: 1, dtype: bool)
--verbose (count: 1)
--config (min: 0, dtype: filepath)
--launcher (count: 1, dtype: executable)
--generate-configs (min: 0, dtype: folder)
--no-defaults (min: 0, dtype: bool)
--mode (count: 1, dtype: string)
--output-name (min: 1, dtype: filename)
--reset (max: 1, dtype: bool)
--end-to-end (max: 1, dtype: bool)
--wait (count: 1, dtype: seconds)
--duration (count: 1, dtype: seconds)
--iterations (count: 1, dtype: int)
--speedups (min: 0, dtype: integers)
--binary-scope (min: 0, dtype: integers)
--source-scope (min: 0, dtype: integers)
--function-scope (min: 0, dtype: regex-list)
--binary-exclude (min: 0, dtype: integers)
--source-exclude (min: 0, dtype: integers)
--function-exclude (min: 0, dtype: regex-list)
]
Causal profiling usually requires multiple runs to reliably resolve the speedup estimates.
This executable is designed to streamline that process.
For example (assume all commands end with \'-- <exe> <args>\'):
rocprof-sys-causal -n 5 -- <exe> # runs <exe> 5x with causal profiling enabled
rocprof-sys-causal -s 0 5,10,15,20 # runs <exe> 2x with virtual speedups:
# - 0
# - randomly selected from 5, 10, 15, and 20
rocprof-sys-causal -F func_A func_B func_(A|B) # runs <exe> 3x with the function scope limited to:
# 1. func_A
# 2. func_B
# 3. func_A or func_B
General tips:
- Insert progress points at hotspots in your code or use rocprof-sys\'s runtime instrumentation
- Note: binary rewrite will produce a incompatible new binary
- Run rocprof-sys-causal in "function" mode first (does not require debug info)
- Run rocprof-sys-causal in "line" mode when you are targeting one function (requires debug info)
- Preferably, use predictions from the "function" mode to determine which function to target
- Limit the virtual speedups to a smaller pool, e.g., 0,5,10,25,50, to get reliable predictions quicker
- Make use of the binary, source, and function scope to limit the functions/lines selected for experiments
- Note: source scope requires debug info
Options:
-h, -?, --help Shows this page
--version Prints the version and exit
[DEBUG OPTIONS]
--monochrome Disable colorized output
--debug Debug output
-v, --verbose Verbose output
[GENERAL OPTIONS]
-c, --config Base configuration file
-l, --launcher When running MPI jobs, rocprof-sys-causal needs to be *before* the executable which launches the MPI processes (i.e.
before `mpirun`, `srun`, etc.). Pass the name of the target executable (or a regex for matching to the name of the
target) for causal profiling, e.g., `rocprof-sys-causal -l foo -- mpirun -n 4 foo`. This ensures that the rocprof-sys
library is LD_PRELOADed on the proper target
-g, --generate-configs Generate config files instead of passing environment variables directly. If no arguments are provided, the config files
will be placed in ${PWD}/rocprof-sys-causal-config folder
--no-defaults Do not activate default features which are recommended for causal profiling. For example: PID-tagging of output files
and timestamped subdirectories are disabled by default. Kokkos tools support is added by default
(ROCPROFSYS_USE_KOKKOSP=ON) because, for Kokkos applications, the Kokkos-Tools callbacks are used for progress points.
Activation of OpenMP tools support is similar
[CAUSAL PROFILING OPTIONS (General)]
(These settings will be applied to all causal profiling runs)
-m, --mode [ function (func) | line ]
Causal profiling mode
-o, --output-name Output filename of causal profiling data w/o extension
-r, --reset Overwrite any existing experiment results during the first run
-e, --end-to-end Single causal experiment for the entire application runtime
-w, --wait Set the wait time (i.e. delay) before starting the first causal experiment (in seconds)
-d, --duration Set the length of time (in seconds) to perform causal experimentationafter the first experiment is started. Once this
amount of time has elapsed, no more causal experiments will be started but any currently running experiment will be
allowed to finish.
-n, --iterations Number of times to repeat the combination of run configurations
[CAUSAL PROFILING OPTIONS (Combinatorial)]
(Each individual argument to these options will multiply the number runs by the number of arguments and the number of
iterations. E.g. -n 2 -B "MAIN" -F "foo" "bar" will produce 4 runs: 2 iterations x 1 binary scope x 2 function scopes
(MAIN+foo, MAIN+bar, MAIN+foo, MAIN+bar))
-s, --speedups Pool of virtual speedups to sample from during experimentation. Each space designates a group and multiple speedups can
be grouped together by commas, e.g. -s 0 0,10,20-50 is two groups: group #1 is \'0\' and group #2 is \'0 10 20 25 30 35 40
45 50\'
-B, --binary-scope Restricts causal experiments to the binaries matching the list of regular expressions. Each space designates a group
and multiple scopes can be grouped together with a semi-colon
-S, --source-scope Restricts causal experiments to the source files or source file + lineno pairs (i.e. <file> or <file>:<line>) matching
the list of regular expressions. Each space designates a group and multiple scopes can be grouped together with a
semi-colon
-F, --function-scope Restricts causal experiments to the functions matching the list of regular expressions. Each space designates a group
and multiple scopes can be grouped together with a semi-colon
-BE, --binary-exclude Excludes causal experiments from being performed on the binaries matching the list of regular expressions. Each space
designates a group and multiple excludes can be grouped together with a semi-colon
-SE, --source-exclude Excludes causal experiments from being performed on the code from the source files or source file + lineno pair (i.e.
<file> or <file>:<line>) matching the list of regular expressions. Each space designates a group and multiple excludes
can be grouped together with a semi-colon
-FE, --function-exclude Excludes causal experiments from being performed on the functions matching the list of regular expressions. Each space
designates a group and multiple excludes can be grouped together with a semi-colon
Examples
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
.. code-block:: shell
#!/bin/bash -e
module load rocprofiler-systems
N=20
I=3
# when providing speedups to rocprof-sys-causal, speedup
# groups are separated by a space so "0,10" results in
# one speedup group where rocprof-sys samples from
# the speedup set of {0, 10}. Passing "0 10" (without
# quotes to rocprof-sys-causal multiplies the
# number of runs by 2, where the first half of the
# runs instruct rocprof-sys to only use 0 as the
# speedup and the second half of the runs instruct
# rocprof-sys to only use 10 as the speedup.
SPEEDUPS="0,0,0,10,20,30,40,50,50,75,75,75,90,90,90"
# thus, -s ${SPEEDUPS} only multiplies the number
# of runs by 1 whereas -S ${SPEEDUPS_E2E} multiplies
# the number of runs by 15:
# - 3 runs with speedup of 0
# - 1 run for each of the speedups 10, 20, 30, and 40
# - 2 runs with speedup of 50
# - 3 runs with speedup of 75
# - 3 runs with speedup of 90
SPEEDUPS_E2E=$(echo "${SPEEDUPS}" | sed \'s/,/ /g\')
# 20 iterations in function mode with 1 speedup group
# and source scope set to .cpp files
#
# outputs to files:
# - causal/experiments.func.coz
# - causal/experiments.func.json
#
# total executions: 20
#
rocprof-sys-causal \
-n ${N} \
-s ${SPEEDUPS} \
-m function \
-o experiments.func \
-S ".*\\.cpp" \
-- \
./causal-rocprofsys-cpu "${@}"
# 20 iterations in line mode with 1 speedup group
# and source scope restricted to lines 100 and 110
# in the causal.cpp file.
#
# outputs to files:
# - causal/experiments.line.coz
# - causal/experiments.line.json
#
# total executions: 20
#
rocprof-sys-causal \
-n ${N} \
-s ${SPEEDUPS} \
-m line \
-o experiments.line \
-S "causal\\.cpp:(100|110)" \
-- \
./causal-rocprofsys-cpu "${@}"
# 3 iterations in function mode of 15 singular speedups
# in end-to-end mode with 2 different function scopes
# where one is restricted to "cpu_slow_func" and
# another is restricted to "cpu_fast_func".
#
# outputs to files:
# - causal/experiments.func.e2e.coz
# - causal/experiments.func.e2e.json
#
# total executions: 90
#
rocprof-sys-causal \
-n ${I} \
-s ${SPEEDUPS_E2E} \
-m func \
-e \
-o experiments.func.e2e \
-F "cpu_slow_func" \
"cpu_fast_func" \
-- \
./causal-rocprofsys-cpu "${@}"
# 3 iterations in line mode of 15 singular speedups
# in end-to-end mode with 2 different source scopes
# where one is restricted to line 100 in causal.cpp
# and another is restricted to line 110 in causal.cpp.
#
# outputs to files:
# - causal/experiments.line.e2e.coz
# - causal/experiments.line.e2e.json
#
# total executions: 90
#
rocprof-sys-causal \
-n ${I} \
-s ${SPEEDUPS_E2E} \
-m line \
-e \
-o experiments.line.e2e \
-S "causal\\.cpp:100" \
"causal\\.cpp:110" \
-- \
./causal-rocprofsys-cpu "${@}"
export OMP_NUM_THREADS=8
export OMP_PROC_BIND=spread
export OMP_PLACES=threads
# set number of iterations to 5
N=5
# 5 iterations in function mode of 1 speedup
# group with the source scope restricted
# to files containing "lulesh" in their filename
# and exclude functions which start with "Kokkos::"
# or "std::enable_if".
#
# outputs to files:
# - causal/experiments.func.coz
# - causal/experiments.func.json
#
# total executions: 5
#
# First of 5 executions overwrites any
# existing causal/experiments.func.(coz|json)
# file due to "--reset" argument
#
rocprof-sys-causal \
--reset \
-n ${N} \
-s ${SPEEDUPS} \
-m func \
-o experiments.func \
-S "lulesh.*" \
-FE "^(Kokkos::|std::enable_if)" \
-- \
./lulesh-rocprofsys -i 50 -s 200 -r 20 -b 5 -c 5 -p
# 5 iterations in line mode of 1 speedup
# group with the source scope restricted
# to files containing "lulesh" in their filename
# and exclude functions which start with "exec_range"
# or "execute" and which contain either
# "construct_shared_allocation" or "._omp_fn." in
# the function name.
#
# outputs to files:
# - causal/experiments.line.coz
# - causal/experiments.line.json
#
# total executions: 5
#
# First of 5 executions overwrites any
# existing causal/experiments.line.(coz|json)
# file due to "--reset" argument
#
rocprof-sys-causal \
--reset \
-n ${N} \
-s ${SPEEDUPS} \
-m line \
-o experiments.line \
-S "lulesh.*" \
-FE "^(exec_range|execute);construct_shared_allocation;\\._omp_fn\\." \
-- \
./lulesh-rocprofsys -i 50 -s 200 -r 20 -b 5 -c 5 -p
# 5 iterations in line mode of 1 speedup
# group with the source scope restricted
# to files whose basename is "lulesh.cc"
# for 3 different functions:
# - ApplyMaterialPropertiesForElems
# - CalcHourglassControlForElems
# - CalcVolumeForceForElems
#
# outputs to files:
# - causal/experiments.line.targeted.coz
# - causal/experiments.line.targeted.json
#
# total executions: 15
#
# First of 5 executions overwrites any
# existing causal/experiments.line.(coz|json)
# file due to "--reset" argument
#
rocprof-sys-causal \
--reset \
-n ${N} \
-s ${SPEEDUPS} \
-m line \
-o experiments.line.targeted \
-F "ApplyMaterialPropertiesForElems" \
"CalcHourglassControlForElems" \
"CalcVolumeForceForElems" \
-S "lulesh\\.cc" \
-- \
./lulesh-rocprofsys -i 50 -s 200 -r 20 -b 5 -c 5 -p
Using rocprof-sys-causal with other launchers like mpirun
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The ``rocprof-sys-causal`` executable is intended to assist with application replay
and is designed to always be at the start of the command line as the primary process.
``rocprof-sys-causal`` typically adds a ``LD_PRELOAD`` of the ROCm Systems Profiler libraries
into the environment before launching the command to inject the functionality
required to start the causal profiling tooling. However, this is problematic
when the target application for causal profiling uses a launcher, in which case
it is listed as an argument rather than as the main application. For example,
``foo`` is the target application for profiling, but the command to run it is
``mpirun -n 2 foo``. Running the command ``rocprof-sys-causal -- mpirun -n 2 foo``
applies the causal profiling to ``mpirun`` instead of ``foo``.
``rocprof-sys-causal`` remedies this by providing a command-line option ``-l` / `--launcher``
to indicate the target application is using a launcher script/executable. The
argument to the command-line option is the name of, or regular expression for, the target application
on the command line. When ``--launcher`` is used, ``rocprof-sys-causal`` generates
all the replay configurations and runs them but delays adding the ``LD_PRELOAD``. Instead it
inserts a call to itself into the command line right before the target
application. This recursive call inherits the configuration from
the parent ``rocprof-sys-causal`` executable, inserts an ``LD_PRELOAD`` into the environment,
and calls ``execv`` to replace itself with the new process launched by the target
application.
In other words, the following command:
.. code-block:: shell
rocprof-sys-causal -l foo -n 3 -- mpirun -n 2 foo`
Effectively results in:
.. code-block:: shell
mpirun -n 2 rocprof-sys-causal -- foo
mpirun -n 2 rocprof-sys-causal -- foo
mpirun -n 2 rocprof-sys-causal -- foo
Visualizing the causal output
-------------------------------------------------------------------------
ROCm Systems Profiler generates ``causal/experiments.json`` and ``causal/experiments.coz`` in
``${ROCPROFSYS_OUTPUT_PATH}/${ROCPROFSYS_OUTPUT_PREFIX}``. Visit
`plasma-umass.org/coz <https://plasma-umass.org/coz/>`_ to open the ``*.coz`` file.
ROCm Systems Profiler versus Coz
=======================================
This comparison is intended for readers who are familiar with the
`Coz profiler <https://github.com/plasma-umass/coz>`_.
ROCm Systems Profiler provides several additional features and utilities for causal profiling:
.. csv-table::
:header: "Feature", "Coz", "ROCm Systems Profiler", "Notes"
:widths: 20, 60, 60, 30
"Debug info", "requires debug info in DWARF v3 format (``-gdwarf-3``)", "optional, supports any DWARF format version", "See Note #1 below"
"Experiment selection", "``<file>:<line>``", "``<function>`` or ``<file>:<line>``", "See Note #2 below"
"Experiment speed-ups", "Randomly samples b/t 0..100 in increments of 5 or one fixed speed-up", "Supports specifying smaller subset", "See Note #3 below"
"Scope options", "Supports binary and source scopes", "Supports binary, source, and function scopes", "See Note #4, #5, and #6 below"
"Scope inclusion", "Uses ``%`` as a wildcard for binary and source scopes", "Full regex support for binary, source, and function scopes", ""
"Scope exclusion", "Not supported", "Supports regexes for excluding binary/source/function", "See Note #7 below"
"Call-stack sampling", "Linux Perf", "Linux Perf, libunwind", "See Note #8 below"
.. note::
#. ROCm Systems Profiler supports a "function" mode which does not require debug info.
#. ROCm Systems Profiler supports selecting an entire range of instruction pointers for a function instead
of an instruction pointer for one line. In large code bases, "function" mode
can resolve in fewer iterations. After a target function is identified, you can
switch to line mode and limit the function scope to the target function.
#. ROCm Systems Profiler supports randomly sampling from subsets, e.g. { 0, 0, 5, 10 }
where 0% is randomly selected 50% of time and 5% and 10% are randomly selected 25% of the time.
#. ROCm Systems Profiler and COZ have the same definition for binary scope, which is the binaries
loaded at runtime (the executable and linked libraries).
#. ROCm Systems Profiler "source scope" supports both ``<file>`` and ``<file>:<line>`` formats
in contrast to the COZ "source scope" which requires ``<file>:<line>`` format.
#. ROCm Systems Profiler supports a "function" scope which narrows the function and lines
which are eligible for causal experiments to those within the matching functions.
#. ROCm Systems Profiler supports a second filter on scopes for removing binary/source/function
caught by an inclusive match. For example ``BINARY_SCOPE=.*`` and ``BINARY_EXCLUDE=libmpi.*``
initially includes all binaries but exclude regex removes MPI libraries.
#. In ROCm Systems Profiler, the Linux Perf backend is preferred over use libunwind. However,
Linux Perf usage can be restricted for security reasons.
ROCm Systems Profiler falls back to using a second POSIX timer and libunwind if
Linux Perf is not available.
+340
Просмотреть файл
@@ -0,0 +1,340 @@
.. meta::
:description: ROCm Systems Profiler Python profiling documentation and reference
:keywords: rocprof-sys, rocprofiler-systems, Omnitrace, ROCm, Python, profiling Python, profiler, tracking, visualization, tool, Instinct, accelerator, AMD
****************************************************
Profiling Python scripts
****************************************************
`ROCm Systems Profiler <https://github.com/ROCm/rocprofiler-systems>`_ supports profiling Python code at the
source level and the script level.
Python support is enabled via the ``ROCPROFSYS_USE_PYTHON`` and the
``ROCPROFSYS_PYTHON_VERSIONS="<MAJOR>.<MINOR>`` CMake options.
Alternatively, to build multiple Python versions, use
``ROCPROFSYS_PYTHON_VERSIONS="<MAJOR>.<MINOR>;[<MAJOR>.<MINOR>]"``,
and ``ROCPROFSYS_PYTHON_ROOT_DIRS="/path/to/version;[/path/to/version]"`` instead of ``ROCPROFSYS_PYTHON_VERSION``.
When building multiple Python versions, the length of the ``ROCPROFSYS_PYTHON_VERSIONS``
and ``ROCPROFSYS_PYTHON_ROOT_DIRS`` lists must
be the same size.
.. note::
When using ROCm Systems Profiler with Python programs, the Python interpreter major and minor version (e.g. 3.7)
must match the interpreter major and minor version
used when compiling the Python bindings. When building ROCm Systems Profiler,
the shared object file ``libpyrocprofsys.<IMPL>-<VERSION>-<ARCH>-<OS>-<ABI>.so`` is generated
where ``IMPL`` is the Python implementation, ``VERSION`` is the major and minor
version, ``ARCH`` is the architecture,
``OS`` is the operating system, and ``ABI`` is the application binary interface,
for example, ``libpyrocprofsys.cpython-38-x86_64-linux-gnu.so``.
.. note::
ROCm Systems Profiler has limited support for Artificial Intelligence (AI) and Machine Learning (ML) workloads.
Data from child threads is not captured. For other profiling options,
see `rocprofV3 <https://rocm.docs.amd.com/projects/rocprofiler-sdk/en/latest/how-to/using-rocprofv3.html#using-rocprofv3>`_.
Getting started
========================================
The ROCm Systems Profiler Python package is installed in ``lib/pythonX.Y/site-packages/rocprofsys``.
To ensure the Python interpreter can find the ROCm Systems Profiler package,
add this path to the ``PYTHONPATH`` environment variable, as in the following example:
.. code-block:: shell
export PYTHONPATH=/opt/rocprofiler-systems/lib/python3.8/site-packages:${PYTHONPATH}
Both the ``share/rocprofiler-systems/setup-env.sh`` script and the module file in
``share/modulefiles/rocprofiler-systems`` automatically handle the prefixing of the ``PYTHONPATH``
environment variable.
Running ROCm Systems Profiler on a Python script
================================================
ROCm Systems Profiler provides an ``rocprof-sys-python`` helper bash script which
ensures ``PYTHONPATH`` is properly set and the correct Python interpreter is used.
This means the following commands are effectively equivalent:
.. code-block:: shell
rocprof-sys-python --help
and
.. code-block:: shell
export PYTHONPATH=/opt/rocprofiler-systems/lib/python3.8/site-packages:${PYTHONPATH}
python3.8 -m rocprofsys --help
.. note::
``rocprof-sys-python`` and ``python -m rocprofsys`` use the same command-line syntax
as the other ``rocprof-sys`` executables (``rocprof-sys-python <ROCPROFSYS_ARGS> -- <SCRIPT> <SCRIPT_ARGS>``)
and has similar options.
Command line options
-----------------------------------
Use ``rocprof-sys-python --help`` to view the available options:
.. code-block:: shell
usage: rocprof-sys [-h] [-v VERBOSITY] [-b] [-c FILE] [-s FILE] [-F [BOOL]] [--label [{args,file,line} [{args,file,line} ...]]] [-I FUNC [FUNC ...]] [-E FUNC [FUNC ...]] [-R FUNC [FUNC ...]] [-MI FILE [FILE ...]] [-ME FILE [FILE ...]] [-MR FILE [FILE ...]] [--trace-c [BOOL]]
optional arguments:
-h, --help show this help message and exit
-v VERBOSITY, --verbosity VERBOSITY
Logging verbosity
-b, --builtin Put 'profile' in the builtins. Use '@profile' to decorate a single function, or 'with profile:' to profile a single section of code.
-c FILE, --config FILE
ROCm Systems Profiler configuration file
-s FILE, --setup FILE
Code to execute before the code to profile
-F [BOOL], --full-filepath [BOOL]
Encode the full function filename (instead of basename)
--label [{args,file,line} [{args,file,line} ...]]
Encode the function arguments, filename, and/or line number into the profiling function label
-I FUNC [FUNC ...], --function-include FUNC [FUNC ...]
Include any entries with these function names
-E FUNC [FUNC ...], --function-exclude FUNC [FUNC ...]
Filter out any entries with these function names
-R FUNC [FUNC ...], --function-restrict FUNC [FUNC ...]
Select only entries with these function names
-MI FILE [FILE ...], --module-include FILE [FILE ...]
Include any entries from these files
-ME FILE [FILE ...], --module-exclude FILE [FILE ...]
Filter out any entries from these files
-MR FILE [FILE ...], --module-restrict FILE [FILE ...]
Select only entries from these files
--trace-c [BOOL] Enable profiling C functions
usage: python3 -m rocprofsys <ROCPROFSYS_ARGS> -- <SCRIPT> <SCRIPT_ARGS>
.. note::
The ``--trace-c`` option does not incorporate ROCm Systems Profiler's dynamic instrumentation support.
It only enables profiling the underlying C function call within the Python interpreter.
Selective instrumentation
-----------------------------------
Similar to the ``rocprof-sys-instrument`` executable, command-line options exist for restricting,
including, and excluding certain functions and modules, for example, ``--function-exclude "^__init__$"``.
Alternatively, add the ``@profile`` decorator to the primary function of interest
in your program and use the ``-b`` / ``--builtin`` command-line option to narrow the scope of the
instrumentation to this function and its children.
Consider the following Python code (``example.py``):
.. code-block:: python
import sys
def fib(n):
return n if n < 2 else (fib(n - 1) + fib(n - 2))
def inefficient(n):
a = 0
for i in range(n):
a += i
for j in range(n):
a += j
return a
def run(n):
return fib(n) + inefficient(n)
if __name__ == "__main__":
run(20)
Running ``rocprof-sys-python -- ./example.py`` with ``ROCPROFSYS_PROFILE=ON`` and
``ROCPROFSYS_TIMEMORY_COMPONENTS=trip_count`` produces the following:
.. code-block:: shell
|-------------------------------------------------------------------------------------------|
| COUNTS NUMBER OF INVOCATIONS |
|-------------------------------------------------------------------------------------------|
| LABEL | COUNT | DEPTH | METRIC | SUM |
|---------------------------------------------------|--------|--------|------------|--------|
| |0>>> run | 1 | 0 | trip_count | 1 |
| |0>>> |_fib | 1 | 1 | trip_count | 1 |
| |0>>> |_fib | 2 | 2 | trip_count | 2 |
| |0>>> |_fib | 4 | 3 | trip_count | 4 |
| |0>>> |_fib | 8 | 4 | trip_count | 8 |
| |0>>> |_fib | 16 | 5 | trip_count | 16 |
| |0>>> |_fib | 32 | 6 | trip_count | 32 |
| |0>>> |_fib | 64 | 7 | trip_count | 64 |
| |0>>> |_fib | 128 | 8 | trip_count | 128 |
| |0>>> |_fib | 256 | 9 | trip_count | 256 |
| |0>>> |_fib | 512 | 10 | trip_count | 512 |
| |0>>> |_fib | 1024 | 11 | trip_count | 1024 |
| |0>>> |_fib | 2026 | 12 | trip_count | 2026 |
| |0>>> |_fib | 3632 | 13 | trip_count | 3632 |
| |0>>> |_fib | 5020 | 14 | trip_count | 5020 |
| |0>>> |_fib | 4760 | 15 | trip_count | 4760 |
| |0>>> |_fib | 2942 | 16 | trip_count | 2942 |
| |0>>> |_fib | 1152 | 17 | trip_count | 1152 |
| |0>>> |_fib | 274 | 18 | trip_count | 274 |
| |0>>> |_fib | 36 | 19 | trip_count | 36 |
| |0>>> |_fib | 2 | 20 | trip_count | 2 |
| |0>>> |_inefficient | 1 | 1 | trip_count | 1 |
|-------------------------------------------------------------------------------------------|
If the ``inefficient`` function is decorated with ``@profile`` as follows:
.. code-block:: python
@profile
def inefficient(n):
# ...
And then run using the command ``rocprof-sys-python -b -- ./example.py``, ROCm Systems Profiler produces this output:
.. code-block:: shell
|-----------------------------------------------------------|
| COUNTS NUMBER OF INVOCATIONS |
|-----------------------------------------------------------|
| LABEL | COUNT | DEPTH | METRIC | SUM |
|-------------------|--------|--------|------------|--------|
| |0>>> inefficient | 1 | 0 | trip_count | 1 |
|-----------------------------------------------------------|
ROCm Systems Profiler Python source instrumentation
===================================================
Starting with the unmodified ``example.py`` script above, import the ``rocprofsys`` module:
.. code-block:: python
import sys
import rocprofsys # import rocprofsys
def fib(n):
# ... etc. ...
Next, add ``@rocprofsys.profile()`` to the ``run`` function:
.. code-block:: python
@rocprofsys.profile()
def run(n):
# ...
Alternatively, use ``rocprofsys.profile()`` as a context-manager around ``run(20)``:
.. code-block:: python
if __name__ == "__main__":
with rocprofsys.profile():
run(20)
The results for both of the source-level instrumentation modes are identical to the
original ``rocprof-sys-python -- ./example.py`` results:
.. code-block:: shell
|-------------------------------------------------------------------------------------------|
| COUNTS NUMBER OF INVOCATIONS |
|-------------------------------------------------------------------------------------------|
| LABEL | COUNT | DEPTH | METRIC | SUM |
|---------------------------------------------------|--------|--------|------------|--------|
| |0>>> run | 1 | 0 | trip_count | 1 |
| |0>>> |_fib | 1 | 1 | trip_count | 1 |
| |0>>> |_fib | 2 | 2 | trip_count | 2 |
| |0>>> |_fib | 4 | 3 | trip_count | 4 |
| |0>>> |_fib | 8 | 4 | trip_count | 8 |
| |0>>> |_fib | 16 | 5 | trip_count | 16 |
| |0>>> |_fib | 32 | 6 | trip_count | 32 |
| |0>>> |_fib | 64 | 7 | trip_count | 64 |
| |0>>> |_fib | 128 | 8 | trip_count | 128 |
| |0>>> |_fib | 256 | 9 | trip_count | 256 |
| |0>>> |_fib | 512 | 10 | trip_count | 512 |
| |0>>> |_fib | 1024 | 11 | trip_count | 1024 |
| |0>>> |_fib | 2026 | 12 | trip_count | 2026 |
| |0>>> |_fib | 3632 | 13 | trip_count | 3632 |
| |0>>> |_fib | 5020 | 14 | trip_count | 5020 |
| |0>>> |_fib | 4760 | 15 | trip_count | 4760 |
| |0>>> |_fib | 2942 | 16 | trip_count | 2942 |
| |0>>> |_fib | 1152 | 17 | trip_count | 1152 |
| |0>>> |_fib | 274 | 18 | trip_count | 274 |
| |0>>> |_fib | 36 | 19 | trip_count | 36 |
| |0>>> |_fib | 2 | 20 | trip_count | 2 |
| |0>>> |_inefficient | 1 | 1 | trip_count | 1 |
|-------------------------------------------------------------------------------------------|
.. note::
When ``rocprof-sys-python`` is used without built-ins, the profiling results can be cluttered by the
numerous functions called when more complex modules are imported, such as ``import numpy``.
ROCm Systems Profiler Python source instrumentation configuration
-----------------------------------------------------------------
Within the Python source code, the profiler can be configured by directly
modifying the ``rocprof-sys.profiler.config`` data fields.
.. code-block:: python
import sys
def fib(n):
return n if n < 2 else (fib(n - 1) + fib(n - 2))
def inefficient(n):
a = 0
for i in range(n):
a += i
for j in range(n):
a += j
return a
def run(n):
return fib(n) + inefficient(n)
if __name__ == "__main__":
from rocprofsys.profiler import config
from rocprofsys import profile
config.include_args = True
config.include_filename = False
config.include_line = False
config.restrict_functions += ["fib", "run"]
with profile():
run(5)
Executing this script produces the following:
.. code-block:: shell
|------------------------------------------------------------------|
| COUNTS NUMBER OF INVOCATIONS |
|------------------------------------------------------------------|
| LABEL | COUNT | DEPTH | METRIC | SUM |
|--------------------------|--------|--------|------------|--------|
| |0>>> run(n=5) | 1 | 0 | trip_count | 1 |
| |0>>> |_fib(n=5) | 1 | 1 | trip_count | 1 |
| |0>>> |_fib(n=4) | 1 | 2 | trip_count | 1 |
| |0>>> |_fib(n=3) | 1 | 3 | trip_count | 1 |
| |0>>> |_fib(n=2) | 1 | 4 | trip_count | 1 |
| |0>>> |_fib(n=1) | 1 | 5 | trip_count | 1 |
| |0>>> |_fib(n=0) | 1 | 5 | trip_count | 1 |
| |0>>> |_fib(n=1) | 1 | 4 | trip_count | 1 |
| |0>>> |_fib(n=2) | 1 | 3 | trip_count | 1 |
| |0>>> |_fib(n=1) | 1 | 4 | trip_count | 1 |
| |0>>> |_fib(n=0) | 1 | 4 | trip_count | 1 |
| |0>>> |_fib(n=3) | 1 | 2 | trip_count | 1 |
| |0>>> |_fib(n=2) | 1 | 3 | trip_count | 1 |
| |0>>> |_fib(n=1) | 1 | 4 | trip_count | 1 |
| |0>>> |_fib(n=0) | 1 | 4 | trip_count | 1 |
| |0>>> |_fib(n=1) | 1 | 3 | trip_count | 1 |
|------------------------------------------------------------------|

Некоторые файлы не были показаны из-за слишком большого количества измененных файлов Показать больше