2
0
Ficheiros
rocm-systems/.github/workflows/continuous_integration.yml
T
Jonathan R. Madsen c5e45803e9 Code Coverage Reporting (#334)
* Update lib/rocprofiler-sdk/counters/{tests,parser/tests}/CMakeLists.txt

- use rocprofiler-static-library instead of rocprofiler-object-library

* Update scripts/run-ci.py

- support gcovr and pycobertura

* Update CI workflow for code coverage

- load/save cache for XML code coverage (via gcovr)
- generate and write code coverage comment
- archive code coverage HTML report
- fix name for sanitizer jobs

* Update CI workflow

- tweaks to env for PATH and LD_LIBRARY_PATH

* Add scripts/upload-image-to-github.py

- script for saving images to orphan branches to be used in markdown links

* Update CI workflow

- fix upload artifact conflict
- use upload-image-to-github.py

* Update CI workflow

- install extra packages for wkhtmltopdf/wkhtmltoimage

* Update CI workflow (code coverage)

- install more recent git
- tweak package installs for wkhtmltopdf/wkhtmltoimage

* Update CI workflow (code coverage)

- remove duplicate --cap-add=SYS_PTRACE

* Update CI and upload-image-to-github.py

- print versions

* Update upload-image-to-github.py

- check exit code of some subprocesses

* Update CI workflow

- fix GITHUB_PATH ordering
- fix LD_LIBRARY_PATH

* Update CI workflow

- fix code coverage cache keys (use SHAs)
- copy .codecov to .codecov.ref if a cached .codecov exists

* Update upload-image-to-github.py

- Update git pull/push commands

* Update upload-image-to-github.py

- git fetch before pulling
- git pull before committing

* Update upload-image-to-github.py

- git fetch after committing
- git pull after committing

* Update CI workflow

- list files before cat

* Update upload-image-to-github.py

- output messages

* Update CI workflow and upload-image-to-github.py

- fix output directory path for script to work with CI workflow

* Update CI workflow

- finishing touches/fixes on the code coverage comment generation

* Reproducible filenames

* Update CI workflow

- fix archive of code coverage data

* Fix relative path of reproducible file loc

* Update upload-image-to-github.py

- change update method

* rocprofiler-v2-internal -> rocprofiler-sdk-internal
2024-01-02 19:22:43 -06:00

490 linhas
18 KiB
YAML

name: Continuous Integration
on:
workflow_dispatch:
push:
branches: [ "main" ]
paths-ignore:
- '*.md'
- 'source/docs/**'
pull_request:
branches: [ "main" ]
paths-ignore:
- '*.md'
- 'source/docs/**'
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
env:
# TODO(jrmadsen): replace LD_RUNPATH_FLAG, GPU_LIST, etc. with internal handling in cmake
ROCM_PATH: "/opt/rocm"
GPU_LIST: "gfx900 gfx906 gfx908 gfx90a gfx940 gfx941 gfx942 gfx1030 gfx1100 gfx1101 gfx1102"
PATH: "/usr/bin:$PATH"
jobs:
get_latest_mainline_build_number:
# Run job on vega20 instead of mi200 as mi200 is unstable, as per ammar's instructions.
# TODO: Change it back when re-enabling on mi200
#runs-on: mi200
runs-on: vega20
outputs:
LATEST_BUILD_NUMBER: ${{ steps.get_build_number.outputs.LATEST_BUILD_NUMBER }}
steps:
- id: get_build_number
run: echo "LATEST_BUILD_NUMBER=$(wget -qO- 'http://rocm-ci.amd.com/job/compute-rocm-dkms-no-npi-hipclang/lastSuccessfulBuild/buildNumber')" >> $GITHUB_OUTPUT
# Changed job name from mi200-ubuntu to vega20-ubuntu
# TODO: Change it back when re-enabling on mi200
vega20-ubuntu:
# See: https://docs.github.com/en/free-pro-team@latest/actions/learn-github-actions/managing-complex-workflows#using-a-build-matrix
strategy:
fail-fast: true
max-parallel: 4
matrix:
include:
# Run job on vega20 instead of mi200 as mi200 is unstable, as per ammar's instructions.
# TODO: Change it back when re-enabling on mi200
# - os: 'ubuntu-22.04'
# runner: 'renderD131'
# device: '/renderD131'
# build-type: 'RelWithDebInfo'
# ci-flags: '--linter clang-tidy'
# name-tag: ''
- os: 'ubuntu-22.04'
runner: 'vega20'
build-type: 'RelWithDebInfo'
ci-flags: '--linter clang-tidy'
name-tag: ''
runs-on: ${{ matrix.runner }}
# define this for containers
env:
GIT_DISCOVERY_ACROSS_FILESYSTEM: 1
# TODO: Uncomment this when re-enabling tests on the mi200 as it contains --memory and --cpus flag for the mi200. Remove these 2 options when running on vega20.
# vega20 machine only has 24 cpus available.
# container:
# image: compute-artifactory.amd.com:5000/rocm-plus-docker/compute-rocm-dkms-no-npi-hipclang:${{ needs.get_latest_mainline_build_number.outputs.LATEST_BUILD_NUMBER }}-${{ matrix.os }}-stg1
# options: --memory=128g --cpus=32 --ipc=host --device=/dev/kfd --group-add video --cap-add=SYS_PTRACE --cap-add CAP_SYS_ADMIN --security-opt seccomp=unconfined
container:
image: compute-artifactory.amd.com:5000/rocm-plus-docker/compute-rocm-dkms-no-npi-hipclang:${{ needs.get_latest_mainline_build_number.outputs.LATEST_BUILD_NUMBER }}-${{ matrix.os }}-stg1
options: --ipc=host --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --cap-add CAP_SYS_ADMIN --security-opt seccomp=unconfined
needs: get_latest_mainline_build_number
steps:
- uses: actions/checkout@v4
with:
submodules: true
- name: Install requirements
shell: bash
run: |
git config --global --add safe.directory '*'
apt-get update
apt-get install -y cmake clang-tidy-11 g++-11 g++-12 libgtest-dev python3-pip
update-alternatives --install /usr/bin/clang-tidy clang-tidy /usr/bin/clang-tidy-11 10
update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-11 10 --slave /usr/bin/g++ g++ /usr/bin/g++-11
update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 20 --slave /usr/bin/g++ g++ /usr/bin/g++-12
python3 -m pip install -r requirements.txt
python3 -m pip install pytest
python3 -m pip install 'cmake>=3.22.0'
- name: List Files
shell: bash
run: |
which-realpath() { echo -e "\n$1 resolves to $(realpath $(which $1))"; echo "$($(which $1) --version &> /dev/stdout | head -n 1)"; }
for i in python python3 git cmake ctest; do which-realpath $i; done
ls -la
- name: Configure, Build, and Test
timeout-minutes: 30
shell: bash
# Replaced 'mi200' with '${{ matrix.runner }}' when disabling jobs on mi200 and running it on vega20.
# TODO: Change it back when re-enabling on mi200
run:
python3 ./source/scripts/run-ci.py -B build
--name ${{ github.repository }}-${{ github.ref_name }}-${{ matrix.runner }}-${{ matrix.os }}${{ matrix.name-tag }}
--build-jobs 8
--site ${{ matrix.runner }}
--gpu-targets ${{ env.GPU_LIST }}
${{ matrix.ci-flags }}
--
-DCMAKE_BUILD_TYPE=${{ matrix.build-type }}
-DCMAKE_INSTALL_PREFIX=/opt/rocprofiler/v2
-DCPACK_GENERATOR='DEB;RPM;TGZ'
-DCPACK_PACKAGING_INSTALL_PREFIX="$(realpath /opt/rocm)"
-DPython3_EXECUTABLE=$(which python3)
- name: Install
timeout-minutes: 10
run:
cmake --build build --target install --parallel 8
- name: Build Packaging
timeout-minutes: 10
run:
cmake --build build --target package --parallel 8
- name: Test Install Build
timeout-minutes: 10
shell: bash
run: |
CMAKE_PREFIX_PATH=/opt/rocprofiler/v2 cmake -B build-samples samples
CMAKE_PREFIX_PATH=/opt/rocprofiler/v2 cmake -B build-tests tests
export LD_LIBRARY_PATH=/opt/rocprofiler/v2/lib:${LD_LIBRARY_PATH}
cmake --build build-samples --target all --parallel 8
cmake --build build-tests --target all --parallel 8
ctest --test-dir build-samples --output-on-failure
ctest --test-dir build-tests --output-on-failure
- name: Install Packages
timeout-minutes: 5
shell: bash
run: |
export PATH=${PATH}:/usr/local/sbin:/usr/sbin:/sbin
ls -la
ls -la ./build
for i in $(ls -S ./build/rocprofiler-sdk*.deb); do dpkg -i ${i}; done;
- name: Test Installed Packages
timeout-minutes: 20
shell: bash
run: |
CMAKE_PREFIX_PATH=/opt/rocm cmake -B build-samples-deb /opt/rocm/share/rocprofiler-sdk/samples
CMAKE_PREFIX_PATH=/opt/rocm cmake -B build-tests-deb /opt/rocm/share/rocprofiler-sdk/tests
cmake --build build-samples-deb --target all --parallel 8
cmake --build build-tests-deb --target all --parallel 8
ctest --test-dir build-samples-deb --output-on-failure
ctest --test-dir build-tests-deb --output-on-failure
- name: Archive production artifacts
uses: actions/upload-artifact@v4
with:
name: installers
path: |
${{github.workspace}}/build/*.deb
${{github.workspace}}/build/*.rpm
${{github.workspace}}/build/*.tgz
code-coverage:
strategy:
fail-fast: true
max-parallel: 4
matrix:
# TODO: Change it back when re-enabling on mi200
include:
- os: 'ubuntu-22.04'
runner: 'vega20'
build-type: 'Release'
# include:
# - os: 'ubuntu-22.04'
# runner: 'renderD131'
# device: '/renderD131'
# build-type: 'Release'
runs-on: ${{ matrix.runner }}
# define this for containers
env:
GIT_DISCOVERY_ACROSS_FILESYSTEM: 1
# TODO: Uncomment this when re-enabling tests on the mi200 as it contains --memory and --cpus flag for the mi200. Remove these 2 options when running on vega20.
# vega20 machine only has 24 cpus available.
container:
image: compute-artifactory.amd.com:5000/rocm-plus-docker/compute-rocm-dkms-no-npi-hipclang:${{ needs.get_latest_mainline_build_number.outputs.LATEST_BUILD_NUMBER }}-${{ matrix.os }}-stg1
options: --ipc=host --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --cap-add CAP_SYS_ADMIN --security-opt seccomp=unconfined
# container:
# image: compute-artifactory.amd.com:5000/rocm-plus-docker/compute-rocm-dkms-no-npi-hipclang:${{ needs.get_latest_mainline_build_number.outputs.LATEST_BUILD_NUMBER }}-${{ matrix.os }}-stg1
# options: --memory=128g --cpus=32 --ipc=host --device=/dev/kfd --device=/dev/dri${{ matrix.device }} --group-add video --cap-add=SYS_PTRACE --cap-add CAP_SYS_ADMIN --security-opt seccomp=unconfined
needs: get_latest_mainline_build_number
steps:
- name: Patch Git
timeout-minutes: 25
run: |
apt-get update
apt-get install -y software-properties-common
add-apt-repository -y ppa:git-core/ppa
apt-get update
apt-get install -y git
- uses: actions/checkout@v4
with:
submodules: true
- name: Load Existing XML Code Coverage
if: github.event_name == 'pull_request'
id: load-coverage
uses: actions/cache@v3
with:
key: ${{ github.event.pull_request.base.sha }}-codecov
path: .codecov/**
- name: Copy Existing XML Code Coverage
if: github.event_name == 'pull_request'
shell: bash
run: |
if [ -d .codecov ]; then cp -r .codecov .codecov.ref; fi
- name: Configure Env
shell: bash
run: |
echo "${PATH}:/usr/local/bin:${HOME}/.local/bin" >> $GITHUB_PATH
echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/lib:${HOME}/.local/lib" >> $GITHUB_ENV
- name: List Files
shell: bash
run: |
echo "PATH: ${PATH}"
echo "LD_LIBRARY_PATH: ${LD_LIBRARY_PATH}"
which-realpath() { echo -e "\n$1 resolves to $(realpath $(which $1))"; echo "$($(which $1) --version &> /dev/stdout | head -n 1)"; }
for i in python python3 git cmake ctest; do which-realpath $i; done
ls -la
- name: Install requirements
shell: bash
run: |
git config --global --add safe.directory '*'
apt-get install -y cmake libgtest-dev python3-pip gcovr wkhtmltopdf xvfb xfonts-base xfonts-75dpi xfonts-100dpi xfonts-utils xfonts-encodings libfontconfig
python3 -m pip install -r requirements.txt
python3 -m pip install pytest pycobertura
- name: Configure, Build, and Test (Total Code Coverage)
timeout-minutes: 30
shell: bash
# Replaced 'mi200' with '${{ matrix.runner }}' when disabling jobs on mi200 and running it on vega20.
# TODO: Change it back when re-enabling on mi200
run:
python3 ./source/scripts/run-ci.py -B build
--name ${{ github.repository }}-${{ github.ref_name }}-${{ matrix.runner }}-${{ matrix.os }}-codecov
--build-jobs 8
--site ${{ matrix.runner }}
--gpu-targets ${{ env.GPU_LIST }}
--coverage all
--
-DCMAKE_BUILD_TYPE=${{ matrix.build-type }}
-DPython3_EXECUTABLE=$(which python3)
- name: Configure, Build, and Test (Tests Code Coverage)
timeout-minutes: 30
shell: bash
run:
find build -type f | egrep '\.gcda$' | xargs rm &&
python3 ./source/scripts/run-ci.py -B build
--name ${{ github.repository }}-${{ github.ref_name }}-${{ matrix.runner }}-${{ matrix.os }}-codecov-tests
--build-jobs 8
--site ${{ matrix.runner }}
--gpu-targets ${{ env.GPU_LIST }}
--coverage tests
--
-DCMAKE_BUILD_TYPE=${{ matrix.build-type }}
-DPython3_EXECUTABLE=$(which python3)
- name: Configure, Build, and Test (Samples Code Coverage)
timeout-minutes: 30
shell: bash
run:
find build -type f | egrep '\.gcda$' | xargs rm &&
python3 ./source/scripts/run-ci.py -B build
--name ${{ github.repository }}-${{ github.ref_name }}-${{ matrix.runner }}-${{ matrix.os }}-codecov-samples
--build-jobs 8
--site ${{ matrix.runner }}
--gpu-targets ${{ env.GPU_LIST }}
--coverage samples
--
-DCMAKE_BUILD_TYPE=${{ matrix.build-type }}
-DPython3_EXECUTABLE=$(which python3)
- name: Save XML Code Coverage
id: save-coverage
uses: actions/cache/save@v3
with:
key: ${{ github.sha }}-codecov
path: |
.codecov/*.xml
- name: Generate Code Coverage Comment
if: github.event_name == 'pull_request'
timeout-minutes: 5
shell: bash
run: |
echo "PWD: ${PWD}"
ls -la
for i in "all" "tests" "samples"; do
wkhtmltoimage --enable-local-file-access --quality 85 .codecov/${i}.html .codecov/${i}.png
done
ls -la .codecov
which -a git
git --version
./source/scripts/upload-image-to-github.py --bot --token ${{ github.token }} --files .codecov/{all,tests,samples}.png --output-dir .codecov --name pr-${{ github.event.pull_request.number }}
echo -e "\n${PWD}:"
ls -la .
echo -e "\n.codecov:"
ls -la .codecov
get-md-contents() { cat .codecov/${1}.png.md .codecov/${1}.md; }
cat << EOF > .codecov/report.md
# Code Coverage Report
## Tests Only
$(get-md-contents tests)
## Samples Only
$(get-md-contents samples)
## Tests + Samples
$(get-md-contents all)
EOF
- name: Write Code Coverage Comment
if: github.event_name == 'pull_request'
timeout-minutes: 5
uses: thollander/actions-comment-pull-request@v2
with:
comment_tag: codecov-report
filePath: .codecov/report.md
- name: Archive Code Coverage Data
uses: actions/upload-artifact@v4
with:
name: code-coverage-details
path: |
${{github.workspace}}/.codecov/*
- name: Verify Test Labels
timeout-minutes: 5
shell: bash
run: |
pushd build
#
# if following fails, there is a test that does not have
# a label identifying it as sample or test (unit or integration).
# Recommended labels are:
# - samples
# - unittests
# - integration-tests
#
ctest -N -LE 'samples|tests' -O ctest.mislabeled.log
grep 'Total Tests: 0' ctest.mislabeled.log
#
# if following fails, then there is overlap between the labels.
# A test cannot both be a sample and (unit/integration) test.
#
ctest -N -O ctest.all.log
ctest -N -O ctest.samples.log -L samples
ctest -N -O ctest.tests.log -L tests
NUM_ALL=$(grep 'Total Tests:' ctest.all.log | awk '{print $NF}')
NUM_SAMPLE=$(grep 'Total Tests:' ctest.samples.log | awk '{print $NF}')
NUM_TEST=$(grep 'Total Tests:' ctest.tests.log | awk '{print $NF}')
NUM_SUM=$((${NUM_SAMPLE} + ${NUM_TEST}))
echo "Total tests: ${NUM_ALL}"
echo "Total labeled tests: ${NUM_SUM}"
if [ ${NUM_ALL} != ${NUM_SUM} ]; then
echo "Test label overlap"
exit 1
fi
popd
sanitizers:
strategy:
fail-fast: false
matrix:
include:
# Run job on vega20 instead of mi200 as mi200 is unstable, as per ammar's instructions.
# TODO: Change it back when re-enabling on mi200
- os: 'ubuntu-22.04'
runner: 'vega20'
build-type: 'RelWithDebInfo'
ci-flags: ''
sanitizer: 'AddressSanitizer'
- os: 'ubuntu-22.04'
runner: 'vega20'
build-type: 'RelWithDebInfo'
ci-flags: ''
sanitizer: 'ThreadSanitizer'
- os: 'ubuntu-22.04'
runner: 'vega20'
build-type: 'RelWithDebInfo'
ci-flags: ''
sanitizer: 'LeakSanitizer'
# - os: 'ubuntu-22.04'
# runner: 'renderD131'
# device: '/renderD131'
# build-type: 'RelWithDebInfo'
# ci-flags: ''
# sanitizer: 'AddressSanitizer'
# - os: 'ubuntu-22.04'
# runner: 'renderD131'
# device: '/renderD131'
# build-type: 'RelWithDebInfo'
# ci-flags: ''
# sanitizer: 'ThreadSanitizer'
# - os: 'ubuntu-22.04'
# runner: 'renderD131'
# device: '/renderD131'
# build-type: 'RelWithDebInfo'
# ci-flags: ''
# sanitizer: 'LeakSanitizer'
runs-on: ${{ matrix.runner }}
# define this for containers
env:
GIT_DISCOVERY_ACROSS_FILESYSTEM: 1
container:
image: compute-artifactory.amd.com:5000/rocm-plus-docker/compute-rocm-dkms-no-npi-hipclang:${{ needs.get_latest_mainline_build_number.outputs.LATEST_BUILD_NUMBER }}-${{ matrix.os }}-stg1
options: --privileged --ipc=host --device=/dev/kfd --device=/dev/dri --group-add video --cap-add=SYS_PTRACE --cap-add CAP_SYS_ADMIN --security-opt seccomp=unconfined
needs: get_latest_mainline_build_number
steps:
- uses: actions/checkout@v4
with:
submodules: true
- name: List Files
shell: bash
run: |
which-realpath() { echo -e "\n$1 resolves to $(realpath $(which $1))"; echo "$($(which $1) --version &> /dev/stdout | head -n 1)"; }
for i in python python3 git cmake ctest; do which-realpath $i; done
ls -la
- name: Install requirements
shell: bash
run: |
git config --global --add safe.directory '*'
apt-get install -y cmake libgtest-dev python3-pip libasan8 libtsan2
python3 -m pip install -r requirements.txt
python3 -m pip install pytest
- name: Configure, Build, and Test
timeout-minutes: 45
shell: bash
run:
python3 ./source/scripts/run-ci.py -B build
--name ${{ github.repository }}-${{ github.ref_name }}-${{ matrix.runner }}-${{ matrix.os }}-${{ matrix.sanitizer }}
--build-jobs 8
--site ${{ matrix.runner }}
--gpu-targets ${{ env.GPU_LIST }}
--memcheck=${{ matrix.sanitizer }}
${{ matrix.ci-flags }}
--
-DCMAKE_BUILD_TYPE=${{ matrix.build-type }}
-DCMAKE_INSTALL_PREFIX="${{ env.ROCM_PATH }}"
-DPython3_EXECUTABLE=$(which python3)