Files
rocm-systems/.github/workflows/rocprofiler-sdk-continuous_integration.yml
T
Benjamin Welton 1517a398bf [rocprofiler-sdk] Buffer finalization fixes and HSA ABI 0x09 support (#2318)
* [rocprofiler-sdk] Fix buffer flush ordering and sanitizer CI improvements

Buffer Pool Design
------------------
Replace the fixed array-based double buffer with a dynamic pool design to
fix race conditions that caused "internal correlation id was retired
prematurely" errors.

The original design had a race where flush callbacks could be delivered
out-of-order: when buffer 0 fills and begins flushing, writes go to
buffer 1. If buffer 1 fills before buffer 0's flush completes, the
buffer index wraps back to 0 (which may still be flushing). Independent
flush tasks submitted to the thread pool can complete out of order.

The new pool design:
- Uses a std::deque of buffer instances that grows as needed
- Allocates buffers from the pool when the current buffer needs to flush
- Serializes flushes with a mutex to ensure FIFO callback ordering
- Returns buffers to the pool after flush completion
- Eliminates the race between buffer selection and write operations

New Unit Tests
--------------
- buffer_correlation_ordering.cpp: Tests that API records are always
  delivered before their corresponding retirement records
- buffer_ordering_stress.cpp: Stress tests buffer flush ordering under
  high contention with multiple threads rapidly filling buffers

HSA Tool Hooks
--------------
Added hsa_tool_hooks.cpp/hpp to register an HSA OnUnload callback that
waits for pending flush tasks before tool finalization, preventing
"retired prematurely" errors during HSA shutdown.

Sanitizer Improvements
----------------------
- LSAN: Set fast_unwind_on_malloc=1 to prevent deadlock in libgcc unwinder
- LSAN: Added suppressions for external tools (liblzma, liblsan, seq, strdup)
- TSAN: Added suppression for false positive on C++11 thread-safe static
  initialization in create_write_functor
- ASAN/UBSAN: Added patterns for known issues in HSA runtime, HIP, perfetto
- Disabled attachment tests for sanitizers due to library preloading issues

Other Fixes
-----------
- Thread-trace agent test: Use heap-allocated callback state
- Correlation ID: Refactored reference counting and finalization ordering

* [rocprofiler-sdk] Revert buffer pool design changes

Revert buffer.cpp and buffer.hpp to the original double-buffer
design from develop branch. The pool-based redesign introduced
concerns about:
- Signal safety (mutex vs atomic_flag)
- API changes (flush() return type)
- Complexity of the new design

This revert removes:
- Dynamic buffer pool with std::deque
- std::mutex/condition_variable synchronization
- buffer_correlation_ordering.cpp test
- buffer_ordering_stress.cpp test

The underlying buffer flush ordering issue will need to be
addressed with a different approach that preserves the original
API and synchronization characteristics.

* [rocprofiler-sdk] Consistent fini_status checks to prevent correlation ID creation during finalization

- Revert TOCTOU CAS loop change in sub_ref_count() - not needed with consistent checks
- Add fini_status check in correlation_tracing_service::construct() with ROCP_CI_LOG warning
- Add nullptr checks at all construct() call sites (queue.cpp, async_copy.cpp, memory_allocation.cpp)
- Change all 'get_fini_status() > 0' to '!= 0' for consistent behavior:
  - hsa/queue.cpp (lines 105, 210)
  - hsa/async_copy.cpp (line 344)
  - hsa/hsa_barrier.cpp (line 43)
  - buffer.cpp (lines 107, 138, 185)

This ensures no correlation IDs are created once finalization starts (fini_status != 0),
preventing races between finalization and ongoing tracing operations.

* [rocprofiler-sdk] Replace arrival-order checks with timestamp-based temporal validation

Buffer records are not guaranteed to arrive in any specific order. Tests and
samples should use timestamps for temporal ordering validation instead.

Changes:
- samples/external_correlation_id_request: Replace 'retired prematurely' arrival
  order check with timestamp-based validation that retirement timestamp >=
  max(end_timestamps) for records with the same correlation ID
- tests/external_correlation.cpp: Remove EXPECT_GT(corr_id, last_corr_id) check
- tests/registration.cpp: Remove EXPECT_GT(corr_id, last_corr_id) check
- tests/roctx.cpp: Remove EXPECT_GT(corr_id, last_corr_id) check

Correlation IDs are not guaranteed to be monotonically increasing when records
are sorted by timestamp. Temporal ordering should be validated using the
timestamp fields in each record.

* [rocprofiler-sdk] Revert external/CMakeLists.txt SYSTEM keyword removal

Restore the SYSTEM keyword to target_include_directories for
rocprofiler-sdk-fmt to match develop branch.

* [rccl] Remove orphaned rocSHMEM gitlink

Remove orphaned submodule reference that was introduced during a merge
but never had a corresponding .gitmodules entry, causing CI failures
with "fatal: no submodule mapping found in .gitmodules".

* [rocprofiler-sdk] Add HSA ABI version 0x09 support

Add ABI checks for HSA_AMD_EXT_API_TABLE_STEP_VERSION 0x09 which
introduces hsa_amd_counted_queue_acquire and hsa_amd_counted_queue_release
functions (added in rocr-runtime SWDEV-561708).

* [rocprofiler-sdk] Handle finalized status gracefully in buffer flush operations

This commit consolidates fixes for handling the finalization status during
buffer flush operations across the SDK.

Changes:
- Tool and samples: Handle ROCPROFILER_STATUS_ERROR_FINALIZED gracefully
  when flushing buffers, as this indicates buffers were already flushed
  during finalization (not an error condition)
- HSA handlers (queue.cpp, async_copy.cpp, hsa_barrier.cpp): Use > 0 check
  for fini_status to allow operations during finalization process
- buffer.cpp: Revert fini_status checks to use > 0 for consistency
- correlation_id.cpp: Add fini_status > 0 check with ROCP_TRACE logging
  to prevent correlation ID creation after finalization starts

Files modified:
- source/lib/rocprofiler-sdk-tool/tool.cpp
- tests/tools/json-tool.cpp
- source/lib/rocprofiler-sdk/tests/registration.cpp
- source/lib/rocprofiler-sdk/tests/roctx.cpp
- samples/api_buffered_tracing/client.cpp
- samples/counter_collection/buffered_client.cpp
- samples/counter_collection/device_counting_async_client.cpp
- samples/external_correlation_id_request/client.cpp
- samples/pc_sampling/client.cpp
- source/lib/rocprofiler-sdk/buffer.cpp
- source/lib/rocprofiler-sdk/context/correlation_id.cpp
- source/lib/rocprofiler-sdk/hsa/queue.cpp
- source/lib/rocprofiler-sdk/hsa/async_copy.cpp
- source/lib/rocprofiler-sdk/hsa/hsa_barrier.cpp

* [rocprofiler-sdk] Remove hsa_tool_hooks and simplify buffer flush handling

Remove the hsa_tool_hooks infrastructure and simplify buffer flush calls
in samples and tools. The ERROR_FINALIZED handling was overly complex
and the hsa_tool_hooks OnUnload synchronization is no longer needed.

Changes:
- Remove hsa_tool_hooks.cpp/hpp and related registration.cpp code
- Simplify buffer flush calls in samples to use direct ROCPROFILER_CALL
- Simplify buffer flush in tool.cpp and json-tool.cpp
- Remove ERROR_FINALIZED special handling from test files

Co-Authored-By: Claude <noreply@anthropic.com>

* [rocprofiler-sdk] Fix output_stream move semantics to null source pointers

The default move constructor and move assignment operator for
output_stream did not null out the source's pointers after the move.
This caused double-close when the moved-from temporary was destroyed,
leading to use-after-free crashes (SIGSEGV in std::ostream::sentry).

Co-Authored-By: Claude <noreply@anthropic.com>

* [rocprofiler-sdk] Improve Perfetto trace writer and sanitizer configuration

- generatePerfetto.cpp: Move output_stream into shared_state to prevent
  use-after-free race conditions during Perfetto callback execution
- run-ci.py: Simplify and consolidate sanitizer environment variable
  configuration for better maintainability

Co-Authored-By: Claude <noreply@anthropic.com>

* [rocprofiler-sdk] Revert run-ci.py changes that broke sanitizer suppressions

The previous changes removed MEMCHECK_SANITIZER_OPTIONS which is required
for CTest to properly pass suppression files to the sanitizers during
memcheck runs.

Co-Authored-By: Claude <noreply@anthropic.com>

* Revert "[rccl] Remove orphaned rocSHMEM gitlink"

This reverts commit 1ad21003941355658fff8114fa27768f11a948f7.

* [rocprofiler-sdk] Revert registration.cpp changes

Revert changes to registration.cpp to match develop branch.

Co-Authored-By: Claude <noreply@anthropic.com>

* [rocprofiler-sdk] Remove suppression file content printing from run-ci.py

Co-Authored-By: Claude <noreply@anthropic.com>

* Fix output_stream move ctor/assignment operator

* Fix erroneous revert of registration.cpp

* Fix handling of fini status in correlation ID construction

* [rocprofiler-sdk] Fix OMPT segfault during finalization

Add nullptr checks in OMPT tracing code to handle the case where
correlation_tracing_service::construct() returns nullptr during
finalization. This fixes segfaults in openmp-target-sample and
tests.integration.execute.openmp-tools.

The correlation ID construction now returns nullptr when fini_status > 0,
but the OMPT callbacks were not checking for this, causing crashes when
dereferencing the null pointer during OpenMP runtime shutdown.

Changes:
- event_common(): Return nullptr early if correlation ID is null
- event(): Check for nullptr before calling sub_ref_count()
- ompt_task_create_callback(): Return early if correlation ID is null
- ompt_task_schedule_callback(): Return early if correlation ID is null

* [rocprofiler-sdk] Fix HSA API tracing segfault during finalization

Add nullptr check in hsa_api_impl::functor after correlation ID
construction. During finalization, correlation_service::construct()
returns nullptr, and without this check the code would dereference
the null pointer when accessing corr_id->internal.

This fixes the SEGV at address 0x000000000008 (null + 8 byte offset)
that occurs when HSA async event threads call hsa_signal_destroy
during runtime shutdown after finalization has started.

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2026-01-27 13:27:54 -05:00

874 строки
39 KiB
YAML

name: rocprofiler-sdk Continuous Integration
on:
workflow_dispatch:
schedule:
- cron: '0 7 * * *'
push:
branches: [ develop ]
paths:
- 'projects/rocprofiler-sdk/**'
- '.github/workflows/rocprofiler-sdk-continuous_integration.yml'
- '!**/*.md'
- '!**/*.rtf'
- '!**/*.rst'
- '!**/.markdownlint-ci2.yaml'
- '!**/.readthedocs.yaml'
- '!**/.spellcheck.local.yaml'
- '!**/.wordlist.txt'
- '!projects/rocprofiler-sdk/CODEOWNERS'
- '!projects/rocprofiler-sdk/source/docs/**'
pull_request:
paths:
- 'projects/rocprofiler-sdk/**'
- '.github/workflows/rocprofiler-sdk-continuous_integration.yml'
- '!**/*.md'
- '!**/*.rtf'
- '!**/*.rst'
- '!**/.markdownlint-ci2.yaml'
- '!**/.readthedocs.yaml'
- '!**/.spellcheck.local.yaml'
- '!**/.wordlist.txt'
- '!projects/rocprofiler-sdk/CODEOWNERS'
- '!projects/rocprofiler-sdk/source/docs/**'
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
# Needed to push/pull cached Docker layers + GHCR images
permissions:
contents: read
packages: write
env:
# TODO(jrmadsen): replace LD_RUNPATH_FLAG, GPU_TARGETS, etc. with internal handling in cmake
ROCM_PATH: "/opt/rocm"
ROCM_VERSION: "7.1.1"
PYTHON_VENV_PATH: "rocprofiler-sdk"
PYTHON_VENV_ACTIVATE: "rocprofiler-sdk/bin/activate"
GPU_TARGETS: "gfx906 gfx908 gfx90a gfx942 gfx950 gfx1030 gfx1100 gfx1101 gfx1102 gfx1201"
PATH: "/usr/bin:$PATH"
## No tests should be excluded here except for extreme emergencies; tests should only be disabled in CMake
## A task should be assigned directly to fix the issues
## Scratch memory tests need to be fixed for ROCm 7.0 release
navi3_EXCLUDE_TESTS_REGEX: ""
vega20_EXCLUDE_TESTS_REGEX: ""
mi200_EXCLUDE_TESTS_REGEX: ""
mi300_EXCLUDE_TESTS_REGEX: ""
mi300a_EXCLUDE_TESTS_REGEX: ""
mi325_EXCLUDE_TESTS_REGEX: ""
mi3xx_EXCLUDE_TESTS_REGEX: ""
navi4_EXCLUDE_TESTS_REGEX: ""
navi3_EXCLUDE_LABEL_REGEX: ""
vega20_EXCLUDE_LABEL_REGEX: ""
mi200_EXCLUDE_LABEL_REGEX: ""
mi300_EXCLUDE_LABEL_REGEX: ""
mi300a_EXCLUDE_LABEL_REGEX: ""
mi325_EXCLUDE_LABEL_REGEX: ""
mi3xx_EXCLUDE_LABEL_REGEX: ""
navi4_EXCLUDE_LABEL_REGEX: ""
GLOBAL_CMAKE_OPTIONS: ""
ENABLE_ROCR_BUILD: "true"
ENABLE_HIP_CLR_BUILD: "true"
CI_MODE: ${{ github.event_name == 'schedule' && 'Nightly' || 'Continuous' }}
jobs:
# -----------------------------------------------------------------------------
# Ubuntu / DEB job(s)
# -----------------------------------------------------------------------------
core-deb:
name: Core • ${{ matrix.system.gpu }} • ${{ matrix.system.os }}
strategy:
fail-fast: false
matrix:
system:
# TEMPORARILY DISABLED: navi3/navi4 jobs pending CI stabilization
# - { gpu: 'navi4', runner: 'rocprofiler-navi4-dind', os: 'ubuntu-22.04', build-type: 'RelWithDebInfo', ci-flags: '--linter clang-tidy', gpu-target: 'gfx120X' }
# - { gpu: 'navi3', runner: 'rocprofiler-navi3-dind', os: 'ubuntu-22.04', build-type: 'RelWithDebInfo', ci-flags: '--linter clang-tidy', gpu-target: 'gfx110X' }
- { gpu: 'mi325', runner: 'linux-mi325-1gpu-ossci-rocm-frac', os: 'ubuntu-22.04', build-type: 'RelWithDebInfo', ci-flags: '--linter clang-tidy', gpu-target: 'gfx94X' }
runs-on: ${{ matrix.system.runner }}
container:
image: docker.io/rocm/rocprofiler-private:${{ matrix.system.os }}-${{ matrix.system.gpu-target }}-latest
credentials:
username: ${{ secrets.ROCPROFILER_AZURE_CI_USER }}
password: ${{ secrets.ROCPROFILER_AZURE_CI_PASS }}
env:
DEBIAN_FRONTEND: noninteractive
options: --privileged
env:
GIT_DISCOVERY_ACROSS_FILESYSTEM: 1
CORE_EXT_RUNNER: mi325
GPU_RUNNER: ${{ matrix.system.gpu }}
steps:
- name: Install Latest Nightly ROCm
shell: bash
working-directory: /tmp
run: |
tar -xf ${{ env.ROCM_PATH }}-${{ matrix.system.gpu-target }}.tar.gz -C ${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }}
ln -s ${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} ${{ env.ROCM_PATH }}
echo "ROCm installed to: ${{ env.ROCM_PATH }}"
- name: Clone ROCProfiler SDK & AQLProfile & ROCProfiler Register & ROCR-Runtime
uses: actions/checkout@v5
with:
sparse-checkout: |
projects/rocprofiler-sdk
projects/aqlprofile
projects/rocprofiler-register
projects/rocr-runtime
projects/clr
projects/hip
submodules: false
set-safe-directory: true
- name: Compute submodule cache key
id: submods
shell: bash
run: |
git --version
git config --global --add safe.directory '*'
git submodule status --recursive | awk '{print $1,$2}' > .git-submodules-status
echo "hash=$(sha256sum .git-submodules-status | cut -d' ' -f1)" >> "$GITHUB_OUTPUT"
# collect submodule paths for cache 'path'
git config --file .gitmodules --get-regexp path | awk '{print $2}' > .git-submodule-paths
{ echo "paths<<EOF"; cat .git-submodule-paths; echo "EOF"; } >> "$GITHUB_OUTPUT"
- name: Restore submodule cache
uses: actions/cache@v4
with:
path: |
.git/modules
${{ steps.submods.outputs.paths }}
key: submods-${{ runner.os }}-${{ steps.submods.outputs.hash }}
restore-keys: |
submods-${{ runner.os }}-
submods-
- name: Init/Update submodules
run: git submodule update --init --recursive --jobs 16
- name: Clone ROCDecode
uses: actions/checkout@v5
with:
repository: 'ROCm/rocDecode'
ref: 'release/rocm-rel-7.0'
set-safe-directory: true
path: 'rocDecode'
- name: Clone ROCJPEG
uses: actions/checkout@v5
with:
repository: 'ROCm/rocJPEG'
ref: 'release/rocm-rel-7.0'
set-safe-directory: true
path: 'rocJPEG'
- name: Install requirements
timeout-minutes: 10
shell: bash
working-directory: projects/rocprofiler-sdk
run: |
git config --global --add safe.directory '*'
apt-get update
apt-get install -y g++-11 g++-12 cmake python3-pip libdw-dev libsqlite3-dev libdrm-dev file autoconf pkg-config rpm libzstd-dev
update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-11 10 --slave /usr/bin/g++ g++ /usr/bin/g++-11 --slave /usr/bin/gcov gcov /usr/bin/gcov-11
update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 20 --slave /usr/bin/g++ g++ /usr/bin/g++-12 --slave /usr/bin/gcov gcov /usr/bin/gcov-12
python3 -m pip install -U --user -r requirements.txt
rm -rf \
${{ env.ROCM_PATH }}/lib/*rocprofiler-sdk* \
${{ env.ROCM_PATH }}/lib/cmake/*rocprofiler-sdk* \
${{ env.ROCM_PATH }}/share/*rocprofiler-sdk* \
${{ env.ROCM_PATH }}/libexec/*rocprofiler-sdk* \
${{ env.ROCM_PATH }}*/lib/python*/site-packages/roctx \
${{ env.ROCM_PATH }}*/lib/python*/site-packages/rocpd
- name: Setup ccache
uses: hendrikmuhs/ccache-action@63069e3931dedbf3b63792097479563182fe70d1 # v1.2.18
with:
key: ccache-${{ matrix.system.os }}-${{ matrix.system.runner }}-${{ matrix.system.gpu }}
max-size: 2G
save: true
- name: Install Missing ROCm Dependencies
shell: bash
run: |
echo -e "Building & Installing ROCDecode..."
cmake -B build-rocdecode \
-DCMAKE_INSTALL_PREFIX=${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} \
-DCMAKE_PREFIX_PATH=${{ env.ROCM_PATH }} \
-DCMAKE_CXX_COMPILER=${{ env.ROCM_PATH }}/bin/amdclang++ \
-DCMAKE_C_COMPILER_LAUNCHER=/usr/bin/ccache \
-DCMAKE_CXX_COMPILER_LAUNCHER=/usr/bin/ccache \
${GITHUB_WORKSPACE}/rocDecode
cmake --build build-rocdecode --target all --parallel 16
cmake --build build-rocdecode --target install
echo -e "ROCDecode Installed Successfully!"
echo -e "Building & Installing ROCJPEG..."
cmake -B build-rocjpeg \
-DCMAKE_INSTALL_PREFIX=${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} \
-DCMAKE_PREFIX_PATH=${{ env.ROCM_PATH }} \
-DCMAKE_CXX_COMPILER=${{ env.ROCM_PATH }}/bin/amdclang++ \
-DCMAKE_C_COMPILER_LAUNCHER=/usr/bin/ccache \
-DCMAKE_CXX_COMPILER_LAUNCHER=/usr/bin/ccache \
${GITHUB_WORKSPACE}/rocJPEG
cmake --build build-rocjpeg --target all --parallel 16
cmake --build build-rocjpeg --target install
echo -e "ROCJPEG Installed Successfully!"
- name: Build and Install ROCProfiler-Register
shell: bash
working-directory: projects/rocprofiler-register
run: |
export LD_LIBRARY_PATH=${{ env.ROCM_PATH }}/lib:${{ env.ROCM_PATH }}/llvm/lib:$LD_LIBRARY_PATH
export PATH=${{ env.ROCM_PATH }}/bin:${{ env.ROCM_PATH }}/llvm/bin:$PATH
echo "Install ROCProfiler-Register"
cmake -B build-rocprofiler-register \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DCMAKE_PREFIX_PATH=${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} \
-DCMAKE_INSTALL_PREFIX=${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} \
-DCMAKE_C_COMPILER_LAUNCHER=/usr/bin/ccache \
-DCMAKE_CXX_COMPILER_LAUNCHER=/usr/bin/ccache \
.
cmake --build build-rocprofiler-register --target all --parallel 16
cmake --build build-rocprofiler-register --target install
echo "✅ ROCProfiler-Register Installation complete!"
- name: Build and Install ROCR-Runtime
if: ${{ env.ENABLE_ROCR_BUILD == 'true' }}
shell: bash
working-directory: projects/rocr-runtime
run: |
export LD_LIBRARY_PATH=${{ env.ROCM_PATH }}/lib:${{ env.ROCM_PATH }}/llvm/lib:$LD_LIBRARY_PATH
export PATH=${{ env.ROCM_PATH }}/bin:${{ env.ROCM_PATH }}/llvm/bin:$PATH
echo "Install ROCR-Runtime..."
cmake -B build \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DCMAKE_PREFIX_PATH='${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }};${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }}/llvm' \
-DCMAKE_INSTALL_PREFIX=${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} \
.
cmake --build build --target all --parallel 16
cmake --build build --target install
echo "✅ ROCR-Runtime Installation complete!"
- name: Build and Install HIP
if: ${{ env.ENABLE_HIP_CLR_BUILD == 'true' }}
shell: bash
working-directory: projects
run: |
export HIP_DIR=$PWD/hip
export CLR_DIR=$PWD/clr
export LD_LIBRARY_PATH=${{ env.ROCM_PATH }}/lib:${{ env.ROCM_PATH }}/llvm/lib:$LD_LIBRARY_PATH
export PATH=${{ env.ROCM_PATH }}/bin:${{ env.ROCM_PATH }}/llvm/bin:$PATH
echo "Install HIP..."
cd $CLR_DIR
cmake \
-DHIP_COMMON_DIR=$HIP_DIR \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DHIP_PLATFORM=amd \
-DCMAKE_PREFIX_PATH='${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }};${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }}/llvm' \
-DCMAKE_INSTALL_PREFIX=${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} \
-DHIP_LLVM_ROOT=${{ env.ROCM_PATH }}/lib/llvm \
-DHIP_CATCH_TEST=0 \
-DCLR_BUILD_HIP=ON \
-DCLR_BUILD_OCL=ON \
-S $CLR_DIR \
-B build
cmake --build build --target all --parallel 16
cmake --build build --target install
echo "✅ HIP Installation complete!"
- name: Build and Install Aqlprofile
shell: bash
working-directory: projects/aqlprofile
run: |
export LD_LIBRARY_PATH=${{ env.ROCM_PATH }}/lib:${{ env.ROCM_PATH }}/llvm/lib:$LD_LIBRARY_PATH
export PATH=${{ env.ROCM_PATH }}/bin:${{ env.ROCM_PATH }}/llvm/bin:$PATH
echo "Install Aqlprofile..."
cmake -B build-aqlprofile \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DCMAKE_PREFIX_PATH=${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} \
-DCMAKE_INSTALL_PREFIX=${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} \
-DCMAKE_C_COMPILER_LAUNCHER=/usr/bin/ccache \
-DCMAKE_CXX_COMPILER_LAUNCHER=/usr/bin/ccache \
.
cmake --build build-aqlprofile --target all --parallel 16
cmake --build build-aqlprofile --target install
echo "✅ AQLProfile Installation complete!"
- name: List Files
shell: bash
working-directory: projects/rocprofiler-sdk
run: |
echo "PATH: ${PATH}"
echo "LD_LIBRARY_PATH: ${LD_LIBRARY_PATH}"
which-realpath() { echo -e "\n$1 resolves to $(realpath $(which $1))"; echo "$($(which $1) --version &> /dev/stdout | head -n 1)"; }
for i in python3 git cmake ctest gcc g++ gcov; do which-realpath $i; done
cat ${{ env.ROCM_PATH }}/.info/version
ls -la
- name: Enable PC Sampling
if: ${{ contains(matrix.system.gpu, 'mi200') || contains(matrix.system.gpu, 'mi300a') }}
shell: bash
working-directory: projects/rocprofiler-sdk
run: echo 'ROCPROFILER_PC_SAMPLING_BETA_ENABLED=1' >> $GITHUB_ENV
- name: Configure, Build, and Test
timeout-minutes: 30
shell: bash
working-directory: projects/rocprofiler-sdk
run: |
LD_LIBRARY_PATH=${{ env.ROCM_PATH }}/lib:${{ env.ROCM_PATH }}/llvm/lib:$LD_LIBRARY_PATH \
PATH=${{ env.ROCM_PATH }}/bin:${{ env.ROCM_PATH }}/llvm/bin:$HOME/.local/bin:$PATH \
python3 ./source/scripts/run-ci.py \
-B build \
--name ${{ github.repository }}-${{ github.ref_name }}-${{ matrix.system.os }}-${{ matrix.system.gpu }}-core \
--build-jobs 16 \
--mode ${CI_MODE} \
--site ${{ matrix.system.runner }} \
--gpu-targets ${{ env.GPU_TARGETS }} \
--run-attempt ${{ github.run_attempt }} \
${{ matrix.system.ci-flags }} -- \
-DROCPROFILER_DEP_ROCMCORE=ON \
-DROCPROFILER_BUILD_DOCS=OFF \
-DROCPROFILER_BUILD_FMT=OFF \
-DROCPROFILER_INTERNAL_RCCL_API_TRACE=ON \
-DCMAKE_BUILD_TYPE=${{ matrix.system.build-type }} \
-DCMAKE_INSTALL_PREFIX=/opt/rocprofiler-sdk \
-DCPACK_GENERATOR='DEB;RPM;TGZ' \
-DCPACK_PACKAGING_INSTALL_PREFIX="$(realpath ${{ env.ROCM_PATH }})" \
-DPython3_EXECUTABLE=$(which python3) \
-DCMAKE_PREFIX_PATH='${{ env.ROCM_PATH }};${{ env.ROCM_PATH }}/llvm' \
${{ env.GLOBAL_CMAKE_OPTIONS }} -- \
-LE "${${{ matrix.system.gpu }}_EXCLUDE_LABEL_REGEX}" \
-E "${${{ matrix.system.gpu }}_EXCLUDE_TESTS_REGEX}"
- name: Install
if: ${{ contains(matrix.system.gpu, env.CORE_EXT_RUNNER) }}
timeout-minutes: 10
working-directory: projects/rocprofiler-sdk
run: |
export LD_LIBRARY_PATH=${{ env.ROCM_PATH }}/lib:${{ env.ROCM_PATH }}/llvm/lib:$LD_LIBRARY_PATH
export PATH=${{ env.ROCM_PATH }}/bin:${{ env.ROCM_PATH }}/llvm/bin:$PATH
cmake --build build --target install --parallel 16
- name: Build Packaging
if: ${{ contains(matrix.system.gpu, env.CORE_EXT_RUNNER) }}
timeout-minutes: 10
working-directory: projects/rocprofiler-sdk
run: |
export LD_LIBRARY_PATH=${{ env.ROCM_PATH }}/lib:${{ env.ROCM_PATH }}/llvm/lib:$LD_LIBRARY_PATH
export PATH=${{ env.ROCM_PATH }}/bin:${{ env.ROCM_PATH }}/llvm/bin:$PATH
cmake --build build --target package --parallel 16
- name: Test Install Build
if: ${{ contains(matrix.system.gpu, env.CORE_EXT_RUNNER) }}
timeout-minutes: 20
shell: bash
working-directory: projects/rocprofiler-sdk
run: |
export LD_LIBRARY_PATH=${{ env.ROCM_PATH }}/lib:${{ env.ROCM_PATH }}/llvm/lib:$LD_LIBRARY_PATH
export PATH=${{ env.ROCM_PATH }}/bin:${{ env.ROCM_PATH }}/llvm/bin:$HOME/.local/bin:$PATH
CMAKE_PREFIX_PATH=/opt/rocprofiler-sdk cmake -B build-samples samples
CMAKE_PREFIX_PATH=/opt/rocprofiler-sdk cmake -B build-tests -DGPU_TARGETS="gfx942" tests
export LD_LIBRARY_PATH=/opt/rocprofiler-sdk/lib:${LD_LIBRARY_PATH}
cmake --build build-samples --target all --parallel 16
cmake --build build-tests --target all --parallel 16
ctest --test-dir build-samples -LE "${${{ matrix.system.gpu }}_EXCLUDE_LABEL_REGEX}" -E "${${{ matrix.system.gpu }}_EXCLUDE_TESTS_REGEX}" --output-on-failure
ctest --test-dir build-tests -LE "${${{ matrix.system.gpu }}_EXCLUDE_LABEL_REGEX}" -E "${${{ matrix.system.gpu }}_EXCLUDE_TESTS_REGEX}" --output-on-failure
- name: Install Packages
if: ${{ contains(matrix.system.gpu, env.CORE_EXT_RUNNER) }}
timeout-minutes: 5
shell: bash
working-directory: projects/rocprofiler-sdk
run: |
export LD_LIBRARY_PATH=${{ env.ROCM_PATH }}/lib:${{ env.ROCM_PATH }}/llvm/lib:$LD_LIBRARY_PATH
export PATH=${{ env.ROCM_PATH }}/bin:${{ env.ROCM_PATH }}/llvm/bin:$HOME/.local/bin:$PATH
export PATH=${PATH}:/usr/local/sbin:/usr/sbin:/sbin
ls -la
ls -la ./build
dpkg --force-all -i ./build/rocprofiler-sdk-roctx_*.deb
dpkg --force-all -i ./build/rocprofiler-sdk-rocpd_*.deb
for i in $(ls -S ./build/rocprofiler-sdk*.deb | egrep -v 'roctx|rocpd'); do dpkg --force-all -i ${i}; done;
- name: Test Installed Packages
if: ${{ contains(matrix.system.gpu, env.CORE_EXT_RUNNER) }}
timeout-minutes: 20
shell: bash
working-directory: projects/rocprofiler-sdk
run: |
export LD_LIBRARY_PATH=${{ env.ROCM_PATH }}/lib:${{ env.ROCM_PATH }}/llvm/lib:$LD_LIBRARY_PATH
export PATH=${{ env.ROCM_PATH }}/bin:${{ env.ROCM_PATH }}/llvm/bin:$HOME/.local/bin:$PATH
CMAKE_PREFIX_PATH=${{ env.ROCM_PATH }} cmake -B build-samples-deb ${{ env.ROCM_PATH }}/share/rocprofiler-sdk/samples
CMAKE_PREFIX_PATH=${{ env.ROCM_PATH }} cmake -B build-tests-deb -DGPU_TARGETS="gfx942" ${{ env.ROCM_PATH }}/share/rocprofiler-sdk/tests
cmake --build build-samples-deb --target all --parallel 16
cmake --build build-tests-deb --target all --parallel 16
ctest --test-dir build-samples-deb -LE "${${{ matrix.system.gpu }}_EXCLUDE_LABEL_REGEX}" -E "${${{ matrix.system.gpu }}_EXCLUDE_TESTS_REGEX}" --output-on-failure
ctest --test-dir build-tests-deb -LE "${${{ matrix.system.gpu }}_EXCLUDE_LABEL_REGEX}" -E "${${{ matrix.system.gpu }}_EXCLUDE_TESTS_REGEX}" --output-on-failure
- name: Archive production artifacts
if: ${{ github.event_name == 'workflow_dispatch' && contains(matrix.system.gpu, env.CORE_EXT_RUNNER) }}
uses: actions/upload-artifact@v4
with:
name: installers-deb
path: |
${{github.workspace}}/build/*.deb
${{github.workspace}}/build/*.rpm
${{github.workspace}}/build/*.tgz
# -----------------------------------------------------------------------------
# RHEL/SLES (RPM) job(s)
# -----------------------------------------------------------------------------
core-rpm:
name: Core • ${{ matrix.system.gpu }} • ${{ matrix.system.os }}
strategy:
fail-fast: false
matrix:
system:
- { os: 'rhel-8.8', runner: 'linux-mi325-1gpu-ossci-rocm-frac', gpu: 'mi325', gpu-target: 'gfx94X', build-type: 'RelWithDebInfo', ci-flags: '' }
- { os: 'rhel-9.5', runner: 'linux-mi325-1gpu-ossci-rocm-frac', gpu: 'mi325', gpu-target: 'gfx94X', build-type: 'RelWithDebInfo', ci-flags: '' }
- { os: 'sles-15.6', runner: 'linux-mi325-1gpu-ossci-rocm-frac', gpu: 'mi325', gpu-target: 'gfx94X', build-type: 'RelWithDebInfo', ci-flags: '' }
runs-on: ${{ matrix.system.runner }}
container:
image: docker.io/rocm/rocprofiler-private:${{ matrix.system.os }}-${{ matrix.system.gpu-target }}-latest
credentials:
username: ${{ secrets.ROCPROFILER_AZURE_CI_USER }}
password: ${{ secrets.ROCPROFILER_AZURE_CI_PASS }}
options: --privileged
env:
GIT_DISCOVERY_ACROSS_FILESYSTEM: 1
OS_TYPE: ${{ matrix.system.os }}
GPU_RUNNER: ${{ matrix.system.gpu }}
steps:
- name: Clone ROCProfiler SDK & AQLProfile & ROCProfiler Register & ROCR-Runtime
uses: actions/checkout@v5
with:
sparse-checkout: |
projects/rocprofiler-sdk
projects/aqlprofile
projects/rocprofiler-register
projects/rocr-runtime
projects/clr
projects/hip
submodules: false
set-safe-directory: true
- name: Compute submodule cache key
id: submods
shell: bash
run: |
git --version
git config --global --add safe.directory '*'
git submodule status --recursive | awk '{print $1,$2}' > .git-submodules-status
echo "hash=$(sha256sum .git-submodules-status | cut -d' ' -f1)" >> "$GITHUB_OUTPUT"
# collect submodule paths for cache 'path'
git config --file .gitmodules --get-regexp path | awk '{print $2}' > .git-submodule-paths
{ echo "paths<<EOF"; cat .git-submodule-paths; echo "EOF"; } >> "$GITHUB_OUTPUT"
- name: Restore submodule cache
uses: actions/cache@v4
with:
path: |
.git/modules
${{ steps.submods.outputs.paths }}
key: submods-${{ runner.os }}-${{ steps.submods.outputs.hash }}
restore-keys: |
submods-${{ runner.os }}-
submods-
- name: Init/Update submodules
run: git submodule update --init --recursive --jobs 16
- name: Install Latest Nightly ROCm using TheRock Tarballs
shell: bash
working-directory: /tmp
run: |
tar -xf ${{ env.ROCM_PATH }}-${{ matrix.system.gpu-target }}.tar.gz -C ${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }}
ln -s ${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} ${{ env.ROCM_PATH }}
echo "ROCm installed to: ${{ env.ROCM_PATH }}"
- name: Install requirements (venv)
timeout-minutes: 10
shell: bash
working-directory: projects/rocprofiler-sdk
run: |
git config --global --add safe.directory '*'
python3 -m venv ${{ env.PYTHON_VENV_PATH }}
source ${{ env.PYTHON_VENV_ACTIVATE }}
export PATH=/opt/rh/gcc-toolset-11/root/usr/bin:$PATH
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade -r requirements.txt
rm -rf \
${{ env.ROCM_PATH }}/lib/*rocprofiler-sdk* \
${{ env.ROCM_PATH }}/lib/cmake/*rocprofiler-sdk* \
${{ env.ROCM_PATH }}/share/*rocprofiler-sdk* \
${{ env.ROCM_PATH }}/libexec/*rocprofiler-sdk*
- name: Install Curl for RHEL 8.8
if: ${{ matrix.system.os == 'rhel-8.8' }}
run: |
dnf install -y curl
ln -s /usr/local/bin/curl /usr/bin/curl
- name: Setup ccache
uses: hendrikmuhs/ccache-action@63069e3931dedbf3b63792097479563182fe70d1 # v1.2.18
with:
key: ccache-${{ matrix.system.os }}-${{ matrix.system.runner }}-${{ matrix.system.gpu }}
max-size: 2G
save: true
variant: sccache
- name: Build and Install ROCProfiler-Register
shell: bash
working-directory: projects/rocprofiler-register
run: |
echo "Install ROCProfiler-Register"
cmake -B build-rocprofiler-register \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DCMAKE_PREFIX_PATH=${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} \
-DCMAKE_INSTALL_PREFIX=${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} \
-DCMAKE_C_COMPILER_LAUNCHER=/usr/local/bin/sccache \
-DCMAKE_CXX_COMPILER_LAUNCHER=/usr/local/bin/sccache \
.
cmake --build build-rocprofiler-register --target all --parallel 16
cmake --build build-rocprofiler-register --target install
echo "✅ ROCProfiler-Register Installation complete!"
- name: Build and Install ROCR-Runtime
if: ${{ env.ENABLE_ROCR_BUILD == 'true' }}
shell: bash
working-directory: projects/rocr-runtime
run: |
python3 -m venv ${{ env.PYTHON_VENV_PATH }}
source ${{ env.PYTHON_VENV_ACTIVATE }}
export PATH=/opt/rh/gcc-toolset-11/root/usr/bin:$PATH
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade cmake
echo "Install ROCR-Runtime..."
cmake -B build \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DCMAKE_PREFIX_PATH='${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }};${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }}/llvm' \
-DCMAKE_INSTALL_PREFIX=${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} \
.
cmake --build build --target all --parallel 16
cmake --build build --target install
echo "✅ ROCR-Runtime Installation complete!"
- name: Build and Install HIP
if: ${{ env.ENABLE_HIP_CLR_BUILD == 'true' }}
shell: bash
working-directory: projects
run: |
export HIP_DIR=$PWD/hip
export CLR_DIR=$PWD/clr
export LD_LIBRARY_PATH=${{ env.ROCM_PATH }}/lib:${{ env.ROCM_PATH }}/llvm/lib:$LD_LIBRARY_PATH
export PATH=${{ env.ROCM_PATH }}/bin:${{ env.ROCM_PATH }}/llvm/bin:$PATH
echo "Install HIP..."
cd $CLR_DIR
cmake \
-DHIP_COMMON_DIR=$HIP_DIR \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DHIP_PLATFORM=amd \
-DCMAKE_PREFIX_PATH='${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }};${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }}/llvm' \
-DCMAKE_INSTALL_PREFIX=${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} \
-DHIP_LLVM_ROOT=${{ env.ROCM_PATH }}/lib/llvm \
-DHIP_CATCH_TEST=0 \
-DCLR_BUILD_HIP=ON \
-DCLR_BUILD_OCL=ON \
-S $CLR_DIR \
-B build
cmake --build build --target all --parallel 16
cmake --build build --target install
echo "✅ HIP Installation complete!"
- name: Build and Install Aqlprofile
shell: bash
working-directory: projects/aqlprofile
run: |
echo "Install Aqlprofile."
python3 -m venv ${{ env.PYTHON_VENV_PATH }}
source ${{ env.PYTHON_VENV_ACTIVATE }}
export PATH=/opt/rh/gcc-toolset-11/root/usr/bin:$PATH
python3 -m pip install --upgrade pip
python3 -m pip install --upgrade cmake
cmake -B build-aqlprofile \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DCMAKE_PREFIX_PATH=${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} \
-DCMAKE_INSTALL_PREFIX=${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} \
-DCMAKE_C_COMPILER_LAUNCHER=/usr/local/bin/sccache \
-DCMAKE_CXX_COMPILER_LAUNCHER=/usr/local/bin/sccache \
.
cmake --build build-aqlprofile --target all --parallel 16
cmake --build build-aqlprofile --target install
echo "✅ AQLProfile Installation complete!"
- name: Enable PC Sampling
if: ${{ contains(matrix.system.gpu, 'mi200') || contains(matrix.system.gpu, 'mi300a') }}
shell: bash
working-directory: projects/rocprofiler-sdk
run:
echo 'ROCPROFILER_PC_SAMPLING_BETA_ENABLED=1' >> $GITHUB_ENV
- name: List Files
shell: bash
working-directory: projects/rocprofiler-sdk
run: |
source ${{ env.PYTHON_VENV_ACTIVATE }}
echo "PATH: ${PATH}"
echo "LD_LIBRARY_PATH: ${LD_LIBRARY_PATH}"
which-realpath() { echo -e "\n$1 resolves to $(realpath $(which $1))"; echo "$($(which $1) --version &> /dev/stdout | head -n 1)"; }
for i in python3 git cmake ctest gcc g++ gcov; do which-realpath $i; done
cat ${{ env.ROCM_PATH }}/.info/version
ls -la
- name: Configure, Build, and Test
timeout-minutes: 30
shell: bash
working-directory: projects/rocprofiler-sdk
run: |
source ${{ env.PYTHON_VENV_ACTIVATE }}
export PATH=~/.local/bin:/opt/rh/gcc-toolset-11/root/usr/bin:$PATH
python3 ./source/scripts/run-ci.py -B build \
--name ${{ github.repository }}-${{ github.ref_name }}-${{ matrix.system.os }}-${{ matrix.system.gpu }}-core \
--mode ${CI_MODE} \
--build-jobs 16 \
--site ${{ matrix.system.runner }} \
--gpu-targets ${{ env.GPU_TARGETS }} \
--run-attempt ${{ github.run_attempt }} \
${{ matrix.system.ci-flags }} \
-- \
-DROCPROFILER_DEP_ROCMCORE=ON \
-DROCPROFILER_BUILD_DOCS=OFF \
-DROCPROFILER_BUILD_FMT=OFF \
-DROCPROFILER_INTERNAL_RCCL_API_TRACE=ON \
-DCMAKE_BUILD_TYPE=${{ matrix.system.build-type }} \
-DCMAKE_PREFIX_PATH='${{ env.ROCM_PATH }};${{ env.ROCM_PATH }}/llvm' \
-DPython3_EXECUTABLE=$(which python3) \
${{ env.GLOBAL_CMAKE_OPTIONS }} \
-- \
-LE "${${{ matrix.system.gpu }}_EXCLUDE_LABEL_REGEX}" \
-E "${${{ matrix.system.gpu }}_EXCLUDE_TESTS_REGEX}" \
sanitizers:
name: ${{ matrix.system.sanitizer }} • ${{ matrix.system.gpu }} • ${{ matrix.system.os }}
strategy:
fail-fast: false
matrix:
system:
- { sanitizer: 'AddressSanitizer', os: 'ubuntu-22.04', runner: 'linux-mi325-1gpu-ossci-rocm-frac', gpu: 'mi325', gpu-target: 'gfx94X', build-type: 'RelWithDebInfo' }
- { sanitizer: 'ThreadSanitizer', os: 'ubuntu-22.04', runner: 'linux-mi325-1gpu-ossci-rocm-frac', gpu: 'mi325', gpu-target: 'gfx94X', build-type: 'RelWithDebInfo' }
- { sanitizer: 'LeakSanitizer', os: 'ubuntu-22.04', runner: 'linux-mi325-1gpu-ossci-rocm-frac', gpu: 'mi325', gpu-target: 'gfx94X', build-type: 'RelWithDebInfo' }
- { sanitizer: 'UndefinedBehaviorSanitizer', os: 'ubuntu-22.04', runner: 'linux-mi325-1gpu-ossci-rocm-frac', gpu: 'mi325', gpu-target: 'gfx94X', build-type: 'RelWithDebInfo' }
if: ${{ contains(github.event_name, 'pull_request') }}
runs-on: ${{ matrix.system.runner }}
container:
image: docker.io/rocm/rocprofiler-private:${{ matrix.system.os }}-${{ matrix.system.gpu-target }}-latest
credentials:
username: ${{ secrets.ROCPROFILER_AZURE_CI_USER }}
password: ${{ secrets.ROCPROFILER_AZURE_CI_PASS }}
env:
DEBIAN_FRONTEND: noninteractive
options: --privileged --cap-add=SYS_PTRACE --security-opt seccomp=unconfined
# define this for containers
env:
GIT_DISCOVERY_ACROSS_FILESYSTEM: 1
GCC_COMPILER_VERSION: 13
GPU_RUNNER: ${{ matrix.system.gpu }}
steps:
- name: Install Latest Nightly ROCm
shell: bash
working-directory: /tmp
run: |
ls -lah /opt/
tar -xf ${{ env.ROCM_PATH }}-${{ matrix.system.gpu-target }}.tar.gz -C ${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }}
ln -s ${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} ${{ env.ROCM_PATH }}
echo "ROCm installed to: ${{ env.ROCM_PATH }}"
- name: Clone ROCProfiler SDK & AQLProfile & ROCProfiler Register & ROCR-Runtime
uses: actions/checkout@v5
with:
sparse-checkout: |
projects/rocprofiler-sdk
projects/aqlprofile
projects/rocprofiler-register
projects/rocr-runtime
projects/clr
projects/hip
submodules: false
set-safe-directory: true
- name: Compute submodule cache key
id: submods
shell: bash
run: |
git --version
git config --global --add safe.directory '*'
git submodule status --recursive | awk '{print $1,$2}' > .git-submodules-status
echo "hash=$(sha256sum .git-submodules-status | cut -d' ' -f1)" >> "$GITHUB_OUTPUT"
# collect submodule paths for cache 'path'
git config --file .gitmodules --get-regexp path | awk '{print $2}' > .git-submodule-paths
{ echo "paths<<EOF"; cat .git-submodule-paths; echo "EOF"; } >> "$GITHUB_OUTPUT"
- name: Restore submodule cache
uses: actions/cache@v4
with:
path: |
.git/modules
${{ steps.submods.outputs.paths }}
key: submods-${{ runner.os }}-${{ steps.submods.outputs.hash }}
restore-keys: |
submods-${{ runner.os }}-
submods-
- name: Init/Update submodules
run: git submodule update --init --recursive --jobs 16
- name: Install requirements
timeout-minutes: 10
shell: bash
working-directory: projects/rocprofiler-sdk
run: |
git config --global --add safe.directory '*'
apt-get update
apt-get install -y build-essential cmake python3-pip libasan8 libtsan2 software-properties-common clang-15 libdw-dev libsqlite3-dev libdrm-dev file autoconf pkg-config
add-apt-repository ppa:ubuntu-toolchain-r/test
apt-get update
apt-get upgrade -y
apt-get install -y gcc-${{ env.GCC_COMPILER_VERSION }} g++-${{ env.GCC_COMPILER_VERSION }}
update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-${{ env.GCC_COMPILER_VERSION }} 100 --slave /usr/bin/g++ g++ /usr/bin/g++-${{ env.GCC_COMPILER_VERSION }} --slave /usr/bin/gcov gcov /usr/bin/gcov-${{ env.GCC_COMPILER_VERSION }}
python3 -m pip install -U --user -r requirements.txt
rm -rf ${{ env.ROCM_PATH }}/lib/*rocprofiler-sdk* ${{ env.ROCM_PATH }}/lib/cmake/*rocprofiler-sdk* ${{ env.ROCM_PATH }}/share/*rocprofiler-sdk* ${{ env.ROCM_PATH }}/libexec/*rocprofiler-sdk* ${{ env.ROCM_PATH }}*/lib/python*/site-packages/roctx ${{ env.ROCM_PATH }}*/lib/python*/site-packages/rocpd
- name: List Files
shell: bash
working-directory: projects/rocprofiler-sdk
run: |
which-realpath() { echo -e "\n$1 resolves to $(realpath $(which $1))"; echo "$($(which $1) --version &> /dev/stdout | head -n 1)"; }
for i in python3 git cmake ctest gcc g++ gcov; do which-realpath $i; done
cat ${{ env.ROCM_PATH }}/.info/version
ls -la
- name: Enable PC Sampling
if: ${{ contains(matrix.system.gpu, 'mi200') || contains(matrix.system.gpu, 'mi300a') }}
shell: bash
working-directory: projects/rocprofiler-sdk
run: echo 'ROCPROFILER_PC_SAMPLING_BETA_ENABLED=1' >> $GITHUB_ENV
- name: Setup ccache
uses: hendrikmuhs/ccache-action@63069e3931dedbf3b63792097479563182fe70d1 # v1.2.18
with:
key: ccache-${{ matrix.system.os }}-${{ matrix.system.runner }}-${{ matrix.system.gpu }}-${{ matrix.system.sanitizer}}
max-size: 2G
save: true
- name: Build and Install ROCProfiler-Register
shell: bash
working-directory: projects/rocprofiler-register
run: |
echo "Install ROCProfiler-Register"
cmake -B build-rocprofiler-register \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DCMAKE_PREFIX_PATH=${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} \
-DCMAKE_INSTALL_PREFIX=${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} \
-DCMAKE_C_COMPILER_LAUNCHER=/usr/bin/ccache \
-DCMAKE_CXX_COMPILER_LAUNCHER=/usr/bin/ccache \
.
cmake --build build-rocprofiler-register --target all --parallel 16
cmake --build build-rocprofiler-register --target install
echo "✅ ROCProfiler-Register Installation complete!"
- name: Build and Install ROCR-Runtime
if: ${{ env.ENABLE_ROCR_BUILD == 'true' }}
shell: bash
working-directory: projects/rocr-runtime
run: |
echo "Install ROCR-Runtime..."
cmake -B build \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DCMAKE_PREFIX_PATH='${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }};${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }}/llvm' \
-DCMAKE_INSTALL_PREFIX=${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} \
-DCMAKE_C_COMPILER_LAUNCHER=/usr/bin/ccache \
-DCMAKE_CXX_COMPILER_LAUNCHER=/usr/bin/ccache \
.
cmake --build build --target all --parallel 16
cmake --build build --target install
echo "✅ ROCR-Runtime Installation complete!"
- name: Build and Install HIP
if: ${{ env.ENABLE_HIP_CLR_BUILD == 'true' }}
shell: bash
working-directory: projects
run: |
export HIP_DIR=$PWD/hip
export CLR_DIR=$PWD/clr
export LD_LIBRARY_PATH=${{ env.ROCM_PATH }}/lib:${{ env.ROCM_PATH }}/llvm/lib:$LD_LIBRARY_PATH
export PATH=${{ env.ROCM_PATH }}/bin:${{ env.ROCM_PATH }}/llvm/bin:$PATH
echo "Install HIP..."
cd $CLR_DIR
cmake \
-DHIP_COMMON_DIR=$HIP_DIR \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DHIP_PLATFORM=amd \
-DCMAKE_PREFIX_PATH='${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }};${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }}/llvm' \
-DCMAKE_INSTALL_PREFIX=${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} \
-DHIP_LLVM_ROOT=${{ env.ROCM_PATH }}/lib/llvm \
-DHIP_CATCH_TEST=0 \
-DCLR_BUILD_HIP=ON \
-DCLR_BUILD_OCL=ON \
-S $CLR_DIR \
-B build
cmake --build build --target all --parallel 16
cmake --build build --target install
echo "✅ HIP Installation complete!"
- name: Build and Install Aqlprofile
shell: bash
working-directory: projects/aqlprofile
run: |
echo "Install Aqlprofile."
cmake -B build-aqlprofile \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DCMAKE_PREFIX_PATH=${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} \
-DCMAKE_INSTALL_PREFIX=${{ env.ROCM_PATH }}-${{ env.ROCM_VERSION }} \
-DCMAKE_C_COMPILER_LAUNCHER=/usr/bin/ccache \
-DCMAKE_CXX_COMPILER_LAUNCHER=/usr/bin/ccache \
.
cmake --build build-aqlprofile --target all --parallel 16
cmake --build build-aqlprofile --target install
echo "✅ Installation complete!"
- name: Configure, Build, and Test
timeout-minutes: 45
shell: bash
working-directory: projects/rocprofiler-sdk
run:
sudo sysctl -w vm.mmap_rnd_bits=28;
export PATH="$HOME/.local/bin:$PATH";
python3 ./source/scripts/run-ci.py -B build
--name ${{ github.repository }}-${{ github.ref_name }}-${{ matrix.system.os }}-${{ matrix.system.gpu }}-${{ matrix.system.sanitizer }}
--build-jobs 16
--site ${{ matrix.system.runner }}
--gpu-targets ${{ env.GPU_TARGETS }}
--memcheck ${{ matrix.system.sanitizer }}
--run-attempt ${{ github.run_attempt }}
--
-DROCPROFILER_BUILD_FMT=OFF \
-DROCPROFILER_INTERNAL_RCCL_API_TRACE=ON \
-DCMAKE_BUILD_TYPE=${{ matrix.system.build-type }}
-DCMAKE_INSTALL_PREFIX="${{ env.ROCM_PATH }}"
-DCMAKE_PREFIX_PATH='${{ env.ROCM_PATH }};${{ env.ROCM_PATH }}/llvm'
-DPython3_EXECUTABLE=$(which python3)
${{ env.GLOBAL_CMAKE_OPTIONS }}
--
-LE "${${{ matrix.system.gpu }}_EXCLUDE_LABEL_REGEX}"
-E "${${{ matrix.system.gpu }}_EXCLUDE_TESTS_REGEX}"