Files
rocm-systems/.github/workflows/therock-test-packages.yml
T
Scott Todd fa772be675 Reapply amdgpu-windows-interop revert. (#1893)
## Overview and rationale

This reverts https://github.com/ROCm/rocm-systems/pull/1886, which...
* Re-applies https://github.com/ROCm/rocm-systems/pull/1866
* Reverts https://github.com/ROCm/rocm-systems/pull/1728

(So it restores the [`amdgpu-windows-interop/`](https://github.com/ROCm/rocm-systems/tree/develop/shared/amdgpu-windows-interop) folder back to the state from a few weeks ago)

The rationale for this change is at https://github.com/ROCm/rocm-systems/pull/1866:
> Last PAL update broke applications on gfx12 Windows.

## Cross-repository change details

That PR failed to build but was merged with this explanation:

> TheRock CI Windows build fails as expected with this revert.
> 
> References to these PAL members need to be stripped out in a patch on TheRock.
> 
> ```
> 11.3	C:\home\runner\_work\rocm-systems\rocm-systems\projects\clr\rocclr\device\pal\palubercapturemgr.cpp(152): error C2039: 'RegisterTraceStateChangeCallback': is not a member of 'GpuUtil::TraceSession'
> 11.4	C:\home\runner\_work\rocm-systems\rocm-systems\shared\amdgpu-windows-interop\pal\inc\gpuUtil\palTraceSession.h(372): note: see declaration of 'GpuUtil::TraceSession'
> 11.4	C:\home\runner\_work\rocm-systems\rocm-systems\projects\clr\rocclr\device\pal\palubercapturemgr.cpp(195): error C2039: 'UnregisterTraceStateChangeCallback': is not a member of 'GpuUtil::TraceSession'
> 11.4	C:\home\runner\_work\rocm-systems\rocm-systems\shared\amdgpu-windows-interop\pal\inc\gpuUtil\palTraceSession.h(372): note: see declaration of 'GpuUtil::TraceSession'
> ```

The patch in TheRock was updated in https://github.com/ROCm/TheRock/pull/2154. This rolls forward by updating the ref for TheRock.

That original PR could have been sequenced differently to avoid a build break - perhaps by
* Pointing to a branch in TheRock with the patch rebased
* Deleting the patch in the workflows here but holding a local copy of the path to be applied in workflows
* Landing the patch as a normal commit instead of carrying it at all

## Test plan

1. Watch TheRock CI here (https://github.com/ROCm/rocm-systems/actions/runs/19447202693/job/55644411119?pr=1893)
2. Build locally:
    
    ```bash
    # In rocm-systems
    git am --whitespace=nowarn D:\projects\TheRock\patches\amd-mainline\rocm-systems\0001-Revert-SWDEV-543498-Some-compute-Ubertrace-profiles-.patch
    git am --whitespace=nowarn D:\projects\TheRock\patches\amd-mainline\rocm-systems\0003-Use-is_versioned-true-consistently-in-both-Comgr-Loa.patch
    git am --whitespace=nowarn D:\projects\TheRock\patches\amd-mainline\rocm-systems\0006-Explicitly-load-libamdhip64.so.7.patch
    # Note: the build fails with the observed errors if patch 0001 is not applied!
    
    # In TheRock
    cmake -DCMAKE_BUILD_TYPE=Release \
      -DCMAKE_C_COMPILER=cl.exe -DCMAKE_CXX_COMPILER=cl.exe \
      -DCMAKE_C_COMPILER_LAUNCHER=ccache -DCMAKE_CXX_COMPILER_LAUNCHER=ccache \
      -DPython3_EXECUTABLE=d:/projects/TheRock/.venv/Scripts/python \
      -DTHEROCK_ROCM_SYSTEMS_SOURCE_DIR=d:/projects/TheRock/../rocm-systems \  # IMPORTANT
      -DTHEROCK_AMDGPU_FAMILIES=gfx110X-all \
      -DBUILD_TESTING=ON \
      -DTHEROCK_ENABLE_ALL=ON \
      -Damd-llvm_BUILD_TYPE=RelWithDebInfo \
      -S D:/projects/TheRock \
      -B D:/projects/TheRock/build \
      -G Ninja
    
    cmake --build D:/projects/TheRock/build --target hip-clr
    # [build] Build finished with exit code 0
    cmake --build D:/projects/TheRock/build --target ocl-clr+dist
    # [build] Build finished with exit code 0
    ```
2025-11-18 07:17:06 -08:00

124 wiersze
4.5 KiB
YAML

name: TheRock Test Packages
on:
workflow_call:
inputs:
project_to_test:
type: string
amdgpu_families:
type: string
test_runs_on:
type: string
platform:
type: string
permissions:
contents: read
jobs:
configure_test_matrix:
name: "Configure test matrix"
runs-on: ubuntu-24.04
if: ${{ inputs.test_runs_on != '' }}
outputs:
components: ${{ steps.configure.outputs.components }}
steps:
- name: "Fetch 'build_tools' from repository"
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
sparse-checkout: build_tools
path: "prejob"
repository: "ROCm/TheRock"
# Checkout failure is possible on Windows, as it's the first job on a GPU test runner.
# Post-job cleanup isn't necessary since no executables are launched in this job.
- name: Pre-job cleanup processes on Windows
if: ${{ runner.os == 'Windows' }}
shell: powershell
run: . '${{ github.workspace }}\prejob\build_tools\github_actions\cleanup_processes.ps1'
- name: "Checking out repository"
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
repository: "ROCm/TheRock"
- name: "Configuring CI options"
env:
PLATFORM: ${{ inputs.platform }}
project_to_test: ${{ inputs.project_to_test }}
id: configure
run: python ./build_tools/github_actions/fetch_test_configurations.py
test_components:
name: 'Test ${{ matrix.components.job_name }}'
runs-on: ${{ inputs.test_runs_on }}
container:
image: ${{ inputs.platform == 'linux' && 'ghcr.io/rocm/no_rocm_image_ubuntu24_04@sha256:405945a40deaff9db90b9839c0f41d4cba4a383c1a7459b28627047bf6302a26' || null }}
options: --ipc host
--group-add video
--device /dev/kfd
--device /dev/dri
--group-add 992
--env-file /etc/podinfo/gha-gpu-isolation-settings
needs: configure_test_matrix
# skip tests if no test matrix to run
if: ${{ needs.configure_test_matrix.outputs.components != '[]' }}
strategy:
fail-fast: false
matrix:
components: ${{ fromJSON(needs.configure_test_matrix.outputs.components) }}
defaults:
run:
shell: bash
env:
VENV_DIR: ${{ github.workspace }}/.venv
ARTIFACT_RUN_ID: "${{ github.run_id }}"
OUTPUT_ARTIFACTS_DIR: ${{ github.workspace }}/build
THEROCK_BIN_DIR: "./build/bin"
steps:
- name: "Fetch 'build_tools' from repository"
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
sparse-checkout: build_tools
path: "prejob"
repository: "ROCm/TheRock"
- name: Pre-job cleanup processes on Windows
if: ${{ runner.os == 'Windows' }}
shell: powershell
run: . '${{ github.workspace }}\prejob\build_tools\github_actions\cleanup_processes.ps1'
- name: Checkout Repository
uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
with:
repository: "ROCm/TheRock"
ref: eb8f187ff47eb6af9cd5aaa0b8d9a04b06b12796 # 2025-11-15 commit
- name: Run setup test environment workflow
uses: './.github/actions/setup_test_environment'
with:
ARTIFACT_RUN_ID: ${{ env.ARTIFACT_RUN_ID }}
ARTIFACT_GROUP: ${{ inputs.amdgpu_families }}
OUTPUT_ARTIFACTS_DIR: ${{ env.OUTPUT_ARTIFACTS_DIR }}
VENV_DIR: ${{ env.VENV_DIR }}
FETCH_ARTIFACT_ARGS: ${{ matrix.components.fetch_artifact_args }}
PLATFORM: ${{ inputs.platform }}
IS_PR_FROM_FORK: ${{ github.event.pull_request.head.repo.fork }}
- name: Install additional packages
if: ${{ inputs.platform == 'linux' && (matrix.components.job_name == 'rocblas' || matrix.components.job_name == 'hipblaslt') }}
run: sudo apt install libgfortran5 -y
- name: Test
timeout-minutes: ${{ matrix.components.timeout_minutes }}
run: |
if [ "${{ inputs.PLATFORM }}" == "linux" ]; then source ${VENV_DIR}/bin/activate ; else . ${VENV_DIR}/Scripts/activate ; fi
${{ matrix.components.test_script }}
# GitHub's 'Complete job' step is unaware of launched executables
# and will fail to clean up orphan processes.
- name: Post-job cleanup processes on Windows
if: ${{ always() && runner.os == 'Windows' }}
shell: powershell
run: . '${{ github.workspace }}\build_tools\github_actions\cleanup_processes.ps1'