b5e1645a14
Problem with original test:
- Created circular dependencies between queues:
* Queue1: Kernel A → Barrier(waits for signal_2) → Kernel C
* Queue2: Barrier(waits for signal_1) → Kernel B → sets signal_2
- With strict "one kernel at a time" serialization, this created deadlock:
* Queue1 executed Kernel A, then blocked on barrier waiting for signal_2
* Serializer switched to Queue2, but Queue2 was blocked waiting for signal_1
* Neither queue could proceed: Queue1 needed Queue2's Kernel B to complete,
but Queue2 couldn't start until Queue1 finished completely
- Test would hang indefinitely at hsa_signal_wait_relaxed() for signal_2
Solution implemented:
- Reordered packet submission to eliminate circular dependencies
- Ensured signal producers execute before consumers need them:
* Kernel A produces signal_1 before Queue2's barrier needs it
* Kernel B produces signal_2 before Queue1's continuation needs it
- Dependencies now flow forward without cycles, allowing serializer progress
Refactoring changes:
- Extract common functionality into helper functions:
* create_completion_signal() for signal creation
* create_queue() for queue creation
* submit_kernel_packet() for kernel dispatch packets
* submit_barrier_packet() for barrier packets
- Add comprehensive documentation explaining expected execution pattern
- Simplify main() function making the dependency flow more readable
Co-authored-by: Benjamin Welton <bewelton@amd.com>