4e2144dbfa2ccdc772028938d611eb2ba8df4cf5
3 Cometimentos
| Autor(a) | SHA1 | Mensagem | Data | |
|---|---|---|---|---|
|
|
348d740388 |
hsa multiqueue application (#618)
* hsa multiqueuw application * cmake formatting (cmake-format) (#619) Co-authored-by: bgopesh <7112102+bgopesh@users.noreply.github.com> * source formatting (clang-format v11) (#620) Co-authored-by: bgopesh <7112102+bgopesh@users.noreply.github.com> * comppialtion fix * Update tests/bin/CMakeLists.txt Reorder `add_subdirectory` to fix (recurrent) issues with ROCTx in `CMAKE_BUILD_RPATH` * addressing early feedback * cmake updates * more cmake updates * adding queue dependency test * updating test * test updates * removed hsa_api_trace header * reformating headers to prevent clang from reordering * Fixing packaging * Fixes for hangs * source formatting (clang-format v11) (#676) Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com> * cmake formatting (cmake-format) (#673) Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com> * structure change * cmake formatting (cmake-format) (#680) Co-authored-by: bgopesh <7112102+bgopesh@users.noreply.github.com> * Adding kernel trace to test hang fix * rebased and fixed kernel-trace test * rhel clang fixes * source formatting (clang-format v11) (#685) Co-authored-by: bgopesh <7112102+bgopesh@users.noreply.github.com> * Update lib/rocprofiler-sdk-tool/helper.hpp - remove suppression of -Wshadow * Update tests/bin/hsa-queue-dependency/CMakeLists.txt - cleanup unnecessary code - GPU_LIST -> GPU_TARGETS * GPU_LIST -> GPU_TARGETS * Remove installation of test executable and libraries --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: bgopesh <7112102+bgopesh@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com> Co-authored-by: Benjamin Welton <bewelton@amd.com> Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com> Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com> |
||
|
|
8c5399a68a |
Update HSA async copy active signals handling (#732)
* Enable INFO logging on retried CI jobs
* Update lib/rocprofiler-sdk/async_copy.cpp
- rework active_signals
- make hsa_signal_t member variable
- remove sync from destructor
- replace _is_set with atomic counter
- timeout of 30 seconds hsa_signal_wait
- switch from relaxed to scacquire/screlease memory ordering
- improve logging and error handling
- destroy hsa signal in active_signals in async_fini
* Update lib/rocprofiler-sdk/async_copy.cpp
- active_signals::create
- change initial value of signal to 1 instead of value of completion signal
- change condition trigger of signal callback
* Update tests/counter-collection/validate.py
* Update lib/rocprofiler-sdk/async_copy.cpp
- improved logging
- fix hsa_signal_wait_scacquire_fn check
* Cleanup tests/lib/transpose/transpose.cpp
- remove huge comment block
* Appears to be working on MI200
Dependency Versions:
clr:
|
||
|
|
7b6d3c70bd |
Shared Library Constructor (rocprofv3 deadlock fix) (#599)
* Moved tests/apps to tests/bin * Renamed cmake project in tests/bin * Update samples - Use ROCPROFILER_DEFAULT_FAIL_REGEX - tweaks to stdout messages * Update tests - Use ROCPROFILER_DEFAULT_FAIL_REGEX * Add tests/lib - libraries with HIP code * Update PTL submodule - remove atexit delete of thread_id_map * Update cmake/rocprofiler_options.cmake - Set ROCPROFILER_DEFAULT_FAIL_REGEX * Update common lib: env + logging - improved customization of logging settings - default to disabling logging to files - install failure handler for rocprofv3 - set_env support in environment.* * Add lib/rocprofiler-sdk/shared_library.cpp - shared library constructor * Update lib/rocprofiler-sdk-tool/tool.cpp - destructor thread safety - convert callback_name_info and buffered_name_info to pointers - install failure handler for logging * Add tests/bin/hip-in-libraries - hip-in-libraries is an exe which uses two shared libraries where each shared library contains HIP kernels - used for testing deadlocking within __hipRegisterFatBinary * Update bin/rocprofv3 - reorganized the env variables - use exec to launch command - set ROCPROFILER_LIBRARY_CTOR=1 * Add tests/rocprofv3/tracing-hip-in-libraries - uses hip-in-libraries exe for exe which uses shared libraries to launch HIP kernels * Update bin/rocprofv3 - fix counter collection (no exec) * Update lib/rocprofiler-sdk-tool/tool.cpp - replace "Kernel-Name" with "Kernel_Name" * Update lib/rocprofiler-sdk/registration.cpp Use RTLD_LOCAL instead of RTLD_GLOBAL for env libraries * Update tests/rocprofv3 - replace "Kernel-Name" with "Kernel_Name" * Update tests - vector-ops (bin) stream syncs + runs with 4 queues per device - improve counter-collection/input1 validation - rocprofv3/tracing-hip-in-libraries does not do sys-trace - improved validation script for tracing-hip-in-libraries - updated dispatch_callback in json-tool.cpp following reworking of prototypes for counter collection * Update samples/counter_collection - updated dispatch_callback(s) and record_callback(s) following reworking of prototypes * Update bin/rocprofv3 - reorganized help menu - added options for sub-HSA tables - added --hip-runtime-trace - changed --hip-trace to include --hip-compiler-trace * Update lib/rocprofiler-sdk-tool - improved kernel filtering - removed arch_vgpr, accum_vgpr, sgpr code (in rocprofiler-sdk) - fixed issue with counter-collection w/o tracing - added support for fine grained HSA API tracing - removed directly linking to HSA-runtime * Update lib/rocprofiler-sdk/agent.cpp - rocp_agents != hsa_agents is non-fatal when ROCPROFILER_BUILD_CI=OFF (CMake option) * GPR (vector and scalar) info in kernel symbol data - rocprofiler_callback_tracing_code_object_kernel_symbol_register_data_t contains general purpose register info * Header include order fix - Include repo headers first - Third party library headers next - standard library headers last * Update dispatch profiling public API - introduce rocprofiler_profile_counting_dispatch_data_t - change signature of rocprofiler_profile_counting_dispatch_callback_t and rocprofiler_profile_counting_record_callback_t - provide rocprofiler_user_data_t pointer in dispatch callback - provide rocprofiler_user_data_t value (from dispatch cb) in record callback * Update tests/bin/CMakeLists.txt - fix add_subdirectory(hip-in-libraries) order * Update VERSION - bump to 0.2.0 in prep for AFAR |