2
0
Gráfico de cometimentos

197 Cometimentos

Autor(a) SHA1 Mensagem Data
Mythreya 4fa165ec1a Add support for scratch reporting (#523)
* Add ToolsApiTable

Add ToolsApiTable wrapping for
scratch memory tracking

* Add initial support for scratch memory tracking

Buffering is implemented

* cmake formatting (cmake-format) (#525)

Co-authored-by: MythreyaK <MythreyaK@users.noreply.github.com>

* source formatting (clang-format v11) (#524)

Co-authored-by: MythreyaK <MythreyaK@users.noreply.github.com>

* Add callback tracing for scratch

Fixed the error where scratch tracking init was called irrespective of whether any client requested for it

* Apply suggestions from code review

Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>

* Fix tools api copy/update

Table were saved/updated incorrectly in previous
commit. Also adds passing user data through the callback

* Fix OpKind sequence for scratch tracking

Previously scratch was using OpKind from rocprofiler-sdk, but
templates were instantiated using API ID. These differ by 1

* Integration tests for scratch reporting

Added buffer and callback integration tests for scratch reporting

* source formatting (clang-format v11) (#550)

Co-authored-by: MythreyaK <26112391+MythreyaK@users.noreply.github.com>

* cmake formatting (cmake-format) (#551)

Co-authored-by: MythreyaK <26112391+MythreyaK@users.noreply.github.com>

* python formatting (black) (#549)

Co-authored-by: MythreyaK <26112391+MythreyaK@users.noreply.github.com>

* CI fixes

* source formatting (clang-format v11) (#554)

Co-authored-by: MythreyaK <26112391+MythreyaK@users.noreply.github.com>

* Update api

Rebase on main and updates based on PR feedback

* Update scratch reporting and address PR comments

- Added agent id to buffer records
- Updated `test_internal_correlation_ids` - Is almost identical to
  one in async-copy
- Updated scratch test to check for agent id
- Updated queue id serialization in callback records (prints
  handle as nested key)
- Remove `marker_api_traces` from scratch `test_internal_correlation_ids`
  validation test
- Rename `amd_tools_api` to `scratch_memory`
- Added doxygen comments
- Remove scratch callback from `tool.cpp`
- Replace assert with `LOF_IF` in `scratch_memory.cpp`

* Update tools table

Changed to match up with changes to hsa tables in main branch

* Rework scratch memory structure

* Update tests

- Added suggestions from PR review, and updated tests accordingly

* Misc cleanup

* Update scratch test

As of Apr 4th, `hsa_amd_agent_set_async_scratch_limit` is disabled.

Note,
> This API: `hsa_amd_agent_set_async_scratch_limit` is currently
> disabled. We need some changes in CP firmware to be able to do this
> and these changes are not ready yet.
> With the current code, you will also not get notifications for
> alternate-scratch allocations because this feature has been disabled
> while CP firmware is making additional changes
> We are hoping to have that feature enabled by ROCm-6.3

* Minor update to lib/rocprofiler-sdk/internal_threading.*

- delay destruction of shared_ptrs of the tasks to prevent rare (but possible) data race on the destruction of the shared_ptr

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: MythreyaK <MythreyaK@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2024-04-05 20:32:57 -05:00
Ammar ELWazir 5ebcc6b11a Update rerun.yml 2024-04-05 16:47:30 -05:00
Ammar ELWazir ba39f8c9cc Update rerun.yml 2024-04-05 16:36:46 -05:00
Ammar ELWazir 966659eb5c Update rerun.yml 2024-04-05 16:23:35 -05:00
Ammar ELWazir fb48e28112 Update rerun.yml 2024-04-05 15:57:41 -05:00
Ammar ELWazir 417284cd51 Update rerun.yml 2024-04-05 13:30:01 -05:00
Ammar ELWazir 76b27fb2d0 Update rerun.yml (#743) 2024-04-05 12:28:16 -05:00
Ammar ELWazir 176d1552cf Update to Clang-tidy-15 (#742)
* Update continuous_integration.yml

* Update build.sh

* Update continuous_integration.yml

* Update build.sh

* Update continuous_integration.yml
2024-04-05 07:43:17 -05:00
Ammar ELWazir 91307cab11 Update rerun.yml 2024-04-04 21:38:01 -05:00
Ammar ELWazir 791aa0bcda Rerun for pc-sampling runner-set (#741)
* Update rerun.yml

* Update rerun.yml

* Update rerun.yml
2024-04-04 21:00:58 -05:00
Ammar ELWazir eae890e335 Update rerun.yml 2024-04-04 12:21:32 -05:00
Ammar ELWazir ad22340a8c Update rerun.yml 2024-04-04 12:17:36 -05:00
Ammar ELWazir 48e4af1685 Fixing Re-Run (#740)
* Update rerun.yml

* Update rerun.yml

* Update rerun.yml

* Update rerun.yml
2024-04-04 11:27:24 -05:00
Ammar ELWazir 8c6017e7ff Fixing rerun comments (#738)
* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Rerun separated in different yml file

* Update rerun.yml

* Update rerun.yml

* Update rerun.yml
2024-04-04 11:02:14 -05:00
Ammar ELWazir 8c03c8a914 Adding PC Sampling CI (#739)
* Create ci_pc_sampling.yml

* Update continuous_integration.yml

* Update ci_pc_sampling.yml

* Update ci_pc_sampling.yml

* Update continuous_integration.yml
2024-04-04 10:03:08 -05:00
Ammar ELWazir 5bb087f072 Adding useful scripts for formating and building (#737)
* Addin useful scripts for formating and building

* Update build.sh

* Update build.sh

* Update continuous_integration.yml
2024-04-04 06:49:17 -05:00
Benjamin Welton e0caae9ebc Add debug printing for write interceptor injected packets (#674)
* Add debug printing for write interceptor injected packets

Adds debug printing for write interceptor injected
packets. All packets that pass through the write
intercepter while enabled will be printed.

Only executes/prints when the environment variable
GLOG_v is set to 2 or higher (otherwise it is a no-op
and the expression is not evaluated).

* source formatting (clang-format v11) (#675)

Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>

* Changes on fmt location

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>
2024-04-03 18:14:22 -07:00
Benjamin Welton 41c0ddd72d Convert LOG() -> ROCP_X logging macros. (#695)
* Convert LOG() -> ROCP_X logging macros.

This patch converts the LOG() macro to the ROCP_X logging macros.
There are the following levels of logs.

Logs whos expressions are not evaluated unless the log level is enabled:

ROCP_TRACE - VLOG(2) (enabeled by env variable GLOG_v=2)
ROCP_INFO - VLOG(1) (enabeled by env variable GLOG_v=1)

Logs whos expressions are always evaluated:

ROCP_WARNING - LOG(WARNING)
ROCP_ERROR - LOG(ERROR)
ROCP_FATAL - LOG(FATAL)
ROCP_DFATAL - DLOG(FATAL) (only fatal in debug mode)

* source formatting (clang-format v11) (#696)

Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>

* Minor fix

* Fixes for VLOG before main

* fix vmodule

* source formatting (clang-format v11) (#718)

Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>

* memory leak fix

* Vlog change

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>
2024-04-02 17:15:30 -07:00
Gopesh Bhardwaj 5e4dd502d9 README update (#730)
* README update

* Addressing review comments
2024-04-02 14:11:05 -05:00
Ammar ELWazir 872aa1b1d2 Update formatting.yml 2024-04-02 11:28:44 -05:00
Benjamin Welton 1e612a5e52 Wait for all memory copies to complete before allowing destruction (#725)
* Wait for all mem copies to complete before destroying.

* Update source/lib/rocprofiler-sdk/hsa/async_copy.cpp

Co-authored-by: Ammar ELWazir <ammar.elwazir@amd.com>

* Update async_copy.cpp

---------

Co-authored-by: Ammar ELWazir <ammar.elwazir@amd.com>
2024-04-02 08:22:37 -05:00
Jonathan R. Madsen 939e23e9d1 Stop all client contexts prior to finalization (#721)
* Stop all client contexts prior to finalization

* Update lib/common/container/static_vector.hpp

- improve emplace_back for non-{move,copy}-assignable object

* Update samples/intercept_table/client.cpp

- improve robustness against static object destruction

* Update lib/rocprofiler-sdk/context/context.cpp

- change storage of registered context array
  - stable_vector of optional contexts
  - common::static_object wrapper around stable_vector

* Update samples/intercept_table/client.cpp

- use variable template for underlying function pointer
2024-04-02 03:05:11 -05:00
Gopesh Bhardwaj e3c7eed7c0 SWDEV-451569: bug in tracing options (#728) 2024-04-02 03:03:02 -05:00
dependabot[bot] 9d6809d0b6 Bump actions/configure-pages from 4 to 5 (#706)
Bumps [actions/configure-pages](https://github.com/actions/configure-pages) from 4 to 5.
- [Release notes](https://github.com/actions/configure-pages/releases)
- [Commits](https://github.com/actions/configure-pages/compare/v4...v5)

---
updated-dependencies:
- dependency-name: actions/configure-pages
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-04-02 02:34:32 -05:00
Ammar ELWazir ddbcf34da5 Formating Issue fix (#726)
* Update formatting.yml

* Update formatting.yml

* Update client.cpp

* Update formatting.yml

* Update samples/api_buffered_tracing/client.cpp

* Update client.cpp
2024-04-02 02:14:59 -05:00
Ammar ELWazir c45573f559 Update validate.py (#727) 2024-04-02 01:59:33 -05:00
Ammar ELWazir 2905fb5e95 Update run-ci.py (#641)
* Temp: Fixing node id

* source formatting (clang-format v11) (#709)

Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com>

* Using logical node id

* Update agent.cpp

* Update agent.cpp

* Python formatting

* Update run-ci.py

* Update run-ci.py

* Update continuous_integration.yml

* Update continuous_integration.yml

running directly using the prepared runner container

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update run-ci.py

* Clean up

* Fixing install paths

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Fixing GPU Agents Test Validation

* python formatting (black) (#712)

Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com>

* Fixing the issue with rocclr detected kernels __amd_rocclr_.*

* python formatting (black) (#713)

Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com>

* Fixing the issue with rocclr detected kernels __amd_rocclr_.*

* Fixing static number of async copies and using hsa_api instead for validation

* python formatting (black) (#714)

Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com>

* Increasing the time limit for waiting on active signals

* Update continuous_integration.yml

* Update async_copy.cpp

* Update CMakeLists.txt

* changing node id to logical node id in rocprofv3

* Update tool.cpp

* testing async mem copy signal decrement

* Update logging.cpp

* Update validate.py

---------

Co-authored-by: Ammar ELWazir <aelwazir@rocprofiler1.amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com>
Co-authored-by: Ammar ELWazir <aelwazir@rocprofiler2.amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2024-04-02 01:39:24 -05:00
Ammar ELWazir 62625d0aa1 Use logical_node_id for mapping rocprofiler agents to HSA agents (#708)
* Temp: Fixing node id

* source formatting (clang-format v11) (#709)

Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com>

* Using logical node id

* Update agent.cpp

* Update agent.cpp

* Python formatting

---------

Co-authored-by: Ammar ELWazir <aelwazir@rocprofiler1.amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: ammarwa <3832908+ammarwa@users.noreply.github.com>
Co-authored-by: Ammar ELWazir <aelwazir@rocprofiler2.amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2024-04-02 01:38:18 -05:00
Benjamin Welton 4200faf813 Revert "Add additional debug info and do iteration on per-agent basis" (#723)
This reverts commit 6fc239f6eb.
2024-04-01 22:27:27 -07:00
Benjamin Welton 6fc239f6eb Add additional debug info and do iteration on per-agent basis 2024-04-01 18:45:04 -07:00
Jonathan R. Madsen 092c428b78 Update internal threading (#720)
- update lib/rocprofiler-sdk/internal_threading.*
- use PTL::TaskManager instead of PTL::TaskGroup
  - easier to handle for our needs
  - eliminate data race in rocprofiler_flush_buffer
  - combine memory management of TaskManager and ThreadPool
2024-04-01 20:31:54 -05:00
Benjamin Welton 001f9baa04 Add debug printout for data in CC validate 2024-04-01 13:02:50 -07:00
Ammar ELWazir e6237637eb Formatting as suggestion in the same branch (#711)
* Update formatting.yml

* Update agent.cpp

* Update agent.cpp
2024-04-01 13:11:09 -05:00
Ammar ELWazir 7c3e4593bf Update formatting.yml 2024-03-30 07:21:48 -05:00
Ammar ELWazir 8534dfa1f3 Update docs.yml 2024-03-30 07:21:24 -05:00
Ammar ELWazir 6077ddf9ca Update continuous_integration.yml 2024-03-29 21:58:18 -05:00
Ammar ELWazir bc97ae0370 Update continuous_integration.yml 2024-03-29 20:25:44 -05:00
Ammar ELWazir 332374b3fe Adding public sync (#703)
* Create sync-mainline.yaml

* Create sync-staging.yaml
2024-03-29 18:09:00 -05:00
Ammar ELWazir 2a1e9b3f11 Update continuous_integration.yml 2024-03-29 17:16:14 -05:00
Gopesh Bhardwaj ecc79b1fa3 SWDEV-452077 Fixing MI300 list counters and metrics issue (#701) 2024-03-29 14:44:38 -07:00
Benjamin Welton f0924c6aa7 Make dimension error message print the counter name (#658)
* temp

* source formatting (clang-format v11) (#659)

Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: bwelton <1683479+bwelton@users.noreply.github.com>
2024-03-26 17:19:04 -05:00
Gopesh Bhardwaj f633278720 counter collection header update in docs (#690) 2024-03-26 17:18:39 -05:00
Jonathan R. Madsen bc9f86ec62 Update HSA copy table (#687)
- two copies of HSA table: internal and tracing
- internal is used to invoke HSA function without any possibility of triggering tracing, etc.
2024-03-26 17:11:34 -05:00
Jonathan R. Madsen 1addfed9f6 Fix agent node id + randomize offset id (#625)
* Fix agent node id + randomize offset id

- fixes the node_id value
- randomizes a constant offset for the id.handle values
- switch to using node ids in rocprofiler-sdk-tool library
- update tests related to agents

* Logical node id

- sequential node id values from 0 to (N-1) where N is the number of agents
2024-03-21 20:04:21 -05:00
Jonathan R. Madsen 2f9b1767e9 Handle hsa_queue_destroy after finalization (#679)
* Handle hsa_queue_destroy after finalization

- fixes issue where hsa_queue_destroy(...) is invoked after rocprofiler-sdk has finalized
- hsa::get_queue_controller() returns pointer
- if queue controller is a null pointer, skip invoking QueueController::destroy_queue

* Update HIP/HSA/marker update_table logging

* Update rocprofv3 tests

- remove HSA_TOOLS_LIB env variable
- remove setting ROCPROFILER_LOG_LEVEL env variable
- add timeouts to tests which are missing them

* Disable thread sanitizer deadlock detection

* Update CI workflow

- rename vega20-ubuntu job to core-ci
- enable navi32 in core-ci and sanitizers

* Update run-ci.py

- set gcovr html medium and high threshold

* Update lib/rocprofiler-sdk/hsa/queue_controller.cpp

- remove this capture from enable/disable serialization

* Update lib/rocprofiler-sdk/hsa/{hsa_barrier,profile_serializer}.*

- hsa_barrier::set_barrier accepts const-ref to queue map
- profile_serializer::enable and profile_serializer::disable accept const-ref to queue map

* Logging for HIP/HSA/marker/profile_serializer

* Logging for HIP/HSA/marker/queue_controller

* Improve test_retired_correlation_ids asserts

* Fix tests/counter-collection/validate.py

- scale expected SQ_WAVES counter value based on warp size of GPU

* Tweak github comment for code coverage

* Remove gcovr html high/medium threshold args

* Fix tests/counter-collection/validate.py

- round before casting to int in test_counter_values

* operator bool for profile_serializer

- only wait on CV if profile_serializer is used

* Logging updates (profile_serializer + code_object)

* Update counter-collection validate.py

* QueueController does not wait on CV if finalizing/finalized

* Update CI workflow

- remove navi32 from core job

* Improve HIP/HSA/marker tracing get_functor/functor

- remove lambda wrapper around functor

* Update lib/rocprofiler-sdk/hsa/queue_controller.cpp

- do not acquire cvmutex lock during finalization

* Update lib/rocprofiler-sdk/hsa/hsa_barrier.*

- move ctor and dtor to implementation
- skip signal store screlease and destroy if already finalized

* Update CI workflow

- remove navi32 runners

* bwelton fixes for hangs

* CMake improvements + simplified demangle

- remove amd-comgr from common target (and thus removed from roctx DT_NEEDED)

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
2024-03-21 17:52:15 -05:00
Vladimir Indic 78939e705a PCS parser is aware of external correlation IDs (#639)
* PCS parser is aware of external correlation IDs

* source formatting (clang-format v11) (#640)

Co-authored-by: vlaindic <139573562+vlaindic@users.noreply.github.com>

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-03-15 14:04:06 -05:00
Ammar ELWazir 1fa0b51263 Update continuous_integration.yml (#638)
* Update continuous_integration.yml

Adding a more stable docker image to use

* Fixing jobs dependencies

* Fixing job image url
2024-03-15 11:18:55 -05:00
Jonathan R. Madsen 0fdb21c050 Context Updates (#624)
* Improve error checks related to context create/start/stop/is_valid

* Bump version to 0.2.1

* Track number of kernels associated with correlation id

- add atomic kernel counter variable to context::correlation_id

* Update lib/rocprofiler-sdk/hsa/queue.cpp

- apply the +/- kernel count
2024-03-14 04:40:58 -05:00
Jonathan R. Madsen 7ab1a8015f Fix tracing context domain logic for operations (#621)
* Fix tracing context domain logic for operations

- logic error: domain enabled (all operations all implicitly enabled) + domain enabled for subset of operations resulted in only explicitly enabled operations being treated as enabled
- domain_context: split single bitset for operations in all domains into array of bitsets for each domain

* Update lib/common/mpl.hpp

- assert_false for static_asserts in if constexpr expressions

* Update lib/rocprofiler-sdk/tests/contexts.cpp

- Tests for validating logic regarding domain and operations for callback and buffer tracing
2024-03-14 01:25:43 -05:00
Ammar ELWazir 2bfce8b86d Temporary move CI to hip staging (#615)
* Update continuous_integration.yml

* Update ostream.hpp
2024-03-13 12:34:29 -05:00