miscellaneous doc updates (#86)
* miscellaneous doc updates * updated deprecartion message * Updated memory allocation tracking documentation * Update comparing-with-legacy-tools.rst Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com> * Update comparing-with-legacy-tools.rst Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com> * Update comparing-with-legacy-tools.rst --------- Co-authored-by: Ian Trowbridge <ian.trowbridge@amd.com> Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>
このコミットが含まれているのは:
@@ -151,6 +151,7 @@ Full documentation for ROCprofiler-SDK is available at [rocm.docs.amd.com/projec
|
||||
- Added reduce operation for counter expression wrt dimension.
|
||||
- `--collection-period` feature added in rocprofv3, to enable filtering using time.
|
||||
- `--collection-period-unit` feature added in rocprofv3, to allow the user to control time units used in collection period option.
|
||||
- Added deprecation notice for rocprofiler(v1) and rocprofiler(v2).
|
||||
|
||||
### Changed
|
||||
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
# ROCprofiler-SDK: Application Profiling, Tracing, and Performance Analysis
|
||||
|
||||
> [!NOTE]
|
||||
rocprofiler-sdk is currently considered a beta version and is subject to change in future releases
|
||||
> [!IMPORTANT]
|
||||
We are phasing out development and support for ``roctracer/rocprofiler/rocprof/rocprofv2`` in favor of ``rocprofiler-sdk/rocprofv3`` in upcoming ROCm releases. Going forward, only critical defect fixes will be addressed for older versions of profiling tools and libraries. We encourage all users to upgrade to the latest version, rocprofiler-sdk library and rocprofv3 tool, to ensure continued support and access to new features.
|
||||
|
||||
## Overview
|
||||
|
||||
@@ -19,6 +19,8 @@ ROCProfiler-SDK is AMD’s new and improved tooling infrastructure, providing a
|
||||
- HSA operation tracing
|
||||
- Marker(ROCTx) tracing
|
||||
- PC Sampling (Beta)
|
||||
- RCCL tracing
|
||||
- Kokkoks tracing
|
||||
|
||||
## Tool Support
|
||||
|
||||
|
||||
@@ -86,7 +86,14 @@ ROCprofiler-SDK introduces a new command-line tool, `rocprofv3`, which is a more
|
||||
- Part of HIP and HSA Traces
|
||||
- `--memory-copy-trace`
|
||||
- Provides granularity for memory move operations
|
||||
-
|
||||
-
|
||||
* - Basic tracing options
|
||||
- Memory allocation Trace
|
||||
- *Not Available*
|
||||
- *Not Available*
|
||||
- `--memory-allocation-trace`
|
||||
- New option for collecting Memory Allocation Traces. Displays starting address, allocation size, and agent where allocation occurred.
|
||||
-
|
||||
* - Basic tracing options
|
||||
- Kernel Trace
|
||||
- `--kernel-trace`
|
||||
@@ -136,6 +143,20 @@ ROCprofiler-SDK introduces a new command-line tool, `rocprofv3`, which is a more
|
||||
- `--hsa-finalizer-trace`
|
||||
- New option for collecting HSA API Traces (Finalizer-extension API), e.g. HSA functions prefixed with only `hsa_ext_program_` (i.e. hsa_ext_program_create)
|
||||
-
|
||||
* - Advanced tracing options
|
||||
- Kokkos trace
|
||||
- *Not Available*
|
||||
- *Not Available*
|
||||
- `--kokkos-trace`
|
||||
- New option to enable built-in Kokkos Tools support (implies --marker-trace and --kernel-rename)
|
||||
-
|
||||
* - Advanced tracing options
|
||||
- RCCL trace
|
||||
- *Not Available*
|
||||
- *Not Available*
|
||||
- `--rccl-trace`
|
||||
- For collecting RCCL(ROCm Communication Collectives Library. Also pronounced as 'Rickle' ) Traces
|
||||
-
|
||||
* - Aggregate tracing options
|
||||
- Sys Trace
|
||||
- `--sys-trace` [hip-trace|hsa-trace|roctx-trace|kernel-trace]
|
||||
@@ -300,6 +321,13 @@ ROCprofiler-SDK introduces a new command-line tool, `rocprofv3`, which is a more
|
||||
| # YAML and JSON formats are more readable and easy to maintain.
|
||||
| # Allows flexibility to add more features for the tool input
|
||||
-
|
||||
* - I/O options
|
||||
- Command-line Counter Collection
|
||||
- *Not Available*
|
||||
- *Not Available*
|
||||
- `--pmc`
|
||||
- New option to collect performance counters from command line. Counters should be comma OR space separated in case of more than 1 counters
|
||||
-
|
||||
* - I/O options
|
||||
- Providing Custom metrics file
|
||||
- `-m` <metric file>
|
||||
@@ -318,9 +346,9 @@ ROCprofiler-SDK introduces a new command-line tool, `rocprofv3`, which is a more
|
||||
- Trace Period
|
||||
- `--trace-period`
|
||||
- `-tp | --trace-period`
|
||||
- *Not available*
|
||||
- Not yet in rocprofv3
|
||||
-
|
||||
- `-p |--collection-period`,`--collection-period-unit`
|
||||
- Users can specify multiple configurations, each defined by a triplet in the format `start_delay:collection_time:repeat`, with the ability to change the unit of time in the given configurations.
|
||||
-
|
||||
* - Trace Control options
|
||||
- Trace start
|
||||
- `--trace-start <on|off>`
|
||||
@@ -341,6 +369,13 @@ ROCprofiler-SDK introduces a new command-line tool, `rocprofv3`, which is a more
|
||||
- *Not available*
|
||||
- *Not available*
|
||||
- Not yet in rocprofv3
|
||||
-
|
||||
* - PC Sampling options
|
||||
- PC Sampling`
|
||||
- *Not available*
|
||||
- *Not available*
|
||||
- `--pc-sampling-beta-enabled`
|
||||
- Enable pc sampling support; beta version.
|
||||
-
|
||||
* - Legacy options
|
||||
- Timestamp On/Off
|
||||
|
||||
@@ -520,6 +520,17 @@ For the description of the fields in the output file, see :ref:`output-file-fiel
|
||||
Memory allocation trace
|
||||
+++++++++++++++++++++++++
|
||||
|
||||
Memory allocation traces track the HSA functions ``hsa_memory_allocate``,
|
||||
``hsa_amd_memory_pool_allocate``, and ``hsa_amd_vmem_handle_create```. The function
|
||||
``hipMalloc`` calls these underlying HSA functions allowing memory allocations to be
|
||||
tracked.
|
||||
|
||||
In addition to the HSA memory allocation functions listed above, the corresponding HSA
|
||||
free functions ``hsa_memory_free``, ``hsa_amd_memory_pool_free``, and ``hsa_amd_vmem_handle_release``
|
||||
are also tracked. Unlike the allocation functions, however, only the address of the freed memory
|
||||
is recorded. As such, the agent id and size of the freed memory are recorded as 0 in the CSV and
|
||||
JSON outputs.
|
||||
|
||||
To trace memory allocations during the application run, use:
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
新しいイシューから参照
ユーザーをブロックする