[rocprofiler-compute] Analysis Database Schema Improvements (v1.2.0) (#2526)
* Analysis database v1.2.0 * `pc_sampling` and `roofline_data` tables should relate to `kernel` table instead of `workload` table * Remove `kernel_name` fields in `pc_sampling` and `roofline_data` table * Add kernel existence check for roofline data to prevent KeyError (#2536) * Initial plan * Add kernel existence check for roofline data to prevent KeyError Co-authored-by: vedithal-amd <191402304+vedithal-amd@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: vedithal-amd <191402304+vedithal-amd@users.noreply.github.com> * Optimize analysis performance * Refactor database schema: separate metric definitions from kernels Reorganize the database ORM to decouple metric definitions from kernel objects. This improves the schema design by: - Rename Metric -> MetricDefinition and Value -> MetricValue for clarity - Move metric definitions from kernel-level to workload-level, since metric definitions are shared across kernels - Update relationships: MetricDefinition belongs to Workload, MetricValue references both MetricDefinition and Kernel - Refactor metric_view to join through the new schema structure - Update test fixtures to use renamed table and class names - Update documentation with new example output using nbody workload - Regenerate database schema and views diagrams * Add min amd max aggregation in kernel_view * Add primary key id from tables into the view --------- Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: vedithal-amd <191402304+vedithal-amd@users.noreply.github.com>
이 커밋은 다음에 포함됨:
@@ -2,7 +2,7 @@ default_stages: [pre-commit]
|
|||||||
fail_fast: true
|
fail_fast: true
|
||||||
repos:
|
repos:
|
||||||
- repo: https://github.com/pre-commit/pre-commit-hooks
|
- repo: https://github.com/pre-commit/pre-commit-hooks
|
||||||
rev: v5.0.0
|
rev: v6.0.0
|
||||||
hooks:
|
hooks:
|
||||||
- id: check-yaml
|
- id: check-yaml
|
||||||
- id: end-of-file-fixer
|
- id: end-of-file-fixer
|
||||||
@@ -12,7 +12,7 @@ repos:
|
|||||||
- repo: https://github.com/astral-sh/ruff-pre-commit
|
- repo: https://github.com/astral-sh/ruff-pre-commit
|
||||||
# Ruff version. Check https://github.com/astral-sh/ruff-pre-commit#version-compatibility
|
# Ruff version. Check https://github.com/astral-sh/ruff-pre-commit#version-compatibility
|
||||||
# for the latest ruff version supported by the hook.
|
# for the latest ruff version supported by the hook.
|
||||||
rev: v0.12.12
|
rev: v0.14.11
|
||||||
hooks:
|
hooks:
|
||||||
- id: ruff-check
|
- id: ruff-check
|
||||||
args: [--fix]
|
args: [--fix]
|
||||||
|
|||||||
바이너리 파일은 표시되지 않음.
|
이전 너비: | 높이: | 크기: 185 KiB 이후 너비: | 높이: | 크기: 254 KiB |
바이너리 파일은 표시되지 않음.
|
이전 너비: | 높이: | 크기: 34 KiB 이후 너비: | 높이: | 크기: 240 KiB |
@@ -15,7 +15,7 @@ This section provides an overview of ROCm Compute Profiler's CLI analysis featur
|
|||||||
* :ref:`Metric customization <cli-analysis-options>`: Isolate a subset of built-in metrics or build your own profiling configuration.
|
* :ref:`Metric customization <cli-analysis-options>`: Isolate a subset of built-in metrics or build your own profiling configuration.
|
||||||
|
|
||||||
* :ref:`Filtering <cli-analysis-options>`: Hone in on a particular kernel, GPU ID, or dispatch ID via post-process filtering.
|
* :ref:`Filtering <cli-analysis-options>`: Hone in on a particular kernel, GPU ID, or dispatch ID via post-process filtering.
|
||||||
|
|
||||||
* :ref:`Per-kernel roofline analysis <per-kernel-roofline>`: Detailed arithmetic intensity and performance analysis for individual kernels.
|
* :ref:`Per-kernel roofline analysis <per-kernel-roofline>`: Detailed arithmetic intensity and performance analysis for individual kernels.
|
||||||
|
|
||||||
Run ``rocprof-compute analyze -h`` for more details.
|
Run ``rocprof-compute analyze -h`` for more details.
|
||||||
@@ -534,36 +534,46 @@ Analysis database example
|
|||||||
|
|
||||||
.. code-block:: shell-session
|
.. code-block:: shell-session
|
||||||
|
|
||||||
$ rocprof-compute analyze --verbose --db test -p workloads/vmem/MI300X_A1 -p workloads/vmem1/MI300X_A1
|
$ rocprof-compute analyze --verbose --output-name test --output-format db -p workloads/nbody/MI300X_A1 -p workloads/nbody1/MI300X_A1
|
||||||
DEBUG Execution mode = analyze
|
DEBUG Execution mode = analyze
|
||||||
|
|
||||||
__ _
|
__ _
|
||||||
_ __ ___ ___ _ __ _ __ ___ / _| ___ ___ _ __ ___ _ __ _ _| |_ ___
|
_ __ ___ ___ _ __ _ __ ___ / _| ___ ___ _ __ ___ _ __ _ _| |_ ___
|
||||||
| '__/ _ \ / __| '_ \| '__/ _ \| |_ _____ / __/ _ \| '_ ` _ \| '_ \| | | | __/ _ \
|
| '__/ _ \ / __| '_ \| '__/ _ \| |_ _____ / __/ _ \| '_ ` _ \| '_ \| | | | __/ _ \
|
||||||
| | | (_) | (__| |_) | | | (_) | _|_____| (_| (_) | | | | | | |_) | |_| | || __/
|
| | | (_) | (__| |_) | | | (_) | _|_____| (_| (_) | | | | | | |_) | |_| | || __/
|
||||||
|_| \___/ \___| .__/|_| \___/|_| \___\___/|_| |_| |_| .__/ \__,_|\__\___|
|
|_| \___/ \___| .__/|_| \___/|_| \___\___/|_| |_| |_| .__/ \__,_|\__\___|
|
||||||
|_| |_|
|
|_| |_|
|
||||||
|
|
||||||
INFO Analysis mode = db
|
INFO Analysis mode = db
|
||||||
DEBUG [omnisoc init]
|
INFO ed45b0b189
|
||||||
DEBUG [omnisoc init]
|
DEBUG [omnisoc init]
|
||||||
DEBUG [analysis] prepping to do some analysis
|
INFO ed45b0b189
|
||||||
INFO [analysis] deriving rocprofiler-compute metrics...
|
DEBUG [omnisoc init]
|
||||||
WARNING Roofline ceilings not found for /app/projects/rocprofiler-compute/workloads/vmem/MI300X_A1.
|
DEBUG [analysis] prepping to do some analysis
|
||||||
WARNING Roofline ceilings not found for /app/projects/rocprofiler-compute/workloads/vmem1/MI300X_A1.
|
INFO [analysis] deriving rocprofiler-compute metrics...
|
||||||
WARNING PC sampling data not found for /app/projects/rocprofiler-compute/workloads/vmem/MI300X_A1.
|
DEBUG Collected roofline ceilings
|
||||||
WARNING PC sampling data not found for /app/projects/rocprofiler-compute/workloads/vmem1/MI300X_A1.
|
WARNING PC sampling data not found for /app/projects/rocprofiler-compute/workloads/nbody/MI300X_A1.
|
||||||
DEBUG Collected dispatch data
|
WARNING PC sampling data not found for /app/projects/rocprofiler-compute/workloads/nbody1/MI300X_A1.
|
||||||
DEBUG Applied analysis mode filters
|
DEBUG Collected dispatch data
|
||||||
DEBUG Calculated dispatch data
|
DEBUG Applied analysis mode filters
|
||||||
DEBUG Collected metrics data
|
DEBUG Calculated dispatch data
|
||||||
WARNING Failed to evaluate expression for 3.1.39 - Value: to_round((to_avg(
|
DEBUG Collected metrics data
|
||||||
|
WARNING Failed to evaluate expression for 3.1.39 - Value: to_round((to_avg(
|
||||||
(pmc_df.get("pmc_perf_ACCUM") / pmc_df.get("SQC_ICACHE_REQ")).where((pmc_df.get("SQC_ICACHE_REQ") != 0), None)) * 100), 0) - unsupported operand type(s) for /: 'NoneType' and 'float'
|
(pmc_df.get("pmc_perf_ACCUM") / pmc_df.get("SQC_ICACHE_REQ")).where((pmc_df.get("SQC_ICACHE_REQ") != 0), None)) * 100), 0) - unsupported operand type(s) for /: 'NoneType' and 'float'
|
||||||
WARNING Failed to evaluate expression for 3.1.39 - Value: to_round((to_avg(
|
WARNING Failed to evaluate expression for 3.1.39 - Value: to_round((to_avg(
|
||||||
(pmc_df.get("pmc_perf_ACCUM") / pmc_df.get("SQC_ICACHE_REQ")).where((pmc_df.get("SQC_ICACHE_REQ") != 0), None)) * 100), 0) - unsupported operand type(s) for /: 'NoneType' and 'float'
|
(pmc_df.get("pmc_perf_ACCUM") / pmc_df.get("SQC_ICACHE_REQ")).where((pmc_df.get("SQC_ICACHE_REQ") != 0), None)) * 100), 0) - unsupported operand type(s) for /: 'NoneType' and 'float'
|
||||||
DEBUG Calculated metric values
|
DEBUG Calculating expressions for kernel: _ZN12rocrand_impl6system6detail17trampoline_kernelINS_4host33static_block_size_config_providerILj256EEEjLb0ELNS3_11target_archE942EZNS3_25xorwow_generator_templateINS0_13device_systemENS3_23default_config_providerIL16rocrand_rng_type401EEEE4initEvEUlT_DpT0_E_JPN14rocrand_device13xorwow_engineEjjyyEEEvT3_DpT4_
|
||||||
DEBUG Calculated roofline data points
|
DEBUG Calculating expressions for kernel: _ZN12rocrand_impl6system6detail17trampoline_kernelINS_4host23default_config_providerIL16rocrand_rng_type401EEEfLb0ELNS3_11target_archE942EZZNS3_25xorwow_generator_templateINS0_13device_systemES6_E8generateIfNS3_20uniform_distributionIfjEEEE14rocrand_statusPT_mT0_ENKUlSF_E_clISt17integral_constantIbLb0EEEEDaSF_EUlSF_DpT0_E_JPN14rocrand_device13xorwow_engineEjPfmSD_EEEvT3_DpT4_
|
||||||
DEBUG [analysis] generating analysis
|
DEBUG Calculating expressions for kernel: void bodyForce_block<256>(HIP_vector_type<float, 4u> const*, HIP_vector_type<float, 4u>*, float, int)
|
||||||
DEBUG SQLite database initialized with name: test.db
|
DEBUG Calculating expressions for kernel: _ZN12rocrand_impl6system6detail17trampoline_kernelINS_4host33static_block_size_config_providerILj256EEEjLb0ELNS3_11target_archE942EZNS3_25xorwow_generator_templateINS0_13device_systemENS3_23default_config_providerIL16rocrand_rng_type401EEEE4initEvEUlT_DpT0_E_JPN14rocrand_device13xorwow_engineEjjyyEEEvT3_DpT4_
|
||||||
DEBUG Initialized database: test.db
|
DEBUG Calculating expressions for kernel: _ZN12rocrand_impl6system6detail17trampoline_kernelINS_4host23default_config_providerIL16rocrand_rng_type401EEEfLb0ELNS3_11target_archE942EZZNS3_25xorwow_generator_templateINS0_13device_systemES6_E8generateIfNS3_20uniform_distributionIfjEEEE14rocrand_statusPT_mT0_ENKUlSF_E_clISt17integral_constantIbLb0EEEEDaSF_EUlSF_DpT0_E_JPN14rocrand_device13xorwow_engineEjPfmSD_EEEvT3_DpT4_
|
||||||
DEBUG Completed writing database
|
DEBUG Calculating expressions for kernel: void bodyForce_block<256>(HIP_vector_type<float, 4u> const*, HIP_vector_type<float, 4u>*, float, int)
|
||||||
|
DEBUG Calculated metric values
|
||||||
|
DEBUG Calculated roofline data points
|
||||||
|
DEBUG [analysis] generating analysis
|
||||||
|
DEBUG SQLite database initialized with name: test.db
|
||||||
|
DEBUG Initialized database: test.db
|
||||||
|
INFO ed45b0b189
|
||||||
|
INFO ed45b0b189
|
||||||
|
DEBUG Completed writing database
|
||||||
|
WARNING Created file: test.db
|
||||||
|
|||||||
@@ -101,7 +101,9 @@ class db_analysis(OmniAnalyze_Base):
|
|||||||
Database.init(db_name)
|
Database.init(db_name)
|
||||||
console_debug(f"Initialized database: {db_name}")
|
console_debug(f"Initialized database: {db_name}")
|
||||||
|
|
||||||
|
# Iterate over all workloads
|
||||||
for workload_path in self._runs.keys():
|
for workload_path in self._runs.keys():
|
||||||
|
# Add workload
|
||||||
workload_obj = orm.Workload(
|
workload_obj = orm.Workload(
|
||||||
name=workload_path.split("/")[-2],
|
name=workload_path.split("/")[-2],
|
||||||
sub_name=workload_path.split("/")[-1],
|
sub_name=workload_path.split("/")[-1],
|
||||||
@@ -113,38 +115,9 @@ class db_analysis(OmniAnalyze_Base):
|
|||||||
)
|
)
|
||||||
Database.get_session().add(workload_obj)
|
Database.get_session().add(workload_obj)
|
||||||
|
|
||||||
for pc_sample in self._pc_sampling_data_per_workload.get(
|
# Add kernel
|
||||||
workload_path, pd.DataFrame()
|
|
||||||
).itertuples():
|
|
||||||
Database.get_session().add(
|
|
||||||
orm.PCsampling(
|
|
||||||
source=pc_sample.source_line,
|
|
||||||
instruction=pc_sample.instruction,
|
|
||||||
count=pc_sample.count,
|
|
||||||
kernel_name=pc_sample.kernel_name,
|
|
||||||
offset=pc_sample.offset,
|
|
||||||
count_issue=pc_sample.count_issued,
|
|
||||||
count_stall=pc_sample.count_stalled,
|
|
||||||
stall_reason=pc_sample.stall_reason,
|
|
||||||
workload=workload_obj,
|
|
||||||
)
|
|
||||||
)
|
|
||||||
|
|
||||||
for roofline_data in self._roofline_data_per_workload.get(
|
|
||||||
workload_path, pd.DataFrame()
|
|
||||||
).itertuples():
|
|
||||||
Database.get_session().add(
|
|
||||||
orm.RooflineData(
|
|
||||||
kernel_name=roofline_data.kernel_name,
|
|
||||||
total_flops=roofline_data.total_flops,
|
|
||||||
l1_cache_data=roofline_data.l1_cache_data,
|
|
||||||
l2_cache_data=roofline_data.l2_cache_data,
|
|
||||||
hbm_cache_data=roofline_data.hbm_cache_data,
|
|
||||||
workload=workload_obj,
|
|
||||||
)
|
|
||||||
)
|
|
||||||
|
|
||||||
kernel_objs: dict[str, orm.Kernel] = {}
|
kernel_objs: dict[str, orm.Kernel] = {}
|
||||||
|
|
||||||
for dispatch in self._dispatch_data_per_workload.get(
|
for dispatch in self._dispatch_data_per_workload.get(
|
||||||
workload_path, pd.DataFrame()
|
workload_path, pd.DataFrame()
|
||||||
).itertuples():
|
).itertuples():
|
||||||
@@ -167,44 +140,101 @@ class db_analysis(OmniAnalyze_Base):
|
|||||||
)
|
)
|
||||||
)
|
)
|
||||||
|
|
||||||
# Optimize: Pre-group values by (metric_id, kernel_name) for O(1) lookups
|
# Add roofline data points
|
||||||
values_df = self._values_data_per_workload.get(
|
for roofline_data in self._roofline_data_per_workload.get(
|
||||||
workload_path, pd.DataFrame()
|
|
||||||
)
|
|
||||||
values_grouped = {}
|
|
||||||
if not values_df.empty:
|
|
||||||
for value in values_df.itertuples():
|
|
||||||
key = (value.metric_id, value.kernel_name)
|
|
||||||
if key not in values_grouped:
|
|
||||||
values_grouped[key] = []
|
|
||||||
values_grouped[key].append(value)
|
|
||||||
|
|
||||||
for metric in self._metrics_info_data_per_workload.get(
|
|
||||||
workload_path, pd.DataFrame()
|
workload_path, pd.DataFrame()
|
||||||
).itertuples():
|
).itertuples():
|
||||||
for kernel_name in kernel_objs.keys():
|
if roofline_data.kernel_name not in kernel_objs:
|
||||||
metric_obj = orm.Metric(
|
console_warning(
|
||||||
name=metric.name,
|
f"Kernel {roofline_data.kernel_name} from roofline data "
|
||||||
metric_id=metric.metric_id,
|
"not found in dispatch data. Skipping roofline entry."
|
||||||
description=metric.description,
|
|
||||||
unit=metric.unit,
|
|
||||||
table_name=metric.table_name,
|
|
||||||
sub_table_name=metric.sub_table_name,
|
|
||||||
kernel=kernel_objs[kernel_name],
|
|
||||||
)
|
)
|
||||||
Database.get_session().add(metric_obj)
|
continue
|
||||||
|
Database.get_session().add(
|
||||||
|
orm.RooflineData(
|
||||||
|
total_flops=roofline_data.total_flops,
|
||||||
|
l1_cache_data=roofline_data.l1_cache_data,
|
||||||
|
l2_cache_data=roofline_data.l2_cache_data,
|
||||||
|
hbm_cache_data=roofline_data.hbm_cache_data,
|
||||||
|
kernel=kernel_objs[roofline_data.kernel_name],
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
# Direct lookup instead of iterating through all values
|
# Add pc sampling data
|
||||||
key = (metric.metric_id, kernel_name)
|
for pc_sample in self._pc_sampling_data_per_workload.get(
|
||||||
for value in values_grouped.get(key, []):
|
workload_path, pd.DataFrame()
|
||||||
Database.get_session().add(
|
).itertuples():
|
||||||
orm.Value(
|
if pc_sample.kernel_name not in kernel_objs:
|
||||||
metric=metric_obj,
|
console_warning(
|
||||||
value_name=value.value_name,
|
f"Kernel {pc_sample.kernel_name} from PC sampling data "
|
||||||
value=value.value,
|
"not found in dispatch data. Skipping PC sampling entry."
|
||||||
)
|
)
|
||||||
|
continue
|
||||||
|
Database.get_session().add(
|
||||||
|
orm.PCsampling(
|
||||||
|
source=pc_sample.source_line,
|
||||||
|
instruction=pc_sample.instruction,
|
||||||
|
count=pc_sample.count,
|
||||||
|
offset=pc_sample.offset,
|
||||||
|
count_issue=pc_sample.count_issued,
|
||||||
|
count_stall=pc_sample.count_stalled,
|
||||||
|
stall_reason=pc_sample.stall_reason,
|
||||||
|
kernel=kernel_objs[pc_sample.kernel_name],
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
# Add metrics and values - iterate on values, create metrics as needed
|
||||||
|
metrics_info_dict = {
|
||||||
|
row.metric_id: row
|
||||||
|
for row in self._metrics_info_data_per_workload.get(
|
||||||
|
workload_path, pd.DataFrame()
|
||||||
|
).itertuples()
|
||||||
|
}
|
||||||
|
metric_objs: dict[str, orm.MetricDefinition] = {}
|
||||||
|
|
||||||
|
for value in self._values_data_per_workload.get(
|
||||||
|
workload_path, pd.DataFrame()
|
||||||
|
).itertuples():
|
||||||
|
# Check if kernel exists
|
||||||
|
if value.kernel_name not in kernel_objs:
|
||||||
|
console_warning(
|
||||||
|
f"Kernel {value.kernel_name} from values data "
|
||||||
|
"not found in dispatch data. Skipping metric value."
|
||||||
|
)
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Create or reuse metric object
|
||||||
|
if value.metric_id not in metric_objs:
|
||||||
|
# Fetch metric info
|
||||||
|
if value.metric_id not in metrics_info_dict:
|
||||||
|
console_warning(
|
||||||
|
f"Metric {value.metric_id} from values data "
|
||||||
|
"not found in metrics info. Skipping metric value."
|
||||||
)
|
)
|
||||||
|
continue
|
||||||
|
metric_info = metrics_info_dict[value.metric_id]
|
||||||
|
metric_objs[value.metric_id] = orm.MetricDefinition(
|
||||||
|
name=metric_info.name,
|
||||||
|
metric_id=metric_info.metric_id,
|
||||||
|
description=metric_info.description,
|
||||||
|
unit=metric_info.unit,
|
||||||
|
table_name=metric_info.table_name,
|
||||||
|
sub_table_name=metric_info.sub_table_name,
|
||||||
|
workload=workload_obj,
|
||||||
|
)
|
||||||
|
Database.get_session().add(metric_objs[value.metric_id])
|
||||||
|
|
||||||
|
# Add value
|
||||||
|
Database.get_session().add(
|
||||||
|
orm.MetricValue(
|
||||||
|
metric=metric_objs[value.metric_id],
|
||||||
|
kernel=kernel_objs[value.kernel_name],
|
||||||
|
value_name=value.value_name,
|
||||||
|
value=value.value,
|
||||||
|
)
|
||||||
|
)
|
||||||
|
|
||||||
|
# Add metadata
|
||||||
version = get_version(rocprof_compute_home)
|
version = get_version(rocprof_compute_home)
|
||||||
Database.get_session().add(
|
Database.get_session().add(
|
||||||
orm.Metadata(
|
orm.Metadata(
|
||||||
|
|||||||
@@ -45,7 +45,7 @@ from sqlalchemy.sql import Select
|
|||||||
from utils.logger import console_debug, console_error
|
from utils.logger import console_debug, console_error
|
||||||
|
|
||||||
PREFIX = "compute_"
|
PREFIX = "compute_"
|
||||||
SCHEMA_VERSION = "1.1.0"
|
SCHEMA_VERSION = "1.2.0"
|
||||||
|
|
||||||
|
|
||||||
Base = declarative_base()
|
Base = declarative_base()
|
||||||
@@ -63,18 +63,16 @@ class Workload(Base):
|
|||||||
|
|
||||||
# Workload can have multiple kernels
|
# Workload can have multiple kernels
|
||||||
kernels = relationship("Kernel", back_populates="workload")
|
kernels = relationship("Kernel", back_populates="workload")
|
||||||
# Workload can have multiple roofline data points
|
# Workload can have multiple metric definitions
|
||||||
roofline_data_points = relationship("RooflineData", back_populates="workload")
|
metric_definitions = relationship("MetricDefinition", back_populates="workload")
|
||||||
# Workload can have multiple pc_sampling values
|
|
||||||
pc_sampling_values = relationship("PCsampling", back_populates="workload")
|
|
||||||
|
|
||||||
|
|
||||||
class Metric(Base):
|
class MetricDefinition(Base):
|
||||||
__tablename__ = f"{PREFIX}metric"
|
__tablename__ = f"{PREFIX}metric_definition"
|
||||||
|
|
||||||
metric_uuid = Column(Integer, primary_key=True)
|
metric_uuid = Column(Integer, primary_key=True)
|
||||||
kernel_uuid = Column(
|
workload_id = Column(
|
||||||
Integer, ForeignKey(f"{PREFIX}kernel.kernel_uuid"), nullable=False
|
Integer, ForeignKey(f"{PREFIX}workload.workload_id"), nullable=False
|
||||||
)
|
)
|
||||||
name = Column(String) # e.g. Wavefronts Num
|
name = Column(String) # e.g. Wavefronts Num
|
||||||
metric_id = Column(String) # e.g. 4.1.3
|
metric_id = Column(String) # e.g. 4.1.3
|
||||||
@@ -83,27 +81,26 @@ class Metric(Base):
|
|||||||
sub_table_name = Column(String) # e.g. Wavefront stats
|
sub_table_name = Column(String) # e.g. Wavefront stats
|
||||||
unit = Column(String) # e.g. Gbps
|
unit = Column(String) # e.g. Gbps
|
||||||
|
|
||||||
# Metric can have one kernel
|
# Metric can have one workload
|
||||||
kernel = relationship("Kernel", back_populates="metrics")
|
workload = relationship("Workload", back_populates="metric_definitions")
|
||||||
# Metric can have multiple values
|
# Metric can have multiple metric values
|
||||||
values = relationship("Value", back_populates="metric")
|
metric_values = relationship("MetricValue", back_populates="metric")
|
||||||
|
|
||||||
|
|
||||||
class RooflineData(Base):
|
class RooflineData(Base):
|
||||||
__tablename__ = f"{PREFIX}roofline_data"
|
__tablename__ = f"{PREFIX}roofline_data"
|
||||||
|
|
||||||
roofline_uuid = Column(Integer, primary_key=True)
|
roofline_uuid = Column(Integer, primary_key=True)
|
||||||
workload_id = Column(
|
kernel_uuid = Column(
|
||||||
Integer, ForeignKey(f"{PREFIX}workload.workload_id"), nullable=False
|
Integer, ForeignKey(f"{PREFIX}kernel.kernel_uuid"), nullable=False
|
||||||
)
|
)
|
||||||
kernel_name = Column(String)
|
|
||||||
total_flops = Column(Float)
|
total_flops = Column(Float)
|
||||||
l1_cache_data = Column(Float)
|
l1_cache_data = Column(Float)
|
||||||
l2_cache_data = Column(Float)
|
l2_cache_data = Column(Float)
|
||||||
hbm_cache_data = Column(Float)
|
hbm_cache_data = Column(Float)
|
||||||
|
|
||||||
# Roofline data point can have one workload
|
# Roofline data point can have one kernel
|
||||||
workload = relationship("Workload", back_populates="roofline_data_points")
|
kernel = relationship("Kernel", back_populates="roofline_data_points")
|
||||||
|
|
||||||
|
|
||||||
class Dispatch(Base):
|
class Dispatch(Base):
|
||||||
@@ -135,42 +132,50 @@ class Kernel(Base):
|
|||||||
workload = relationship("Workload", back_populates="kernels")
|
workload = relationship("Workload", back_populates="kernels")
|
||||||
# Kernel can have multiple dispatches
|
# Kernel can have multiple dispatches
|
||||||
dispatches = relationship("Dispatch", back_populates="kernel")
|
dispatches = relationship("Dispatch", back_populates="kernel")
|
||||||
# Kernel can have multiple metrics
|
# Kernel can have multiple metric values
|
||||||
metrics = relationship("Metric", back_populates="kernel")
|
metric_values = relationship("MetricValue", back_populates="kernel")
|
||||||
|
# Kernel can have multiple roofline data points
|
||||||
|
roofline_data_points = relationship("RooflineData", back_populates="kernel")
|
||||||
|
# Kernel can have multiple pc_sampling values
|
||||||
|
pc_sampling_values = relationship("PCsampling", back_populates="kernel")
|
||||||
|
|
||||||
|
|
||||||
class PCsampling(Base):
|
class PCsampling(Base):
|
||||||
__tablename__ = f"{PREFIX}pcsampling"
|
__tablename__ = f"{PREFIX}pcsampling"
|
||||||
|
|
||||||
pc_sampling_uuid = Column(Integer, primary_key=True)
|
pc_sampling_uuid = Column(Integer, primary_key=True)
|
||||||
workload_id = Column(
|
kernel_uuid = Column(
|
||||||
Integer, ForeignKey(f"{PREFIX}workload.workload_id"), nullable=False
|
Integer, ForeignKey(f"{PREFIX}kernel.kernel_uuid"), nullable=False
|
||||||
)
|
)
|
||||||
source = Column(String)
|
source = Column(String)
|
||||||
instruction = Column(String)
|
instruction = Column(String)
|
||||||
count = Column(Integer)
|
count = Column(Integer)
|
||||||
kernel_name = Column(String)
|
|
||||||
offset = Column(Integer)
|
offset = Column(Integer)
|
||||||
count_issue = Column(Integer)
|
count_issue = Column(Integer)
|
||||||
count_stall = Column(Integer)
|
count_stall = Column(Integer)
|
||||||
stall_reason = Column(JSON)
|
stall_reason = Column(JSON)
|
||||||
|
|
||||||
# PCsampling can have one workload
|
# PCsampling can have one kernel
|
||||||
workload = relationship("Workload", back_populates="pc_sampling_values")
|
kernel = relationship("Kernel", back_populates="pc_sampling_values")
|
||||||
|
|
||||||
|
|
||||||
class Value(Base):
|
class MetricValue(Base):
|
||||||
__tablename__ = f"{PREFIX}value"
|
__tablename__ = f"{PREFIX}metric_value"
|
||||||
|
|
||||||
value_uuid = Column(Integer, primary_key=True)
|
value_uuid = Column(Integer, primary_key=True)
|
||||||
metric_uuid = Column(
|
metric_uuid = Column(
|
||||||
Integer, ForeignKey(f"{PREFIX}metric.metric_uuid"), nullable=False
|
Integer, ForeignKey(f"{PREFIX}metric_definition.metric_uuid"), nullable=False
|
||||||
|
)
|
||||||
|
kernel_uuid = Column(
|
||||||
|
Integer, ForeignKey(f"{PREFIX}kernel.kernel_uuid"), nullable=False
|
||||||
)
|
)
|
||||||
value_name = Column(String) # e.g. min, max, avg
|
value_name = Column(String) # e.g. min, max, avg
|
||||||
value = Column(Float) # e.g. 123.45
|
value = Column(Float) # e.g. 123.45
|
||||||
|
|
||||||
# Value can have one metric
|
# Value can have one metric
|
||||||
metric = relationship("Metric", back_populates="values")
|
metric = relationship("MetricDefinition", back_populates="metric_values")
|
||||||
|
# Value can have one kernel
|
||||||
|
kernel = relationship("Kernel", back_populates="metric_values")
|
||||||
|
|
||||||
|
|
||||||
class Metadata(Base):
|
class Metadata(Base):
|
||||||
@@ -250,11 +255,20 @@ def get_views() -> list[TextClause]:
|
|||||||
|
|
||||||
views: dict[str, Select[Any]] = {
|
views: dict[str, Select[Any]] = {
|
||||||
"kernel_view": select(
|
"kernel_view": select(
|
||||||
|
Kernel.kernel_uuid.label("kernel_uuid"),
|
||||||
|
Kernel.workload_id.label("workload_id"),
|
||||||
|
Workload.name.label("workload_name"),
|
||||||
Kernel.kernel_name,
|
Kernel.kernel_name,
|
||||||
func.count(Dispatch.dispatch_id).label("dispatch_count"),
|
func.count(Dispatch.dispatch_id).label("dispatch_count"),
|
||||||
func.sum(Dispatch.end_timestamp - Dispatch.start_timestamp).label(
|
func.sum(Dispatch.end_timestamp - Dispatch.start_timestamp).label(
|
||||||
"duration_ns_sum"
|
"duration_ns_sum"
|
||||||
),
|
),
|
||||||
|
func.min(Dispatch.end_timestamp - Dispatch.start_timestamp).label(
|
||||||
|
"duration_ns_min"
|
||||||
|
),
|
||||||
|
func.max(Dispatch.end_timestamp - Dispatch.start_timestamp).label(
|
||||||
|
"duration_ns_max"
|
||||||
|
),
|
||||||
median_calc.c.duration_ns_median,
|
median_calc.c.duration_ns_median,
|
||||||
func.avg(Dispatch.end_timestamp - Dispatch.start_timestamp).label(
|
func.avg(Dispatch.end_timestamp - Dispatch.start_timestamp).label(
|
||||||
"duration_ns_mean"
|
"duration_ns_mean"
|
||||||
@@ -262,24 +276,31 @@ def get_views() -> list[TextClause]:
|
|||||||
)
|
)
|
||||||
.select_from(Dispatch)
|
.select_from(Dispatch)
|
||||||
.join(Kernel, Dispatch.kernel_uuid == Kernel.kernel_uuid)
|
.join(Kernel, Dispatch.kernel_uuid == Kernel.kernel_uuid)
|
||||||
|
.join(Workload, Kernel.workload_id == Workload.workload_id)
|
||||||
.join(median_calc.subquery(), Kernel.kernel_name == median_calc.c.kernel_name)
|
.join(median_calc.subquery(), Kernel.kernel_name == median_calc.c.kernel_name)
|
||||||
.group_by(Kernel.kernel_name),
|
.group_by(
|
||||||
|
Kernel.kernel_uuid, Kernel.workload_id, Workload.name, Kernel.kernel_name
|
||||||
|
),
|
||||||
"metric_view": select(
|
"metric_view": select(
|
||||||
|
Workload.workload_id.label("workload_id"),
|
||||||
Workload.name.label("workload_name"),
|
Workload.name.label("workload_name"),
|
||||||
|
Kernel.kernel_uuid.label("kernel_uuid"),
|
||||||
Kernel.kernel_name,
|
Kernel.kernel_name,
|
||||||
Metric.name.label("metric_name"),
|
MetricDefinition.metric_uuid.label("metric_uuid"),
|
||||||
Metric.metric_id,
|
MetricDefinition.name.label("metric_name"),
|
||||||
Metric.description,
|
MetricDefinition.metric_id,
|
||||||
Metric.table_name,
|
MetricDefinition.description,
|
||||||
Metric.sub_table_name,
|
MetricDefinition.table_name,
|
||||||
Metric.unit,
|
MetricDefinition.sub_table_name,
|
||||||
Value.value_name,
|
MetricDefinition.unit,
|
||||||
Value.value,
|
MetricValue.value_uuid.label("value_uuid"),
|
||||||
|
MetricValue.value_name,
|
||||||
|
MetricValue.value,
|
||||||
)
|
)
|
||||||
.select_from(Metric)
|
.select_from(MetricDefinition)
|
||||||
.join(Kernel, Metric.kernel_uuid == Kernel.kernel_uuid)
|
.join(Workload, MetricDefinition.workload_id == Workload.workload_id)
|
||||||
.join(Value, Metric.metric_uuid == Value.metric_uuid)
|
.join(MetricValue, MetricDefinition.metric_uuid == MetricValue.metric_uuid)
|
||||||
.join(Workload, Kernel.workload_id == Workload.workload_id),
|
.join(Kernel, MetricValue.kernel_uuid == Kernel.kernel_uuid),
|
||||||
}
|
}
|
||||||
|
|
||||||
return [
|
return [
|
||||||
|
|||||||
@@ -989,19 +989,19 @@ def test_analyze_rocpd(
|
|||||||
Dispatch,
|
Dispatch,
|
||||||
Kernel,
|
Kernel,
|
||||||
Metadata,
|
Metadata,
|
||||||
Metric,
|
MetricDefinition,
|
||||||
|
MetricValue,
|
||||||
RooflineData,
|
RooflineData,
|
||||||
Value,
|
|
||||||
Workload,
|
Workload,
|
||||||
)
|
)
|
||||||
|
|
||||||
table_name_map = {
|
table_name_map = {
|
||||||
"compute_workload": Workload,
|
"compute_workload": Workload,
|
||||||
"compute_metric": Metric,
|
"compute_metric_definition": MetricDefinition,
|
||||||
"compute_roofline_data": RooflineData,
|
"compute_roofline_data": RooflineData,
|
||||||
"compute_dispatch": Dispatch,
|
"compute_dispatch": Dispatch,
|
||||||
"compute_kernel": Kernel,
|
"compute_kernel": Kernel,
|
||||||
"compute_value": Value,
|
"compute_metric_value": MetricValue,
|
||||||
"compute_metadata": Metadata,
|
"compute_metadata": Metadata,
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
새 이슈에서 참조
사용자 차단