[rocprof-compute] remove references to --kernel-names (#1543)
* remove references to --kernel-names * ruff format * remove redundant comments * update docs and roofline image * added two output lines to docs
此提交包含在:
@@ -105,7 +105,6 @@ Standalone Roofline Options:
|
||||
vL1D
|
||||
LDS
|
||||
--device GPU device ID. (DEFAULT: ALL)
|
||||
--kernel-names Include kernel names in roofline plot.
|
||||
```
|
||||
|
||||
The following sample command profiles the *vcopy* workload.
|
||||
@@ -377,7 +376,7 @@ Standalone Roofline Options:
|
||||
|
||||
- The `--device` \<gpu_id> allows you to specify a device id to collect performace data from when running our roofline benchmark on your system.
|
||||
|
||||
- If you'd like to distinguish different kernels in your .pdf roofline plot use `--kernel-names`. This will give each kernel a unique marker identifiable from the plot's key.
|
||||
- Each kernel in your .pdf roofline plot is automatically distinguished with a unique marker identifiable from the plot's key.
|
||||
|
||||
|
||||
#### Roofline Only
|
||||
|
||||
@@ -315,7 +315,7 @@ Standalone Roofline Options:
|
||||
|
||||
- The `--device` \<gpu_id> allows you to specify a device id to collect performance data from when running our roofline benchmark on your system.
|
||||
|
||||
- If you would like to distinguish different kernels in your .pdf roofline plot use `--kernel-names`. This will give each kernel a unique marker identifiable from the plot's key.
|
||||
- Each kernel in your .pdf roofline plot is automatically distinguished with a unique marker identifiable from the plot's key.
|
||||
|
||||
|
||||
#### Roofline Only
|
||||
|
||||
未顯示二進位檔案。
|
之前 寬度: | 高度: | 大小: 64 KiB 之後 寬度: | 高度: | 大小: 160 KiB |
@@ -583,9 +583,11 @@ Roofline options
|
||||
|
||||
For more information on data types supported based on the GPU architecture, see :doc:`../../conceptual/performance-model`
|
||||
|
||||
To distinguish different kernels in your ``.pdf`` roofline plot use
|
||||
``--kernel-names``. This will give each kernel a unique marker identifiable from
|
||||
the plot's key.
|
||||
Each kernel in your ``.pdf`` roofline plot is automatically distinguished with a unique marker identifiable from the plot's key. The roofline PDF includes an integrated multi-subplot layout with:
|
||||
|
||||
1. **Roofline Plot** - Shows performance ceilings and kernel arithmetic intensity points
|
||||
2. **Plot Points & Values Table** - Displays AI values, performance metrics, memory/compute bound status, and cache levels for each kernel
|
||||
3. **Full Kernel Names Table** - Lists complete kernel names with their corresponding plot markers
|
||||
|
||||
|
||||
Roofline only
|
||||
@@ -595,33 +597,52 @@ The following example demonstrates profiling roofline data only:
|
||||
|
||||
.. code-block:: shell-session
|
||||
|
||||
$ rocprof-compute profile --name vcopy --roof-only -- ./vcopy -n 1048576 -b 256
|
||||
|
||||
$ rocprof-compute profile --name occupancy --roof-only -- ./tests/occupancy -n 1048576 -b 256
|
||||
__ _
|
||||
_ __ ___ ___ _ __ _ __ ___ / _| ___ ___ _ __ ___ _ __ _ _| |_ ___
|
||||
| '__/ _ \ / __| '_ \| '__/ _ \| |_ _____ / __/ _ \| '_ ` _ \| '_ \| | | | __/ _ \
|
||||
| | | (_) | (__| |_) | | | (_) | _|_____| (_| (_) | | | | | | |_) | |_| | || __/
|
||||
|_| \___/ \___| .__/|_| \___/|_| \___\___/|_| |_| |_| .__/ \__,_|\__\___|
|
||||
|_| |_|
|
||||
...
|
||||
[roofline] Checking for roofline.csv in /home/auser/repos/rocprofiler-compute/sample/workloads/vcopy/MI200
|
||||
[roofline] No roofline data found. Generating...
|
||||
Checking for roofline.csv in /home/auser/repos/rocprofiler-compute/sample/workloads/vcopy/MI200
|
||||
INFO [roofline] Generating pmc_perf.csv (roofline counters only).
|
||||
INFO Rocprofiler-Compute version: 3.3.0
|
||||
INFO Profiler choice: rocprofiler-sdk
|
||||
INFO Path: /app/projects/rocprofiler-compute/workloads/occupancy/MI300X_A1
|
||||
INFO Target: MI300X_A1
|
||||
INFO Command: ./tests/occupancy -n 1048576 -b 256
|
||||
INFO Kernel Selection: None
|
||||
INFO Dispatch Selection: None
|
||||
INFO Filtered sections: ['4']
|
||||
INFO
|
||||
INFO ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
INFO Collecting Performance Counters (Roofline Only)
|
||||
INFO ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
INFO
|
||||
INFO [Run 1/3][Approximate profiling time left: pending first measurement...]
|
||||
INFO [profiling] Current input file: /app/projects/rocprofiler-compute/workloads/occupancy/MI300X_A1/perfmon/pmc_perf_0.txt
|
||||
...
|
||||
INFO [roofline] Checking for roofline.csv in /app/projects/rocprofiler-compute/workloads/occupancy/MI300X_A1
|
||||
INFO [roofline] No roofline data found. Generating...
|
||||
Empirical Roofline Calculation
|
||||
Copyright © 2022 Advanced Micro Devices, Inc. All rights reserved.
|
||||
Total detected GPU devices: 4
|
||||
GPU Device 0: Profiling...
|
||||
99% [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| ]
|
||||
...
|
||||
Empirical Roofline PDFs saved!
|
||||
|
||||
Copyright © 2025 Advanced Micro Devices, Inc. All rights reserved.
|
||||
Total detected GPU devices: 8
|
||||
GPU Device 0 (gfx942) with 304 CUs: Profiling...
|
||||
99% [||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| ]
|
||||
...
|
||||
An inspection of our workload output folder shows ``.pdf`` plots were generated
|
||||
successfully.
|
||||
|
||||
.. code-block:: shell-session
|
||||
|
||||
$ ls workloads/vcopy/MI200/
|
||||
$ ls workloads/occupancy/MI300X_A1
|
||||
total 48
|
||||
-rw-r--r-- 1 auser agroup 13331 Mar 1 16:05 empirRoof_gpu-0_FP32.pdf
|
||||
drwxr-xr-x 1 auser agroup 0 Mar 1 16:03 perfmon
|
||||
-rw-r--r-- 1 auser agroup 1101 Mar 1 16:03 pmc_perf.csv
|
||||
-rw-r--r-- 1 auser agroup 1715 Mar 1 16:05 roofline.csv
|
||||
-rw-r--r-- 1 auser agroup 650 Mar 1 16:03 sysinfo.csv
|
||||
-rw-r--r-- 1 auser agroup 399 Mar 1 16:03 timestamps.csv
|
||||
-rw-r--r-- 1 auser agroup 13331 Oct 29 10:33 empirRoof_gpu-0_FP32.pdf
|
||||
drwxr-xr-x 1 auser agroup 0 Oct 29 10:33 perfmon
|
||||
-rw-r--r-- 1 auser agroup 1101 Oct 29 10:33 pmc_perf.csv
|
||||
-rw-r--r-- 1 auser agroup 1715 Oct 29 10:33 roofline.csv
|
||||
-rw-r--r-- 1 auser agroup 650 Oct 29 10:33 sysinfo.csv
|
||||
-rw-r--r-- 1 auser agroup 399 Oct 29 10:33 timestamps.csv
|
||||
|
||||
.. note::
|
||||
|
||||
|
||||
@@ -439,13 +439,6 @@ Examples:
|
||||
type=int,
|
||||
help="\t\t\tTarget GPU device ID. (DEFAULT: 0)",
|
||||
)
|
||||
roofline_group.add_argument(
|
||||
"--kernel-names",
|
||||
required=False,
|
||||
default=False,
|
||||
action="store_true",
|
||||
help="\t\t\tInclude kernel names in roofline plot.",
|
||||
)
|
||||
roofline_group.add_argument(
|
||||
"-R",
|
||||
"--roofline-data-type",
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
declare -A commands=(
|
||||
[path]=' '
|
||||
[no_roof]='--no-roof'
|
||||
[kernel_names]='--roof-only --kernel-names'
|
||||
[kernel_names]='--roof-only'
|
||||
[device_filter]='--device 0'
|
||||
[kernel]='--kernel "vecCopy(double*, double*, double*, int, int) [clone .kd]"'
|
||||
[ipblocks_SQ]='-b SQ'
|
||||
|
||||
@@ -587,14 +587,21 @@ def test_path_rocpd(
|
||||
|
||||
|
||||
@pytest.mark.roofline
|
||||
def test_roof_kernel_names(binary_handler_profile_rocprof_compute):
|
||||
def test_roof_basic_validation(binary_handler_profile_rocprof_compute):
|
||||
"""
|
||||
Test basic roofline PDF generation with full validation pipeline.
|
||||
This test runs the complete validation flow including counter logging
|
||||
and metric comparison (if enabled in config). Validates that roofline PDFs
|
||||
are generated with the integrated multi-subplot layout (roofline plot +
|
||||
plot points table + kernel names table).
|
||||
"""
|
||||
if soc in ("MI100"):
|
||||
# roofline is not supported on MI100
|
||||
assert True
|
||||
# Do not continue testing
|
||||
return
|
||||
|
||||
options = ["--device", "0", "--roof-only", "--kernel-names"]
|
||||
options = ["--device", "0", "--roof-only"]
|
||||
workload_dir = test_utils.get_output_dir()
|
||||
returncode = binary_handler_profile_rocprof_compute(
|
||||
config, workload_dir, options, check_success=False, roof=True
|
||||
@@ -631,7 +638,6 @@ def test_roof_multiple_data_types(binary_handler_profile_rocprof_compute):
|
||||
"--device",
|
||||
"0",
|
||||
"--roof-only",
|
||||
"--kernel-names",
|
||||
"--roofline-data-type",
|
||||
dtype,
|
||||
]
|
||||
@@ -666,7 +672,6 @@ def test_roof_invalid_data_type(binary_handler_profile_rocprof_compute):
|
||||
"--device",
|
||||
"0",
|
||||
"--roof-only",
|
||||
"--kernel-names",
|
||||
"--roofline-data-type",
|
||||
"INVALID_TYPE",
|
||||
]
|
||||
@@ -815,7 +820,6 @@ def test_roofline_workload_dir_not_set_error():
|
||||
class MockArgs:
|
||||
def __init__(self):
|
||||
self.roof_only = True
|
||||
self.kernel_names = False
|
||||
self.mem_level = "ALL"
|
||||
self.sort = "ALL"
|
||||
self.roofline_data_type = ["FP32"]
|
||||
@@ -828,7 +832,6 @@ def test_roofline_workload_dir_not_set_error():
|
||||
"device_id": 0,
|
||||
"sort_type": "kernels",
|
||||
"mem_level": "ALL",
|
||||
"include_kernel_names": False,
|
||||
"is_standalone": True,
|
||||
"roofline_data_type": ["FP32"],
|
||||
}
|
||||
@@ -879,8 +882,15 @@ def test_roof_workload_dir_validation(binary_handler_profile_rocprof_compute):
|
||||
@pytest.mark.roofline
|
||||
def test_roofline_empty_kernel_names_handling(binary_handler_profile_rocprof_compute):
|
||||
"""
|
||||
Test empirical_roofline() when num_kernels == 0
|
||||
This should trigger the "No kernel names found" log message
|
||||
Test roofline behavior when kernel filter doesn't match any
|
||||
kernels during initial profiling.
|
||||
|
||||
When profiling with a non-matching kernel filter, no counter
|
||||
data is collected, so roofline generation is skipped with a
|
||||
warning (but returns success code 0).
|
||||
|
||||
This is different from filtering existing profiling data with
|
||||
a non-matching kernel name, which produces an explicit error.
|
||||
"""
|
||||
if soc in ("MI100"):
|
||||
pytest.skip("Skipping roofline test for MI100")
|
||||
@@ -890,14 +900,20 @@ def test_roofline_empty_kernel_names_handling(binary_handler_profile_rocprof_com
|
||||
"--device",
|
||||
"0",
|
||||
"--roof-only",
|
||||
"--kernel-names",
|
||||
"--kernel",
|
||||
"nonexistent_kernel_name_that_should_not_match_anything",
|
||||
]
|
||||
workload_dir = test_utils.get_output_dir()
|
||||
|
||||
returncode = binary_handler_profile_rocprof_compute( # noqa: F841
|
||||
config, workload_dir, options, check_success=True, roof=True
|
||||
returncode = binary_handler_profile_rocprof_compute(
|
||||
config, workload_dir, options, check_success=False, roof=True
|
||||
)
|
||||
|
||||
assert returncode == 0, f"Expected success (returncode=0), got {returncode}"
|
||||
|
||||
pdf_files = list(Path(workload_dir).glob("empirRoof_*.pdf"))
|
||||
assert len(pdf_files) == 0, (
|
||||
"No roofline PDF should be generated when no kernels match"
|
||||
)
|
||||
|
||||
test_utils.clean_output_dir(config["cleanup"], workload_dir)
|
||||
@@ -918,7 +934,6 @@ def test_roofline_kernel_filter(binary_handler_profile_rocprof_compute):
|
||||
"--device",
|
||||
"0",
|
||||
"--roof-only",
|
||||
"--kernel-names",
|
||||
]
|
||||
workload_dir = test_utils.get_output_dir()
|
||||
|
||||
@@ -991,7 +1006,6 @@ def test_roof_plot_modes(binary_handler_profile_rocprof_compute):
|
||||
"--device",
|
||||
"0",
|
||||
"--roof-only",
|
||||
"--kernel-names",
|
||||
"--kernel",
|
||||
config["kernel_name_1"],
|
||||
],
|
||||
@@ -1083,7 +1097,6 @@ def test_roofline_missing_file_handling(binary_handler_profile_rocprof_compute):
|
||||
class MockArgs:
|
||||
def __init__(self):
|
||||
self.roof_only = True
|
||||
self.kernel_names = False
|
||||
self.mem_level = "ALL"
|
||||
self.sort = "ALL"
|
||||
self.roofline_data_type = ["FP32"]
|
||||
@@ -1098,7 +1111,6 @@ def test_roofline_missing_file_handling(binary_handler_profile_rocprof_compute):
|
||||
"device_id": 0,
|
||||
"sort_type": "kernels",
|
||||
"mem_level": "ALL",
|
||||
"include_kernel_names": False,
|
||||
"is_standalone": True,
|
||||
"roofline_data_type": ["FP32"],
|
||||
}
|
||||
@@ -1137,7 +1149,6 @@ def test_roofline_invalid_datatype_cli(binary_handler_profile_rocprof_compute):
|
||||
class MockArgs:
|
||||
def __init__(self):
|
||||
self.roof_only = True
|
||||
self.kernel_names = False
|
||||
self.mem_level = "ALL"
|
||||
self.sort = "ALL"
|
||||
self.roofline_data_type = ["FP32"]
|
||||
@@ -1150,7 +1161,6 @@ def test_roofline_invalid_datatype_cli(binary_handler_profile_rocprof_compute):
|
||||
"device_id": 0,
|
||||
"sort_type": "kernels",
|
||||
"mem_level": "ALL",
|
||||
"include_kernel_names": False,
|
||||
"is_standalone": True,
|
||||
"roofline_data_type": ["FP32"],
|
||||
}
|
||||
@@ -1187,6 +1197,223 @@ def test_roofline_ceiling_data_validation(binary_handler_profile_rocprof_compute
|
||||
test_utils.clean_output_dir(config["cleanup"], workload_dir)
|
||||
|
||||
|
||||
@pytest.mark.roofline
|
||||
def test_roofline_plot_points_data_generation():
|
||||
"""
|
||||
Test that plot points data structure is correctly generated with:
|
||||
- Symbol assignments
|
||||
- AI values (FLOPs/Byte)
|
||||
- Performance values (GFLOPs/s)
|
||||
- Memory/Compute bound status
|
||||
- Cache level information
|
||||
"""
|
||||
if soc in ("MI100"):
|
||||
pytest.skip("Skipping roofline test for MI100")
|
||||
return
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).parent.parent / "src"))
|
||||
|
||||
try:
|
||||
from roofline import Roofline
|
||||
from utils.specs import generate_machine_specs
|
||||
|
||||
class MockArgs:
|
||||
def __init__(self):
|
||||
self.roof_only = True
|
||||
self.mem_level = "ALL"
|
||||
self.sort = "ALL"
|
||||
self.roofline_data_type = ["FP32"]
|
||||
|
||||
args = MockArgs()
|
||||
mspec = generate_machine_specs(None, None)
|
||||
|
||||
mock_ai_data = {
|
||||
"ai_l1": [[0.5, 1.2], [100.0, 150.0]],
|
||||
"ai_l2": [[0.3, 0.8], [80.0, 120.0]],
|
||||
"ai_hbm": [[0.1, 0.4], [50.0, 90.0]],
|
||||
"kernelNames": ["kernel_A", "kernel_B"],
|
||||
}
|
||||
|
||||
mock_ceiling_data = {
|
||||
"l1": [[0.01, 10], [10, 1000], 100],
|
||||
"l2": [[0.01, 10], [10, 800], 80],
|
||||
"hbm": [[0.01, 10], [10, 500], 50],
|
||||
"valu": [[1, 100], [200, 200], 200],
|
||||
"mfma": [[1, 100], [500, 500], 500],
|
||||
}
|
||||
|
||||
plot_points_data = []
|
||||
cache_colors = {
|
||||
"ai_l1": "blue",
|
||||
"ai_l2": "green",
|
||||
"ai_hbm": "red",
|
||||
}
|
||||
|
||||
roofline_instance = Roofline(args, mspec)
|
||||
|
||||
for cache_level in ["ai_l1", "ai_l2", "ai_hbm"]:
|
||||
if cache_level in mock_ai_data:
|
||||
x_vals = mock_ai_data[cache_level][0]
|
||||
y_vals = mock_ai_data[cache_level][1]
|
||||
num_kernels = len(mock_ai_data["kernelNames"])
|
||||
|
||||
for i in range(min(len(x_vals), num_kernels)):
|
||||
if x_vals[i] > 0 and y_vals[i] > 0:
|
||||
status = roofline_instance._determine_kernel_bound_status(
|
||||
ai_value=x_vals[i],
|
||||
performance=y_vals[i],
|
||||
cache_level=cache_level,
|
||||
ceiling_data=mock_ceiling_data,
|
||||
)
|
||||
|
||||
plot_points_data.append({
|
||||
"symbol": None,
|
||||
"color": cache_colors.get(cache_level, "gray"),
|
||||
"cache_level": cache_level.replace("ai_", "", 1).upper(),
|
||||
"ai": f"{x_vals[i]:.2f}",
|
||||
"performance": f"{y_vals[i]:.2f}",
|
||||
"status": status,
|
||||
"kernel_idx": i,
|
||||
})
|
||||
|
||||
assert len(plot_points_data) > 0, "Plot points data should not be empty"
|
||||
|
||||
for point in plot_points_data:
|
||||
assert "cache_level" in point
|
||||
assert "ai" in point
|
||||
assert "performance" in point
|
||||
assert "status" in point
|
||||
assert "kernel_idx" in point
|
||||
assert "color" in point
|
||||
|
||||
assert point["cache_level"] in ["L1", "L2", "HBM"]
|
||||
|
||||
assert point["status"] in ["Memory Bound", "Compute Bound", "Unknown"]
|
||||
|
||||
assert isinstance(point["ai"], str)
|
||||
assert isinstance(point["performance"], str)
|
||||
|
||||
except ImportError:
|
||||
pytest.skip("Could not import roofline module for direct testing")
|
||||
|
||||
|
||||
@pytest.mark.roofline
|
||||
def test_roofline_bound_status_calculation():
|
||||
"""
|
||||
Test _determine_kernel_bound_status() correctly classifies kernels as
|
||||
Memory Bound or Compute Bound based on their AI and performance vs ceilings.
|
||||
"""
|
||||
if soc in ("MI100"):
|
||||
pytest.skip("Skipping roofline test for MI100")
|
||||
return
|
||||
|
||||
sys.path.insert(0, str(Path(__file__).parent.parent / "src"))
|
||||
|
||||
try:
|
||||
from roofline import Roofline
|
||||
from utils.specs import generate_machine_specs
|
||||
|
||||
class MockArgs:
|
||||
def __init__(self):
|
||||
self.roof_only = True
|
||||
self.mem_level = "ALL"
|
||||
self.sort = "ALL"
|
||||
self.roofline_data_type = ["FP32"]
|
||||
|
||||
args = MockArgs()
|
||||
mspec = generate_machine_specs(None, None)
|
||||
roofline_instance = Roofline(args, mspec)
|
||||
|
||||
ceiling_data = {
|
||||
"hbm": [[0.01, 10], [10, 1000], 100],
|
||||
"valu": [[1, 100], [200, 200], 200],
|
||||
"mfma": [[1, 100], [500, 500], 500],
|
||||
}
|
||||
|
||||
status1 = roofline_instance._determine_kernel_bound_status(
|
||||
ai_value=1.0,
|
||||
performance=100.0,
|
||||
cache_level="ai_hbm",
|
||||
ceiling_data=ceiling_data,
|
||||
)
|
||||
assert status1 == "Memory Bound", f"Expected Memory Bound, got {status1}"
|
||||
|
||||
status2 = roofline_instance._determine_kernel_bound_status(
|
||||
ai_value=5.0,
|
||||
performance=150.0,
|
||||
cache_level="ai_hbm",
|
||||
ceiling_data=ceiling_data,
|
||||
)
|
||||
assert status2 == "Compute Bound", f"Expected Compute Bound, got {status2}"
|
||||
|
||||
status3 = roofline_instance._determine_kernel_bound_status(
|
||||
ai_value=1.0,
|
||||
performance=100.0,
|
||||
cache_level="ai_l1",
|
||||
ceiling_data=ceiling_data,
|
||||
)
|
||||
assert status3 == "Unknown", f"Expected Unknown, got {status3}"
|
||||
|
||||
bad_ceiling_data = {
|
||||
"hbm": [100],
|
||||
}
|
||||
status4 = roofline_instance._determine_kernel_bound_status(
|
||||
ai_value=1.0,
|
||||
performance=100.0,
|
||||
cache_level="ai_hbm",
|
||||
ceiling_data=bad_ceiling_data,
|
||||
)
|
||||
assert status4 == "Unknown", f"Expected Unknown for bad data, got {status4}"
|
||||
|
||||
except ImportError:
|
||||
pytest.skip("Could not import roofline module for direct testing")
|
||||
|
||||
|
||||
@pytest.mark.roofline
|
||||
def test_roofline_many_kernels_dynamic_height(binary_handler_profile_rocprof_compute):
|
||||
"""
|
||||
Test roofline PDF generation with many kernels (10+) to verify:
|
||||
- Dynamic height calculation works
|
||||
- PDF is generated successfully
|
||||
- File size is reasonable
|
||||
|
||||
Note: This test uses a regular workload but validates the PDF structure
|
||||
can handle the multi-subplot layout properly.
|
||||
"""
|
||||
if soc in ("MI100"):
|
||||
pytest.skip("Skipping roofline test for MI100")
|
||||
return
|
||||
|
||||
options = ["--device", "0", "--roof-only"]
|
||||
workload_dir = test_utils.get_output_dir()
|
||||
|
||||
returncode = binary_handler_profile_rocprof_compute(
|
||||
config, workload_dir, options, check_success=False, roof=True
|
||||
)
|
||||
|
||||
assert returncode == 0, "Roofline profiling should succeed"
|
||||
|
||||
pdf_files = list(Path(workload_dir).glob("empirRoof_*.pdf"))
|
||||
assert len(pdf_files) > 0, "At least one roofline PDF should be generated"
|
||||
|
||||
for pdf_file in pdf_files:
|
||||
assert pdf_file.exists(), f"PDF file {pdf_file} should exist"
|
||||
file_size = pdf_file.stat().st_size
|
||||
|
||||
# PDF should be larger than 10KB (has content) but less than 50MB (reasonable)
|
||||
assert file_size > 10000, (
|
||||
f"PDF {pdf_file} too small ({file_size} bytes), may be malformed"
|
||||
)
|
||||
assert file_size < 50000000, (
|
||||
f"PDF {pdf_file} too large ({file_size} bytes), may have issues"
|
||||
)
|
||||
|
||||
file_dict = test_utils.check_csv_files(workload_dir, 1, num_kernels)
|
||||
assert sorted(list(file_dict.keys())) == ROOF_ONLY_FILES
|
||||
|
||||
test_utils.clean_output_dir(config["cleanup"], workload_dir)
|
||||
|
||||
|
||||
@pytest.mark.misc
|
||||
def test_device_filter(binary_handler_profile_rocprof_compute):
|
||||
options = ["--device", "0"]
|
||||
|
||||
@@ -6,7 +6,6 @@ format_rocprof_output: csv
|
||||
hip_trace: false
|
||||
join_type: grid
|
||||
kernel: null
|
||||
kernel_names: false
|
||||
kokkos_trace: false
|
||||
list_metrics: null
|
||||
loglevel: 10
|
||||
|
||||
新增問題並參考
封鎖使用者