Update documentation (#53)
- updated info about OMNITRACE_USE_MPI
- removed wiki links
- info about metadata.json
- update HW counters and fix typos
- fix update-docs.sh
[ROCm/rocprofiler-systems commit: bab90baf0b]
This commit is contained in:
zatwierdzone przez
GitHub
rodzic
060da8159c
commit
0094a471fd
@@ -149,9 +149,9 @@ source ${OMNITRACE_ROOT}/share/omnitrace/setup-env.sh
|
||||
#### MPI Support within Omnitrace
|
||||
|
||||
[Omnitrace](https://github.com/AMDResearch/omnitrace) can have full (`OMNITRACE_USE_MPI=ON`) or partial (`OMNITRACE_USE_MPI_HEADERS=ON`) MPI support.
|
||||
The only difference between these two modes is whether or not the results collected via timemory can be aggregated into one output file. The primary
|
||||
benefits of partial or full MPI support are the automatic wrapping of MPI functions and the ability to label output with suffixes which correspond to the
|
||||
`MPI_COMM_WORLD` rank ID instead of using the system process identifier (i.e. PID).
|
||||
The only difference between these two modes is whether or not the results collected via timemory and/or perfetto can be aggregated into a single
|
||||
output file during finalization. The primary benefits of partial or full MPI support are the automatic wrapping of MPI functions and the ability
|
||||
to label output with suffixes which correspond to the `MPI_COMM_WORLD` rank ID instead of using the system process identifier (i.e. PID).
|
||||
In general, it is recommended to use partial MPI support with the OpenMPI headers as this is the most portable configuration.
|
||||
If full MPI support is selected, make sure your target application is built against the same MPI distribution as omnitrace,
|
||||
i.e. do not build omnitrace with MPICH and use it on a target application built against OpenMPI.
|
||||
|
||||
@@ -49,9 +49,3 @@ have different contextual meanings, e.g., omnitrace's meaning of the term "modul
|
||||
- Due to language constructs or compiler optimizations, it may be possible for multiple functions to overlap (that is, share part of the same function body) or for a single function to have multiple entry points
|
||||
- In practice, it is impossible to determine the difference between multiple overlapping functions and a single function with multiple entry points
|
||||
- By default, omnitrace avoids instrumenting overlapping functions
|
||||
|
||||
## Additional Notes
|
||||
|
||||
The ["Data granularity in profiler types"](https://en.wikipedia.org/wiki/Profiling_(computer_programming)#Data_granularity_in_profiler_types) section of
|
||||
the Wikipedia ["Profiling (computer programming)"](https://en.wikipedia.org/wiki/Profiling_(computer_programming)) page may be a useful reference in understanding
|
||||
the different profiling modes and their trade-offs.
|
||||
|
||||
@@ -55,7 +55,114 @@ $ omnitrace -- ./foo
|
||||
|
||||
## Metadata
|
||||
|
||||
[Omnitrace](https://github.com/AMDResearch/omnitrace) will output a metadata.json file.
|
||||
[Omnitrace](https://github.com/AMDResearch/omnitrace) will output a metadata.json file. This metadata file will contain
|
||||
information about the settings, environment variables, output files, and info about the system and the run:
|
||||
|
||||
```json
|
||||
{
|
||||
"omnitrace": {
|
||||
"metadata": {
|
||||
"info": {
|
||||
"HW_L1_CACHE_SIZE": 32768,
|
||||
"HW_L2_CACHE_SIZE": 524288,
|
||||
"SHELL": "/bin/bash",
|
||||
"HW_PHYSICAL_CPU": 12,
|
||||
"CPU_FEATURES": "fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip rdpid overflow_recov succor smca sme sev sev_es",
|
||||
"HW_CONCURRENCY": 24,
|
||||
"LAUNCH_TIME": "02:04",
|
||||
"CPU_MODEL": "AMD Ryzen Threadripper PRO 3945WX 12-Cores",
|
||||
"TIMEMORY_GIT_REVISION": "52e7034fd419ff296506cdef43084f6071dbaba1",
|
||||
"TIMEMORY_VERSION": "3.3.0rc4",
|
||||
"CPU_FREQUENCY": 2400,
|
||||
"TIMEMORY_API": "tim::project::timemory",
|
||||
"PWD": "/home/jrmadsen/devel/c++/AARInternal/hosttrace-dyninst/build-vscode",
|
||||
"HW_L3_CACHE_SIZE": 16777216,
|
||||
"USER": "jrmadsen",
|
||||
"HOME": "/home/jrmadsen",
|
||||
"TIMEMORY_GIT_DESCRIBE": "v3.2.0-263-g52e7034f",
|
||||
"LAUNCH_DATE": "05/08/22",
|
||||
"CPU_VENDOR": "AuthenticAMD"
|
||||
},
|
||||
"output": {
|
||||
"text": [
|
||||
{
|
||||
"value": [
|
||||
"omnitrace-tests-output/parallel-overhead-binary-rewrite/roctracer.txt"
|
||||
],
|
||||
"key": "roctracer"
|
||||
},
|
||||
{
|
||||
"value": [
|
||||
"omnitrace-tests-output/parallel-overhead-binary-rewrite/wall_clock.txt"
|
||||
],
|
||||
"key": "wall_clock"
|
||||
}
|
||||
],
|
||||
"json": [
|
||||
{
|
||||
"value": [
|
||||
"omnitrace-tests-output/parallel-overhead-binary-rewrite/roctracer.json",
|
||||
"omnitrace-tests-output/parallel-overhead-binary-rewrite/roctracer.tree.json"
|
||||
],
|
||||
"key": "roctracer"
|
||||
},
|
||||
{
|
||||
"value": [
|
||||
"omnitrace-tests-output/parallel-overhead-binary-rewrite/wall_clock.json",
|
||||
"omnitrace-tests-output/parallel-overhead-binary-rewrite/wall_clock.tree.json"
|
||||
],
|
||||
"key": "wall_clock"
|
||||
}
|
||||
]
|
||||
},
|
||||
"environment": [
|
||||
{
|
||||
"value": "/home/jrmadsen",
|
||||
"key": "HOME"
|
||||
},
|
||||
{
|
||||
"value": "/bin/bash",
|
||||
"key": "SHELL"
|
||||
},
|
||||
{
|
||||
"value": "jrmadsen",
|
||||
"key": "USER"
|
||||
},
|
||||
{
|
||||
"value": "true",
|
||||
"key": "... etc. ..."
|
||||
}
|
||||
],
|
||||
"settings": {
|
||||
"OMNITRACE_JSON_OUTPUT": {
|
||||
"count": -1,
|
||||
"environ_updated": false,
|
||||
"name": "json_output",
|
||||
"data_type": "bool",
|
||||
"initial": true,
|
||||
"enabled": true,
|
||||
"value": true,
|
||||
"max_count": 1,
|
||||
"cmdline": [
|
||||
"--omnitrace-json-output"
|
||||
],
|
||||
"environ": "OMNITRACE_JSON_OUTPUT",
|
||||
"config_updated": false,
|
||||
"categories": [
|
||||
"io",
|
||||
"json",
|
||||
"native"
|
||||
],
|
||||
"description": "Write json output files"
|
||||
},
|
||||
"... etc. ...": {
|
||||
"etc.": true
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Configuring Output
|
||||
|
||||
|
||||
@@ -227,120 +227,401 @@ $ omnitrace-avail -C -bd
|
||||
|
||||
```console
|
||||
$ omnitrace-avail -H -bd
|
||||
|---------------------|-------------------------------------------------|
|
||||
| HARDWARE COUNTER | DESCRIPTION |
|
||||
|---------------------|-------------------------------------------------|
|
||||
| CPU | |
|
||||
|---------------------|-------------------------------------------------|
|
||||
| PAPI_L1_DCM | Level 1 data cache misses |
|
||||
| PAPI_L1_ICM | Level 1 instruction cache misses |
|
||||
| PAPI_L2_DCM | Level 2 data cache misses |
|
||||
| PAPI_L2_ICM | Level 2 instruction cache misses |
|
||||
| PAPI_L3_DCM | Level 3 data cache misses |
|
||||
| PAPI_L3_ICM | Level 3 instruction cache misses |
|
||||
| PAPI_L1_TCM | Level 1 cache misses |
|
||||
| PAPI_L2_TCM | Level 2 cache misses |
|
||||
| PAPI_L3_TCM | Level 3 cache misses |
|
||||
| PAPI_CA_SNP | Requests for a snoop |
|
||||
| PAPI_CA_SHR | Requests for exclusive access to shared cach... |
|
||||
| PAPI_CA_CLN | Requests for exclusive access to clean cache... |
|
||||
| PAPI_CA_INV | Requests for cache line invalidation |
|
||||
| PAPI_CA_ITV | Requests for cache line intervention |
|
||||
| PAPI_L3_LDM | Level 3 load misses |
|
||||
| PAPI_L3_STM | Level 3 store misses |
|
||||
| PAPI_BRU_IDL | Cycles branch units are idle |
|
||||
| PAPI_FXU_IDL | Cycles integer units are idle |
|
||||
| PAPI_FPU_IDL | Cycles floating point units are idle |
|
||||
| PAPI_LSU_IDL | Cycles load/store units are idle |
|
||||
| PAPI_TLB_DM | Data translation lookaside buffer misses |
|
||||
| PAPI_TLB_IM | Instruction translation lookaside buffer misses |
|
||||
| PAPI_TLB_TL | Total translation lookaside buffer misses |
|
||||
| PAPI_L1_LDM | Level 1 load misses |
|
||||
| PAPI_L1_STM | Level 1 store misses |
|
||||
| PAPI_L2_LDM | Level 2 load misses |
|
||||
| PAPI_L2_STM | Level 2 store misses |
|
||||
| PAPI_BTAC_M | Branch target address cache misses |
|
||||
| PAPI_PRF_DM | Data prefetch cache misses |
|
||||
| PAPI_L3_DCH | Level 3 data cache hits |
|
||||
| PAPI_TLB_SD | Translation lookaside buffer shootdowns |
|
||||
| PAPI_CSR_FAL | Failed store conditional instructions |
|
||||
| PAPI_CSR_SUC | Successful store conditional instructions |
|
||||
| PAPI_CSR_TOT | Total store conditional instructions |
|
||||
| PAPI_MEM_SCY | Cycles Stalled Waiting for memory accesses |
|
||||
| PAPI_MEM_RCY | Cycles Stalled Waiting for memory reads |
|
||||
| PAPI_MEM_WCY | Cycles Stalled Waiting for memory writes |
|
||||
| PAPI_STL_ICY | Cycles with no instruction issue |
|
||||
| PAPI_FUL_ICY | Cycles with maximum instruction issue |
|
||||
| PAPI_STL_CCY | Cycles with no instructions completed |
|
||||
| PAPI_FUL_CCY | Cycles with maximum instructions completed |
|
||||
| PAPI_HW_INT | Hardware interrupts |
|
||||
| PAPI_BR_UCN | Unconditional branch instructions |
|
||||
| PAPI_BR_CN | Conditional branch instructions |
|
||||
| PAPI_BR_TKN | Conditional branch instructions taken |
|
||||
| PAPI_BR_NTK | Conditional branch instructions not taken |
|
||||
| PAPI_BR_MSP | Conditional branch instructions mispredicted |
|
||||
| PAPI_BR_PRC | Conditional branch instructions correctly pr... |
|
||||
| PAPI_FMA_INS | FMA instructions completed |
|
||||
| PAPI_TOT_IIS | Instructions issued |
|
||||
| PAPI_TOT_INS | Instructions completed |
|
||||
| PAPI_INT_INS | Integer instructions |
|
||||
| PAPI_FP_INS | Floating point instructions |
|
||||
| PAPI_LD_INS | Load instructions |
|
||||
| PAPI_SR_INS | Store instructions |
|
||||
| PAPI_BR_INS | Branch instructions |
|
||||
| PAPI_VEC_INS | Vector/SIMD instructions (could include inte... |
|
||||
| PAPI_RES_STL | Cycles stalled on any resource |
|
||||
| PAPI_FP_STAL | Cycles the FP unit(s) are stalled |
|
||||
| PAPI_TOT_CYC | Total cycles |
|
||||
| PAPI_LST_INS | Load/store instructions completed |
|
||||
| PAPI_SYC_INS | Synchronization instructions completed |
|
||||
| PAPI_L1_DCH | Level 1 data cache hits |
|
||||
| PAPI_L2_DCH | Level 2 data cache hits |
|
||||
| PAPI_L1_DCA | Level 1 data cache accesses |
|
||||
| PAPI_L2_DCA | Level 2 data cache accesses |
|
||||
| PAPI_L3_DCA | Level 3 data cache accesses |
|
||||
| PAPI_L1_DCR | Level 1 data cache reads |
|
||||
| PAPI_L2_DCR | Level 2 data cache reads |
|
||||
| PAPI_L3_DCR | Level 3 data cache reads |
|
||||
| PAPI_L1_DCW | Level 1 data cache writes |
|
||||
| PAPI_L2_DCW | Level 2 data cache writes |
|
||||
| PAPI_L3_DCW | Level 3 data cache writes |
|
||||
| PAPI_L1_ICH | Level 1 instruction cache hits |
|
||||
| PAPI_L2_ICH | Level 2 instruction cache hits |
|
||||
| PAPI_L3_ICH | Level 3 instruction cache hits |
|
||||
| PAPI_L1_ICA | Level 1 instruction cache accesses |
|
||||
| PAPI_L2_ICA | Level 2 instruction cache accesses |
|
||||
| PAPI_L3_ICA | Level 3 instruction cache accesses |
|
||||
| PAPI_L1_ICR | Level 1 instruction cache reads |
|
||||
| PAPI_L2_ICR | Level 2 instruction cache reads |
|
||||
| PAPI_L3_ICR | Level 3 instruction cache reads |
|
||||
| PAPI_L1_ICW | Level 1 instruction cache writes |
|
||||
| PAPI_L2_ICW | Level 2 instruction cache writes |
|
||||
| PAPI_L3_ICW | Level 3 instruction cache writes |
|
||||
| PAPI_L1_TCH | Level 1 total cache hits |
|
||||
| PAPI_L2_TCH | Level 2 total cache hits |
|
||||
| PAPI_L3_TCH | Level 3 total cache hits |
|
||||
| PAPI_L1_TCA | Level 1 total cache accesses |
|
||||
| PAPI_L2_TCA | Level 2 total cache accesses |
|
||||
| PAPI_L3_TCA | Level 3 total cache accesses |
|
||||
| PAPI_L1_TCR | Level 1 total cache reads |
|
||||
| PAPI_L2_TCR | Level 2 total cache reads |
|
||||
| PAPI_L3_TCR | Level 3 total cache reads |
|
||||
| PAPI_L1_TCW | Level 1 total cache writes |
|
||||
| PAPI_L2_TCW | Level 2 total cache writes |
|
||||
| PAPI_L3_TCW | Level 3 total cache writes |
|
||||
| PAPI_FML_INS | Floating point multiply instructions |
|
||||
| PAPI_FAD_INS | Floating point add instructions |
|
||||
| PAPI_FDV_INS | Floating point divide instructions |
|
||||
| PAPI_FSQ_INS | Floating point square root instructions |
|
||||
| PAPI_FNV_INS | Floating point inverse instructions |
|
||||
| PAPI_FP_OPS | Floating point operations |
|
||||
| PAPI_SP_OPS | Floating point operations; optimized to coun... |
|
||||
| PAPI_DP_OPS | Floating point operations; optimized to coun... |
|
||||
| PAPI_VEC_SP | Single precision vector/SIMD instructions |
|
||||
| PAPI_VEC_DP | Double precision vector/SIMD instructions |
|
||||
| PAPI_REF_CYC | Reference clock cycles |
|
||||
|---------------------|-------------------------------------------------|
|
||||
|---------------------------------------|---------------------------------------|
|
||||
| HARDWARE COUNTER | DESCRIPTION |
|
||||
|---------------------------------------|---------------------------------------|
|
||||
| CPU | |
|
||||
|---------------------------------------|---------------------------------------|
|
||||
| PAPI_L1_DCM | Level 1 data cache misses |
|
||||
| PAPI_L1_ICM | Level 1 instruction cache misses |
|
||||
| PAPI_L2_DCM | Level 2 data cache misses |
|
||||
| PAPI_L2_ICM | Level 2 instruction cache misses |
|
||||
| PAPI_L3_DCM | Level 3 data cache misses |
|
||||
| PAPI_L3_ICM | Level 3 instruction cache misses |
|
||||
| PAPI_L1_TCM | Level 1 cache misses |
|
||||
| PAPI_L2_TCM | Level 2 cache misses |
|
||||
| PAPI_L3_TCM | Level 3 cache misses |
|
||||
| PAPI_CA_SNP | Requests for a snoop |
|
||||
| PAPI_CA_SHR | Requests for exclusive access to s... |
|
||||
| PAPI_CA_CLN | Requests for exclusive access to c... |
|
||||
| PAPI_CA_INV | Requests for cache line invalidation |
|
||||
| PAPI_CA_ITV | Requests for cache line intervention |
|
||||
| PAPI_L3_LDM | Level 3 load misses |
|
||||
| PAPI_L3_STM | Level 3 store misses |
|
||||
| PAPI_BRU_IDL | Cycles branch units are idle |
|
||||
| PAPI_FXU_IDL | Cycles integer units are idle |
|
||||
| PAPI_FPU_IDL | Cycles floating point units are idle |
|
||||
| PAPI_LSU_IDL | Cycles load/store units are idle |
|
||||
| PAPI_TLB_DM | Data translation lookaside buffer ... |
|
||||
| PAPI_TLB_IM | Instruction translation lookaside ... |
|
||||
| PAPI_TLB_TL | Total translation lookaside buffer... |
|
||||
| PAPI_L1_LDM | Level 1 load misses |
|
||||
| PAPI_L1_STM | Level 1 store misses |
|
||||
| PAPI_L2_LDM | Level 2 load misses |
|
||||
| PAPI_L2_STM | Level 2 store misses |
|
||||
| PAPI_BTAC_M | Branch target address cache misses |
|
||||
| PAPI_PRF_DM | Data prefetch cache misses |
|
||||
| PAPI_L3_DCH | Level 3 data cache hits |
|
||||
| PAPI_TLB_SD | Translation lookaside buffer shoot... |
|
||||
| PAPI_CSR_FAL | Failed store conditional instructions |
|
||||
| PAPI_CSR_SUC | Successful store conditional instr... |
|
||||
| PAPI_CSR_TOT | Total store conditional instructions |
|
||||
| PAPI_MEM_SCY | Cycles Stalled Waiting for memory ... |
|
||||
| PAPI_MEM_RCY | Cycles Stalled Waiting for memory ... |
|
||||
| PAPI_MEM_WCY | Cycles Stalled Waiting for memory ... |
|
||||
| PAPI_STL_ICY | Cycles with no instruction issue |
|
||||
| PAPI_FUL_ICY | Cycles with maximum instruction issue |
|
||||
| PAPI_STL_CCY | Cycles with no instructions completed |
|
||||
| PAPI_FUL_CCY | Cycles with maximum instructions c... |
|
||||
| PAPI_HW_INT | Hardware interrupts |
|
||||
| PAPI_BR_UCN | Unconditional branch instructions |
|
||||
| PAPI_BR_CN | Conditional branch instructions |
|
||||
| PAPI_BR_TKN | Conditional branch instructions taken |
|
||||
| PAPI_BR_NTK | Conditional branch instructions no... |
|
||||
| PAPI_BR_MSP | Conditional branch instructions mi... |
|
||||
| PAPI_BR_PRC | Conditional branch instructions co... |
|
||||
| PAPI_FMA_INS | FMA instructions completed |
|
||||
| PAPI_TOT_IIS | Instructions issued |
|
||||
| PAPI_TOT_INS | Instructions completed |
|
||||
| PAPI_INT_INS | Integer instructions |
|
||||
| PAPI_FP_INS | Floating point instructions |
|
||||
| PAPI_LD_INS | Load instructions |
|
||||
| PAPI_SR_INS | Store instructions |
|
||||
| PAPI_BR_INS | Branch instructions |
|
||||
| PAPI_VEC_INS | Vector/SIMD instructions (could in... |
|
||||
| PAPI_RES_STL | Cycles stalled on any resource |
|
||||
| PAPI_FP_STAL | Cycles the FP unit(s) are stalled |
|
||||
| PAPI_TOT_CYC | Total cycles |
|
||||
| PAPI_LST_INS | Load/store instructions completed |
|
||||
| PAPI_SYC_INS | Synchronization instructions compl... |
|
||||
| PAPI_L1_DCH | Level 1 data cache hits |
|
||||
| PAPI_L2_DCH | Level 2 data cache hits |
|
||||
| PAPI_L1_DCA | Level 1 data cache accesses |
|
||||
| PAPI_L2_DCA | Level 2 data cache accesses |
|
||||
| PAPI_L3_DCA | Level 3 data cache accesses |
|
||||
| PAPI_L1_DCR | Level 1 data cache reads |
|
||||
| PAPI_L2_DCR | Level 2 data cache reads |
|
||||
| PAPI_L3_DCR | Level 3 data cache reads |
|
||||
| PAPI_L1_DCW | Level 1 data cache writes |
|
||||
| PAPI_L2_DCW | Level 2 data cache writes |
|
||||
| PAPI_L3_DCW | Level 3 data cache writes |
|
||||
| PAPI_L1_ICH | Level 1 instruction cache hits |
|
||||
| PAPI_L2_ICH | Level 2 instruction cache hits |
|
||||
| PAPI_L3_ICH | Level 3 instruction cache hits |
|
||||
| PAPI_L1_ICA | Level 1 instruction cache accesses |
|
||||
| PAPI_L2_ICA | Level 2 instruction cache accesses |
|
||||
| PAPI_L3_ICA | Level 3 instruction cache accesses |
|
||||
| PAPI_L1_ICR | Level 1 instruction cache reads |
|
||||
| PAPI_L2_ICR | Level 2 instruction cache reads |
|
||||
| PAPI_L3_ICR | Level 3 instruction cache reads |
|
||||
| PAPI_L1_ICW | Level 1 instruction cache writes |
|
||||
| PAPI_L2_ICW | Level 2 instruction cache writes |
|
||||
| PAPI_L3_ICW | Level 3 instruction cache writes |
|
||||
| PAPI_L1_TCH | Level 1 total cache hits |
|
||||
| PAPI_L2_TCH | Level 2 total cache hits |
|
||||
| PAPI_L3_TCH | Level 3 total cache hits |
|
||||
| PAPI_L1_TCA | Level 1 total cache accesses |
|
||||
| PAPI_L2_TCA | Level 2 total cache accesses |
|
||||
| PAPI_L3_TCA | Level 3 total cache accesses |
|
||||
| PAPI_L1_TCR | Level 1 total cache reads |
|
||||
| PAPI_L2_TCR | Level 2 total cache reads |
|
||||
| PAPI_L3_TCR | Level 3 total cache reads |
|
||||
| PAPI_L1_TCW | Level 1 total cache writes |
|
||||
| PAPI_L2_TCW | Level 2 total cache writes |
|
||||
| PAPI_L3_TCW | Level 3 total cache writes |
|
||||
| PAPI_FML_INS | Floating point multiply instructions |
|
||||
| PAPI_FAD_INS | Floating point add instructions |
|
||||
| PAPI_FDV_INS | Floating point divide instructions |
|
||||
| PAPI_FSQ_INS | Floating point square root instruc... |
|
||||
| PAPI_FNV_INS | Floating point inverse instructions |
|
||||
| PAPI_FP_OPS | Floating point operations |
|
||||
| PAPI_SP_OPS | Floating point operations; optimiz... |
|
||||
| PAPI_DP_OPS | Floating point operations; optimiz... |
|
||||
| PAPI_VEC_SP | Single precision vector/SIMD instr... |
|
||||
| PAPI_VEC_DP | Double precision vector/SIMD instr... |
|
||||
| PAPI_REF_CYC | Reference clock cycles |
|
||||
| perf::PERF_COUNT_HW_CPU_CYCLES | PERF_COUNT_HW_CPU_CYCLES |
|
||||
| perf::PERF_COUNT_HW_CPU_CYCLES:u=0 | perf::PERF_COUNT_HW_CPU_CYCLES + m... |
|
||||
| perf::PERF_COUNT_HW_CPU_CYCLES:k=0 | perf::PERF_COUNT_HW_CPU_CYCLES + m... |
|
||||
| perf::PERF_COUNT_HW_CPU_CYCLES:h=0 | perf::PERF_COUNT_HW_CPU_CYCLES + m... |
|
||||
| perf::PERF_COUNT_HW_CPU_CYCLES:per... | perf::PERF_COUNT_HW_CPU_CYCLES + s... |
|
||||
| perf::PERF_COUNT_HW_CPU_CYCLES:freq=0 | perf::PERF_COUNT_HW_CPU_CYCLES + s... |
|
||||
| perf::PERF_COUNT_HW_CPU_CYCLES:pre... | perf::PERF_COUNT_HW_CPU_CYCLES + p... |
|
||||
| perf::PERF_COUNT_HW_CPU_CYCLES:excl=0 | perf::PERF_COUNT_HW_CPU_CYCLES + e... |
|
||||
| perf::PERF_COUNT_HW_CPU_CYCLES:mg=0 | perf::PERF_COUNT_HW_CPU_CYCLES + m... |
|
||||
| perf::PERF_COUNT_HW_CPU_CYCLES:mh=0 | perf::PERF_COUNT_HW_CPU_CYCLES + m... |
|
||||
| perf::PERF_COUNT_HW_CPU_CYCLES:cpu=0 | perf::PERF_COUNT_HW_CPU_CYCLES + C... |
|
||||
| perf::PERF_COUNT_HW_CPU_CYCLES:pin... | perf::PERF_COUNT_HW_CPU_CYCLES + p... |
|
||||
| perf::CYCLES | PERF_COUNT_HW_CPU_CYCLES |
|
||||
| perf::CYCLES:u=0 | perf::CYCLES + monitor at user level |
|
||||
| perf::CYCLES:k=0 | perf::CYCLES + monitor at kernel l... |
|
||||
| perf::CYCLES:h=0 | perf::CYCLES + monitor at hypervis... |
|
||||
| perf::CYCLES:period=0 | perf::CYCLES + sampling period |
|
||||
| perf::CYCLES:freq=0 | perf::CYCLES + sampling frequency ... |
|
||||
| perf::CYCLES:precise=0 | perf::CYCLES + precise event sampling |
|
||||
| perf::CYCLES:excl=0 | perf::CYCLES + exclusive access |
|
||||
| perf::CYCLES:mg=0 | perf::CYCLES + monitor guest execu... |
|
||||
| perf::CYCLES:mh=0 | perf::CYCLES + monitor host execution |
|
||||
| perf::CYCLES:cpu=0 | perf::CYCLES + CPU to program |
|
||||
| perf::CYCLES:pinned=0 | perf::CYCLES + pin event to counters |
|
||||
| perf::CPU-CYCLES | PERF_COUNT_HW_CPU_CYCLES |
|
||||
| perf::CPU-CYCLES:u=0 | perf::CPU-CYCLES + monitor at user... |
|
||||
| perf::CPU-CYCLES:k=0 | perf::CPU-CYCLES + monitor at kern... |
|
||||
| perf::CPU-CYCLES:h=0 | perf::CPU-CYCLES + monitor at hype... |
|
||||
| perf::CPU-CYCLES:period=0 | perf::CPU-CYCLES + sampling period |
|
||||
| perf::CPU-CYCLES:freq=0 | perf::CPU-CYCLES + sampling freque... |
|
||||
| perf::CPU-CYCLES:precise=0 | perf::CPU-CYCLES + precise event s... |
|
||||
| perf::CPU-CYCLES:excl=0 | perf::CPU-CYCLES + exclusive access |
|
||||
| perf::CPU-CYCLES:mg=0 | perf::CPU-CYCLES + monitor guest e... |
|
||||
| perf::CPU-CYCLES:mh=0 | perf::CPU-CYCLES + monitor host ex... |
|
||||
| perf::CPU-CYCLES:cpu=0 | perf::CPU-CYCLES + CPU to program |
|
||||
| perf::CPU-CYCLES:pinned=0 | perf::CPU-CYCLES + pin event to co... |
|
||||
| perf::PERF_COUNT_HW_INSTRUCTIONS | PERF_COUNT_HW_INSTRUCTIONS |
|
||||
| perf::PERF_COUNT_HW_INSTRUCTIONS:u=0 | perf::PERF_COUNT_HW_INSTRUCTIONS +... |
|
||||
| perf::PERF_COUNT_HW_INSTRUCTIONS:k=0 | perf::PERF_COUNT_HW_INSTRUCTIONS +... |
|
||||
| perf::PERF_COUNT_HW_INSTRUCTIONS:h=0 | perf::PERF_COUNT_HW_INSTRUCTIONS +... |
|
||||
| perf::PERF_COUNT_HW_INSTRUCTIONS:p... | perf::PERF_COUNT_HW_INSTRUCTIONS +... |
|
||||
| perf::PERF_COUNT_HW_INSTRUCTIONS:f... | perf::PERF_COUNT_HW_INSTRUCTIONS +... |
|
||||
| perf::PERF_COUNT_HW_INSTRUCTIONS:p... | perf::PERF_COUNT_HW_INSTRUCTIONS +... |
|
||||
| perf::PERF_COUNT_HW_INSTRUCTIONS:e... | perf::PERF_COUNT_HW_INSTRUCTIONS +... |
|
||||
| perf::PERF_COUNT_HW_INSTRUCTIONS:mg=0 | perf::PERF_COUNT_HW_INSTRUCTIONS +... |
|
||||
| perf::PERF_COUNT_HW_INSTRUCTIONS:mh=0 | perf::PERF_COUNT_HW_INSTRUCTIONS +... |
|
||||
| perf::PERF_COUNT_HW_INSTRUCTIONS:c... | perf::PERF_COUNT_HW_INSTRUCTIONS +... |
|
||||
| perf::PERF_COUNT_HW_INSTRUCTIONS:p... | perf::PERF_COUNT_HW_INSTRUCTIONS +... |
|
||||
| ... etc. ... | |
|
||||
| perf_raw::r0000 | perf_events raw event syntax: r[0-... |
|
||||
| perf_raw::r0000:u=0 | perf_raw::r0000 + monitor at user ... |
|
||||
| perf_raw::r0000:k=0 | perf_raw::r0000 + monitor at kerne... |
|
||||
| perf_raw::r0000:h=0 | perf_raw::r0000 + monitor at hyper... |
|
||||
| perf_raw::r0000:period=0 | perf_raw::r0000 + sampling period |
|
||||
| perf_raw::r0000:freq=0 | perf_raw::r0000 + sampling frequen... |
|
||||
| perf_raw::r0000:precise=0 | perf_raw::r0000 + precise event sa... |
|
||||
| perf_raw::r0000:excl=0 | perf_raw::r0000 + exclusive access |
|
||||
| perf_raw::r0000:mg=0 | perf_raw::r0000 + monitor guest ex... |
|
||||
| perf_raw::r0000:mh=0 | perf_raw::r0000 + monitor host exe... |
|
||||
| perf_raw::r0000:cpu=0 | perf_raw::r0000 + CPU to program |
|
||||
| perf_raw::r0000:pinned=0 | perf_raw::r0000 + pin event to cou... |
|
||||
| perf_raw::r0000:hw_smpl=0 | perf_raw::r0000 + enable hardware ... |
|
||||
| L1_ITLB_MISS_L2_ITLB_HIT | Number of instruction fetches that... |
|
||||
| L1_ITLB_MISS_L2_ITLB_HIT:e=0 | L1_ITLB_MISS_L2_ITLB_HIT + edge level |
|
||||
| L1_ITLB_MISS_L2_ITLB_HIT:i=0 | L1_ITLB_MISS_L2_ITLB_HIT + invert |
|
||||
| L1_ITLB_MISS_L2_ITLB_HIT:c=0 | L1_ITLB_MISS_L2_ITLB_HIT + counter... |
|
||||
| L1_ITLB_MISS_L2_ITLB_HIT:g=0 | L1_ITLB_MISS_L2_ITLB_HIT + measure... |
|
||||
| L1_ITLB_MISS_L2_ITLB_HIT:u=0 | L1_ITLB_MISS_L2_ITLB_HIT + monitor... |
|
||||
| L1_ITLB_MISS_L2_ITLB_HIT:k=0 | L1_ITLB_MISS_L2_ITLB_HIT + monitor... |
|
||||
| L1_ITLB_MISS_L2_ITLB_HIT:period=0 | L1_ITLB_MISS_L2_ITLB_HIT + samplin... |
|
||||
| L1_ITLB_MISS_L2_ITLB_HIT:freq=0 | L1_ITLB_MISS_L2_ITLB_HIT + samplin... |
|
||||
| L1_ITLB_MISS_L2_ITLB_HIT:excl=0 | L1_ITLB_MISS_L2_ITLB_HIT + exclusi... |
|
||||
| L1_ITLB_MISS_L2_ITLB_HIT:mg=0 | L1_ITLB_MISS_L2_ITLB_HIT + monitor... |
|
||||
| L1_ITLB_MISS_L2_ITLB_HIT:mh=0 | L1_ITLB_MISS_L2_ITLB_HIT + monitor... |
|
||||
| L1_ITLB_MISS_L2_ITLB_HIT:cpu=0 | L1_ITLB_MISS_L2_ITLB_HIT + CPU to ... |
|
||||
| L1_ITLB_MISS_L2_ITLB_HIT:pinned=0 | L1_ITLB_MISS_L2_ITLB_HIT + pin eve... |
|
||||
| L1_ITLB_MISS_L2_ITLB_MISS | Number of instruction fetches that... |
|
||||
| L1_ITLB_MISS_L2_ITLB_MISS:IF1G | L1_ITLB_MISS_L2_ITLB_MISS + Number... |
|
||||
| L1_ITLB_MISS_L2_ITLB_MISS:IF2M | L1_ITLB_MISS_L2_ITLB_MISS + Number... |
|
||||
| L1_ITLB_MISS_L2_ITLB_MISS:IF4K | L1_ITLB_MISS_L2_ITLB_MISS + Number... |
|
||||
| L1_ITLB_MISS_L2_ITLB_MISS:e=0 | L1_ITLB_MISS_L2_ITLB_MISS + edge l... |
|
||||
| L1_ITLB_MISS_L2_ITLB_MISS:i=0 | L1_ITLB_MISS_L2_ITLB_MISS + invert |
|
||||
| L1_ITLB_MISS_L2_ITLB_MISS:c=0 | L1_ITLB_MISS_L2_ITLB_MISS + counte... |
|
||||
| L1_ITLB_MISS_L2_ITLB_MISS:g=0 | L1_ITLB_MISS_L2_ITLB_MISS + measur... |
|
||||
| L1_ITLB_MISS_L2_ITLB_MISS:u=0 | L1_ITLB_MISS_L2_ITLB_MISS + monito... |
|
||||
| L1_ITLB_MISS_L2_ITLB_MISS:k=0 | L1_ITLB_MISS_L2_ITLB_MISS + monito... |
|
||||
| L1_ITLB_MISS_L2_ITLB_MISS:period=0 | L1_ITLB_MISS_L2_ITLB_MISS + sampli... |
|
||||
| L1_ITLB_MISS_L2_ITLB_MISS:freq=0 | L1_ITLB_MISS_L2_ITLB_MISS + sampli... |
|
||||
| L1_ITLB_MISS_L2_ITLB_MISS:excl=0 | L1_ITLB_MISS_L2_ITLB_MISS + exclus... |
|
||||
| L1_ITLB_MISS_L2_ITLB_MISS:mg=0 | L1_ITLB_MISS_L2_ITLB_MISS + monito... |
|
||||
| L1_ITLB_MISS_L2_ITLB_MISS:mh=0 | L1_ITLB_MISS_L2_ITLB_MISS + monito... |
|
||||
| L1_ITLB_MISS_L2_ITLB_MISS:cpu=0 | L1_ITLB_MISS_L2_ITLB_MISS + CPU to... |
|
||||
| L1_ITLB_MISS_L2_ITLB_MISS:pinned=0 | L1_ITLB_MISS_L2_ITLB_MISS + pin ev... |
|
||||
| RETIRED_SSE_AVX_FLOPS | This is a retire-based event. The ... |
|
||||
| RETIRED_SSE_AVX_FLOPS:ADD_SUB_FLOPS | RETIRED_SSE_AVX_FLOPS + Addition/s... |
|
||||
| RETIRED_SSE_AVX_FLOPS:MULT_FLOPS | RETIRED_SSE_AVX_FLOPS + Multiplica... |
|
||||
| RETIRED_SSE_AVX_FLOPS:DIV_FLOPS | RETIRED_SSE_AVX_FLOPS + Division F... |
|
||||
| RETIRED_SSE_AVX_FLOPS:MAC_FLOPS | RETIRED_SSE_AVX_FLOPS + Double pre... |
|
||||
| RETIRED_SSE_AVX_FLOPS:ANY | RETIRED_SSE_AVX_FLOPS + Double pre... |
|
||||
| RETIRED_SSE_AVX_FLOPS:e=0 | RETIRED_SSE_AVX_FLOPS + edge level |
|
||||
| RETIRED_SSE_AVX_FLOPS:i=0 | RETIRED_SSE_AVX_FLOPS + invert |
|
||||
| RETIRED_SSE_AVX_FLOPS:c=0 | RETIRED_SSE_AVX_FLOPS + counter-ma... |
|
||||
| RETIRED_SSE_AVX_FLOPS:g=0 | RETIRED_SSE_AVX_FLOPS + measure in... |
|
||||
| RETIRED_SSE_AVX_FLOPS:u=0 | RETIRED_SSE_AVX_FLOPS + monitor at... |
|
||||
| RETIRED_SSE_AVX_FLOPS:k=0 | RETIRED_SSE_AVX_FLOPS + monitor at... |
|
||||
| RETIRED_SSE_AVX_FLOPS:period=0 | RETIRED_SSE_AVX_FLOPS + sampling p... |
|
||||
| RETIRED_SSE_AVX_FLOPS:freq=0 | RETIRED_SSE_AVX_FLOPS + sampling f... |
|
||||
| RETIRED_SSE_AVX_FLOPS:excl=0 | RETIRED_SSE_AVX_FLOPS + exclusive ... |
|
||||
| RETIRED_SSE_AVX_FLOPS:mg=0 | RETIRED_SSE_AVX_FLOPS + monitor gu... |
|
||||
| RETIRED_SSE_AVX_FLOPS:mh=0 | RETIRED_SSE_AVX_FLOPS + monitor ho... |
|
||||
| RETIRED_SSE_AVX_FLOPS:cpu=0 | RETIRED_SSE_AVX_FLOPS + CPU to pro... |
|
||||
| RETIRED_SSE_AVX_FLOPS:pinned=0 | RETIRED_SSE_AVX_FLOPS + pin event ... |
|
||||
| DIV_CYCLES_BUSY_COUNT | Number of cycles when the divider ... |
|
||||
| DIV_CYCLES_BUSY_COUNT:e=0 | DIV_CYCLES_BUSY_COUNT + edge level |
|
||||
| DIV_CYCLES_BUSY_COUNT:i=0 | DIV_CYCLES_BUSY_COUNT + invert |
|
||||
| DIV_CYCLES_BUSY_COUNT:c=0 | DIV_CYCLES_BUSY_COUNT + counter-ma... |
|
||||
| DIV_CYCLES_BUSY_COUNT:g=0 | DIV_CYCLES_BUSY_COUNT + measure in... |
|
||||
| DIV_CYCLES_BUSY_COUNT:u=0 | DIV_CYCLES_BUSY_COUNT + monitor at... |
|
||||
| DIV_CYCLES_BUSY_COUNT:k=0 | DIV_CYCLES_BUSY_COUNT + monitor at... |
|
||||
| DIV_CYCLES_BUSY_COUNT:period=0 | DIV_CYCLES_BUSY_COUNT + sampling p... |
|
||||
| DIV_CYCLES_BUSY_COUNT:freq=0 | DIV_CYCLES_BUSY_COUNT + sampling f... |
|
||||
| DIV_CYCLES_BUSY_COUNT:excl=0 | DIV_CYCLES_BUSY_COUNT + exclusive ... |
|
||||
| DIV_CYCLES_BUSY_COUNT:mg=0 | DIV_CYCLES_BUSY_COUNT + monitor gu... |
|
||||
| DIV_CYCLES_BUSY_COUNT:mh=0 | DIV_CYCLES_BUSY_COUNT + monitor ho... |
|
||||
| DIV_CYCLES_BUSY_COUNT:cpu=0 | DIV_CYCLES_BUSY_COUNT + CPU to pro... |
|
||||
| DIV_CYCLES_BUSY_COUNT:pinned=0 | DIV_CYCLES_BUSY_COUNT + pin event ... |
|
||||
| DIV_OP_COUNT | Number of divide uops. |
|
||||
| DIV_OP_COUNT:e=0 | DIV_OP_COUNT + edge level |
|
||||
| DIV_OP_COUNT:i=0 | DIV_OP_COUNT + invert |
|
||||
| DIV_OP_COUNT:c=0 | DIV_OP_COUNT + counter-mask in ran... |
|
||||
| DIV_OP_COUNT:g=0 | DIV_OP_COUNT + measure in guest |
|
||||
| DIV_OP_COUNT:u=0 | DIV_OP_COUNT + monitor at user level |
|
||||
| DIV_OP_COUNT:k=0 | DIV_OP_COUNT + monitor at kernel l... |
|
||||
| DIV_OP_COUNT:period=0 | DIV_OP_COUNT + sampling period |
|
||||
| DIV_OP_COUNT:freq=0 | DIV_OP_COUNT + sampling frequency ... |
|
||||
| DIV_OP_COUNT:excl=0 | DIV_OP_COUNT + exclusive access |
|
||||
| DIV_OP_COUNT:mg=0 | DIV_OP_COUNT + monitor guest execu... |
|
||||
| DIV_OP_COUNT:mh=0 | DIV_OP_COUNT + monitor host execution |
|
||||
| DIV_OP_COUNT:cpu=0 | DIV_OP_COUNT + CPU to program |
|
||||
| DIV_OP_COUNT:pinned=0 | DIV_OP_COUNT + pin event to counters |
|
||||
| ... etc. ... | |
|
||||
| amd64_rapl::RAPL_ENERGY_PKG | Number of Joules consumed by all c... |
|
||||
| amd64_rapl::RAPL_ENERGY_PKG:u=0 | amd64_rapl::RAPL_ENERGY_PKG + moni... |
|
||||
| amd64_rapl::RAPL_ENERGY_PKG:k=0 | amd64_rapl::RAPL_ENERGY_PKG + moni... |
|
||||
| amd64_rapl::RAPL_ENERGY_PKG:period=0 | amd64_rapl::RAPL_ENERGY_PKG + samp... |
|
||||
| amd64_rapl::RAPL_ENERGY_PKG:freq=0 | amd64_rapl::RAPL_ENERGY_PKG + samp... |
|
||||
| amd64_rapl::RAPL_ENERGY_PKG:excl=0 | amd64_rapl::RAPL_ENERGY_PKG + excl... |
|
||||
| amd64_rapl::RAPL_ENERGY_PKG:mg=0 | amd64_rapl::RAPL_ENERGY_PKG + moni... |
|
||||
| amd64_rapl::RAPL_ENERGY_PKG:mh=0 | amd64_rapl::RAPL_ENERGY_PKG + moni... |
|
||||
| amd64_rapl::RAPL_ENERGY_PKG:cpu=0 | amd64_rapl::RAPL_ENERGY_PKG + CPU ... |
|
||||
| amd64_rapl::RAPL_ENERGY_PKG:pinned=0 | amd64_rapl::RAPL_ENERGY_PKG + pin ... |
|
||||
| appio:::READ_BYTES | Bytes read |
|
||||
| appio:::READ_CALLS | Number of read calls |
|
||||
| appio:::READ_ERR | Number of read calls that resulted... |
|
||||
| appio:::READ_INTERRUPTED | Number of read calls that timed ou... |
|
||||
| appio:::READ_WOULD_BLOCK | Number of read calls that would ha... |
|
||||
| appio:::READ_SHORT | Number of read calls that returned... |
|
||||
| appio:::READ_EOF | Number of read calls that returned... |
|
||||
| appio:::READ_BLOCK_SIZE | Average block size of reads |
|
||||
| appio:::READ_USEC | Real microseconds spent in reads |
|
||||
| appio:::WRITE_BYTES | Bytes written |
|
||||
| appio:::WRITE_CALLS | Number of write calls |
|
||||
| appio:::WRITE_ERR | Number of write calls that resulte... |
|
||||
| appio:::WRITE_SHORT | Number of write calls that wrote l... |
|
||||
| appio:::WRITE_INTERRUPTED | Number of write calls that timed o... |
|
||||
| appio:::WRITE_WOULD_BLOCK | Number of write calls that would h... |
|
||||
| appio:::WRITE_BLOCK_SIZE | Mean block size of writes |
|
||||
| appio:::WRITE_USEC | Real microseconds spent in writes |
|
||||
| appio:::OPEN_CALLS | Number of open calls |
|
||||
| appio:::OPEN_ERR | Number of open calls that resulted... |
|
||||
| appio:::OPEN_FDS | Number of currently open descriptors |
|
||||
| appio:::SELECT_USEC | Real microseconds spent in select ... |
|
||||
| appio:::RECV_BYTES | Bytes read in recv/recvmsg/recvfrom |
|
||||
| appio:::RECV_CALLS | Number of recv/recvmsg/recvfrom calls |
|
||||
| appio:::RECV_ERR | Number of recv/recvmsg/recvfrom ca... |
|
||||
| appio:::RECV_INTERRUPTED | Number of recv/recvmsg/recvfrom ca... |
|
||||
| appio:::RECV_WOULD_BLOCK | Number of recv/recvmsg/recvfrom ca... |
|
||||
| appio:::RECV_SHORT | Number of recv/recvmsg/recvfrom ca... |
|
||||
| appio:::RECV_EOF | Number of recv/recvmsg/recvfrom ca... |
|
||||
| appio:::RECV_BLOCK_SIZE | Average block size of recv/recvmsg... |
|
||||
| appio:::RECV_USEC | Real microseconds spent in recv/re... |
|
||||
| appio:::SOCK_READ_BYTES | Bytes read from socket |
|
||||
| appio:::SOCK_READ_CALLS | Number of read calls on socket |
|
||||
| appio:::SOCK_READ_ERR | Number of read calls on socket tha... |
|
||||
| appio:::SOCK_READ_SHORT | Number of read calls on socket tha... |
|
||||
| appio:::SOCK_READ_WOULD_BLOCK | Number of read calls on socket tha... |
|
||||
| appio:::SOCK_READ_USEC | Real microseconds spent in read(s)... |
|
||||
| appio:::SOCK_WRITE_BYTES | Bytes written to socket |
|
||||
| appio:::SOCK_WRITE_CALLS | Number of write calls to socket |
|
||||
| appio:::SOCK_WRITE_ERR | Number of write calls to socket th... |
|
||||
| appio:::SOCK_WRITE_SHORT | Number of write calls to socket th... |
|
||||
| appio:::SOCK_WRITE_WOULD_BLOCK | Number of write calls to socket th... |
|
||||
| appio:::SOCK_WRITE_USEC | Real microseconds spent in write(s... |
|
||||
| appio:::SEEK_CALLS | Number of seek calls |
|
||||
| appio:::SEEK_ABS_STRIDE_SIZE | Average absolute stride size of seeks |
|
||||
| appio:::SEEK_USEC | Real microseconds spent in seek calls |
|
||||
| coretemp:::hwmon2:in0_input | V, amdgpu module, label vddgfx |
|
||||
| coretemp:::hwmon2:temp1_input | degrees C, amdgpu module, label edge |
|
||||
| coretemp:::hwmon2:temp2_input | degrees C, amdgpu module, label ju... |
|
||||
| coretemp:::hwmon2:temp3_input | degrees C, amdgpu module, label mem |
|
||||
| coretemp:::hwmon2:fan1_input | RPM, amdgpu module, label ? |
|
||||
| coretemp:::hwmon0:temp1_input | degrees C, nvme module, label Comp... |
|
||||
| coretemp:::hwmon0:temp2_input | degrees C, nvme module, label Sens... |
|
||||
| coretemp:::hwmon0:temp3_input | degrees C, nvme module, label Sens... |
|
||||
| coretemp:::hwmon3:temp1_input | degrees C, k10temp module, label Tctl |
|
||||
| coretemp:::hwmon3:temp2_input | degrees C, k10temp module, label Tdie |
|
||||
| coretemp:::hwmon3:temp5_input | degrees C, k10temp module, label T... |
|
||||
| coretemp:::hwmon3:temp7_input | degrees C, k10temp module, label T... |
|
||||
| coretemp:::hwmon1:temp1_input | degrees C, enp1s0 module, label PH... |
|
||||
| coretemp:::hwmon1:temp2_input | degrees C, enp1s0 module, label MA... |
|
||||
| io:::rchar | Characters read. |
|
||||
| io:::wchar | Characters written. |
|
||||
| io:::syscr | Characters read by system calls. |
|
||||
| io:::syscw | Characters written by system calls. |
|
||||
| io:::read_bytes | Binary bytes read. |
|
||||
| io:::write_bytes | Binary bytes written. |
|
||||
| io:::cancelled_write_bytes | Binary write bytes cancelled. |
|
||||
| net:::lo:rx:bytes | lo receive bytes |
|
||||
| net:::lo:rx:packets | lo receive packets |
|
||||
| net:::lo:rx:errors | lo receive errors |
|
||||
| net:::lo:rx:dropped | lo receive dropped |
|
||||
| net:::lo:rx:fifo | lo receive fifo |
|
||||
| net:::lo:rx:frame | lo receive frame |
|
||||
| net:::lo:rx:compressed | lo receive compressed |
|
||||
| net:::lo:rx:multicast | lo receive multicast |
|
||||
| net:::lo:tx:bytes | lo transmit bytes |
|
||||
| net:::lo:tx:packets | lo transmit packets |
|
||||
| net:::lo:tx:errors | lo transmit errors |
|
||||
| net:::lo:tx:dropped | lo transmit dropped |
|
||||
| net:::lo:tx:fifo | lo transmit fifo |
|
||||
| net:::lo:tx:colls | lo transmit colls |
|
||||
| net:::lo:tx:carrier | lo transmit carrier |
|
||||
| net:::lo:tx:compressed | lo transmit compressed |
|
||||
| net:::enp1s0:rx:bytes | enp1s0 receive bytes |
|
||||
| net:::enp1s0:rx:packets | enp1s0 receive packets |
|
||||
| net:::enp1s0:rx:errors | enp1s0 receive errors |
|
||||
| net:::enp1s0:rx:dropped | enp1s0 receive dropped |
|
||||
| net:::enp1s0:rx:fifo | enp1s0 receive fifo |
|
||||
| net:::enp1s0:rx:frame | enp1s0 receive frame |
|
||||
| net:::enp1s0:rx:compressed | enp1s0 receive compressed |
|
||||
| net:::enp1s0:rx:multicast | enp1s0 receive multicast |
|
||||
| net:::enp1s0:tx:bytes | enp1s0 transmit bytes |
|
||||
| net:::enp1s0:tx:packets | enp1s0 transmit packets |
|
||||
| net:::enp1s0:tx:errors | enp1s0 transmit errors |
|
||||
| net:::enp1s0:tx:dropped | enp1s0 transmit dropped |
|
||||
| net:::enp1s0:tx:fifo | enp1s0 transmit fifo |
|
||||
| net:::enp1s0:tx:colls | enp1s0 transmit colls |
|
||||
| net:::enp1s0:tx:carrier | enp1s0 transmit carrier |
|
||||
| net:::enp1s0:tx:compressed | enp1s0 transmit compressed |
|
||||
| net:::vxlan.calico:rx:bytes | vxlan.calico receive bytes |
|
||||
| net:::vxlan.calico:rx:packets | vxlan.calico receive packets |
|
||||
| net:::vxlan.calico:rx:errors | vxlan.calico receive errors |
|
||||
| net:::vxlan.calico:rx:dropped | vxlan.calico receive dropped |
|
||||
| net:::vxlan.calico:rx:fifo | vxlan.calico receive fifo |
|
||||
| net:::vxlan.calico:rx:frame | vxlan.calico receive frame |
|
||||
| net:::vxlan.calico:rx:compressed | vxlan.calico receive compressed |
|
||||
| net:::vxlan.calico:rx:multicast | vxlan.calico receive multicast |
|
||||
| net:::vxlan.calico:tx:bytes | vxlan.calico transmit bytes |
|
||||
| net:::vxlan.calico:tx:packets | vxlan.calico transmit packets |
|
||||
| net:::vxlan.calico:tx:errors | vxlan.calico transmit errors |
|
||||
| net:::vxlan.calico:tx:dropped | vxlan.calico transmit dropped |
|
||||
| net:::vxlan.calico:tx:fifo | vxlan.calico transmit fifo |
|
||||
| net:::vxlan.calico:tx:colls | vxlan.calico transmit colls |
|
||||
| net:::vxlan.calico:tx:carrier | vxlan.calico transmit carrier |
|
||||
| net:::vxlan.calico:tx:compressed | vxlan.calico transmit compressed |
|
||||
| net:::cali59d6fabc2aa:rx:bytes | cali59d6fabc2aa receive bytes |
|
||||
| net:::cali59d6fabc2aa:rx:packets | cali59d6fabc2aa receive packets |
|
||||
| net:::cali59d6fabc2aa:rx:errors | cali59d6fabc2aa receive errors |
|
||||
| net:::cali59d6fabc2aa:rx:dropped | cali59d6fabc2aa receive dropped |
|
||||
| net:::cali59d6fabc2aa:rx:fifo | cali59d6fabc2aa receive fifo |
|
||||
| net:::cali59d6fabc2aa:rx:frame | cali59d6fabc2aa receive frame |
|
||||
| net:::cali59d6fabc2aa:rx:compressed | cali59d6fabc2aa receive compressed |
|
||||
| net:::cali59d6fabc2aa:rx:multicast | cali59d6fabc2aa receive multicast |
|
||||
| net:::cali59d6fabc2aa:tx:bytes | cali59d6fabc2aa transmit bytes |
|
||||
| net:::cali59d6fabc2aa:tx:packets | cali59d6fabc2aa transmit packets |
|
||||
| net:::cali59d6fabc2aa:tx:errors | cali59d6fabc2aa transmit errors |
|
||||
| net:::cali59d6fabc2aa:tx:dropped | cali59d6fabc2aa transmit dropped |
|
||||
| net:::cali59d6fabc2aa:tx:fifo | cali59d6fabc2aa transmit fifo |
|
||||
| net:::cali59d6fabc2aa:tx:colls | cali59d6fabc2aa transmit colls |
|
||||
| net:::cali59d6fabc2aa:tx:carrier | cali59d6fabc2aa transmit carrier |
|
||||
| net:::cali59d6fabc2aa:tx:compressed | cali59d6fabc2aa transmit compressed |
|
||||
|---------------------------------------|---------------------------------------|
|
||||
```
|
||||
|
||||
## Creating a Configuration File
|
||||
@@ -370,44 +651,92 @@ but do not override an existing value for the environment variable.
|
||||
|
||||
```shell
|
||||
# lvals starting with $ are variables
|
||||
$USE = ON
|
||||
$ENABLE = ON
|
||||
$SAMPLE = OFF
|
||||
|
||||
# use fields
|
||||
OMNITRACE_USE_PERFETTO = $USE
|
||||
OMNITRACE_USE_TIMEMORY = $USE
|
||||
OMNITRACE_USE_SAMPLING = $USE
|
||||
OMNITRACE_USE_PID = OFF
|
||||
OMNITRACE_USE_PERFETTO = $ENABLE
|
||||
OMNITRACE_USE_TIMEMORY = $ENABLE
|
||||
OMNITRACE_USE_SAMPLING = $SAMPLE
|
||||
OMNITRACE_USE_THREAD_SAMPLING = $SAMPLE
|
||||
OMNITRACE_CRITICAL_TRACE = OFF
|
||||
|
||||
# debug
|
||||
OMNITRACE_DEBUG = OFF
|
||||
OMNITRACE_VERBOSE = 1
|
||||
OMNITRACE_DL_VERBOSE = 1
|
||||
|
||||
# output fields
|
||||
OMNITRACE_OUTPUT_PREFIX = %tag%-
|
||||
OMNITRACE_OUTPUT_PATH = omnitrace-example-output
|
||||
OMNITRACE_OUTPUT_PREFIX = %tag%/
|
||||
OMNITRACE_TIME_OUTPUT = OFF
|
||||
OMNITRACE_USE_PID = OFF
|
||||
|
||||
# timemory fields
|
||||
OMNITRACE_PAPI_EVENTS = PAPI_TOT_INS PAPI_FP_INS
|
||||
OMNITRACE_TIMEMORY_COMPONENTS = wall_clock trip_count
|
||||
OMNITRACE_MEMORY_UNITS = MB
|
||||
OMNITRACE_TIMING_UNITS = sec
|
||||
|
||||
# sampling fields
|
||||
OMNITRACE_SAMPLING_FREQ = 10
|
||||
|
||||
# rocm-smi fields
|
||||
OMNITRACE_ROCM_SMI_DEVICES = 1
|
||||
OMNITRACE_ROCM_SMI_DEVICES = 0
|
||||
|
||||
# misc env variables
|
||||
OMNITRACE_SAMPLING_KEEP_DYNINST_SUFFIX = OFF
|
||||
OMNITRACE_SAMPLING_KEEP_INTERNAL = OFF
|
||||
```
|
||||
|
||||
### Sample JSON Configuration File
|
||||
|
||||
The full JSON specification for a configuration value contains a lot of information:
|
||||
|
||||
```json
|
||||
{
|
||||
"omnitrace": {
|
||||
"settings": {
|
||||
"OMNITRACE_ADD_SECONDARY": {
|
||||
"count": -1,
|
||||
"name": "add_secondary",
|
||||
"data_type": "bool",
|
||||
"initial": true,
|
||||
"value": true,
|
||||
"max_count": 1,
|
||||
"cmdline": [
|
||||
"--omnitrace-add-secondary"
|
||||
],
|
||||
"environ": "OMNITRACE_ADD_SECONDARY",
|
||||
"cereal_class_version": 1,
|
||||
"categories": [
|
||||
"component",
|
||||
"data",
|
||||
"native"
|
||||
],
|
||||
"description": "Enable/disable components adding secondary (child) entries when available. E.g. suppress individual CUDA kernels, etc. when using Cupti components"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
However when writing an JSON configuration file, the following is minimally acceptable to set `OMNITRACE_ADD_SECONDARY=false`:
|
||||
|
||||
```json
|
||||
{
|
||||
"omnitrace": {
|
||||
"settings": {
|
||||
"OMNITRACE_ADD_SECONDARY": {
|
||||
"value": true
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Sample XML Configuration File
|
||||
|
||||
The full XML specification for a configuration value contains
|
||||
a lot of information:
|
||||
The full XML specification for a configuration value contains the same information as the JSON specification:
|
||||
|
||||
```xml
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
@@ -424,7 +753,7 @@ a lot of information:
|
||||
<count>-1</count>
|
||||
<max_count>1</max_count>
|
||||
<cmdline>
|
||||
<value0>--timemory-add-secondary</value0>
|
||||
<value0>--omnitrace-add-secondary</value0>
|
||||
</cmdline>
|
||||
<categories>
|
||||
<value0>component</value0>
|
||||
@@ -441,8 +770,7 @@ a lot of information:
|
||||
</timemory_xml>
|
||||
```
|
||||
|
||||
Howver when writing an XML configuration file, the following is perfectly acceptable
|
||||
to set `OMNITRACE_ADD_SECONDARY=false`:
|
||||
However, when writing an XML configuration file, the following is minimally acceptable to set `OMNITRACE_ADD_SECONDARY=false`:
|
||||
|
||||
```xml
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
@@ -456,51 +784,3 @@ to set `OMNITRACE_ADD_SECONDARY=false`:
|
||||
</omnitrace>
|
||||
</timemory_xml>
|
||||
```
|
||||
|
||||
### Sample JSON Configuration File
|
||||
|
||||
The full JSON specification for a configuration value contains the same information as the XML:
|
||||
|
||||
```json
|
||||
{
|
||||
"omnitrace": {
|
||||
"settings": {
|
||||
"OMNITRACE_ADD_SECONDARY": {
|
||||
"count": -1,
|
||||
"name": "add_secondary",
|
||||
"data_type": "bool",
|
||||
"initial": true,
|
||||
"value": true,
|
||||
"max_count": 1,
|
||||
"cmdline": [
|
||||
"--timemory-add-secondary"
|
||||
],
|
||||
"environ": "OMNITRACE_ADD_SECONDARY",
|
||||
"cereal_class_version": 1,
|
||||
"categories": [
|
||||
"component",
|
||||
"data",
|
||||
"native"
|
||||
],
|
||||
"description": "Enable/disable components adding secondary (child) entries when available. E.g. suppress individual CUDA kernels, etc. when using Cupti components"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Similarly, the
|
||||
Howver when writing an XML configuration file, the following is perfectly acceptable
|
||||
to set `OMNITRACE_ADD_SECONDARY=false`:
|
||||
|
||||
```json
|
||||
{
|
||||
"omnitrace": {
|
||||
"settings": {
|
||||
"OMNITRACE_ADD_SECONDARY": {
|
||||
"value": true
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
@@ -25,8 +25,11 @@ make html
|
||||
|
||||
if [ -d ${SOURCE_DIR}/docs ]; then
|
||||
message "Removing stale documentation in ${SOURCE_DIR}/docs/"
|
||||
echo rm -rf ${SOURCE_DIR}/docs/*
|
||||
rm -rf ${SOURCE_DIR}/docs/*
|
||||
|
||||
message "Adding nojekyll to docs/"
|
||||
cp -r ${WORK_DIR}/.nojekyll ${SOURCE_DIR}/docs/.nojekyll
|
||||
|
||||
message "Copying source/docs/_build/html/* to docs/"
|
||||
echo cp -r ${WORK_DIR}/_build/html/* ${SOURCE_DIR}/docs/
|
||||
cp -r ${WORK_DIR}/_build/html/* ${SOURCE_DIR}/docs/
|
||||
fi
|
||||
|
||||
Reference in New Issue
Block a user