e8a5845661
* Added buffer counter collection API. Initial testing added into counter-collection sample. Added support for constant metrics in counter collection (#194) * Added support for constant metrics in counter collection Adds support and test cases for constant metrics (such as max wave size) and adds the metric kernel duration (though this is still not yet calculated). * Minor doc updates * Simple counter unit tests (#199) * Simple counter unit tests Unit tests and some minor fixes for simple and derived counter evaluation * Added unit tests for reduction operations (#200) * Added unit tests for reduction operations * added tests for combo (constant+regular) counters (#201) source formatting (clang-format v11) (#202) Co-authored-by: bwelton <bwelton@users.noreply.github.com> source formatting (clang-format v11) (#203) Co-authored-by: bwelton <bwelton@users.noreply.github.com> Local changes source formatting (clang-format v11) (#205) Co-authored-by: bwelton <bwelton@users.noreply.github.com> Minor doc fix Remove kernel_duration, migrate over set_dimensions to after HSA init source formatting (clang-format v11) (#207) Co-authored-by: bwelton <bwelton@users.noreply.github.com> Added output to ROCPROFILER_SAMPLE_OUTPUT_FILE: * Remove integer based counter in return struct This casues a lot of complications and seems to provide limit benefit of just treating all counters as doubles. For ease of use, drop the integer based counter. * source formatting (clang-format v11) (#217) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Add correlation id support to counters (#218) Adds correlation id support to counter collection. Requires tracing to be enabled to return any useful value currently (since we do not have HIP kernel tracing yet). * source formatting (clang-format v11) (#223) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Add sample that attempts to fetch all counters On whatever machine this test is run on, all counters available on the platform will attempted to be fetched from a kernel execution. Each counter will be fetched one time to check that the counter can be fetched on the platform and that the counter is returning the correct instance count (however due to the lack of transparency from AQL profiler this check is not functional for some counters). We do not do any implicit reduction on any counter, the result is that we see more counters than the number of events being requested. Below is the status of all counters on MI210. All counters appear functional with the changes in this PR. However, the instance count retruned will be greater than that returned by rocprofiler_query_counter_instance_count. Got 516 counters collected Counter ID: 0 (size) expected 1 instances and got 1 Counter ID: 1 (processor_id_low) expected 1 instances and got 1 Counter ID: 2 (capability) expected 1 instances and got 1 Counter ID: 3 (local_mem_size) expected 1 instances and got 1 Counter ID: 4 (min_latency) expected 1 instances and got 1 Counter ID: 5 (weight) expected 1 instances and got 1 Counter ID: 6 (node_from) expected 1 instances and got 1 Counter ID: 7 (version_major) expected 1 instances and got 1 Counter ID: 8 (version_minor) expected 1 instances and got 1 Counter ID: 9 (mem_clk_max) expected 1 instances and got 1 Counter ID: 10 (num_xcc) expected 1 instances and got 1 Counter ID: 11 (width) expected 1 instances and got 1 Counter ID: 12 (flags) expected 1 instances and got 1 Counter ID: 13 (size_in_bytes) expected 1 instances and got 1 Counter ID: 14 (array_count) expected 1 instances and got 1 Counter ID: 15 (num_gws) expected 1 instances and got 1 Counter ID: 16 (simd_id_base) expected 1 instances and got 1 Counter ID: 17 (max_waves_per_simd) expected 1 instances and got 1 Counter ID: 18 (sdma_fw_version) expected 1 instances and got 1 Counter ID: 19 (gfx_target_version) expected 1 instances and got 1 Counter ID: 20 (max_bandwidth) expected 1 instances and got 1 Counter ID: 21 (cpu_core_id_base) expected 1 instances and got 1 Counter ID: 22 (cache_line_size) expected 1 instances and got 1 Counter ID: 23 (level) expected 1 instances and got 1 Counter ID: 24 (min_bandwidth) expected 1 instances and got 1 Counter ID: 25 (location_id) expected 1 instances and got 1 Counter ID: 26 (wave_front_size) expected 1 instances and got 1 Counter ID: 27 (lds_size_in_kb) expected 1 instances and got 1 Counter ID: 28 (simd_count) expected 1 instances and got 1 Counter ID: 29 (fw_version) expected 1 instances and got 1 Counter ID: 30 (recommended_transfer_size) expected 1 instances and got 1 Counter ID: 31 (simd_per_cu) expected 1 instances and got 1 Counter ID: 32 (association) expected 1 instances and got 1 Counter ID: 33 (mem_banks_count) expected 1 instances and got 1 Counter ID: 34 (latency) expected 1 instances and got 1 Counter ID: 35 (max_latency) expected 1 instances and got 1 Counter ID: 36 (cpu_cores_count) expected 1 instances and got 1 Counter ID: 37 (io_links_count) expected 1 instances and got 1 Counter ID: 38 (domain) expected 1 instances and got 1 Counter ID: 39 (max_engine_clk_fcompute) expected 1 instances and got 1 Counter ID: 40 (caches_count) expected 1 instances and got 1 Counter ID: 41 (simd_arrays_per_engine) expected 1 instances and got 1 Counter ID: 42 (cache_lines_per_tag) expected 1 instances and got 1 Counter ID: 43 (gds_size_in_kb) expected 1 instances and got 1 Counter ID: 44 (cu_per_simd_array) expected 1 instances and got 1 Counter ID: 45 (type) expected 1 instances and got 1 Counter ID: 46 (max_slots_scratch_cu) expected 1 instances and got 1 Counter ID: 47 (vendor_id) expected 1 instances and got 1 Counter ID: 48 (device_id) expected 1 instances and got 1 Counter ID: 49 (heap_type) expected 1 instances and got 1 Counter ID: 50 (drm_render_minor) expected 1 instances and got 1 Counter ID: 51 (num_sdma_engines) expected 1 instances and got 1 Counter ID: 52 (node_to) expected 1 instances and got 1 Counter ID: 53 (num_sdma_xgmi_engines) expected 1 instances and got 1 Counter ID: 54 (num_sdma_queues_per_engine) expected 1 instances and got 1 Counter ID: 55 (hive_id) expected 1 instances and got 1 Counter ID: 56 (num_cp_queues) expected 1 instances and got 1 Counter ID: 57 (max_engine_clk_ccompute) expected 1 instances and got 1 Counter ID: 517 (MAX_WAVE_SIZE) expected 1 instances and got 1 Counter ID: 518 (SE_NUM) expected 1 instances and got 1 Counter ID: 519 (SIMD_NUM) expected 1 instances and got 1 Counter ID: 520 (CU_NUM) expected 1 instances and got 1 [ERROR]Counter ID: 521 (SQ_WAIT_INST_LDS) expected 1 instances and got 8 [ERROR]Counter ID: 522 (TCP_TCP_TA_DATA_STALL_CYCLES) expected 16 instances and got 128 Counter ID: 523 (GRBM_COUNT) expected 1 instances and got 1 Counter ID: 524 (GRBM_GUI_ACTIVE) expected 1 instances and got 1 Counter ID: 525 (GRBM_CP_BUSY) expected 1 instances and got 1 Counter ID: 526 (GRBM_SPI_BUSY) expected 1 instances and got 1 Counter ID: 527 (GRBM_TA_BUSY) expected 1 instances and got 1 Counter ID: 528 (GRBM_TC_BUSY) expected 1 instances and got 1 Counter ID: 529 (GRBM_CPC_BUSY) expected 1 instances and got 1 Counter ID: 530 (GRBM_CPF_BUSY) expected 1 instances and got 1 Counter ID: 531 (GRBM_UTCL2_BUSY) expected 1 instances and got 1 Counter ID: 532 (GRBM_EA_BUSY) expected 1 instances and got 1 Counter ID: 533 (CPC_ME1_BUSY_FOR_PACKET_DECODE) expected 1 instances and got 1 Counter ID: 534 (CPC_UTCL1_STALL_ON_TRANSLATION) expected 1 instances and got 1 Counter ID: 535 (CPC_CPC_STAT_BUSY) expected 1 instances and got 1 Counter ID: 536 (CPC_CPC_STAT_IDLE) expected 1 instances and got 1 Counter ID: 537 (CPC_CPC_STAT_STALL) expected 1 instances and got 1 Counter ID: 538 (CPC_CPC_TCIU_BUSY) expected 1 instances and got 1 Counter ID: 539 (CPC_CPC_TCIU_IDLE) expected 1 instances and got 1 Counter ID: 540 (CPC_CPC_UTCL2IU_BUSY) expected 1 instances and got 1 Counter ID: 541 (CPC_CPC_UTCL2IU_IDLE) expected 1 instances and got 1 Counter ID: 542 (CPC_CPC_UTCL2IU_STALL) expected 1 instances and got 1 Counter ID: 543 (CPC_ME1_DC0_SPI_BUSY) expected 1 instances and got 1 Counter ID: 544 (CPF_CMP_UTCL1_STALL_ON_TRANSLATION) expected 1 instances and got 1 Counter ID: 545 (CPF_CPF_STAT_BUSY) expected 1 instances and got 1 Counter ID: 546 (CPF_CPF_STAT_IDLE) expected 1 instances and got 1 Counter ID: 547 (CPF_CPF_STAT_STALL) expected 1 instances and got 1 Counter ID: 548 (CPF_CPF_TCIU_BUSY) expected 1 instances and got 1 Counter ID: 549 (CPF_CPF_TCIU_IDLE) expected 1 instances and got 1 Counter ID: 550 (CPF_CPF_TCIU_STALL) expected 1 instances and got 1 [ERROR]Counter ID: 551 (SPI_CSN_WINDOW_VALID) expected 1 instances and got 8 [ERROR]Counter ID: 552 (SPI_CSN_BUSY) expected 1 instances and got 8 [ERROR]Counter ID: 553 (SPI_CSN_NUM_THREADGROUPS) expected 1 instances and got 8 [ERROR]Counter ID: 554 (SPI_CSN_WAVE) expected 1 instances and got 8 [ERROR]Counter ID: 555 (SPI_RA_REQ_NO_ALLOC) expected 1 instances and got 8 [ERROR]Counter ID: 556 (SPI_RA_REQ_NO_ALLOC_CSN) expected 1 instances and got 8 [ERROR]Counter ID: 557 (SPI_RA_RES_STALL_CSN) expected 1 instances and got 8 [ERROR]Counter ID: 558 (SPI_RA_TMP_STALL_CSN) expected 1 instances and got 8 [ERROR]Counter ID: 559 (SPI_RA_WAVE_SIMD_FULL_CSN) expected 1 instances and got 8 [ERROR]Counter ID: 560 (SPI_RA_VGPR_SIMD_FULL_CSN) expected 1 instances and got 8 [ERROR]Counter ID: 561 (SPI_RA_SGPR_SIMD_FULL_CSN) expected 1 instances and got 8 [ERROR]Counter ID: 562 (SPI_RA_LDS_CU_FULL_CSN) expected 1 instances and got 8 [ERROR]Counter ID: 563 (SPI_RA_BAR_CU_FULL_CSN) expected 1 instances and got 8 [ERROR]Counter ID: 564 (SPI_RA_BULKY_CU_FULL_CSN) expected 1 instances and got 8 [ERROR]Counter ID: 565 (SPI_RA_TGLIM_CU_FULL_CSN) expected 1 instances and got 8 [ERROR]Counter ID: 566 (SPI_RA_WVLIM_STALL_CSN) expected 1 instances and got 8 [ERROR]Counter ID: 567 (SPI_SWC_CSC_WR) expected 1 instances and got 8 [ERROR]Counter ID: 568 (SPI_VWC_CSC_WR) expected 1 instances and got 8 [ERROR]Counter ID: 569 (SQ_ACCUM_PREV) expected 1 instances and got 8 [ERROR]Counter ID: 570 (SQ_CYCLES) expected 1 instances and got 8 [ERROR]Counter ID: 571 (SQ_BUSY_CYCLES) expected 1 instances and got 8 [ERROR]Counter ID: 572 (SQ_WAVES) expected 1 instances and got 8 [ERROR]Counter ID: 573 (SQ_LEVEL_WAVES) expected 1 instances and got 8 [ERROR]Counter ID: 574 (SQ_WAVES_EQ_64) expected 1 instances and got 8 [ERROR]Counter ID: 575 (SQ_WAVES_LT_64) expected 1 instances and got 8 [ERROR]Counter ID: 576 (SQ_WAVES_LT_48) expected 1 instances and got 8 [ERROR]Counter ID: 577 (SQ_WAVES_LT_32) expected 1 instances and got 8 [ERROR]Counter ID: 578 (SQ_WAVES_LT_16) expected 1 instances and got 8 [ERROR]Counter ID: 579 (SQ_BUSY_CU_CYCLES) expected 1 instances and got 8 [ERROR]Counter ID: 580 (SQ_ITEMS) expected 1 instances and got 8 [ERROR]Counter ID: 581 (SQ_INSTS) expected 1 instances and got 8 [ERROR]Counter ID: 582 (SQ_INSTS_VALU) expected 1 instances and got 8 [ERROR]Counter ID: 583 (SQ_INSTS_VALU_ADD_F16) expected 1 instances and got 8 [ERROR]Counter ID: 584 (SQ_INSTS_VALU_MUL_F16) expected 1 instances and got 8 [ERROR]Counter ID: 585 (SQ_INSTS_VALU_FMA_F16) expected 1 instances and got 8 [ERROR]Counter ID: 586 (SQ_INSTS_VALU_TRANS_F16) expected 1 instances and got 8 [ERROR]Counter ID: 587 (SQ_INSTS_VALU_ADD_F32) expected 1 instances and got 8 [ERROR]Counter ID: 588 (SQ_INSTS_VALU_MUL_F32) expected 1 instances and got 8 [ERROR]Counter ID: 589 (SQ_INSTS_VALU_FMA_F32) expected 1 instances and got 8 [ERROR]Counter ID: 590 (SQ_INSTS_VALU_TRANS_F32) expected 1 instances and got 8 [ERROR]Counter ID: 591 (SQ_INSTS_VALU_ADD_F64) expected 1 instances and got 8 [ERROR]Counter ID: 592 (SQ_INSTS_VALU_MUL_F64) expected 1 instances and got 8 [ERROR]Counter ID: 593 (SQ_INSTS_VALU_FMA_F64) expected 1 instances and got 8 [ERROR]Counter ID: 594 (SQ_INSTS_VALU_TRANS_F64) expected 1 instances and got 8 [ERROR]Counter ID: 595 (SQ_INSTS_VALU_INT32) expected 1 instances and got 8 [ERROR]Counter ID: 596 (SQ_INSTS_VALU_INT64) expected 1 instances and got 8 [ERROR]Counter ID: 597 (SQ_INSTS_VALU_CVT) expected 1 instances and got 8 [ERROR]Counter ID: 598 (SQ_INSTS_VALU_MFMA_I8) expected 1 instances and got 8 [ERROR]Counter ID: 599 (SQ_INSTS_VALU_MFMA_F16) expected 1 instances and got 8 [ERROR]Counter ID: 600 (SQ_INSTS_VALU_MFMA_BF16) expected 1 instances and got 8 [ERROR]Counter ID: 601 (SQ_INSTS_VALU_MFMA_F32) expected 1 instances and got 8 [ERROR]Counter ID: 602 (SQ_INSTS_VALU_MFMA_F64) expected 1 instances and got 8 [ERROR]Counter ID: 603 (SQ_INSTS_VALU_MFMA_MOPS_I8) expected 1 instances and got 8 [ERROR]Counter ID: 604 (SQ_INSTS_VALU_MFMA_MOPS_F16) expected 1 instances and got 8 [ERROR]Counter ID: 605 (SQ_INSTS_VALU_MFMA_MOPS_BF16) expected 1 instances and got 8 [ERROR]Counter ID: 606 (SQ_INSTS_VALU_MFMA_MOPS_F32) expected 1 instances and got 8 [ERROR]Counter ID: 607 (SQ_INSTS_VALU_MFMA_MOPS_F64) expected 1 instances and got 8 [ERROR]Counter ID: 608 (SQ_INSTS_MFMA) expected 1 instances and got 8 [ERROR]Counter ID: 609 (SQ_INSTS_VMEM_WR) expected 1 instances and got 8 [ERROR]Counter ID: 610 (SQ_INSTS_VMEM_RD) expected 1 instances and got 8 [ERROR]Counter ID: 611 (SQ_INSTS_VMEM) expected 1 instances and got 8 [ERROR]Counter ID: 612 (SQ_INSTS_SALU) expected 1 instances and got 8 [ERROR]Counter ID: 613 (SQ_INSTS_SMEM) expected 1 instances and got 8 [ERROR]Counter ID: 614 (SQ_INSTS_FLAT) expected 1 instances and got 8 [ERROR]Counter ID: 615 (SQ_INSTS_FLAT_LDS_ONLY) expected 1 instances and got 8 [ERROR]Counter ID: 616 (SQ_INSTS_LDS) expected 1 instances and got 8 [ERROR]Counter ID: 617 (SQ_INSTS_GDS) expected 1 instances and got 8 [ERROR]Counter ID: 618 (SQ_INSTS_EXP_GDS) expected 1 instances and got 8 [ERROR]Counter ID: 619 (SQ_INSTS_BRANCH) expected 1 instances and got 8 [ERROR]Counter ID: 620 (SQ_INSTS_SENDMSG) expected 1 instances and got 8 [ERROR]Counter ID: 621 (SQ_INSTS_VSKIPPED) expected 1 instances and got 8 [ERROR]Counter ID: 622 (SQ_INST_LEVEL_VMEM) expected 1 instances and got 8 [ERROR]Counter ID: 623 (SQ_INST_LEVEL_SMEM) expected 1 instances and got 8 [ERROR]Counter ID: 624 (SQ_INST_LEVEL_LDS) expected 1 instances and got 8 [ERROR]Counter ID: 625 (SQ_VALU_MFMA_BUSY_CYCLES) expected 1 instances and got 8 [ERROR]Counter ID: 626 (SQ_WAVE_CYCLES) expected 1 instances and got 8 [ERROR]Counter ID: 627 (SQ_WAIT_ANY) expected 1 instances and got 8 [ERROR]Counter ID: 628 (SQ_WAIT_INST_ANY) expected 1 instances and got 8 [ERROR]Counter ID: 629 (SQ_ACTIVE_INST_ANY) expected 1 instances and got 8 [ERROR]Counter ID: 630 (SQ_ACTIVE_INST_VMEM) expected 1 instances and got 8 [ERROR]Counter ID: 631 (SQ_ACTIVE_INST_LDS) expected 1 instances and got 8 [ERROR]Counter ID: 632 (SQ_ACTIVE_INST_VALU) expected 1 instances and got 8 [ERROR]Counter ID: 633 (SQ_ACTIVE_INST_SCA) expected 1 instances and got 8 [ERROR]Counter ID: 634 (SQ_ACTIVE_INST_EXP_GDS) expected 1 instances and got 8 [ERROR]Counter ID: 635 (SQ_ACTIVE_INST_MISC) expected 1 instances and got 8 [ERROR]Counter ID: 636 (SQ_ACTIVE_INST_FLAT) expected 1 instances and got 8 [ERROR]Counter ID: 637 (SQ_INST_CYCLES_VMEM_WR) expected 1 instances and got 8 [ERROR]Counter ID: 638 (SQ_INST_CYCLES_VMEM_RD) expected 1 instances and got 8 [ERROR]Counter ID: 639 (SQ_INST_CYCLES_SMEM) expected 1 instances and got 8 [ERROR]Counter ID: 640 (SQ_INST_CYCLES_SALU) expected 1 instances and got 8 [ERROR]Counter ID: 641 (SQ_THREAD_CYCLES_VALU) expected 1 instances and got 8 [ERROR]Counter ID: 642 (SQ_IFETCH) expected 1 instances and got 8 [ERROR]Counter ID: 643 (SQ_IFETCH_LEVEL) expected 1 instances and got 8 [ERROR]Counter ID: 644 (SQ_LDS_BANK_CONFLICT) expected 1 instances and got 8 [ERROR]Counter ID: 645 (SQ_LDS_ADDR_CONFLICT) expected 1 instances and got 8 [ERROR]Counter ID: 646 (SQ_LDS_UNALIGNED_STALL) expected 1 instances and got 8 [ERROR]Counter ID: 647 (SQ_LDS_MEM_VIOLATIONS) expected 1 instances and got 8 [ERROR]Counter ID: 648 (SQ_LDS_ATOMIC_RETURN) expected 1 instances and got 8 [ERROR]Counter ID: 649 (SQ_LDS_IDX_ACTIVE) expected 1 instances and got 8 [ERROR]Counter ID: 650 (SQ_ACCUM_PREV_HIRES) expected 1 instances and got 8 [ERROR]Counter ID: 651 (SQ_WAVES_RESTORED) expected 1 instances and got 8 [ERROR]Counter ID: 652 (SQ_WAVES_SAVED) expected 1 instances and got 8 [ERROR]Counter ID: 653 (SQ_INSTS_SMEM_NORM) expected 1 instances and got 8 [ERROR]Counter ID: 654 (SQC_DCACHE_INPUT_VALID_READYB) expected 1 instances and got 8 [ERROR]Counter ID: 655 (SQC_TC_REQ) expected 1 instances and got 8 [ERROR]Counter ID: 656 (SQC_TC_INST_REQ) expected 1 instances and got 8 [ERROR]Counter ID: 657 (SQC_TC_DATA_READ_REQ) expected 1 instances and got 8 [ERROR]Counter ID: 658 (SQC_TC_DATA_WRITE_REQ) expected 1 instances and got 8 [ERROR]Counter ID: 659 (SQC_TC_DATA_ATOMIC_REQ) expected 1 instances and got 8 [ERROR]Counter ID: 660 (SQC_TC_STALL) expected 1 instances and got 8 [ERROR]Counter ID: 661 (SQC_ICACHE_REQ) expected 1 instances and got 8 [ERROR]Counter ID: 662 (SQC_ICACHE_HITS) expected 1 instances and got 8 [ERROR]Counter ID: 663 (SQC_ICACHE_MISSES) expected 1 instances and got 8 [ERROR]Counter ID: 664 (SQC_ICACHE_MISSES_DUPLICATE) expected 1 instances and got 8 [ERROR]Counter ID: 665 (SQC_DCACHE_REQ) expected 1 instances and got 8 [ERROR]Counter ID: 666 (SQC_DCACHE_HITS) expected 1 instances and got 8 [ERROR]Counter ID: 667 (SQC_DCACHE_MISSES) expected 1 instances and got 8 [ERROR]Counter ID: 668 (SQC_DCACHE_MISSES_DUPLICATE) expected 1 instances and got 8 [ERROR]Counter ID: 669 (SQC_DCACHE_ATOMIC) expected 1 instances and got 8 [ERROR]Counter ID: 670 (SQC_DCACHE_REQ_READ_1) expected 1 instances and got 8 [ERROR]Counter ID: 671 (SQC_DCACHE_REQ_READ_2) expected 1 instances and got 8 [ERROR]Counter ID: 672 (SQC_DCACHE_REQ_READ_4) expected 1 instances and got 8 [ERROR]Counter ID: 673 (SQC_DCACHE_REQ_READ_8) expected 1 instances and got 8 [ERROR]Counter ID: 674 (SQC_DCACHE_REQ_READ_16) expected 1 instances and got 8 [ERROR]Counter ID: 675 (TA_TA_BUSY) expected 16 instances and got 128 [ERROR]Counter ID: 676 (TA_TOTAL_WAVEFRONTS) expected 16 instances and got 128 [ERROR]Counter ID: 677 (TA_BUFFER_WAVEFRONTS) expected 16 instances and got 128 [ERROR]Counter ID: 678 (TA_BUFFER_READ_WAVEFRONTS) expected 16 instances and got 128 [ERROR]Counter ID: 679 (TA_BUFFER_WRITE_WAVEFRONTS) expected 16 instances and got 128 [ERROR]Counter ID: 680 (TA_BUFFER_ATOMIC_WAVEFRONTS) expected 16 instances and got 128 [ERROR]Counter ID: 681 (TA_BUFFER_TOTAL_CYCLES) expected 16 instances and got 128 [ERROR]Counter ID: 682 (TA_BUFFER_COALESCED_READ_CYCLES) expected 16 instances and got 128 [ERROR]Counter ID: 683 (TA_BUFFER_COALESCED_WRITE_CYCLES) expected 16 instances and got 128 [ERROR]Counter ID: 684 (TA_ADDR_STALLED_BY_TC_CYCLES) expected 16 instances and got 128 [ERROR]Counter ID: 685 (TA_ADDR_STALLED_BY_TD_CYCLES) expected 16 instances and got 128 [ERROR]Counter ID: 686 (TA_DATA_STALLED_BY_TC_CYCLES) expected 16 instances and got 128 [ERROR]Counter ID: 687 (TA_FLAT_WAVEFRONTS) expected 16 instances and got 128 [ERROR]Counter ID: 688 (TA_FLAT_READ_WAVEFRONTS) expected 16 instances and got 128 [ERROR]Counter ID: 689 (TA_FLAT_WRITE_WAVEFRONTS) expected 16 instances and got 128 [ERROR]Counter ID: 690 (TA_FLAT_ATOMIC_WAVEFRONTS) expected 16 instances and got 128 [ERROR]Counter ID: 691 (TD_TD_BUSY) expected 16 instances and got 128 [ERROR]Counter ID: 692 (TD_TC_STALL) expected 16 instances and got 128 [ERROR]Counter ID: 693 (TD_SPI_STALL) expected 16 instances and got 128 [ERROR]Counter ID: 694 (TD_LOAD_WAVEFRONT) expected 16 instances and got 128 [ERROR]Counter ID: 695 (TD_ATOMIC_WAVEFRONT) expected 16 instances and got 128 [ERROR]Counter ID: 696 (TD_STORE_WAVEFRONT) expected 16 instances and got 128 [ERROR]Counter ID: 697 (TD_COALESCABLE_WAVEFRONT) expected 16 instances and got 128 [ERROR]Counter ID: 698 (TCP_GATE_EN1) expected 16 instances and got 128 [ERROR]Counter ID: 699 (TCP_GATE_EN2) expected 16 instances and got 128 [ERROR]Counter ID: 700 (TCP_TD_TCP_STALL_CYCLES) expected 16 instances and got 128 [ERROR]Counter ID: 701 (TCP_TCR_TCP_STALL_CYCLES) expected 16 instances and got 128 [ERROR]Counter ID: 702 (TCP_READ_TAGCONFLICT_STALL_CYCLES) expected 16 instances and got 128 [ERROR]Counter ID: 703 (TCP_WRITE_TAGCONFLICT_STALL_CYCLES) expected 16 instances and got 128 [ERROR]Counter ID: 704 (TCP_ATOMIC_TAGCONFLICT_STALL_CYCLES) expected 16 instances and got 128 [ERROR]Counter ID: 705 (TCP_PENDING_STALL_CYCLES) expected 16 instances and got 128 [ERROR]Counter ID: 706 (TCP_TA_TCP_STATE_READ) expected 16 instances and got 128 [ERROR]Counter ID: 707 (TCP_VOLATILE) expected 16 instances and got 128 [ERROR]Counter ID: 708 (TCP_TOTAL_ACCESSES) expected 16 instances and got 128 [ERROR]Counter ID: 709 (TCP_TOTAL_READ) expected 16 instances and got 128 [ERROR]Counter ID: 710 (TCP_TOTAL_WRITE) expected 16 instances and got 128 [ERROR]Counter ID: 711 (TCP_TOTAL_ATOMIC_WITH_RET) expected 16 instances and got 128 [ERROR]Counter ID: 712 (TCP_TOTAL_ATOMIC_WITHOUT_RET) expected 16 instances and got 128 [ERROR]Counter ID: 713 (TCP_TOTAL_WRITEBACK_INVALIDATES) expected 16 instances and got 128 [ERROR]Counter ID: 714 (TCP_UTCL1_REQUEST) expected 16 instances and got 128 [ERROR]Counter ID: 715 (TCP_UTCL1_TRANSLATION_MISS) expected 16 instances and got 128 [ERROR]Counter ID: 716 (TCP_UTCL1_TRANSLATION_HIT) expected 16 instances and got 128 [ERROR]Counter ID: 717 (TCP_UTCL1_PERMISSION_MISS) expected 16 instances and got 128 [ERROR]Counter ID: 718 (TCP_TOTAL_CACHE_ACCESSES) expected 16 instances and got 128 [ERROR]Counter ID: 719 (TCP_TCP_LATENCY) expected 16 instances and got 128 [ERROR]Counter ID: 720 (TCP_TCC_READ_REQ_LATENCY) expected 16 instances and got 128 [ERROR]Counter ID: 721 (TCP_TCC_WRITE_REQ_LATENCY) expected 16 instances and got 128 [ERROR]Counter ID: 722 (TCP_TCC_READ_REQ) expected 16 instances and got 128 [ERROR]Counter ID: 723 (TCP_TCC_WRITE_REQ) expected 16 instances and got 128 [ERROR]Counter ID: 724 (TCP_TCC_ATOMIC_WITH_RET_REQ) expected 16 instances and got 128 [ERROR]Counter ID: 725 (TCP_TCC_ATOMIC_WITHOUT_RET_REQ) expected 16 instances and got 128 [ERROR]Counter ID: 726 (TCP_TCC_NC_READ_REQ) expected 16 instances and got 128 [ERROR]Counter ID: 727 (TCP_TCC_NC_WRITE_REQ) expected 16 instances and got 128 [ERROR]Counter ID: 728 (TCP_TCC_NC_ATOMIC_REQ) expected 16 instances and got 128 [ERROR]Counter ID: 729 (TCP_TCC_UC_READ_REQ) expected 16 instances and got 128 [ERROR]Counter ID: 730 (TCP_TCC_UC_WRITE_REQ) expected 16 instances and got 128 [ERROR]Counter ID: 731 (TCP_TCC_UC_ATOMIC_REQ) expected 16 instances and got 128 [ERROR]Counter ID: 732 (TCP_TCC_CC_READ_REQ) expected 16 instances and got 128 [ERROR]Counter ID: 733 (TCP_TCC_CC_WRITE_REQ) expected 16 instances and got 128 [ERROR]Counter ID: 734 (TCP_TCC_CC_ATOMIC_REQ) expected 16 instances and got 128 [ERROR]Counter ID: 735 (TCP_TCC_RW_READ_REQ) expected 16 instances and got 128 [ERROR]Counter ID: 736 (TCP_TCC_RW_WRITE_REQ) expected 16 instances and got 128 [ERROR]Counter ID: 737 (TCP_TCC_RW_ATOMIC_REQ) expected 16 instances and got 128 Counter ID: 738 (TCA_CYCLE) expected 32 instances and got 32 Counter ID: 739 (TCA_BUSY) expected 32 instances and got 32 Counter ID: 740 (TCC_CYCLE) expected 32 instances and got 32 Counter ID: 741 (TCC_BUSY) expected 32 instances and got 32 Counter ID: 742 (TCC_REQ) expected 32 instances and got 32 Counter ID: 743 (TCC_STREAMING_REQ) expected 32 instances and got 32 Counter ID: 744 (TCC_NC_REQ) expected 32 instances and got 32 Counter ID: 745 (TCC_UC_REQ) expected 32 instances and got 32 Counter ID: 746 (TCC_CC_REQ) expected 32 instances and got 32 Counter ID: 747 (TCC_RW_REQ) expected 32 instances and got 32 Counter ID: 748 (TCC_PROBE) expected 32 instances and got 32 Counter ID: 749 (TCC_PROBE_ALL) expected 32 instances and got 32 Counter ID: 750 (TCC_READ) expected 32 instances and got 32 Counter ID: 751 (TCC_WRITE) expected 32 instances and got 32 Counter ID: 752 (TCC_ATOMIC) expected 32 instances and got 32 Counter ID: 753 (TCC_HIT) expected 32 instances and got 32 Counter ID: 754 (TCC_MISS) expected 32 instances and got 32 Counter ID: 755 (TCC_WRITEBACK) expected 32 instances and got 32 Counter ID: 756 (TCC_EA_WRREQ) expected 32 instances and got 32 Counter ID: 757 (TCC_EA_WRREQ_64B) expected 32 instances and got 32 Counter ID: 758 (TCC_EA_WR_UNCACHED_32B) expected 32 instances and got 32 Counter ID: 759 (TCC_EA_WRREQ_STALL) expected 32 instances and got 32 Counter ID: 760 (TCC_EA_WRREQ_IO_CREDIT_STALL) expected 32 instances and got 32 Counter ID: 761 (TCC_EA_WRREQ_GMI_CREDIT_STALL) expected 32 instances and got 32 Counter ID: 762 (TCC_EA_WRREQ_DRAM_CREDIT_STALL) expected 32 instances and got 32 Counter ID: 763 (TCC_TOO_MANY_EA_WRREQS_STALL) expected 32 instances and got 32 Counter ID: 764 (TCC_EA_WRREQ_LEVEL) expected 32 instances and got 32 Counter ID: 765 (TCC_EA_ATOMIC) expected 32 instances and got 32 Counter ID: 766 (TCC_EA_ATOMIC_LEVEL) expected 32 instances and got 32 Counter ID: 767 (TCC_EA_RDREQ) expected 32 instances and got 32 Counter ID: 768 (TCC_EA_RDREQ_32B) expected 32 instances and got 32 Counter ID: 769 (TCC_EA_RD_UNCACHED_32B) expected 32 instances and got 32 Counter ID: 770 (TCC_EA_RDREQ_IO_CREDIT_STALL) expected 32 instances and got 32 Counter ID: 771 (TCC_EA_RDREQ_GMI_CREDIT_STALL) expected 32 instances and got 32 Counter ID: 772 (TCC_EA_RDREQ_DRAM_CREDIT_STALL) expected 32 instances and got 32 Counter ID: 773 (TCC_EA_RDREQ_LEVEL) expected 32 instances and got 32 Counter ID: 774 (TCC_TAG_STALL) expected 32 instances and got 32 Counter ID: 775 (TCC_NORMAL_WRITEBACK) expected 32 instances and got 32 Counter ID: 776 (TCC_ALL_TC_OP_WB_WRITEBACK) expected 32 instances and got 32 Counter ID: 777 (TCC_NORMAL_EVICT) expected 32 instances and got 32 Counter ID: 778 (TCC_ALL_TC_OP_INV_EVICT) expected 32 instances and got 32 Counter ID: 779 (TCC_EA_RDREQ_DRAM) expected 32 instances and got 32 Counter ID: 780 (TCC_EA_WRREQ_DRAM) expected 32 instances and got 32 [ERROR]Counter ID: 1893 (MeanOccupancyPerCU) expected 1 instances and got 8 [ERROR]Counter ID: 1894 (MeanOccupancyPerActiveCU) expected 1 instances and got 8 [ERROR]Counter ID: 1895 (TA_BUSY_avr) expected 16 instances and got 1 [ERROR]Counter ID: 1896 (TA_BUSY_max) expected 16 instances and got 1 [ERROR]Counter ID: 1897 (TA_BUSY_min) expected 16 instances and got 1 [ERROR]Counter ID: 1898 (TA_TA_BUSY_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1899 (TA_TOTAL_WAVEFRONTS_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1900 (TA_ADDR_STALLED_BY_TC_CYCLES_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1901 (TA_ADDR_STALLED_BY_TD_CYCLES_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1902 (TA_DATA_STALLED_BY_TC_CYCLES_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1903 (TA_FLAT_WAVEFRONTS_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1904 (TA_FLAT_READ_WAVEFRONTS_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1905 (TA_FLAT_WRITE_WAVEFRONTS_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1906 (TA_FLAT_ATOMIC_WAVEFRONTS_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1907 (TA_BUFFER_WAVEFRONTS_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1908 (TA_BUFFER_READ_WAVEFRONTS_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1909 (TA_BUFFER_WRITE_WAVEFRONTS_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1910 (TA_BUFFER_ATOMIC_WAVEFRONTS_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1911 (TA_BUFFER_TOTAL_CYCLES_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1912 (TA_BUFFER_COALESCED_READ_CYCLES_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1913 (TA_BUFFER_COALESCED_WRITE_CYCLES_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1914 (TD_TD_BUSY_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1915 (TD_TC_STALL_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1916 (TD_LOAD_WAVEFRONT_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1917 (TD_ATOMIC_WAVEFRONT_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1918 (TD_STORE_WAVEFRONT_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1919 (TD_COALESCABLE_WAVEFRONT_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1920 (TD_SPI_STALL_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1921 (TCP_GATE_EN1_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1922 (TCP_GATE_EN2_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1923 (TCP_TD_TCP_STALL_CYCLES_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1924 (TCP_TCR_TCP_STALL_CYCLES_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1925 (TCP_READ_TAGCONFLICT_STALL_CYCLES_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1926 (TCP_WRITE_TAGCONFLICT_STALL_CYCLES_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1927 (TCP_ATOMIC_TAGCONFLICT_STALL_CYCLES_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1928 (TCP_VOLATILE_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1929 (TCP_TOTAL_ACCESSES_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1930 (TCP_TOTAL_READ_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1931 (TCP_TOTAL_WRITE_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1932 (TCP_TOTAL_ATOMIC_WITH_RET_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1933 (TCP_TOTAL_ATOMIC_WITHOUT_RET_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1934 (TCP_TOTAL_WRITEBACK_INVALIDATES_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1935 (TCP_UTCL1_REQUEST_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1936 (TCP_UTCL1_TRANSLATION_MISS_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1937 (TCP_UTCL1_TRANSLATION_HIT_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1938 (TCP_UTCL1_PERMISSION_MISS_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1939 (TCP_TOTAL_CACHE_ACCESSES_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1940 (TCP_TCP_LATENCY_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1941 (TCP_TA_TCP_STATE_READ_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1942 (TCP_TCC_READ_REQ_LATENCY_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1943 (TCP_TCC_WRITE_REQ_LATENCY_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1944 (TCP_TCC_READ_REQ_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1945 (TCP_TCC_WRITE_REQ_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1946 (TCP_TCC_ATOMIC_WITH_RET_REQ_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1947 (TCP_TCC_ATOMIC_WITHOUT_RET_REQ_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1948 (TCP_TCC_NC_READ_REQ_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1949 (TCP_TCC_NC_WRITE_REQ_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1950 (TCP_TCC_NC_ATOMIC_REQ_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1951 (TCP_TCC_UC_READ_REQ_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1952 (TCP_TCC_UC_WRITE_REQ_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1953 (TCP_TCC_UC_ATOMIC_REQ_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1954 (TCP_TCC_CC_READ_REQ_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1955 (TCP_TCC_CC_WRITE_REQ_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1956 (TCP_TCC_CC_ATOMIC_REQ_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1957 (TCP_TCC_RW_READ_REQ_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1958 (TCP_TCC_RW_WRITE_REQ_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1959 (TCP_TCC_RW_ATOMIC_REQ_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1960 (TCP_PENDING_STALL_CYCLES_sum) expected 16 instances and got 1 [ERROR]Counter ID: 1961 (TCA_CYCLE_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1962 (TCA_BUSY_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1963 (TCC_BUSY_avr) expected 32 instances and got 1 [ERROR]Counter ID: 1964 (TCC_WRREQ_STALL_max) expected 32 instances and got 1 [ERROR]Counter ID: 1965 (TCC_CYCLE_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1966 (TCC_BUSY_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1967 (TCC_REQ_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1968 (TCC_STREAMING_REQ_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1969 (TCC_NC_REQ_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1970 (TCC_UC_REQ_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1971 (TCC_CC_REQ_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1972 (TCC_RW_REQ_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1973 (TCC_PROBE_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1974 (TCC_PROBE_ALL_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1975 (TCC_READ_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1976 (TCC_WRITE_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1977 (TCC_ATOMIC_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1978 (TCC_HIT_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1979 (TCC_MISS_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1980 (TCC_WRITEBACK_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1981 (TCC_EA_WRREQ_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1982 (TCC_EA_WRREQ_64B_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1983 (TCC_EA_WR_UNCACHED_32B_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1984 (TCC_EA_WRREQ_STALL_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1985 (TCC_EA_WRREQ_IO_CREDIT_STALL_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1986 (TCC_EA_WRREQ_GMI_CREDIT_STALL_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1987 (TCC_EA_WRREQ_DRAM_CREDIT_STALL_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1988 (TCC_TOO_MANY_EA_WRREQS_STALL_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1989 (TCC_EA_WRREQ_LEVEL_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1990 (TCC_EA_RDREQ_LEVEL_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1991 (TCC_EA_ATOMIC_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1992 (TCC_EA_ATOMIC_LEVEL_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1993 (TCC_EA_RDREQ_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1994 (TCC_EA_RDREQ_32B_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1995 (TCC_EA_RD_UNCACHED_32B_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1996 (TCC_EA_RDREQ_IO_CREDIT_STALL_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1997 (TCC_EA_RDREQ_GMI_CREDIT_STALL_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1998 (TCC_EA_RDREQ_DRAM_CREDIT_STALL_sum) expected 32 instances and got 1 [ERROR]Counter ID: 1999 (TCC_TAG_STALL_sum) expected 32 instances and got 1 [ERROR]Counter ID: 2000 (TCC_NORMAL_WRITEBACK_sum) expected 32 instances and got 1 [ERROR]Counter ID: 2001 (TCC_ALL_TC_OP_WB_WRITEBACK_sum) expected 32 instances and got 1 [ERROR]Counter ID: 2002 (TCC_NORMAL_EVICT_sum) expected 32 instances and got 1 [ERROR]Counter ID: 2003 (TCC_ALL_TC_OP_INV_EVICT_sum) expected 32 instances and got 1 [ERROR]Counter ID: 2004 (TCC_EA_RDREQ_DRAM_sum) expected 32 instances and got 1 [ERROR]Counter ID: 2005 (TCC_EA_WRREQ_DRAM_sum) expected 32 instances and got 1 [ERROR]Counter ID: 2006 (FETCH_SIZE) expected 32 instances and got 1 [ERROR]Counter ID: 2007 (WRITE_SIZE) expected 32 instances and got 1 [ERROR]Counter ID: 2008 (WRITE_REQ_32B) expected 32 instances and got 1 [ERROR]Counter ID: 2009 (CU_OCCUPANCY) expected 1 instances and got 8 Counter ID: 2010 (CU_UTILIZATION) expected 1 instances and got 1 [ERROR]Counter ID: 2011 (TOTAL_16_OPS) expected 1 instances and got 8 [ERROR]Counter ID: 2012 (TOTAL_32_OPS) expected 1 instances and got 8 [ERROR]Counter ID: 2013 (TOTAL_64_OPS) expected 1 instances and got 8 Counter ID: 2014 (AggSysCycles) expected 1 instances and got 1 Counter ID: 2015 (GpuUtil) expected 1 instances and got 1 Counter ID: 2016 (CpUtil) expected 1 instances and got 1 Counter ID: 2017 (SpiUtil) expected 1 instances and got 1 Counter ID: 2018 (TaUtil) expected 1 instances and got 1 Counter ID: 2019 (TcUtil) expected 1 instances and got 1 Counter ID: 2020 (EaUtil) expected 1 instances and got 1 [ERROR]Counter ID: 2021 (InstrFetchLatency) expected 1 instances and got 8 [ERROR]Counter ID: 2022 (WaveOccupancy) expected 1 instances and got 8 [ERROR]Counter ID: 2023 (WaveDuration) expected 1 instances and got 8 [ERROR]Counter ID: 2024 (WaveDepWait) expected 1 instances and got 8 [ERROR]Counter ID: 2025 (WaveIssueWait) expected 1 instances and got 8 [ERROR]Counter ID: 2026 (WaveExec) expected 1 instances and got 8 [ERROR]Counter ID: 2027 (ValuIops) expected 1 instances and got 8 [ERROR]Counter ID: 2028 (MfmaFlops) expected 1 instances and got 8 [ERROR]Counter ID: 2029 (MfmaFlopsF16) expected 1 instances and got 8 [ERROR]Counter ID: 2030 (MfmaFlopsBF16) expected 1 instances and got 8 [ERROR]Counter ID: 2031 (MfmaFlopsF32) expected 1 instances and got 8 [ERROR]Counter ID: 2032 (MfmaFlopsF64) expected 1 instances and got 8 [ERROR]Counter ID: 2033 (ScaPipeIssueUtil) expected 1 instances and got 8 [ERROR]Counter ID: 2034 (ValuPipeIssueUtil) expected 1 instances and got 8 [ERROR]Counter ID: 2035 (VmemPipeIssueUtil) expected 1 instances and got 8 [ERROR]Counter ID: 2036 (MfmaUtil) expected 1 instances and got 8 [ERROR]Counter ID: 2037 (AvgNumActiveThreads) expected 1 instances and got 8 [ERROR]Counter ID: 2038 (VmemLatency) expected 1 instances and got 8 [ERROR]Counter ID: 2039 (SmemLatency) expected 1 instances and got 8 [ERROR]Counter ID: 2040 (LdsUtil) expected 1 instances and got 8 [ERROR]Counter ID: 2041 (LdsPipeIssueUtil) expected 1 instances and got 8 [ERROR]Counter ID: 2042 (LdsLatency) expected 1 instances and got 8 [ERROR]Counter ID: 2043 (LdsBankConflict) expected 1 instances and got 8 [ERROR]Counter ID: 2044 (L1iCacheHitRate) expected 1 instances and got 8 [ERROR]Counter ID: 2045 (sL1dCacheHitRate) expected 1 instances and got 8 [ERROR]Counter ID: 2046 (vL1dBufCoalesceRate) expected 16 instances and got 1 [ERROR]Counter ID: 2047 (vL1dCacheUtil) expected 16 instances and got 1 [ERROR]Counter ID: 2048 (vL1dCacheTcbHitRate) expected 16 instances and got 1 [ERROR]Counter ID: 2049 (vL1dCacheWaveLatency) expected 16 instances and got 1 [ERROR]Counter ID: 2050 (vL1dReadFromL2Latency) expected 16 instances and got 1 [ERROR]Counter ID: 2051 (vL1dWriteToL2Latency) expected 16 instances and got 1 [ERROR]Counter ID: 2052 (vL1dRdTagConfStallRate) expected 16 instances and got 1 [ERROR]Counter ID: 2053 (vL1dWrTagConfStallRate) expected 16 instances and got 1 [ERROR]Counter ID: 2054 (vL1dAtomicTagConfStallRate) expected 16 instances and got 1 [ERROR]Counter ID: 2055 (vL1dMissReqStallRate) expected 16 instances and got 1 [ERROR]Counter ID: 2056 (vL1dDataPendRate) expected 16 instances and got 1 [ERROR]Counter ID: 2057 (vL1dDataRetStallRate) expected 16 instances and got 1 [ERROR]Counter ID: 2058 (L2CacheHitRate) expected 32 instances and got 1 [ERROR]Counter ID: 2059 (L2CacheTagRamStallRate) expected 32 instances and got 1 [ERROR]Counter ID: 2060 (EaRdLatency) expected 32 instances and got 1 [ERROR]Counter ID: 2061 (EaRdIoStallRate) expected 32 instances and got 1 [ERROR]Counter ID: 2062 (EaRdGmiStallRate) expected 32 instances and got 1 [ERROR]Counter ID: 2063 (EaRdDramStallRate) expected 32 instances and got 1 [ERROR]Counter ID: 2064 (EaWrLatency) expected 32 instances and got 1 [ERROR]Counter ID: 2065 (EaWrIoStallRate) expected 32 instances and got 1 [ERROR]Counter ID: 2066 (EaWrGmiStallRate) expected 32 instances and got 1 [ERROR]Counter ID: 2067 (EaWrDramStallRate) expected 32 instances and got 1 [ERROR]Counter ID: 2068 (EaWrStarveRate) expected 32 instances and got 1 [ERROR]Counter ID: 2069 (EaAtomicLatency) expected 32 instances and got 1 [ERROR]Counter ID: 2070 (TCP_TCP_TA_DATA_STALL_CYCLES_sum) expected 16 instances and got 1 [ERROR]Counter ID: 2071 (TCP_TCP_TA_DATA_STALL_CYCLES_max) expected 16 instances and got 1 [ERROR]Counter ID: 2072 (VFetchInsts) expected 16 instances and got 8 [ERROR]Counter ID: 2073 (VWriteInsts) expected 16 instances and got 8 [ERROR]Counter ID: 2074 (FlatVMemInsts) expected 1 instances and got 8 [ERROR]Counter ID: 2075 (LDSInsts) expected 1 instances and got 8 [ERROR]Counter ID: 2076 (FlatLDSInsts) expected 1 instances and got 8 [ERROR]Counter ID: 2077 (VALUUtilization) expected 1 instances and got 8 [ERROR]Counter ID: 2078 (VALUBusy) expected 1 instances and got 8 [ERROR]Counter ID: 2079 (SALUBusy) expected 1 instances and got 8 [ERROR]Counter ID: 2080 (FetchSize) expected 32 instances and got 1 [ERROR]Counter ID: 2081 (WriteSize) expected 32 instances and got 1 [ERROR]Counter ID: 2082 (MemWrites32B) expected 32 instances and got 1 [ERROR]Counter ID: 2083 (L2CacheHit) expected 32 instances and got 1 [ERROR]Counter ID: 2084 (MemUnitStalled) expected 16 instances and got 1 [ERROR]Counter ID: 2085 (WriteUnitStalled) expected 32 instances and got 1 [ERROR]Counter ID: 2086 (LDSBankConflict) expected 1 instances and got 8 * source formatting (clang-format v11) (#225) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * cmake formatting (cmake-format) (#224) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Minor fixes * source formatting (clang-format v11) (#226) Co-authored-by: bwelton <bwelton@users.noreply.github.com> * Minor test change --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: bwelton <bwelton@users.noreply.github.com>
634 خطوط
24 KiB
C++
634 خطوط
24 KiB
C++
// MIT License
|
|
//
|
|
// Copyright (c) 2023 Advanced Micro Devices, Inc. All rights reserved.
|
|
//
|
|
// Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
// of this software and associated documentation files (the "Software"), to deal
|
|
// in the Software without restriction, including without limitation the rights
|
|
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
// copies of the Software, and to permit persons to whom the Software is
|
|
// furnished to do so, subject to the following conditions:
|
|
//
|
|
// The above copyright notice and this permission notice shall be included in all
|
|
// copies or substantial portions of the Software.
|
|
//
|
|
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
// SOFTWARE.
|
|
|
|
#include "lib/rocprofiler/counters/evaluate_ast.hpp"
|
|
#include <fmt/core.h>
|
|
|
|
#include <exception>
|
|
#include <optional>
|
|
|
|
#include <numeric>
|
|
#include <stdexcept>
|
|
#include "lib/common/synchronized.hpp"
|
|
#include "lib/common/utility.hpp"
|
|
#include "lib/rocprofiler/counters/dimensions.hpp"
|
|
#include "lib/rocprofiler/counters/parser/reader.hpp"
|
|
|
|
namespace rocprofiler
|
|
{
|
|
namespace counters
|
|
{
|
|
namespace
|
|
{
|
|
ReduceOperation
|
|
get_reduce_op_type_from_string(const std::string& op)
|
|
{
|
|
static const std::unordered_map<std::string, ReduceOperation> reduce_op_string_to_type = {
|
|
{"min", REDUCE_MIN}, {"max", REDUCE_MAX}, {"sum", REDUCE_SUM}, {"avr", REDUCE_AVG}};
|
|
|
|
ReduceOperation type = REDUCE_NONE;
|
|
if(op.empty()) return REDUCE_NONE;
|
|
const auto* reduce_op_type = rocprofiler::common::get_val(reduce_op_string_to_type, op);
|
|
if(reduce_op_type) type = *reduce_op_type;
|
|
return type;
|
|
}
|
|
|
|
std::vector<rocprofiler_record_counter_t>*
|
|
perform_reduction(ReduceOperation reduce_op, std::vector<rocprofiler_record_counter_t>* input_array)
|
|
{
|
|
rocprofiler_record_counter_t result{.id = 0, .counter_value = 0};
|
|
if(input_array->empty()) return input_array;
|
|
switch(reduce_op)
|
|
{
|
|
case REDUCE_NONE: break;
|
|
case REDUCE_MIN:
|
|
{
|
|
result =
|
|
*std::min_element(input_array->begin(), input_array->end(), [](auto& a, auto& b) {
|
|
return a.counter_value < b.counter_value;
|
|
});
|
|
break;
|
|
}
|
|
case REDUCE_MAX:
|
|
{
|
|
result =
|
|
*std::max_element(input_array->begin(), input_array->end(), [](auto& a, auto& b) {
|
|
return a.counter_value > b.counter_value;
|
|
});
|
|
break;
|
|
}
|
|
case REDUCE_SUM:
|
|
{
|
|
result = std::accumulate(input_array->begin(),
|
|
input_array->end(),
|
|
rocprofiler_record_counter_t{.id = 0, .counter_value = 0},
|
|
[](auto& a, auto& b) {
|
|
return rocprofiler_record_counter_t{
|
|
.id = a.id,
|
|
.counter_value = a.counter_value + b.counter_value};
|
|
});
|
|
break;
|
|
}
|
|
case REDUCE_AVG:
|
|
{
|
|
result = std::accumulate(input_array->begin(),
|
|
input_array->end(),
|
|
rocprofiler_record_counter_t{.id = 0, .counter_value = 0},
|
|
[](auto& a, auto& b) {
|
|
return rocprofiler_record_counter_t{
|
|
.id = a.id,
|
|
.counter_value = a.counter_value + b.counter_value};
|
|
});
|
|
result.counter_value /= input_array->size();
|
|
break;
|
|
}
|
|
}
|
|
input_array->clear();
|
|
input_array->push_back(result);
|
|
set_dim_in_rec(input_array->begin()->id, ROCPROFILER_DIMENSION_NONE, 0);
|
|
return input_array;
|
|
}
|
|
|
|
} // namespace
|
|
|
|
const std::unordered_map<std::string, EvaluateASTMap>&
|
|
get_ast_map()
|
|
{
|
|
static std::unordered_map<std::string, EvaluateASTMap> ast_map = []() {
|
|
std::unordered_map<std::string, EvaluateASTMap> data;
|
|
const auto& metric_map = counters::getMetricMap();
|
|
for(const auto& [gfx, metrics] : metric_map)
|
|
{
|
|
// TODO: Remove global XML from derrived counters...
|
|
if(gfx == "global") continue;
|
|
|
|
std::unordered_map<std::string, Metric> by_name;
|
|
for(const auto& metric : metrics)
|
|
{
|
|
by_name.emplace(metric.name(), metric);
|
|
}
|
|
|
|
auto& eval_map = data.emplace(gfx, EvaluateASTMap{}).first->second;
|
|
for(auto& [_, metric] : by_name)
|
|
{
|
|
RawAST* ast = nullptr;
|
|
auto* buf =
|
|
yy_scan_string(metric.expression().empty() ? metric.name().c_str()
|
|
: metric.expression().c_str());
|
|
yyparse(&ast);
|
|
if(!ast)
|
|
{
|
|
LOG(ERROR) << fmt::format("Unable to parse metric {}", metric);
|
|
throw std::runtime_error(fmt::format("Unable to parse metric {}", metric));
|
|
}
|
|
try
|
|
{
|
|
auto& evaluate_ast_node =
|
|
eval_map
|
|
.emplace(metric.name(),
|
|
EvaluateAST({.handle = metric.id()}, by_name, *ast, gfx))
|
|
.first->second;
|
|
evaluate_ast_node.validate_raw_ast(
|
|
by_name); // TODO: refactor and consolidate internal post-construction
|
|
// logic as a Finish() method
|
|
} catch(std::exception& e)
|
|
{
|
|
LOG(ERROR) << e.what();
|
|
throw std::runtime_error(
|
|
fmt::format("AST was not generated for {}:{}", gfx, metric.name()));
|
|
}
|
|
yy_delete_buffer(buf);
|
|
delete ast;
|
|
}
|
|
|
|
for(auto& [name, ast] : eval_map)
|
|
{
|
|
ast.expand_derived(eval_map);
|
|
}
|
|
}
|
|
|
|
return data;
|
|
}();
|
|
return ast_map;
|
|
}
|
|
|
|
std::optional<std::set<Metric>>
|
|
get_required_hardware_counters(const std::unordered_map<std::string, EvaluateASTMap>& asts,
|
|
const std::string& agent,
|
|
const Metric& metric)
|
|
{
|
|
const auto* agent_map = rocprofiler::common::get_val(asts, agent);
|
|
if(!agent_map) return std::nullopt;
|
|
const auto* counter_ast = rocprofiler::common::get_val(*agent_map, metric.name());
|
|
if(!counter_ast) return std::nullopt;
|
|
|
|
std::set<Metric> required_counters;
|
|
counter_ast->get_required_counters(*agent_map, required_counters);
|
|
return required_counters;
|
|
}
|
|
|
|
EvaluateAST::EvaluateAST(rocprofiler_counter_id_t out_id,
|
|
const std::unordered_map<std::string, Metric>& metrics,
|
|
const RawAST& ast,
|
|
std::string agent)
|
|
: _type(ast.type)
|
|
, _reduce_op(get_reduce_op_type_from_string(ast.reduce_op))
|
|
, _agent(std::move(agent))
|
|
, _reduce_dimension_set(ast.reduce_dimension_set)
|
|
, _out_id(out_id)
|
|
{
|
|
if(_type == NodeType::REFERENCE_NODE)
|
|
{
|
|
try
|
|
{
|
|
_metric = metrics.at(std::get<std::string>(ast.value));
|
|
} catch(std::exception& e)
|
|
{
|
|
throw std::runtime_error(
|
|
fmt::format("Unable to lookup metric {}", std::get<std::string>(ast.value)));
|
|
}
|
|
}
|
|
|
|
if(_type == NodeType::NUMBER_NODE)
|
|
{
|
|
_raw_value = std::get<int64_t>(ast.value);
|
|
_static_value.push_back(
|
|
{.id = 0, .counter_value = static_cast<double>(std::get<int64_t>(ast.value))});
|
|
}
|
|
|
|
for(const auto& nextAst : ast.counter_set)
|
|
{
|
|
_children.emplace_back(_out_id, metrics, *nextAst, _agent);
|
|
}
|
|
}
|
|
|
|
DimensionTypes
|
|
EvaluateAST::set_dimensions()
|
|
{
|
|
if(_dimension_types != DIMENSION_LAST)
|
|
{
|
|
return _dimension_types;
|
|
}
|
|
|
|
auto get_dim_types = [&](auto& metric) {
|
|
int dim_types = 0;
|
|
for(const auto& dim : getBlockDimensions(_agent, metric))
|
|
{
|
|
dim_types |= (dim.type() != ROCPROFILER_DIMENSION_NONE) ? (0x1 << dim.type()) : 0;
|
|
}
|
|
return static_cast<DimensionTypes>(dim_types);
|
|
};
|
|
|
|
switch(_type)
|
|
{
|
|
case NONE:
|
|
case RANGE_NODE:
|
|
case CONSTANT_NODE:
|
|
case NUMBER_NODE: break;
|
|
case ADDITION_NODE:
|
|
case SUBTRACTION_NODE:
|
|
case MULTIPLY_NODE:
|
|
case DIVIDE_NODE:
|
|
{
|
|
if(_children[0].set_dimensions() != _children[1].set_dimensions() &&
|
|
_children[0].type() != NUMBER_NODE && _children[1].type() != NUMBER_NODE)
|
|
throw std::runtime_error(fmt::format("Dimension mis-mismatch: {} and {}",
|
|
_children[0].metric(),
|
|
_children[1].metric()));
|
|
_dimension_types = (_children[0].type() != NUMBER_NODE) ? _children[0].set_dimensions()
|
|
: _children[1].set_dimensions();
|
|
}
|
|
break;
|
|
case REFERENCE_NODE:
|
|
{
|
|
_dimension_types = get_dim_types(_metric);
|
|
}
|
|
break;
|
|
case REDUCE_NODE:
|
|
{
|
|
// There is only one child node in case of REDUCE_NODE and that
|
|
// child node denotes the expression on which the reduce is applied.
|
|
// The resulting dimension of REDUCE_NODE will be the child's dimension
|
|
// minus the dimensions specified in the reduce_dimension_set.
|
|
int original_dim = static_cast<int>(_children[0].set_dimensions());
|
|
int turn_off_dims = 0;
|
|
for(auto dim : _reduce_dimension_set)
|
|
{
|
|
turn_off_dims |= (dim != ROCPROFILER_DIMENSION_NONE) ? (0x1 << dim) : 1;
|
|
}
|
|
int final_dims = _reduce_dimension_set.empty() ? ROCPROFILER_DIMENSION_NONE
|
|
: (original_dim & ~turn_off_dims);
|
|
_dimension_types = static_cast<DimensionTypes>(final_dims);
|
|
}
|
|
break;
|
|
case SELECT_NODE:
|
|
{
|
|
// TODO: future scope
|
|
}
|
|
break;
|
|
}
|
|
return _dimension_types;
|
|
}
|
|
|
|
void
|
|
EvaluateAST::get_required_counters(const std::unordered_map<std::string, EvaluateAST>& asts,
|
|
std::set<Metric>& counters) const
|
|
{
|
|
if(!_metric.empty() && children().empty() && _type != NodeType::NUMBER_NODE)
|
|
{
|
|
// Base counter
|
|
if(_metric.expression().empty())
|
|
{
|
|
counters.insert(_metric);
|
|
return;
|
|
}
|
|
|
|
// Derrived Counter
|
|
const auto* expr_ptr = rocprofiler::common::get_val(asts, _metric.name());
|
|
if(!expr_ptr) throw std::runtime_error("could not find derived counter");
|
|
expr_ptr->get_required_counters(asts, counters);
|
|
// TODO: Add guards against infinite recursion
|
|
return;
|
|
}
|
|
|
|
for(const auto& child : children())
|
|
{
|
|
child.get_required_counters(asts, counters);
|
|
}
|
|
}
|
|
|
|
bool
|
|
EvaluateAST::validate_raw_ast(const std::unordered_map<std::string, Metric>& metrics)
|
|
{
|
|
bool ret = true;
|
|
|
|
try
|
|
{
|
|
switch(_type)
|
|
{
|
|
case NONE:
|
|
case RANGE_NODE:
|
|
case CONSTANT_NODE:
|
|
case NUMBER_NODE: break;
|
|
case ADDITION_NODE:
|
|
case SUBTRACTION_NODE:
|
|
case MULTIPLY_NODE:
|
|
case DIVIDE_NODE:
|
|
{
|
|
// For arithmetic operations '+' '-' '*' '/' check if
|
|
// dimensions of both operands are matching. (handled in set_dimensions())
|
|
for(auto& child : _children)
|
|
{
|
|
child.validate_raw_ast(metrics);
|
|
}
|
|
}
|
|
break;
|
|
case REFERENCE_NODE:
|
|
{
|
|
// handled in constructor
|
|
}
|
|
break;
|
|
case REDUCE_NODE:
|
|
{
|
|
// Future TODO
|
|
// Check #1 : Should be applied on a base metric. Derived metric support will be
|
|
// added later. Check #2 : Operation should be a supported operation. Check #3 :
|
|
// Dimensions specified should be valid for this metric and GPU
|
|
|
|
// validate the members of RawAST, not the members of this class
|
|
}
|
|
break;
|
|
case SELECT_NODE:
|
|
{
|
|
// Future TODO
|
|
// Check #1 : Should be applied on a base metric. Derived metric support will be
|
|
// added later. Check #2 : Operation should be a supported operation. Check #3 :
|
|
// Dimensions specified should be valid for this metric and GPU. Check #4 :
|
|
// Dimensionindex values should be within limits for this metric and GPU.
|
|
}
|
|
break;
|
|
}
|
|
} catch(std::exception& e)
|
|
{
|
|
throw;
|
|
}
|
|
|
|
// Future TODO:
|
|
// check if there are cycles in the graph
|
|
|
|
return ret;
|
|
}
|
|
|
|
namespace
|
|
{
|
|
using property_function_t = int64_t (*)(const rocprofiler_agent_t&);
|
|
#define GEN_MAP_ENTRY(name, value) \
|
|
{ \
|
|
name, property_function_t([](const rocprofiler_agent_t& agent_info) { \
|
|
return static_cast<int64_t>(value); \
|
|
}) \
|
|
}
|
|
|
|
int64_t
|
|
get_agent_property(const std::string& property, const rocprofiler_agent_t& agent)
|
|
{
|
|
static std::unordered_map<std::string, property_function_t> props = {
|
|
GEN_MAP_ENTRY("cpu_cores_count", agent_info.cpu_cores_count),
|
|
GEN_MAP_ENTRY("simd_count", agent_info.simd_count),
|
|
GEN_MAP_ENTRY("mem_banks_count", agent_info.mem_banks_count),
|
|
GEN_MAP_ENTRY("caches_count", agent_info.caches_count),
|
|
GEN_MAP_ENTRY("io_links_count", agent_info.io_links_count),
|
|
GEN_MAP_ENTRY("cpu_core_id_base", agent_info.cpu_core_id_base),
|
|
GEN_MAP_ENTRY("simd_id_base", agent_info.simd_id_base),
|
|
GEN_MAP_ENTRY("max_waves_per_simd", agent_info.max_waves_per_simd),
|
|
GEN_MAP_ENTRY("lds_size_in_kb", agent_info.lds_size_in_kb),
|
|
GEN_MAP_ENTRY("gds_size_in_kb", agent_info.gds_size_in_kb),
|
|
GEN_MAP_ENTRY("num_gws", agent_info.num_gws),
|
|
GEN_MAP_ENTRY("wave_front_size", agent_info.wave_front_size),
|
|
GEN_MAP_ENTRY("array_count", agent_info.array_count),
|
|
GEN_MAP_ENTRY("simd_arrays_per_engine", agent_info.simd_arrays_per_engine),
|
|
GEN_MAP_ENTRY("cu_per_simd_array", agent_info.cu_per_simd_array),
|
|
GEN_MAP_ENTRY("simd_per_cu", agent_info.simd_per_cu),
|
|
GEN_MAP_ENTRY("max_slots_scratch_cu", agent_info.max_slots_scratch_cu),
|
|
GEN_MAP_ENTRY("gfx_target_version", agent_info.gfx_target_version),
|
|
GEN_MAP_ENTRY("vendor_id", agent_info.vendor_id),
|
|
GEN_MAP_ENTRY("device_id", agent_info.device_id),
|
|
GEN_MAP_ENTRY("location_id", agent_info.location_id),
|
|
GEN_MAP_ENTRY("domain", agent_info.domain),
|
|
GEN_MAP_ENTRY("drm_render_minor", agent_info.drm_render_minor),
|
|
GEN_MAP_ENTRY("hive_id", agent_info.hive_id),
|
|
GEN_MAP_ENTRY("num_sdma_engines", agent_info.num_sdma_engines),
|
|
GEN_MAP_ENTRY("num_sdma_xgmi_engines", agent_info.num_sdma_xgmi_engines),
|
|
GEN_MAP_ENTRY("num_sdma_queues_per_engine", agent_info.num_sdma_queues_per_engine),
|
|
GEN_MAP_ENTRY("num_cp_queues", agent_info.num_cp_queues),
|
|
GEN_MAP_ENTRY("max_engine_clk_ccompute", agent_info.max_engine_clk_ccompute),
|
|
};
|
|
if(const auto* func = rocprofiler::common::get_val(props, property))
|
|
{
|
|
return (*func)(agent);
|
|
}
|
|
|
|
LOG(ERROR) << fmt::format("Unsupported special property {}", property);
|
|
return 0.0;
|
|
}
|
|
} // namespace
|
|
|
|
void
|
|
EvaluateAST::read_special_counters(
|
|
const rocprofiler_agent_t& agent,
|
|
const std::set<counters::Metric>& required_special_counters,
|
|
std::unordered_map<uint64_t, std::vector<rocprofiler_record_counter_t>>& out_map)
|
|
{
|
|
for(const auto& metric : required_special_counters)
|
|
{
|
|
if(!out_map[metric.id()].empty()) out_map[metric.id()].clear();
|
|
auto& record = out_map[metric.id()].emplace_back();
|
|
set_counter_in_rec(record.id, {.handle = metric.id()});
|
|
set_dim_in_rec(record.id, ROCPROFILER_DIMENSION_NONE, 0);
|
|
|
|
record.counter_value = get_agent_property(metric.name(), agent);
|
|
}
|
|
}
|
|
|
|
std::unordered_map<uint64_t, std::vector<rocprofiler_record_counter_t>>
|
|
EvaluateAST::read_pkt(const aql::AQLPacketConstruct* pkt_gen, hsa::AQLPacket& pkt)
|
|
{
|
|
struct it_data
|
|
{
|
|
std::unordered_map<uint64_t, std::vector<rocprofiler_record_counter_t>>* data;
|
|
const aql::AQLPacketConstruct* pkt_gen;
|
|
};
|
|
|
|
std::unordered_map<uint64_t, std::vector<rocprofiler_record_counter_t>> ret;
|
|
if(pkt.empty) return ret;
|
|
it_data aql_data{.data = &ret, .pkt_gen = pkt_gen};
|
|
;
|
|
hsa_status_t status = hsa_ven_amd_aqlprofile_iterate_data(
|
|
&pkt.profile,
|
|
[](hsa_ven_amd_aqlprofile_info_type_t info_type,
|
|
hsa_ven_amd_aqlprofile_info_data_t* info_data,
|
|
void* data) {
|
|
CHECK(data);
|
|
auto& it = *static_cast<it_data*>(data);
|
|
if(info_type != HSA_VEN_AMD_AQLPROFILE_INFO_PMC_DATA) return HSA_STATUS_SUCCESS;
|
|
const auto* metric = it.pkt_gen->event_to_metric(info_data->pmc_data.event);
|
|
if(!metric) return HSA_STATUS_SUCCESS;
|
|
auto& vec = it.data->emplace(metric->id(), std::vector<rocprofiler_record_counter_t>{})
|
|
.first->second;
|
|
auto& next_rec = vec.emplace_back();
|
|
set_counter_in_rec(next_rec.id, {.handle = metric->id()});
|
|
// Actual dimension info needs to be used here in the future
|
|
set_dim_in_rec(next_rec.id, ROCPROFILER_DIMENSION_NONE, vec.size() - 1);
|
|
// Note: in the near future we need to use hw_counter here instead
|
|
next_rec.counter_value = info_data->pmc_data.result;
|
|
return HSA_STATUS_SUCCESS;
|
|
},
|
|
&aql_data);
|
|
CHECK(status == HSA_STATUS_SUCCESS);
|
|
return ret;
|
|
}
|
|
|
|
void
|
|
EvaluateAST::set_out_id(std::vector<rocprofiler_record_counter_t>& results) const
|
|
{
|
|
for(auto& record : results)
|
|
{
|
|
set_counter_in_rec(record.id, _out_id);
|
|
}
|
|
}
|
|
|
|
void
|
|
EvaluateAST::expand_derived(std::unordered_map<std::string, EvaluateAST>& asts)
|
|
{
|
|
if(_expanded) return;
|
|
_expanded = true;
|
|
for(auto& child : _children)
|
|
{
|
|
if(auto* ptr = rocprofiler::common::get_val(asts, child.metric().name()))
|
|
{
|
|
ptr->expand_derived(asts);
|
|
child = *ptr;
|
|
}
|
|
else
|
|
{
|
|
child.expand_derived(asts);
|
|
}
|
|
}
|
|
|
|
/**
|
|
* This covers cases where a derived metric is not a child at all. I.e.
|
|
* <metric name="MemWrites32B" expr=WRITE_REQ_32B>. This will expand
|
|
* WRITE_REQ_32B to its proper expression.
|
|
*/
|
|
if(!_metric.expression().empty())
|
|
{
|
|
if(auto* ptr = rocprofiler::common::get_val(asts, _metric.name()))
|
|
{
|
|
ptr->expand_derived(asts);
|
|
_children = ptr->children();
|
|
_type = ptr->type();
|
|
_reduce_op = ptr->reduce_op();
|
|
}
|
|
}
|
|
}
|
|
|
|
// convert to buffer at some point
|
|
std::vector<rocprofiler_record_counter_t>*
|
|
EvaluateAST::evaluate(
|
|
std::unordered_map<uint64_t, std::vector<rocprofiler_record_counter_t>>& results_map,
|
|
std::vector<std::unique_ptr<std::vector<rocprofiler_record_counter_t>>>& cache)
|
|
{
|
|
auto perform_op = [&](auto&& op) {
|
|
auto* r1 = _children.at(0).evaluate(results_map, cache);
|
|
auto* r2 = _children.at(1).evaluate(results_map, cache);
|
|
|
|
if(r1->size() < r2->size()) swap(r1, r2);
|
|
|
|
cache.emplace_back(std::make_unique<std::vector<rocprofiler_record_counter_t>>());
|
|
*cache.back() = *r1;
|
|
r1 = cache.back().get();
|
|
|
|
CHECK(!r1->empty() && !r2->empty());
|
|
|
|
if(r2->size() == 1)
|
|
{
|
|
// Special operation on either a number node
|
|
// or special node. This is typically a multiple/divide
|
|
// or some other type of constant op.
|
|
for(auto& val : *r1)
|
|
{
|
|
val = op(val, *r2->begin());
|
|
}
|
|
}
|
|
else if(r2->size() == r1->size())
|
|
{
|
|
// Normal combination
|
|
std::transform(r1->begin(), r1->end(), r2->begin(), r1->begin(), op);
|
|
}
|
|
else
|
|
{
|
|
throw std::runtime_error(
|
|
fmt::format("Mismatched Sizes {}, {}", r1->size(), r2->size()));
|
|
}
|
|
return r1;
|
|
};
|
|
|
|
switch(_type)
|
|
{
|
|
case NONE:
|
|
case CONSTANT_NODE:
|
|
case RANGE_NODE: break;
|
|
case NUMBER_NODE: return &_static_value;
|
|
case ADDITION_NODE:
|
|
return perform_op([](auto& a, auto& b) {
|
|
return rocprofiler_record_counter_t{
|
|
.id = a.id, .counter_value = a.counter_value + b.counter_value};
|
|
});
|
|
case SUBTRACTION_NODE:
|
|
return perform_op([](auto& a, auto& b) {
|
|
return rocprofiler_record_counter_t{
|
|
.id = a.id, .counter_value = a.counter_value - b.counter_value};
|
|
});
|
|
case MULTIPLY_NODE:
|
|
return perform_op([](auto& a, auto& b) {
|
|
return rocprofiler_record_counter_t{
|
|
.id = a.id, .counter_value = a.counter_value * b.counter_value};
|
|
});
|
|
case DIVIDE_NODE:
|
|
return perform_op([](auto& a, auto& b) {
|
|
return rocprofiler_record_counter_t{
|
|
.id = a.id,
|
|
.counter_value =
|
|
(b.counter_value == 0 ? 0 : a.counter_value / b.counter_value)};
|
|
});
|
|
case REFERENCE_NODE:
|
|
{
|
|
auto* result = rocprofiler::common::get_val(results_map, _metric.id());
|
|
if(!result)
|
|
throw std::runtime_error(
|
|
fmt::format("Unable to lookup results for metric {}", _metric.name()));
|
|
|
|
return result;
|
|
}
|
|
break;
|
|
case REDUCE_NODE:
|
|
{
|
|
auto* result = rocprofiler::common::get_val(results_map, _children[0]._metric.id());
|
|
if(!result)
|
|
throw std::runtime_error(fmt::format("Unable to lookup results for metric {}",
|
|
_children[0]._metric.name()));
|
|
|
|
if(_reduce_op == REDUCE_NONE)
|
|
throw std::runtime_error(fmt::format("Invalid Second argument to reduce(): {}",
|
|
static_cast<int>(_reduce_op)));
|
|
return perform_reduction(_reduce_op, result);
|
|
}
|
|
// Currently unsupported
|
|
case SELECT_NODE: break;
|
|
}
|
|
|
|
return nullptr;
|
|
}
|
|
|
|
} // namespace counters
|
|
} // namespace rocprofiler
|