diff options
author | Sandipan Das <sandipan.das@amd.com> | 2023-07-06 12:04:40 +0530 |
---|---|---|
committer | Arnaldo Carvalho de Melo <acme@redhat.com> | 2023-07-11 11:50:39 -0300 |
commit | 8d40f74ebf217d3b9e9b7481721e6236b857cc55 (patch) | |
tree | 2a521f30b73df49bc294f9d7298dec94d839c0f1 /tools | |
parent | 1feece2780ac2f8de45177fe53979726cee4b3d1 (diff) |
perf vendor events amd: Fix large metrics
There are cases where a metric requires more events than the number of
available counters. E.g. AMD Zen, Zen 2 and Zen 3 processors have four
data fabric counters but the "nps1_die_to_dram" metric has eight events.
By default, the constituent events are placed in a group and since the
events cannot be scheduled at the same time, the metric is not computed.
The "all metrics" test also fails because of this.
Use the NO_GROUP_EVENTS constraint for such metrics which anyway expect
the user to run perf with "--metric-no-group".
E.g.
$ sudo perf test -v 101
Before:
101: perf all metrics test :
--- start ---
test child forked, pid 37131
Testing branch_misprediction_ratio
Testing all_remote_links_outbound
Testing nps1_die_to_dram
Metric 'nps1_die_to_dram' not printed in:
Error:
Invalid event (dram_channel_data_controller_4) in per-thread mode, enable system wide with '-a'.
Testing macro_ops_dispatched
Testing all_l2_cache_accesses
Testing all_l2_cache_hits
Testing all_l2_cache_misses
Testing ic_fetch_miss_ratio
Testing l2_cache_accesses_from_l2_hwpf
Testing l2_cache_misses_from_l2_hwpf
Testing op_cache_fetch_miss_ratio
Testing l3_read_miss_latency
Testing l1_itlb_misses
test child finished with -1
---- end ----
perf all metrics test: FAILED!
After:
101: perf all metrics test :
--- start ---
test child forked, pid 43766
Testing branch_misprediction_ratio
Testing all_remote_links_outbound
Testing nps1_die_to_dram
Testing macro_ops_dispatched
Testing all_l2_cache_accesses
Testing all_l2_cache_hits
Testing all_l2_cache_misses
Testing ic_fetch_miss_ratio
Testing l2_cache_accesses_from_l2_hwpf
Testing l2_cache_misses_from_l2_hwpf
Testing op_cache_fetch_miss_ratio
Testing l3_read_miss_latency
Testing l1_itlb_misses
test child finished with 0
---- end ----
perf all metrics test: Ok
Reported-by: Ayush Jain <ayush.jain3@amd.com>
Suggested-by: Ian Rogers <irogers@google.com>
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Link: https://lore.kernel.org/r/20230706063440.54189-1-sandipan.das@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Diffstat (limited to 'tools')
3 files changed, 6 insertions, 3 deletions
diff --git a/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json index bf5083c1c260..4d28177325a0 100644 --- a/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json +++ b/tools/perf/pmu-events/arch/x86/amdzen1/recommended.json @@ -169,8 +169,9 @@ }, { "MetricName": "nps1_die_to_dram", - "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)", + "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die)", "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7", + "MetricConstraint": "NO_GROUP_EVENTS", "MetricGroup": "data_fabric", "PerPkg": "1", "ScaleUnit": "6.1e-5MiB" diff --git a/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json index a71694a043ba..60e19456d4c8 100644 --- a/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json +++ b/tools/perf/pmu-events/arch/x86/amdzen2/recommended.json @@ -169,8 +169,9 @@ }, { "MetricName": "nps1_die_to_dram", - "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)", + "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die)", "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7", + "MetricConstraint": "NO_GROUP_EVENTS", "MetricGroup": "data_fabric", "PerPkg": "1", "ScaleUnit": "6.1e-5MiB" diff --git a/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json b/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json index 988cf68ae825..3e9e1781812e 100644 --- a/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json +++ b/tools/perf/pmu-events/arch/x86/amdzen3/recommended.json @@ -205,10 +205,11 @@ }, { "MetricName": "nps1_die_to_dram", - "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die) (may need --metric-no-group)", + "BriefDescription": "Approximate: Combined DRAM B/bytes of all channels on a NPS1 node (die)", "MetricExpr": "dram_channel_data_controller_0 + dram_channel_data_controller_1 + dram_channel_data_controller_2 + dram_channel_data_controller_3 + dram_channel_data_controller_4 + dram_channel_data_controller_5 + dram_channel_data_controller_6 + dram_channel_data_controller_7", "MetricGroup": "data_fabric", "PerPkg": "1", + "MetricConstraint": "NO_GROUP_EVENTS", "ScaleUnit": "6.1e-5MiB" } ] |