summaryrefslogtreecommitdiff
path: root/tools/perf/util
AgeCommit message (Collapse)AuthorFilesLines
2015-09-01perf tools: Fix link time error with sample_reg_masks on non x86Stephane Eranian2-0/+6
This patch makes perf compile on non x86 platforms by defining a weak symbol for sample_reg_masks[] in util/perf_regs.c. The patch also moves the REG() and REG_END() macros into the util/per_regs.h header file. The macros are renamed to SMPL_REG/SMPL_REG_END to avoid clashes with other header files. Signed-off-by: Stephane Eranian <eranian@google.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1441099814-26783-1-git-send-email-eranian@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-09-01perf build: Fix Intel PT instruction decoder dependency problemWang Nan1-0/+1
I hit following building error randomly: ... /bin/sh: /path/to/kernel/buildperf/util/intel-pt-decoder/inat-tables.c: No such file or directory ... LINK /path/to/kernel/buildperf/plugin_mac80211.so LINK /path/to/kernel/buildperf/plugin_kmem.so LINK /path/to/kernel/buildperf/plugin_xen.so LINK /path/to/kernel/buildperf/plugin_hrtimer.so In file included from util/intel-pt-decoder/intel-pt-insn-decoder.c:25:0: util/intel-pt-decoder/inat.c:24:25: fatal error: inat-tables.c: No such file or directory #include "inat-tables.c" ^ compilation terminated. make[4]: *** [/path/to/kernel/buildperf/util/intel-pt-decoder/intel-pt-insn-decoder.o] Error 1 make[4]: *** Waiting for unfinished jobs.... LINK /path/to/kernel/buildperf/plugin_function.so This is caused by tools/perf/util/intel-pt-decoder/Build that, it tries to generate $(OUTPUT)util/intel-pt-decoder/inat-tables.c atomatically but forget to ensure the existance of $(OUTPUT)util/intel-pt-decoder directory. This patch fixes it by adding $(call rule_mkdir) like other similar rules. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1441087005-107540-1-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-31perf record: Add ability to name registers to recordStephane Eranian4-1/+78
This patch modifies the -I/--int-regs option to enablepassing the name of the registers to sample on interrupt. Registers can be specified by their symbolic names. For instance on x86, --intr-regs=ax,si. The motivation is to reduce the size of the perf.data file and the overhead of sampling by only collecting the registers useful to a specific analysis. For instance, for value profiling, sampling only the registers used to passed arguements to functions. With no parameter, the --intr-regs still records all possible registers based on the architecture. To name registers, it is necessary to use the long form of the option, i.e., --intr-regs: $ perf record --intr-regs=si,di,r8,r9 ..... To record any possible registers: $ perf record -I ..... $ perf report --intr-regs ... To display the register, one can use perf report -D To list the available registers: $ perf record --intr-regs=\? available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10 R11 R12 R13 R14 R15 Signed-off-by: Stephane Eranian <eranian@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1441039273-16260-4-git-send-email-eranian@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-31perf/x86: Add list of register namesStephane Eranian1-0/+7
This patch adds a way to locate a register identifier (PERF_X86_REG_*) based on its name, e.g., AX. This will be used by a subsequent patch to improved flexibility of perf record. Signed-off-by: Stephane Eranian <eranian@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kan Liang <kan.liang@intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1441039273-16260-3-git-send-email-eranian@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-31perf evlist: Open event on evsel cpus and threadsKan Liang1-0/+4
An evsel may have different cpus and threads than the evlist it is in. Use it's own cpus and threads, when opening the evsel in 'perf record'. Signed-off-by: Kan Liang <kan.liang@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1440138194-17001-1-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-31perf tools: Fix build on powerpc broken by pt/btsAdrian Hunter2-0/+4
It is theoretically possible to process perf.data files created on x86 and that contain Intel PT or Intel BTS data, on any other architecture, which is why it is possible for there to be build errors on powerpc caused by pt/bts. The errors were: util/intel-pt-decoder/intel-pt-insn-decoder.c: In function ‘intel_pt_insn_decoder’: util/intel-pt-decoder/intel-pt-insn-decoder.c:138:3: error: switch missing default case [-Werror=switch-default] switch (insn->immediate.nbytes) { ^ cc1: all warnings being treated as errors linux-acme.git/tools/perf/perf-obj/libperf.a(libperf-in.o): In function `intel_pt_synth_branch_sample': sources/linux-acme.git/tools/perf/util/intel-pt.c:871: undefined reference to `tsc_to_perf_time' linux-acme.git/tools/perf/perf-obj/libperf.a(libperf-in.o): In function `intel_pt_sample': sources/linux-acme.git/tools/perf/util/intel-pt.c:915: undefined reference to `tsc_to_perf_time' sources/linux-acme.git/tools/perf/util/intel-pt.c:962: undefined reference to `tsc_to_perf_time' linux-acme.git/tools/perf/perf-obj/libperf.a(libperf-in.o): In function `intel_pt_process_event': sources/linux-acme.git/tools/perf/util/intel-pt.c:1454: undefined reference to `perf_time_to_tsc' Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1441046384-28663-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-28perf evlist: Add backpointer for perf_env to evlistKan Liang2-0/+2
Add backpointer to perf_env in evlist, so we can easily access env when processing something where we have a evsel or evlist. Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Kan Liang <kan.liang@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1440755289-30939-5-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-28perf tools: Rename perf_session_env to perf_envKan Liang5-9/+9
As it is not necessarily tied to a perf.data file and needs using in places where a perf_session is not required. Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Kan Liang <kan.liang@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1440755289-30939-4-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-28perf tools: Do not change lib/api/fs/debugfs directlyJiri Olsa1-1/+0
The tracing_events_path is the variable we want to change via --debugfs-dir option, not the debugfs_mountpoint. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Reviewed-by: Matt Fleming <matt@codeblueprint.co.uk> Cc: Raphael Beamonte <raphael.beamonte@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1440596813-12844-4-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-28perf tools: Add tracing_path and remove unneeded functionsJiri Olsa2-53/+5
There's no need for find_tracing_dir, because perf already searches for debugfs/tracefs mount on start and populate tracing_events_path. Adding tracing_path to carry tracing dir string to be used in get_tracing_file instead of calling find_tracing_dir. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Raphael Beamonte <raphael.beamonte@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1440596813-12844-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-28perf buildid: Introduce sysfs/filename__sprintf_build_idMasami Hiramatsu2-0/+35
Introduce sysfs/filename__sprintf_build_id for consolidating similar code. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: Hemant Kumar <hemant@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20150815114259.13642.34685.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-28perf evsel: Add a backpointer to the evlist a evsel is inArnaldo Carvalho de Melo3-0/+8
So that functions that deal primarily with an evsel to access information that concerns the whole evlist it is in. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1440677263-21954-5-git-send-email-kan.liang@intel.com Signed-off-by: Kan Liang <kan.liang@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-26perf probe: Support probing at absolute addressWang Nan3-24/+163
It should be useful to allow 'perf probe' probe at absolute offset of a target. For example, when (u)probing at a instruction of a shared object in a embedded system where debuginfo is not avaliable but we know the offset of that instruction by manually digging. This patch enables following perf probe command syntax: # perf probe 0xffffffff811e6615 And # perf probe /lib/x86_64-linux-gnu/libc-2.19.so 0xeb860 In the above example, we don't need a anchor symbol, so it is possible to compute absolute addresses using other methods and then use 'perf probe' to create the probing points. v1 -> v2: Drop the leading '+' in cmdline; Allow uprobing at offset 0x0; Improve 'perf probe -l' result when uprobe at area without debuginfo. v2 -> v3: Split bugfix to a separated patch. Test result: # perf probe 0xffffffff8119d175 %ax # perf probe sys_write %ax # perf probe /lib64/libc-2.18.so 0x0 %ax # perf probe /lib64/libc-2.18.so 0x5 %ax # perf probe /lib64/libc-2.18.so 0xd8e40 %ax # perf probe /lib64/libc-2.18.so __write %ax # perf probe /lib64/libc-2.18.so 0xd8e49 %ax # cat /sys/kernel/debug/tracing/uprobe_events p:probe_libc/abs_0 /lib64/libc-2.18.so:0x (null) arg1=%ax p:probe_libc/abs_5 /lib64/libc-2.18.so:0x0000000000000005 arg1=%ax p:probe_libc/abs_d8e40 /lib64/libc-2.18.so:0x00000000000d8e40 arg1=%ax p:probe_libc/__write /lib64/libc-2.18.so:0x00000000000d8e40 arg1=%ax p:probe_libc/abs_d8e49 /lib64/libc-2.18.so:0x00000000000d8e49 arg1=%ax # cat /sys/kernel/debug/tracing/kprobe_events p:probe/abs_ffffffff8119d175 0xffffffff8119d175 arg1=%ax p:probe/sys_write _text+1692016 arg1=%ax # perf probe -l Failed to find debug information for address 5 probe:abs_ffffffff8119d175 (on sys_write+5 with arg1) probe:sys_write (on sys_write with arg1) probe_libc:__write (on @unix/syscall-template.S:81 in /lib64/libc-2.18.so with arg1) probe_libc:abs_0 (on 0x0 in /lib64/libc-2.18.so with arg1) probe_libc:abs_5 (on 0x5 in /lib64/libc-2.18.so with arg1) probe_libc:abs_d8e40 (on @unix/syscall-template.S:81 in /lib64/libc-2.18.so with arg1) probe_libc:abs_d8e49 (on __GI___libc_write+9 in /lib64/libc-2.18.so with arg1) Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1440586666-235233-7-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-26perf probe: Fix error reported when offset without functionWang Nan1-3/+7
This patch fixes a bug that, when offset is provided but function is lost, parse_perf_probe_point() will give a "" string as function name, so the checking code at the end of parse_perf_probe_point() become useless. For example: # perf probe +0x1234 Failed to find symbol in kernel Error: Failed to add events. After this patch: # perf probe +0x1234 Semantic error :Offset requires an entry function. Error: Command Parse Error. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1440586666-235233-6-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-26perf probe: Fix list result when address is zeroWang Nan1-3/+25
When manually added uprobe point with zero address, 'perf probe -l' reports error. For example: # echo p:probe_libc/abs_0 /path/to/lib.bin:0x0 arg1=%ax > \ /sys/kernel/debug/tracing/uprobe_events # perf probe -l Error: Failed to show event list. Probing at 0x0 is possible and useful when lib.bin is not a normal shared object but is manually mapped. However, in this case kernel report: # cat /sys/kernel/debug/tracing/uprobe_events p:probe_libc/abs_0 /path/to/lib.bin:0x (null) arg1=%ax This patch supports the above kernel output. Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1440586666-235233-5-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-26perf probe: Fix list result when symbol can't be foundWang Nan1-1/+1
'perf probe -l' reports error if it is unable find symbol through address. Here is an example. # echo 'p:probe_libc/abs_5 /lib64/libc.so.6:0x5' > /sys/kernel/debug/tracing/uprobe_events # cat /sys/kernel/debug/tracing/uprobe_events p:probe_libc/abs_5 /lib64/libc.so.6:0x0000000000000005 # perf probe -l Error: Failed to show event list Also, this situation triggers a logical inconsistency in convert_to_perf_probe_point() that, it returns ENOMEM but actually it never try strdup(). This patch removes !tp->module && !is_kprobe condition, so it always uses address to build function name if symbol not found. Test result: # perf probe -l probe_libc:abs_5 (on 0x5 in /lib64/libc.so.6) Signed-off-by: Wang Nan <wangnan0@huawei.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1440586666-235233-4-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-26perf probe: Prevent segfault when reading probe point with absolute addressWang Nan1-4/+4
'perf probe -l' panic if there is a manually inserted probing point with absolute address. For example: # echo 'p:probe/abs_ffffffff811e6615 0xffffffff811e6615' > /sys/kernel/debug/tracing/kprobe_events # perf probe -l Segmentation fault (core dumped) This patch fix this problem by considering the situation that "tp->symbol == NULL" in find_perf_probe_point_from_dwarf() and find_perf_probe_point_from_map(). After this patch: # perf probe -l probe:abs_ffffffff811e6615 (on SyS_write+5@fs/read_write.c) And when debug info is missing: # rm -rf ~/.debug # mv /lib/modules/4.2.0-rc1+/build/vmlinux /lib/modules/4.2.0-rc1+/build/vmlinux.bak # perf probe -l probe:abs_ffffffff811e6615 (on sys_write+5) Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1440509256-193590-1-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24perf tools: Add Intel PT support for decoding TRACESTOP packetsAdrian Hunter1-0/+11
A TRACESTOP packet is produced when an Intel PT trace enters a defined region of the address space at which point the tracing stops. This patch just adds decoder support. Support for specifying TRACESTOP regions is left until later. For details refer to the June 2015 or later Intel 64 and IA-32 Architectures SDM Chapter 36 Intel Processor Trace. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-25-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24perf tools: Add Intel PT support for decoding CYC packetsAdrian Hunter1-5/+306
CYC packets provide even finer grain timestamp information than MTC and TSC packets. A CYC packet contains the number of CPU cycles since the last CYC packet. This patch just adds decoder support. The CPU frequency can be related to TSC using the Maximum Non-Turbo Ratio in combination with the CBR (core-to-bus ratio) packet. However more accuracy is achieved by simply interpolating the number of cycles between other timing packets like MTC or TSC. This patch takes the latter approach. Support for a default value and validation of values is provided by a later patch. Also documentation is updated in a separate patch. For details refer to the June 2015 or later Intel 64 and IA-32 Architectures SDM Chapter 36 Intel Processor Trace. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-23-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24perf tools: Add Intel PT support for decoding MTC packetsAdrian Hunter2-4/+159
MTC packets provide finer grain timestamp information than TSC packets. MTC packets record time using the hardware crystal clock (CTC) which is related to TSC packets using a TMA packet. This patch just adds decoder support. Support for a default value and validation of values is provided by a later patch. Also documentation is updated in a separate patch. For details refer to the June 2015 or later Intel 64 and IA-32 Architectures SDM Chapter 36 Intel Processor Trace. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-21-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24perf tools: Pass Intel PT information for decoding MTC and CYCAdrian Hunter3-10/+60
Record additional information in the AUXTRACE_INFO event in preparation for decoding MTC and CYC packets. Pass the information to the decoder. The AUXTRACE_INFO record can be extended by using the size to indicate the presence of new members. The additional information includes PMU config bit positions and the TSC to CTC (hardware crystal clock) ratio needed to decode MTC packets. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-20-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24perf tools: Add new Intel PT packet definitionsAdrian Hunter3-17/+201
New features have been added to Intel PT which include a number of new packet definitions. This patch adds packet definitions for new packets: TMA, MTC, CYC, VMCS, TRACESTOP and MNT. Also another bit in PIP is defined. This patch only adds support for the definitions. Later patches add support for decoding TMA, MTC, CYC and TRACESTOP which is where those packets are explained. VMCS and the newly defined bit in PIP are used with virtualization which is not supported yet. MNT is a maintenance packet which the decoder should ignore. For details, refer to the June 2015 or later Intel 64 and IA-32 Architectures SDM Chapter 36 Intel Processor Trace. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-19-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24perf tools: Fix Intel PT 'instructions' sample periodAdrian Hunter3-1/+8
The period on synthesized 'instructions' samples was being set to a fixed value, whereas the correct value is the number of instructions since the last sample, which is a value that the decoder can provide. So do it that way. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-14-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24perf ordered_events: Clear the progress bar at the end of a flushArnaldo Carvalho de Melo1-0/+3
We were depending on the next screen operation after a flush() being one that would redraw the whole screen so that the progress bar would be overwritten, when that didn't happen a screen artifact of, say, a error dialog window would be overlaid on top of the progress bar, fix it by calling ui_browser__finish(), that now has a TUI implementation. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-el0fyw6duemnx62lydjzhs8c@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-24perf annotate: Reset the dso find_symbol cache when removing symbolsArnaldo Carvalho de Melo2-0/+12
The 'annotate' tool does some filtering in the entries in a DSO but forgot to reset the cache done in dso__find_symbol(), cauxing a SEGV: [root@zoo ~]# perf annotate netlink_poll perf: Segmentation fault -------- backtrace -------- perf[0x526ceb] /lib64/libc.so.6(+0x34960)[0x7faedfbe0960] perf(rb_erase+0x223)[0x499d63] perf[0x4213e9] perf[0x4bc123] perf[0x4bc621] perf[0x4bf26b] perf[0x4bc855] perf(perf_session__process_events+0x340)[0x4bddc0] perf(cmd_annotate+0x6bb)[0x421b5b] perf[0x479063] perf(main+0x60a)[0x42098a] /lib64/libc.so.6(__libc_start_main+0xf0)[0x7faedfbcbfe0] perf[0x420aa9] [0x0] [root@zoo ~]# Fix it by reseting the find cache when removing symbols. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Fixes: b685ac22b436 ("perf symbols: Add front end cache for DSO symbol lookup") Link: http://lkml.kernel.org/n/tip-b2y9x46y0t8yem1ive41zqyp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-22perf tools: Fix tarball build broken by pt/btsAdrian Hunter6-6/+35
Fix some include paths and add missing inat_types.h. Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/55D77696.60102@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-21perf probe: Try to use symbol table if searching debug info failedWang Nan1-3/+4
A problem can occur in a statically linked perf when vmlinux can be found: # perf probe --add sys_epoll_pwait probe-definition(0): sys_epoll_pwait symbol:sys_epoll_pwait file:(null) line:0 offset:0 return:0 lazy:(null) 0 arguments Looking at the vmlinux_path (7 entries long) Using /lib/modules/4.2.0-rc1+/build/vmlinux for symbols Open Debuginfo file: /lib/modules/4.2.0-rc1+/build/vmlinux Try to find probe point from debuginfo. Symbol sys_epoll_pwait address found : ffffffff8122bd40 Matched function: SyS_epoll_pwait Failed to get call frame on 0xffffffff8122bd40 An error occurred in debuginfo analysis (-2). Error: Failed to add events. Reason: No such file or directory (Code: -2) The reason is caused by libdw that, if libdw is statically linked, it can't load libebl_{arch}.so reliable. In this case it is still possible to get the address from /proc/kalksyms. However, perf tries that only when libdw returns -EBADF. This patch gives it another chance to utilize symbol table, even if libdw returns an error code other than -EBADF. After applying this patch: # perf probe -nv --add sys_epoll_pwait probe-definition(0): sys_epoll_pwait symbol:sys_epoll_pwait file:(null) line:0 offset:0 return:0 lazy:(null) 0 arguments Looking at the vmlinux_path (7 entries long) Using /lib/modules/4.2.0-rc1+/build/vmlinux for symbols Open Debuginfo file: /lib/modules/4.2.0-rc1+/build/vmlinux Try to find probe point from debuginfo. Symbol sys_epoll_pwait address found : ffffffff8122bd40 Matched function: SyS_epoll_pwait Failed to get call frame on 0xffffffff8122bd40 An error occurred in debuginfo analysis (-2). Trying to use symbols. Opening /sys/kernel/debug/tracing/kprobe_events write=1 Added new event: Writing event: p:probe/sys_epoll_pwait _text+2276672 probe:sys_epoll_pwait (on sys_epoll_pwait) You can now use it in all perf tools, such as: perf record -e probe:sys_epoll_pwait -aR sleep 1 Although libdw returns an error (Failed to get call frame), perf tries symbol table and finally gets correct address. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Brendan Gregg <brendan.d.gregg@gmail.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David Ahern <dsahern@gmail.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kaixu Xia <xiakaixu@huawei.com> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/1440151770-129878-2-git-send-email-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-21perf tools: Initialize reference counts in map__clone()Arnaldo Carvalho de Melo1-2/+11
Map clone was written before we introduced reference counts for maps and dsos, so all that was needed was just a copy and then we would insert it into the new map_groups instance. Fix it by, after copying, initializing the map->refcnt, grabbing a struct dso refcount and resetting pointers that may be used to determine if a map, when deleted, is in a rb_tree. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Borislav Petkov <bp@suse.de> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-pd4mr80o5b9gvk50iineacec@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-21perf tools: Add Intel BTS supportAdrian Hunter6-4/+981
Intel BTS support fits within the new auxtrace infrastructure. Recording is supporting by identifying the Intel BTS PMU, parsing options and setting up events. Decoding is supported by queuing up trace data by thread and then decoding synchronously delivering synthesized event samples into the session processing for tools to consume. Committer note: E.g: [root@felicio ~]# perf record --per-thread -e intel_bts// ls anaconda-ks.cfg apctest.output bin kernel-rt-3.10.0-298.rt56.171.el7.x86_64.rpm libexec lock_page.bpf.c perf.data perf.data.old [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 4.367 MB perf.data ] [root@felicio ~]# perf evlist -v intel_bts//: type: 6, size: 112, { sample_period, sample_freq }: 1, sample_type: IP|TID|IDENTIFIER, read_format: ID, disabled: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1 dummy:u: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 1, sample_type: IP|TID|IDENTIFIER, read_format: ID, disabled: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1 [root@felicio ~]# perf script # the navigate in the pager to some interesting place: ls 1843 1 branches: ffffffff810a60cb flush_signal_handlers ([kernel.kallsyms]) => ffffffff8121a522 setup_new_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8121a529 setup_new_exec ([kernel.kallsyms]) => ffffffff8122fa30 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa5d do_close_on_exec ([kernel.kallsyms]) => ffffffff81767ae0 _raw_spin_lock ([kernel.kallsyms]) ls 1843 1 branches: ffffffff81767af4 _raw_spin_lock ([kernel.kallsyms]) => ffffffff8122fa62 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fac9 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fad2 do_close_on_exec ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8122fadd do_close_on_exec ([kernel.kallsyms]) => ffffffff8120fc80 filp_close ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8120fcaf filp_close ([kernel.kallsyms]) => ffffffff8120fcb6 filp_close ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8120fcc2 filp_close ([kernel.kallsyms]) => ffffffff812547f0 dnotify_flush ([kernel.kallsyms]) ls 1843 1 branches: ffffffff81254823 dnotify_flush ([kernel.kallsyms]) => ffffffff8120fcc7 filp_close ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8120fccd filp_close ([kernel.kallsyms]) => ffffffff81261790 locks_remove_posix ([kernel.kallsyms]) ls 1843 1 branches: ffffffff812617a3 locks_remove_posix ([kernel.kallsyms]) => ffffffff812617b9 locks_remove_posix ([kernel.kallsyms]) ls 1843 1 branches: ffffffff812617b9 locks_remove_posix ([kernel.kallsyms]) => ffffffff8120fcd2 filp_close ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8120fcd5 filp_close ([kernel.kallsyms]) => ffffffff812142c0 fput ([kernel.kallsyms]) ls 1843 1 branches: ffffffff812142d6 fput ([kernel.kallsyms]) => ffffffff812142df fput ([kernel.kallsyms]) ls 1843 1 branches: ffffffff8121430c fput ([kernel.kallsyms]) => ffffffff810b6580 task_work_add ([kernel.kallsyms]) ls 1843 1 branches: ffffffff810b65ad task_work_add ([kernel.kallsyms]) => ffffffff810b65b1 task_work_add ([kernel.kallsyms]) ls 1843 1 branches: ffffffff810b65c1 task_work_add ([kernel.kallsyms]) => ffffffff810bc710 kick_process ([kernel.kallsyms]) ls 1843 1 branches: ffffffff810bc725 kick_process ([kernel.kallsyms]) => ffffffff810bc742 kick_process ([kernel.kallsyms]) ls 1843 1 branches: ffffffff810bc742 kick_process ([kernel.kallsyms]) => ffffffff810b65c6 task_work_add ([kernel.kallsyms]) ls 1843 1 branches: ffffffff810b65c9 task_work_add ([kernel.kallsyms]) => ffffffff81214311 fput ([kernel.kallsyms]) Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-9-git-send-email-adrian.hunter@intel.com [ Merged sample->time fix for bug found after first round of testing on slightly older kernel ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-21perf tools: Fix Intel PT timestamp handlingAdrian Hunter1-1/+1
Events that don't sample the timestamp have a timestamp value of -1. Intel PT processing wasn't taking that into account. This is particularly noticeable with Intel BTS because timestamps are not requested by default. Then, if the conversion of -1 to TSC results in a small number, the processing is unaffected. However if the conversion results in a big number, then the data is processed prematurely before relevant sideband data like mmap events, which in turn results in samples with unknown dsos. Commiter note: Since BTS wasn't upstream, I split the patch to fold the BTS part with the patch introducing it, to avoid having this bug in the commit history. PT was already upstream, so this patch contains that part. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1440060692-5585-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-21perf tools: /proc/kcore requires CAP_SYS_RAWIO message too noisyAdrian Hunter2-2/+3
The "/proc/kcore requires CAP_SYS_RAWIO" message comes up all the time for 'perf script' if vmlinux is not found and the user isn't root, even when the kernel is not being traced and even though the message is only really relevant for annotation. Change it to pr_debug and instead put a note in the message displayed if annotation is not possible. Also, the file being accessed might not be /proc/kcore. Tools can be directed to a different location using the --kallsyms option in which case kcore is expected to be in the same directory. Adjust the message so it is not misleading in that case. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Li Zhang <zhlcindy@linux.vnet.ibm.com> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/1440065260-8802-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-21perf script: Fix segfault using --show-mmap-eventsAdrian Hunter1-1/+1
Patch "perf script: Don't assume evsel position of tracking events" changed 'perf script' to use 'perf_evlist__id2evsel()'. That results in a segfault if there is more than 1 event and there are synthesized mmap events e.g. $ perf record -e cycles,instructions -p$$ sleep 1 $ perf script --show-mmap-events Segmentation fault (core dumped) That happens because these synthesized events have an 'id' of zero which does not match any 'evsel'. Currently, these synthesized events use the sample type of the first evsel. Change 'perf_evlist__id2evsel()' to reflect that which also makes it consistent with 'perf_evlist__event2evsel()'. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Fixes: 06b234ec26fd ("perf script: Don't assume evsel position of tracking events") Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1440059205-1765-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-20Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar26-6/+7393
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: - Support Intel PT in several tools, enabling the use of the processor trace feature introduced in Intel Broadwell processors: (Adrian Hunter) # dmesg | grep Performance # [0.188477] Performance Events: PEBS fmt2+, 16-deep LBR, Broadwell events, full-width counters, Intel PMU driver. # perf record -e intel_pt//u -a sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.216 MB perf.data ] # perf script # then navigate in the tool output to some area, like this one: 184 1030 dl_main (/usr/lib64/ld-2.17.so) => 7f21ba661440 dl_main (/usr/lib64/ld-2.17.so) 185 1457 dl_main (/usr/lib64/ld-2.17.so) => 7f21ba669f10 _dl_new_object (/usr/lib64/ld-2.17.so) 186 9f37 _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba677b90 strlen (/usr/lib64/ld-2.17.so) 187 7ba3 strlen (/usr/lib64/ld-2.17.so) => 7f21ba677c75 strlen (/usr/lib64/ld-2.17.so) 188 7c78 strlen (/usr/lib64/ld-2.17.so) => 7f21ba669f3c _dl_new_object (/usr/lib64/ld-2.17.so) 189 9f8a _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba65fab0 calloc@plt (/usr/lib64/ld-2.17.so) 190 fab0 calloc@plt (/usr/lib64/ld-2.17.so) => 7f21ba675e70 calloc (/usr/lib64/ld-2.17.so) 191 5e87 calloc (/usr/lib64/ld-2.17.so) => 7f21ba65fa90 malloc@plt (/usr/lib64/ld-2.17.so) 192 fa90 malloc@plt (/usr/lib64/ld-2.17.so) => 7f21ba675e60 malloc (/usr/lib64/ld-2.17.so) 193 5e68 malloc (/usr/lib64/ld-2.17.so) => 7f21ba65fa80 __libc_memalign@plt (/usr/lib64/ld-2.17.so) 194 fa80 __libc_memalign@plt (/usr/lib64/ld-2.17.so) => 7f21ba675d50 __libc_memalign (/usr/lib64/ld-2.17.so) 195 5d63 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675e20 __libc_memalign (/usr/lib64/ld-2.17.so) 196 5e40 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675d73 __libc_memalign (/usr/lib64/ld-2.17.so) 197 5d97 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675e18 __libc_memalign (/usr/lib64/ld-2.17.so) 198 5e1e __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675df9 __libc_memalign (/usr/lib64/ld-2.17.so) 199 5e10 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba669f8f _dl_new_object (/usr/lib64/ld-2.17.so) 200 9fc2 _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba678e70 memcpy (/usr/lib64/ld-2.17.so) 201 8e8c memcpy (/usr/lib64/ld-2.17.so) => 7f21ba678ea0 memcpy (/usr/lib64/ld-2.17.so) - Fix annotation of vdso (Adrian Hunter) - Fix DWARF callchains in 'perf script' (Jiri Olsa) - Fix adding probes in kernel syscalls and listing which variables can be collected at kernel syscall function lines (Masami Hiramatsu) Build Fixes: - Fix 32-bit compilation error in util/annotate.c (Adrian Hunter) - Support static linking with libdw on Fedora 22 (Andi Kleen) Infrastructure changes: - Add a helper function to probe whether cpu-wide tracing is possible (Adrian Hunter) - Move vfs_getname storage to per thread area in 'perf trace' (Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-20Merge branch 'perf/urgent' into perf/core, to pick up fixes before adding ↵Ingo Molnar2-2/+24
more changes Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-19perf tools: Make fork event processing more resilientAdrian Hunter1-2/+18
When processing a fork event, the tools lookup the parent thread by its tid. In a couple of cases, it is possible for that thread to have the wrong pid. That can happen if the data is being processed out of order, or if the (fork) event that would have removed the erroneous thread was lost. Assume the latter case, print a dump message, remove the erroneous thread, create a new one with the correct pid, and keep going. Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1439994561-27436-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-19perf tools: Avoid deadlock when map_groups are brokenAdrian Hunter1-0/+6
Attempting to clone map groups onto themselves will deadlock. It only happens because of other bugs, but the code should protect itself anyway. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/1439994561-27436-2-git-send-email-adrian.hunter@intel.com [ Use pr_debug() instead of dump_fprintf() ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-17perf tools: Take Intel PT into useAdrian Hunter2-3/+6
To record an AUX area, the weak function auxtrace_record__init() must be implemented. Equally to decode an AUX area, the AUX area tracing type must be added to the perf_event__process_auxtrace_info() function. This patch makes those two changes plus hooks up default config for the intel_pt PMU. Also some brief documentation is provided for using the tools with intel_pt. Commiter note: E.g: [root@perf4 ~]# dmesg 451 [0.405807] Performance Events: PEBS fmt2+, 16-deep LBR, Broadwell events, full-width counters, Intel PMU driver. [root@perf4 ~]# perf --version perf version 4.1.g53874a [root@perf4 ~]# perf record -e intel_pt//u -a sleep 10 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.383 MB perf.data ] [root@perf4 ~]# perf evlist intel_pt//u sched:sched_switch dummy:u [root@perf4 ~]# perf report --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 0 of event 'intel_pt//u' # Event count (approx.): 0 # # Overhead Command Shared Object Symbol # ........ ....... ............. ...... # # Samples: 393 of event 'sched:sched_switch' # Event count (approx.): 393 # # Overhead Command Shared Object Symbol # ........ .............. ................ .............. 49.62% swapper [kernel.vmlinux] [k] __schedule 10.69% rcu_sched [kernel.vmlinux] [k] __schedule 6.62% rcuos/0 [kernel.vmlinux] [k] __schedule 5.60% kworker/0:1 [kernel.vmlinux] [k] __schedule 3.56% rcuos/3 [kernel.vmlinux] [k] __schedule 3.05% kworker/u384:2 [kernel.vmlinux] [k] __schedule 2.54% kworker/2:0 [kernel.vmlinux] [k] __schedule 2.54% tuned [kernel.vmlinux] [k] __schedule <SNIP> # Samples: 0 of event 'dummy:u' # Event count (approx.): 0 # # Overhead Command Shared Object Symbol # ........ ....... ............. ...... # Samples: 28 of event 'instructions:u' # Event count (approx.): 5030172 # # Overhead Command Shared Object Symbol # ........ .......... ................... ................................ # 21.43% tuned libpython2.7.so.1.0 [.] PyEval_EvalFrameEx | ---PyEval_EvalFrameEx | |--83.33%-- PyEval_EvalCodeEx | PyEval_EvalFrameEx | | | |--60.00%-- PyEval_EvalCodeEx | | PyEval_EvalFrameEx | | PyEval_EvalFrameEx | | | --40.00%-- PyEval_EvalFrameEx | --16.67%-- PyEval_EvalFrameEx PyEval_EvalCodeEx PyEval_EvalFrameEx PyEval_EvalCodeEx PyEval_EvalFrameEx PyEval_EvalFrameEx 14.29% tuned libpython2.7.so.1.0 [.] _PyType_Lookup | ---_PyType_Lookup _PyObject_GenericGetAttrWithDict PyEval_EvalFrameEx PyEval_EvalCodeEx PyEval_EvalFrameEx PyEval_EvalCodeEx PyEval_EvalFrameEx | |--75.00%-- PyEval_EvalFrameEx | --25.00%-- PyEval_EvalCodeEx PyEval_EvalFrameEx PyEval_EvalFrameEx 3.57% irqbalance irqbalance [.] 0x0000000000004038 | ---0x4038 0x4761 0x4761 0x4761 0x49f1 0x2295 3.57% irqbalance libc-2.17.so [.] __GI_____strtoull_l_internal | ---__GI_____strtoull_l_internal 0x6f49 0x229a 3.57% irqbalance libc-2.17.so [.] __strchrnul | ---__strchrnul vfprintf __vsprintf_chk __sprintf_chk 0x2724 0x4038 0x2331 3.57% irqbalance libc-2.17.so [.] __strstr_sse42 | ---__strstr_sse42 0x71e0 0x229f # And now to some userspace ftrace on uninstrumented binaries 8-) : # Hand edited to make it a bit more compact, replacing /home/acme/bin/perf # with /bin/perf: [root@perf4 ~]# perf script perf 8921 [3] 7.310889: 1 branches:u: 0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310889: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310889: 1 branches:u: 481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310889: 1 branches:u: 481630 perf_evlist__enable (/bin/perf) => 4816d8 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310889: 1 branches:u: 4816de perf_evlist__enable (/bin/perf) => 48164f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310889: 1 branches:u: 481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310889: 1 branches:u: 481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf) perf 8921 [3] 7.310889: 1 branches:u: 41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310889: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown]) perf 8921 [3] 7.310890: 1 branches:u: 0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310890: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310890: 1 branches:u: 481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310890: 1 branches:u: 481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310890: 1 branches:u: 481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf) perf 8921 [3] 7.310890: 1 branches:u: 41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310890: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown]) perf 8921 [3] 7.310893: 1 branches:u: 0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310893: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310893: 1 branches:u: 4816a8 perf_evlist__enable (/bin/perf) => 4815f8 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310893: 1 branches:u: 4815fe perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310893: 1 branches:u: 481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310893: 1 branches:u: 481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf) perf 8921 [3] 7.310893: 1 branches:u: 41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310893: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown]) perf 8921 [3] 7.310956: 1 branches:u: 0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310956: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310956: 1 branches:u: 481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310956: 1 branches:u: 481630 perf_evlist__enable (/bin/perf) => 4816d8 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310956: 1 branches:u: 4816de perf_evlist__enable (/bin/perf) => 48164f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310956: 1 branches:u: 481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310956: 1 branches:u: 481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf) perf 8921 [3] 7.310956: 1 branches:u: 41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310956: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown]) perf 8921 [3] 7.310961: 1 branches:u: 0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310961: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310961: 1 branches:u: 481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310961: 1 branches:u: 481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310961: 1 branches:u: 481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf) perf 8921 [3] 7.310961: 1 branches:u: 41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310961: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown]) perf 8921 [3] 7.310968: 1 branches:u: 0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310968: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310968: 1 branches:u: 4816a8 perf_evlist__enable (/bin/perf) => 4815f8 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310968: 1 branches:u: 4815fe perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310968: 1 branches:u: 481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.310968: 1 branches:u: 481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf) perf 8921 [3] 7.310968: 1 branches:u: 41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.310968: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown]) perf 8921 [3] 7.311040: 1 branches:u: 0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.311040: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.311040: 1 branches:u: 481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.311040: 1 branches:u: 481630 perf_evlist__enable (/bin/perf) => 4816d8 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.311040: 1 branches:u: 4816de perf_evlist__enable (/bin/perf) => 48164f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.311040: 1 branches:u: 481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.311040: 1 branches:u: 481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf) perf 8921 [3] 7.311040: 1 branches:u: 41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.311040: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown]) perf 8921 [3] 7.311046: 1 branches:u: 0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.311046: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.311046: 1 branches:u: 481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf) perf 8921 [3] 7.311046: 1 branches:u: 481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf) perf 8921 [3] 7.311046: 1 branches:u: 481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf) perf 8921 [3] 7.311046: 1 branches:u: 41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.311046: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown]) perf 8921 [3] 7.311050: 1 branches:u: 0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so) perf 8921 [3] 7.311050: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf) : Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-8-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-17perf tools: Add Intel PT supportAdrian Hunter3-0/+1963
Add support for Intel Processor Trace. Intel PT support fits within the new auxtrace infrastructure. Recording is supporting by identifying the Intel PT PMU, parsing options and setting up events. Decoding is supported by queuing up trace data by cpu or thread and then decoding synchronously delivering synthesized event samples into the session processing for tools to consume. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-7-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-17perf tools: Add Intel PT decoderAdrian Hunter3-1/+1921
Add support for decoding an Intel Processor Trace. Intel PT trace data must be 'decoded' which involves walking the object code and matching the trace data packets. The decoder requests a buffer of binary data via a get_trace() call-back, which it decodes using instruction information which it gets via another call-back walk_insn(). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-6-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-17perf tools: Add Intel PT logAdrian Hunter3-1/+208
Add a facility to log Intel Processor Trace decoding. The log is intended for debugging purposes only. The log file name is "intel_pt.log" and is opened in the current directory. The log contains a record of all packets and instructions decoded and can get very large (10 MB would be a small one). Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-5-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-17perf tools: Add Intel PT instruction decoderAdrian Hunter9-1/+2790
Add support for decoding instructions for Intel Processor Trace. The kernel x86 instruction decoder is copied for this. This essentially provides intel_pt_get_insn() which takes a binary buffer, uses the kernel's x86 instruction decoder to get details of the instruction and then categorizes it for consumption by an Intel PT decoder. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1439450095-30122-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-17perf tools: Add Intel PT packet decoderAdrian Hunter4-0/+466
Add support for decoding Intel Processor Trace packets. This essentially provides intel_pt_get_packet() which takes a buffer of binary data and returns the decoded packet. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-3-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-17perf auxtrace: Add Intel PT as an AUX area tracing typeAdrian Hunter2-0/+2
Add the Intel Processor Trace type constant PERF_AUXTRACE_INTEL_PT. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1437150840-31811-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-17perf tools: Add a helper function to probe whether cpu-wide tracing is possibleAdrian Hunter2-0/+25
Add a helper function to probe whether cpu-wide tracing is possible. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/1439458857-30636-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-17perf symbols: Fix annotation of vdsoAdrian Hunter1-0/+11
Older kernels attempt to prelink vdso to its virtual address. To permit annotation using objdump, the map__rip_2objdump() calculation must result in that same address which we can infer from the start and offset of the text section. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/1439556606-11297-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-17perf annotate: Fix 32-bit compilation error in util/annotate.cAdrian Hunter1-2/+2
Fix the following 32-bit compilation errors: util/annotate.c: In function ‘addr_map_symbol__account_cycles’: util/annotate.c:643:3: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘u64’ [-Werror=format=] pr_debug2("BB with bad start: addr %lx start %lx sym %lx saddr %lx\n", ^ util/annotate.c:643:3: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘u64’ [-Werror=format=] util/annotate.c:643:3: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 6 has type ‘u64’ [-Werror=format=] These were introduced by the patch: "perf report: Add infrastructure for a cycles histogram" Also change the 'saddr' variable from 'unsigned long' to 'u64' noting that theoretically we could be processing data captured on a 64-bit machine but processing it on a 32-bit machine. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Fixes: d4957633bf9d ("perf report: Add infrastructure for a cycles histogram") Link: http://lkml.kernel.org/r/1439536294-18241-1-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-13perf probe: Fix to add missed brace around if blockMasami Hiramatsu1-1/+2
The commit 75186a9b09e4 (perf probe: Fix to show lines of sys_ functions correctly) introduced a bug by a missed brace around if block. This fixes to add it. Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: 75186a9b09e4 ("perf probe: Fix to show lines of sys_ functions correctly") Link: http://lkml.kernel.org/r/20150812215541.9088.62425.stgit@localhost.localdomain Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-12perf report: Show call graph from reference eventsKan Liang2-2/+8
Introduce --show-ref-call-graph for perf report to print reference callgraph for no callgraph event. Here is an example. perf report --show-ref-call-graph --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 5 of event 'cpu/cpu-cycles,call-graph=fp/' # Event count (approx.): 144985 # # Children Self Command Shared Object Symbol # ........ ........ ....... ................ ........................................ # 72.30% 0.00% sleep [kernel.vmlinux] [k] entry_SYSCALL_64_fastpath | ---entry_SYSCALL_64_fastpath | |--22.62%-- __GI___libc_nanosleep --77.38%-- [...] ...... # Samples: 6 of event 'cpu/instructions,call-graph=no/', show reference callgraph # Event count (approx.): 172780 # # Children Self Command Shared Object Symbol # ........ ........ ....... ................ ........................................ # 73.16% 0.00% sleep [kernel.vmlinux] [k] entry_SYSCALL_64_fastpath | ---entry_SYSCALL_64_fastpath | |--31.44%-- __GI___libc_nanosleep --68.56%-- [...] Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1439289050-40510-3-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-12perf callchain: Allow disabling call graphs per eventKan Liang2-9/+17
This patch introduce "call-graph=no" to disable per-event callgraph. Here is an example. perf record -e 'cpu/cpu-cycles,call-graph=fp/,cpu/instructions,call-graph=no/' sleep 1 perf report --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 6 of event 'cpu/cpu-cycles,call-graph=fp/' # Event count (approx.): 774218 # # Children Self Command Shared Object Symbol # ........ ........ ....... ................ ........................................ # 61.94% 0.00% sleep [kernel.vmlinux] [k] entry_SYSCALL_64_fastpath | ---entry_SYSCALL_64_fastpath | |--97.30%-- __brk | --2.70%-- mmap64 _dl_check_map_versions _dl_check_all_versions 61.94% 0.00% sleep [kernel.vmlinux] [k] perf_event_mmap | ---perf_event_mmap | |--97.30%-- do_brk | sys_brk | entry_SYSCALL_64_fastpath | __brk | --2.70%-- mmap_region do_mmap_pgoff vm_mmap_pgoff sys_mmap_pgoff sys_mmap entry_SYSCALL_64_fastpath mmap64 _dl_check_map_versions _dl_check_all_versions ...... # Samples: 6 of event 'cpu/instructions,call-graph=no/' # Event count (approx.): 359692 # # Children Self Command Shared Object Symbol # ........ ........ ....... ................ ................................. # 89.03% 0.00% sleep [unknown] [.] 0xffff6598ffff6598 89.03% 0.00% sleep ld-2.17.so [.] _dl_resolve_conflicts 89.03% 0.00% sleep [kernel.vmlinux] [k] page_fault Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1439289050-40510-2-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2015-08-12perf callchain: Per-event type selection supportKan Liang6-3/+83
This patchkit adds the ability to set callgraph mode (fp, dwarf, lbr) per event. This in term can reduce sampling overhead and the size of the perf.data. Here is an example. perf record -e 'cpu/cpu-cycles,period=1000,call-graph=fp,time=1/,cpu/instructions,call-graph=lbr/' sleep 1 perf evlist -v cpu/cpu-cycles,period=1000,call-graph=fp,time=1/: type: 4, size: 112, config: 0x3c, { sample_period, sample_freq }: 1000, sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1 cpu/instructions,call-graph=lbr/: type: 4, size: 112, config: 0xc0, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|BRANCH_STACK|IDENTIFIER, read_format: ID, disabled: 1, inherit: 1, freq: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1 Signed-off-by: Kan Liang <kan.liang@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/1439289050-40510-1-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>