summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2012-05-21Merge branch 'perf/core' of ↵Ingo Molnar361-2147/+3638
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Fixes for perf/core: - Rename some perf_target methods to avoid double negation, from Namhyung Kim. - Revert change to use per task events with inheritance, from Namhyung Kim. - Events should start disabled till children starts running, from David Ahern. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-18perf evsel: Create events initially disabled -- againDavid Ahern1-1/+1
764e16a changed perf-record to create events disabled by default and enable them once perf initializations are done. This setting was dropped by 0f82ebc. Now perf events are once again generated during perf's initialization phase (e.g., generating maps). As an example, perf opens a lot of files at startup. Unpatched: perf record -e syscalls:sys_enter_open -ga -fo /tmp/perf.data -- sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.087 MB /tmp/perf.data (~3798 samples) ] Using perf-script to look at the samples shows the perf command generating 563 of the 566 total events. Patched: perf record -e syscalls:sys_enter_open -ga -fo /tmp/perf.data -- sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.028 MB /tmp/perf.data (~1206 samples) ] Using perf-script to look at the samples does not show perf command. Signed-off-by: David Ahern <dsahern@gmail.com> Link: http://lkml.kernel.org/r/1336968088-11531-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-18Merge remote-tracking branch 'tip/perf/urgent' into perf/coreArnaldo Carvalho de Melo351-2073/+3514
Merge reason: We are going to queue up a dependent patch: "perf tools: Move parse event automated tests to separated object" That depends on: commit e7c72d8 perf tools: Add 'G' and 'H' modifiers to event parsing Conflicts: tools/perf/builtin-stat.c Conflicted with the recent 'perf_target' patches when checking the result of perf_evsel open routines to see if a retry is needed to cope with older kernels where the exclude guest/host perf_event_attr bits were not used. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-18perf tools: Split term type into value type and term typeJiri Olsa4-56/+96
Introducing type_val and type_term for term instead of a single type value. Currently the term type marked out the value type as well. With this change we can have future string term values being specified by user and translated into proper number along the processing. Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1335371102-11358-2-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-17perf hists: Fix callchain ip printf formatJiri Olsa1-1/+1
The callchain address is stored as u64. Current code uses following format string to display callchain address: "%p\n", (void *)(long)chain->ip This way we lose upper 32 bits if we report 64 bit addresses in 32 bit environment. Fixing this to always display whole 64 bits. Note, running following to test perf endianity handling: test 1) - origin system: # perf record -a -- sleep 10 (any perf record will do) # perf report > report.origin # perf archive perf.data - copy the perf.data, report.origin and perf.data.tar.bz2 to a target system and run: # tar xjvf perf.data.tar.bz2 -C ~/.debug # perf report > report.target # diff -u report.origin report.target - the diff should produce no output (besides some white space stuff and possibly different date/TZ output) test 2) - origin system: # perf record -ag -fo /tmp/perf.data -- sleep 1 - mount origin system root to the target system on /mnt/origin - target system: # perf script --symfs /mnt/origin -I -i /mnt/origin/tmp/perf.data \ --kallsyms /mnt/origin/proc/kallsyms - complete perf.data header is displayed Signed-off-by: Jiri Olsa <jolsa@redhat.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1337151548-2396-8-git-send-email-jolsa@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-17perf target: Add uses_mmap fieldNamhyung Kim5-0/+10
If perf doesn't mmap on event (like perf stat), it should not create per-task-per-cpu events. So just use a dummy cpu map to create a per-task event for this case. Signed-off-by: Namhyung Kim <namhyung.kim@lge.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1337161549-9870-3-git-send-email-namhyung.kim@lge.com [ committer note: renamed .need_mmap to .uses_mmap ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-16ftrace: Remove selecting FRAME_POINTER with FUNCTION_TRACERSteven Rostedt1-1/+0
The function tracer will enable the -pg option with gcc, which requires that frame pointers. When FRAME_POINTER is defined in the kernel config it adds the gcc option -fno-omit-frame-pointer which causes some problems on some architectures. For those architectures, the FRAME_POINTER select was not set. When FUNCTION_TRACER was selected on these architectures that can not have -fno-omit-frame-pointer, the -pg option is still set. But when FRAME_POINTER is not selected, the kernel config would add the gcc option -fomit-frame-pointer. Adding this option is incompatible with -pg even on archs that do not need frame pointers with -pg. The answer to this was to just not add either -fno-omit-frame-pointer or -fomit-frame-pointer on these archs that want function tracing but do not set FRAME_POINTER. As it turns out, for archs that require frame pointers for function tracing, the same can be used. If gcc requires frame pointers with -pg, it will simply add it. The best thing to do is not select FRAME_POINTER when function tracing is selected, and let gcc add it if needed. Only add the -fno-omit-frame-pointer when something else selects FRAME_POINTER, but do not add -fomit-frame-pointer if function tracing is selected. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-16ftrace/x86: Have x86 ftrace use the ftrace_modify_all_code()Steven Rostedt3-15/+5
To remove duplicate code, have the ftrace arch_ftrace_update_code() use the generic ftrace_modify_all_code(). This requires that the default ftrace_replace_code() becomes a weak function so that an arch may override it. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-16ftrace: Make ftrace_modify_all_code() global for archs to useSteven Rostedt2-8/+15
Rename __ftrace_modify_code() to ftrace_modify_all_code() and make it global for all archs to use. This will remove the duplication of code, as archs that can modify code without stop_machine() can use it directly outside of the stop_machine() call. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-16ftrace: Return record ip addr for ftrace_location()Steven Rostedt2-7/+11
ftrace_location() is passed an addr, and returns 1 if the addr is on a ftrace nop (or caller to ftrace_caller), and 0 otherwise. To let kprobes know if it should move a breakpoint or not, it must return the actual addr that is the start of the ftrace nop. This way a kprobe placed on the location of a ftrace nop, can instead be placed on the instruction after the nop. Even if the probe addr is on the second or later byte of the nop, it can simply be moved forward. Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-16ftrace: Consolidate ftrace_location() and ftrace_text_reserved()Steven Rostedt1-40/+40
Both ftrace_location() and ftrace_text_reserved() do basically the same thing. They search to see if an address is in the ftace table (contains an address that may change from nop to call ftrace_caller). The difference is that ftrace_location() searches a single address, but ftrace_text_reserved() searches a range. This also makes the ftrace_text_reserved() faster as it now uses a bsearch() instead of linearly searching all the addresses within a page. Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-16ftrace: Speed up search by skipping pages by addressSteven Rostedt1-6/+16
As all records in a page of the ftrace table are sorted, we can speed up the search algorithm by checking if the address to look for falls in between the first and last record ip on the page. This speeds up both the ftrace_location() and ftrace_text_reserved() algorithms, as it can skip full pages when the search address is not in them. Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-16ftrace: Remove extra helper functionsSteven Rostedt1-37/+24
The ftrace_record_ip() and ftrace_alloc_dyn_node() were from the time of the ftrace daemon. Although they were still used, they still make things a bit more complex than necessary. Move the code into the one function that uses it, and remove the helper functions. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-16ftrace: Sort all function addresses, not just per pageSteven Rostedt2-13/+23
Instead of just sorting the ip's of the functions per ftrace page, sort the entire list before adding them to the ftrace pages. This will allow the bsearch algorithm to be sped up as it can also sort by pages, not just records within a page. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-16tracing: change CPU ring buffer state from tracing_cpumaskVaibhav Nagarnaik1-0/+2
According to Documentation/trace/ftrace.txt: tracing_cpumask: This is a mask that lets the user only trace on specified CPUS. The format is a hex string representing the CPUS. The tracing_cpumask currently doesn't affect the tracing state of per-CPU ring buffers. This patch enables/disables CPU recording as its corresponding bit in tracing_cpumask is set/unset. Link: http://lkml.kernel.org/r/1336096792-25373-3-git-send-email-vnagarnaik@google.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Laurent Chavey <chavey@google.com> Cc: Justin Teravest <teravest@google.com> Cc: David Sharp <dhsharp@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-16tracing: Check return value of tracing_dentry_percpu()Namhyung Kim1-0/+3
If tracing_dentry_percpu() failed, tracing_init_debugfs_percpu() will try to create each cpu directories on debugfs' root directory as d_percpu is NULL. Link: http://lkml.kernel.org/r/1335143517-2285-1-git-send-email-namhyung.kim@lge.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Namhyung Kim <namhyung.kim@lge.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-16ring-buffer: Reset head page before running self testSteven Rostedt1-0/+4
When the ring buffer does its consistency test on itself, it removes the head page, runs the tests, and then adds it back to what the "head_page" pointer was. But because the head_page pointer may lack behind the real head page (held by the link list pointer). The reset may be incorrect. Instead, if the head_page exists (it does not on first allocation) reset it back to the real head page before running the consistency tests. Then it will be put back to its original location after the tests are complete. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-16ring-buffer: Add integrity check at end of iter readSteven Rostedt1-0/+29
There use to be ring buffer integrity checks after updating the size of the ring buffer. But now that the ring buffer can modify the size while the system is running, the integrity checks were removed, as they require the ring buffer to be disabed to perform the check. Move the integrity check to the reading of the ring buffer via the iterator reads (the "trace" file). As reading via an iterator requires disabling the ring buffer, it is a perfect place to have it. If the ring buffer happens to be disabled when updating the size, we still perform the integrity check. Cc: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-16ring-buffer: Make addition of pages in ring buffer atomicVaibhav Nagarnaik1-25/+77
This patch adds the capability to add new pages to a ring buffer atomically while write operations are going on. This makes it possible to expand the ring buffer size without reinitializing the ring buffer. The new pages are attached between the head page and its previous page. Link: http://lkml.kernel.org/r/1336096792-25373-2-git-send-email-vnagarnaik@google.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Laurent Chavey <chavey@google.com> Cc: Justin Teravest <teravest@google.com> Cc: David Sharp <dhsharp@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-16ring-buffer: Make removal of ring buffer pages atomicVaibhav Nagarnaik2-76/+209
This patch adds the capability to remove pages from a ring buffer without destroying any existing data in it. This is done by removing the pages after the tail page. This makes sure that first all the empty pages in the ring buffer are removed. If the head page is one in the list of pages to be removed, then the page after the removed ones is made the head page. This removes the oldest data from the ring buffer and keeps the latest data around to be read. To do this in a non-racey manner, tracing is stopped for a very short time while the pages to be removed are identified and unlinked from the ring buffer. The pages are freed after the tracing is restarted to minimize the time needed to stop tracing. The context in which the pages from the per-cpu ring buffer are removed runs on the respective CPU. This minimizes the events not traced to only NMI trace contexts. Link: http://lkml.kernel.org/r/1336096792-25373-1-git-send-email-vnagarnaik@google.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Laurent Chavey <chavey@google.com> Cc: Justin Teravest <teravest@google.com> Cc: David Sharp <dhsharp@google.com> Signed-off-by: Vaibhav Nagarnaik <vnagarnaik@google.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-16tracing: Clean up tracing_mark_write()Steven Rostedt1-13/+11
On gcc 4.5 the function tracing_mark_write() would give a warning of page2 being uninitialized. This is due to a bug in gcc because the logic prevents page2 from being used uninitialized, and gcc 4.6+ does not complain (correctly). Instead of adding a "unitialized" around page2, which could show a bug later on, I combined page1 and page2 into an array map_pages[]. This binds the two and the two are modified according to nr_pages (what gcc 4.5 seems to ignore). This no longer gives a warning with gcc 4.5 nor with gcc 4.6. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-16Revert 'perf evlist: Fix creation of cpu map'Namhyung Kim1-3/+3
The commit 55261f46702c ("perf evlist: Fix creation of cpu map") changed to create a per-task event when no cpu target is specified. However it caused a problem since perf-task do not allow event inheritance due to scalability issues so that the result will contain samples only from parent, not from its children. So we should use perf-task-per-cpu events anyway to get the right result. Revert it. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Analysed-by: Ingo Molnar <mingo@kernel.org> Acked-and-tested-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Namhyung Kim <namhyung.kim@lge.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1337161549-9870-2-git-send-email-namhyung.kim@lge.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-16perf target: Rename functions to avoid double negationNamhyung Kim5-15/+15
Rename perf_target__no_{cpu,task} to perf_target__has_{cpu,task} because it's more intuitive and easy to parse (for human beings) when used with negation. The names are came out from David Ahern. It is intended to be a mechanical substitution without any functional change. The perf_target__none remains unchanged since I couldn't find a right name and it is hardly used with negation. Signed-off-by: Namhyung Kim <namhyung.kim@lge.com> Suggested-by: David Ahern <dsahern@gmail.com> Suggested-by: Ingo Molnar <mingo@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1337161549-9870-1-git-send-email-namhyung.kim@lge.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-14perf/x86/ibs: Fix undefined reference to `get_ibs_caps'Robert Richter1-0/+4
Fixing i386 allnoconfig built errors: arch/x86/built-in.o: In function `amd_pmu_hw_config': perf_event_amd.c:(.text+0xc3e1): undefined reference to `get_ibs_caps' Reported-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-14Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar3-43/+369
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Arjan & Linus Annotation Edition - Fix indirect calls beautifier, reported by Linus. - Use the objdump comments to nuke specificities about how access to a well know variable is encoded, suggested by Linus. - Show the number of places that jump to a target, requested by Arjan. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-12perf annotate browser: Add key bindings help windowArnaldo Carvalho de Melo1-7/+18
Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-1txmtzf71eqie5xcukbfxors@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-12perf annotate browser: Show 'jumpy' functionsArnaldo Carvalho de Melo1-5/+60
Just press 'J' and see how many places jump to jump targets. The hottest jump target appears in red, targets with more than one source have a different color than single source jump targets. Suggested-by: Arjan van de Ven <arjan@infradead.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-7452y0dmc02a20ooins7rn79@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-12perf annotate browser: Count the numbers of jump sources to a targetArnaldo Carvalho de Melo1-3/+3
Instead of simply marking an offset as a jump target. So that we can implement a new feature: showing "jumpy" targets, I.e. addresses that lots of places jump to. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-vc7b0u5yxgrubig0q61ayhxf@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-12perf annotate: Introduce ->free() method in ins_opsArnaldo Carvalho de Melo2-8/+21
So that we don't special case disasm_line__free, allowing each instruction class to provide an specialized destructor, like is needed for 'lock'. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-xxw4vs5n077tf35jsvjzylhb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-12perf annotate: Augment lock instruction outputArnaldo Carvalho de Melo2-34/+109
It just chops off the 'lock' and uses the ins__find, etc machinery to call instruction specific parsers/beautifiers. Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-4913ba2dzakz5rivgumosqbh@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-11perf annotate: Resolve symbols using objdump comment for single op insArnaldo Carvalho de Melo1-0/+45
Starting with inc, incl, dec, decl. Requested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-jvh0jspefr5jyn0l7qko12st@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-11perf annotate: Resolve symbols using objdump commentArnaldo Carvalho de Melo2-1/+119
This: mov 0x95bbb6(%rip),%ecx # ffffffff81ae8d04 <d_hash_shift> Becomes: mov d_hash_shift,%ecx Ditto for many more instructions that take two operands. Requested-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-i5opbyai2x6mn9e5yjmhx9k6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-11perf annotate: Use raw form for register indirect call instructionsArnaldo Carvalho de Melo1-0/+9
callq *0x10(%rax) was being rendered in simplified mode as: callq *10 I.e. hexa, but without the 0x and omitting the register. In such cases just use the raw form. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-m91tv004h2m1fkfgu6ovx3hb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-11Merge tag 'perf-core-for-mingo' of ↵Ingo Molnar27-369/+583
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Fixes and improvements for perf/core: - perf_target: abstraction for --uid, --pid, --tid, --cpu, --all-cpus handling, eliminating code duplicated in the tools, having constraints that apply to all of them, from Namhyung Kim - Fixes for handling fallback to cpu-clock on PPC, from David Ahern - Fix for processing events with unknown size, from Jiri Olsa - Compilation fix on 32-bit, from Jiri Olsa Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-10tracing: Do not enable function event with enableSteven Rostedt3-1/+7
With the adding of function tracing event to perf, it caused a side effect that produces the following warning when enabling all events in ftrace: # echo 1 > /sys/kernel/debug/tracing/events/enable [console] event trace: Could not enable event function This is because when enabling all events via the debugfs system it ignores events that do not have a ->reg() function assigned. This was to skip over the ftrace internal events (as they are not TRACE_EVENTs). But as the ftrace function event now has a ->reg() function attached to it for use with perf, it is no longer ignored. Worse yet, this ->reg() function is being called when it should not be. It returns an error and causes the above warning to be printed. By adding a new event_call flag (TRACE_EVENT_FL_IGNORE_ENABLE) and have all ftrace internel event structures have it set, setting the events/enable will no longe try to incorrectly enable the function event and does not warn. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2012-05-10perf hists browser: Use '/' for search/filter instead of 's'Arnaldo Carvalho de Melo1-2/+2
That is what is used in vi and mutt, and as well on the 'annotate' browser. Eventually we can have keymappings to make people used to other key associations more confortable. Suggested-by: Ingo Molnar <mingo@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/n/tip-fyln9286b8gx5q4n277l0djs@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-09Merge branch 'tip/perf/core' of ↵Ingo Molnar3-102/+24
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace into perf/core
2012-05-09perf stat: handle ENXIO error for perf_event_openDavid Ahern1-1/+6
perf stat on PPC currently fails to run: $ perf stat -- sleep 1 Error: open_counter returned with 6 (No such device or address). /bin/dmesg may provide additional information. Fatal: Not all events could be opened. The problem is that until 2.6.37 (behavior changed with commit b0a873e) perf on PPC returns ENXIO when hw_perf_event_init() fails. With this patch we get the expected behavior: $ perf stat -v -- sleep 1 cycles event is not supported by the kernel. stalled-cycles-frontend event is not supported by the kernel. stalled-cycles-backend event is not supported by the kernel. instructions event is not supported by the kernel. branches event is not supported by the kernel. branch-misses event is not supported by the kernel. ... Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1336490956-57145-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-09perf annotate: shorten helpline so it fits in visible spaceDavid Ahern1-3/+3
Additional toggles have pushed the help line out of view on a modestly sized terminal (120 columns wide). Shorten it to just reminders. Signed-off-by: David Ahern <dsahern@gmail.com> Link: http://lkml.kernel.org/r/1336510879-64610-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-09perf record: Reset event name when falling back to cpu-clockDavid Ahern1-0/+4
perf-record defaults to the H/W cycles event and if it is not supported falls back to cpu-clock. Reset the event name as well. Signed-off-by: David Ahern <dsahern@gmail.com> Link: http://lkml.kernel.org/r/1336495811-58461-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-09perf top: Update event name when falling back to cpu-clockDavid Ahern1-0/+4
The 'perf top' command falls back to cpu-clock if the H/W cycles event is not supported, but the event name is not updated leading to a misleading header: PerfTop: 8 irqs/sec kernel:75.0% exact: 0.0% [1000Hz cycles], ... Update the event name when the event type is changed so that the header displays correctly: PerfTop: 794 irqs/sec kernel:100.0% exact: 0.0% [1000Hz cpu-clock], ... Signed-off-by: David Ahern <dsahern@gmail.com> Link: http://lkml.kernel.org/r/1336495789-58420-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-09perf stat: handle ENXIO error for perf_event_openDavid Ahern1-1/+6
perf stat on PPC currently fails to run: $ perf stat -- sleep 1 Error: open_counter returned with 6 (No such device or address). /bin/dmesg may provide additional information. Fatal: Not all events could be opened. The problem is that until 2.6.37 (behavior changed with commit b0a873e) perf on PPC returns ENXIO when hw_perf_event_init() fails. With this patch we get the expected behavior: $ perf stat -v -- sleep 1 cycles event is not supported by the kernel. stalled-cycles-frontend event is not supported by the kernel. stalled-cycles-backend event is not supported by the kernel. instructions event is not supported by the kernel. branches event is not supported by the kernel. branch-misses event is not supported by the kernel. ... Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1336490956-57145-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-09perf record: Fix fallback to cpu-clock on ppcDavid Ahern1-2/+6
perf-record on PPC is not falling back to cpu-clock: $ perf record -ag -fo /tmp/perf.data -- sleep 1 Error: sys_perf_event_open() syscall returned with 6 (No such device or address). /bin/dmesg may provide additional information. Fatal: No CONFIG_PERF_EVENTS=y kernel support configured? The problem is that until 2.6.37 (behavior changed with commit b0a873e) perf on PPC returns ENXIO when hw_perf_event_init() fails. With this patch we get the expected behavior: $ perf record -ag -fo /tmp/perf.data -v -- sleep 1 Old kernel, cannot exclude guest or host samples. The cycles event is not supported, trying to fall back to cpu-clock-ticks [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.151 MB /tmp/perf.data (~6592 samples) ] Signed-off-by: David Ahern <dsahern@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1336490937-57106-1-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-09perf report: Fix format string for x86-32 compilationJiri Olsa1-1/+1
Using PRIu64 for printing out u64 nr_events to fix compilation for x86 32 bits. Cc: Arun Sharma <asharma@fb.com> Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Frank C. Eigler <fche@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Robert Richter <robert.richter@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Tom Zanussi <tzanussi@gmail.com> Cc: Ulrich Drepper <drepper@gmail.com> Link: http://lkml.kernel.org/r/1335958638-5160-7-git-send-email-jolsa@redhat.com Signed-off-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2012-05-09sched, perf: Use a single callback into the schedulerPeter Zijlstra3-30/+17
We can easily use a single callback for both sched-in and sched-out. This reduces the code footprint in the scheduler path as well as removes the PMU black spot otherwise present between the out and in callback. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-o56ajxp1edwqg6x9d31wb805@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09perf/x86-ibs: Fix usage of IBS op current countRobert Richter1-12/+21
The value of IbsOpCurCnt rolls over when it reaches IbsOpMaxCnt. Thus, it is reset to zero by hardware. To get the correct count we need to add the max count to it in case we received an ibs sample (valid bit set). Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-13-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09perf/x86-ibs: Catch spurious interrupts after stopping IBSRobert Richter1-5/+7
After disabling IBS there could be still incomming NMIs with samples that even have the valid bit cleared. Mark all this NMIs as handled to avoid spurious interrupt messages. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-12-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09perf/x86-ibs: Implement workaround for IBS erratum #420Robert Richter1-23/+39
When disabling ibs there might be the case where hardware continuously generates interrupts. This is described in erratum #420 (Instruction- Based Sampling Engine May Generate Interrupt that Cannot Be Cleared). To avoid this we must clear the counter mask first and then clear the enable bit. This patch implements this. See Revision Guide for AMD Family 10h Processors, Publication #41322. Note: We now keep track of the last read ibs config value which is then used to disable ibs. To update the config value we pass now a pointer to the functions reading it. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-11-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09perf/x86-ibs: Extend hw period that triggers overflowRobert Richter1-2/+13
If the last hw period is too short we might hit the irq handler which biases the results. Thus try to have a max last period that triggers the sw overflow. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-10-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-09perf/x86-ibs: Trigger overflow if remaining period is too smallRobert Richter1-4/+1
There are cases where the remaining period is smaller than the minimal possible value. In this case the counter is restarted with the minimal period. This is of no use as the interrupt handler will trigger immediately again and most likely hits itself. This biases the results. So, if the remaining period is within the min range, we better do not restart the counter and instead trigger the overflow. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333390758-10893-9-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>