diff options
author | Alexei Starovoitov <ast@kernel.org> | 2019-11-21 17:15:15 -0800 |
---|---|---|
committer | Alexei Starovoitov <ast@kernel.org> | 2019-11-24 16:58:46 -0800 |
commit | c4781e37c6a22c39cb4a57411d14f42aca124f04 (patch) | |
tree | feeb550fccfecba072ae58bac2ca83fb52964093 /arch | |
parent | 161f3cbcda06aa70faed6b703066fedbd7653e23 (diff) |
selftests/bpf: Add BPF trampoline performance test
Add a test that benchmarks different ways of attaching BPF program to a kernel function.
Here are the results for 2.4Ghz x86 cpu on a kernel without mitigations:
$ ./test_progs -n 49 -v|grep events
task_rename base 2743K events per sec
task_rename kprobe 2419K events per sec
task_rename kretprobe 1876K events per sec
task_rename raw_tp 2578K events per sec
task_rename fentry 2710K events per sec
task_rename fexit 2685K events per sec
On a kernel with retpoline:
$ ./test_progs -n 49 -v|grep events
task_rename base 2401K events per sec
task_rename kprobe 1930K events per sec
task_rename kretprobe 1485K events per sec
task_rename raw_tp 2053K events per sec
task_rename fentry 2351K events per sec
task_rename fexit 2185K events per sec
All 5 approaches:
- kprobe/kretprobe in __set_task_comm()
- raw tracepoint in trace_task_rename()
- fentry/fexit in __set_task_comm()
are roughly equivalent.
__set_task_comm() by itself is quite fast, so any extra instructions add up.
Until BPF trampoline was introduced the fastest mechanism was raw tracepoint.
kprobe via ftrace was second best. kretprobe is slow due to trap. New
fentry/fexit methods via BPF trampoline are clearly the fastest and the
difference is more pronounced with retpoline on, since BPF trampoline doesn't
use indirect jumps.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20191122011515.255371-1-ast@kernel.org
Diffstat (limited to 'arch')
0 files changed, 0 insertions, 0 deletions