diff options
author | Kostya Serebryany <kcc@google.com> | 2015-04-01 21:33:20 +0000 |
---|---|---|
committer | Kostya Serebryany <kcc@google.com> | 2015-04-01 21:33:20 +0000 |
commit | 01055ec7e316e4b6e1b37e9e165b66d07716830c (patch) | |
tree | 3578c38ff426bcb03896514b917230811c4fe996 /docs | |
parent | a8d688454d2a7cf1e38574b836183579b01476ff (diff) |
[fuzzer] document the -tokens flag. Also change the diagnostic output
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233842 91177308-0d34-0410-b5e6-96231b3b80d8
Diffstat (limited to 'docs')
-rw-r--r-- | docs/LibFuzzer.rst | 22 |
1 files changed, 22 insertions, 0 deletions
diff --git a/docs/LibFuzzer.rst b/docs/LibFuzzer.rst index 354e871903..684d9def78 100644 --- a/docs/LibFuzzer.rst +++ b/docs/LibFuzzer.rst @@ -163,6 +163,27 @@ which will cause the fuzzer to exit on the first new synthesised input:: N=100; M=4; ./pcre_fuzzer ./CORPUS -jobs=$N -workers=$M -exit_on_first=1 +Advanced features +================= + +Tokens +------ + +By default, the fuzzer is not aware of complexities of the input language +and when fuzzing e.g. a C++ parser it will mostly stress the lexer. +It is very hard for the fuzzer to come up with something like ``reinterpret_cast<int>`` +from a test corpus that doesn't have it. +See a detailed discussion of this topic at +http://lcamtuf.blogspot.com/2015/01/afl-fuzz-making-up-grammar-with.html. + +lib/Fuzzer implements a simple technique that allows to fuzz input languages with +long tokens. All you need is to prepare a text file containing up to 253 tokens, one token per line, +and pass it to the fuzzer as ``-tokens=TOKENS_FILE.txt``. +Three implicit tokens are added: ``" "``, ``"\t"``, and ``"\n"``. +The fuzzer itself will still be mutating a string of bytes +but before passing this input to the target library it will replace every byte ``b`` with the ``b``-th token. +If there are less than ``b`` tokens, a space will be added instead. + Fuzzing components of LLVM ========================== @@ -188,6 +209,7 @@ clang-fuzzer ------------ The default behavior is very similar to ``clang-format-fuzzer``. +Clang can also be fuzzed with Tokens_ using ``-tokens=$LLVM/lib/Fuzzer/cxx_fuzzer_tokens.txt`` option. Tracking bug: https://llvm.org/bugs/show_bug.cgi?id=23057 |