From 1a8d7143e69e6536a0b1924313e3246e26086a47 Mon Sep 17 00:00:00 2001 From: Werner Lemberg Date: Fri, 3 Feb 2017 16:44:31 +0100 Subject: More GSoC ideas from Kostya and Alexei. --- gsoc.html | 207 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 200 insertions(+), 7 deletions(-) (limited to 'gsoc.html') diff --git a/gsoc.html b/gsoc.html index 1423d92..0379caf 100644 --- a/gsoc.html +++ b/gsoc.html @@ -63,14 +63,178 @@
Improve fuzzing for FreeType
-

Description will follow.

- -

Difficulty: medium. Requirements: - C, C++, Unix build tools. Potential mentors: - Kostya Serebryany (Google), Werner Lemberg - (FreeType).

+

There are at least two fuzzers that constantly test + FreeType. One of them + is OSS-Fuzz, + and the tasks for GSoC presented here are targeted to + increase the efficiency of this fuzzer bot.

+ +
+
Split the existing fuzz target into many
+
+

Right now we + have src/tools/ftfuzzer/ftfuzzer.cc + that contains

+ +
+extern "C"
+int LLVMFuzzerTestOneInput(const uint8_t* data,
+                           size_t size_)
+{
+  ParseWhateverFontFileIGet(data, size);
+}
+ +

Instead of this monolithic approach we should + split fuzzing into separate files (fuzz targets) + for every font format, for example

+ +
+src/tools/ftfuzzer/cff_fuzz.cc
+src/tools/ftfuzzer/cff2_fuzz.cc
+src/tools/ftfuzzer/cid_fuzz.cc
+ +

and every such file will have

+ +
+extern "C"
+int LLVMFuzzerTestOneInput(const uint8_t* data,
+                           size_t size_)
+{
+  ParseOnlyMyFormatAndRejectAnythingElseQuickly(data, size);
+}
+ +

Ideally, the build rule for cff_fuzz + will not link anything that CFF does not + need.

+ +

Such a split will make fuzzing more efficient for + many reasons.

+ +
    +
  • Genetic mutations will not spend time crossing + over files of different formats (e.g., trying to + add BDF genes to a Type 1 font).
  • +
  • Data-flow guided mutations will not try to + transform, say, a CID font file into a PCF font + file.
  • +
  • Some of the fuzzer's internal algorithms that + are linear by the code size will run faster.
  • +
  • Slow inputs that currently make fuzzing + inefficient will cause only some of the targets + suffer, not all of them.
  • +
+ +

The changes will need to be reflected in the + OSS-Fuzz + repository.

+ +

Difficulty: medium. Requirements: + C, C++, Unix build tools. Potential + mentors: Kostya Serebryany (Google), Werner + Lemberg (FreeType).

+
+
+ +
+
Prepare a public corpus of inputs
+
+

The quality of the ‘seed corpus’ is + the key to fuzzing efficiency. We should set up a + repository (e.g., in github) that would hold

+ +
    +
  • small but representative sample font + files for every relevant font format (only with + permissive licenses!), and
  • +
  • fuzzed mutations of the above (this part will + need to be periodically updated as fuzzing finds + more inputs).
  • +
+ +

This corpus will be used in two ways, namely

+ +
    +
  • to seed the fuzzing process, and
  • +
  • as a regression suite (see below).
  • +
+ +

Difficulty: medium. Requirements: + Unix build tools, experience with + scripting. Potential mentors: Kostya + Serebryany (Google), Werner Lemberg + (FreeType).

+
+
+ +
+
Extend the FreeType testing process to use the + above corpus
+
+

The public corpus will allow us to use the fuzz + targets as a regression test suite. We'll need to + set up a continuous integration testing (not + fuzzing) to run the fuzz targets on the corpus. + One way to achieve it is to have a github mirror + of FreeType and set up Travis (or whatever other + CI integrated with github).

+ +

Difficulty: medium. Requirements: + Unix build tools, experience with + scripting. Potential mentors: Kostya + Serebryany (Google), Werner Lemberg + (FreeType).

+
+
+ +
+
Analyze code coverage
+
+

Once the fuzz targets are split, the public + corpus is prepared, and the OSS-Fuzz integration + is updated, we'll need to analyze + the code + coverage provided by OSS-Fuzz to see what code + is left untested.

+ +

Then either the fuzz targets or the corpus (or + both) will need to be extended to cover that code. + The ideal end state is to have 100% line coverage + (currently, we have ~67% for the existing fuzz + target).

+ +

Difficulty: medium. Requirements: + C, C++, Unix build tools, experience with + scripting. Potential mentors: Kostya + Serebryany (Google), Werner Lemberg + (FreeType).

+
+
+ +
+
Prepare fuzzing dictionaries for the font formats + where relevant
+
+

In some cases a simple dictionary (list of tokens + used by the file format) may + have dramatic + effect on fuzzing.

+ +

Difficulty: easy. Requirements: + Unix build tools, experience with + scripting. Potential mentors: Kostya + Serebryany (Google), Werner Lemberg + (FreeType).

+
+
+ +

While the subtasks themselves are quite isolated, the + order they have to be handled is from top to bottom as + shown in the descriptions above. Consequently, the + project is suited to a group of students also that + work tightly together as a team.

+
Develop a test framework for checking FreeType's rendering output
@@ -103,6 +267,35 @@
+
+
Improve the ‘ftinspect’ demo program
+
+

Right now, FreeType comes with a suite of small + graphic tools to test the library, most notably + ‘ftview’ and ‘ftgrid’. The + used graphics library, while working more or less, is + very archaic, not having any comfort that modern GUIs + are providing.

+ +

To improve this, a new demo program called + ‘ftinspect’ was started, based on the Qt + GUI toolkit. However, the development is currently + stalled, mainly for lack of time.

+ +

The idea is to finish ftinspect, + handling all aspects of the other demo + programs. Currently, it only provides the + functionality of ‘ftgrid’.

+ +

If the student prefers, the Qt toolkit could be + replaced with GTK.

+ +

Difficulty: medium. Requirements: + C, C++, Qt, Unix build tools. Potential + mentor: Werner Lemberg (FreeType).

+
+
+

Do you have more ideas? Please write to our mailing list so that we can discuss your suggestions, @@ -112,7 +305,7 @@

-

Last update: 2-Feb-2017

+

Last update: 3-Feb-2017

-- cgit v1.2.3