|
| 1 | +There are two things the fuzzer is storing: explored features and the corpus. |
| 2 | + |
| 3 | +Explored features is a sorted buffer of u32 values. Explored feature is |
| 4 | +anything "interesting" that happened in the code when we run it with some input |
| 5 | +and *it did not crash*. Different "interesting things" should correspond to |
| 6 | +different u32 values but collisions never 100% avoidable. Explored features are |
| 7 | +not shared among different workers. |
| 8 | + |
| 9 | +Currently tracked "interesting things" are: |
| 10 | + |
| 11 | +* taken edges in the CFG |
| 12 | +* address of cmp instructions executed |
| 13 | +* address of switch statements executed |
| 14 | +* indirect calls |
| 15 | + |
| 16 | +Fuzzer is trying to maximize the number of unique explored features over all |
| 17 | +inputs. |
| 18 | + |
| 19 | +The corpus is a set of inputs where input is some array of bytes. The initial |
| 20 | +corpus is either provided by the user or generated randomly. The corpus is |
| 21 | +stored as two arrays that are shared among all workers. One of the arrays |
| 22 | +stores the inputs, densely packed one after another. The other is storing some |
| 23 | +metadata and indexes of string ends. Whenever some input explores a new |
| 24 | +feature, it is added to the corpus. The corpus is never shrunk, only appended. |
| 25 | + |
| 26 | +All that the fuzzer does is pick random input, mutate it, see if it hits any |
| 27 | +new features and if so, add the mutated input to the corpus and the new |
| 28 | +features to explored features. |
| 29 | + |
| 30 | +Possible improvements: |
| 31 | + |
| 32 | +* Prioritize mutating inputs that hit rare features |
| 33 | +* Table of recently compared values used in mutations |
| 34 | +* In-place mutation to avoid copying? |
| 35 | +* Implement more mutations |
| 36 | +* Multithreading |
| 37 | +* Maybe use hash table for explored features instead of sorted array |
| 38 | + |
0 commit comments