In recent articles, ForAllSecure has discussed how we were able to use our next-generation fuzzing solution, Mayhem, to discover previously unknown vulnerabilities in several open source projects, including Netflix DIAL reference, Das U-Boot, and more. In this post, we will follow up on a prior article on using Mayhem to analyze stb and MATIO by reviewing three additional vulnerabilities found in another open source library. Prior to detailing these new vulnerabilities, we will examine some of the factors which can help to identify code which is a good candidate for fuzzing.
When bug-hunting for fun, or when there’s no predetermined target, taking advantage of this freedom by putting some thought into your targets can yield better results (more defects found quicker). When tasked with analyzing a specific project, it’s often worthwhile to survey the codebase and its dependencies. Identifying parts of the attack surface to prioritize can make the process more efficient and allow you to find the low-hanging fruit more rapidly. This allows fixes to happen concurrently with expanding the fuzzing campaign.
There are a few technical considerations and qualitative factors that may influence priorities. Let’s discuss the questions users should ask that will help assess which targets are most promising.
This question is fairly self-explanatory, but sometimes a non-trivial hurdle: dynamic analysis needs to be able to run the target! Although analyzing black box binaries with Mayhem is possible, building the target from source allows for inserting compile-time instrumentation for coverage gathering, which can lead to efficiency gains.
When looking at open source projects (especially C/C++ codebases) you often encounter a variety of build systems, some of which are easier to get working than others. If you are able to build and run the target, but had to jump through hurdles to achieve this, it may have dissuaded other researchers from proceeding further. This can indicate a target which has not been analyzed thoroughly, making it a ripe target for bug hunting.
Fuzzers are exceptional at finding issues in memory-unsafe languages such as C and C++, especially when coupled with tools like ASAN/MSAN/UBSAN. The range of security-relevant defects that users can expect to find in memory-safe languages, such as Golang or Rust, is smaller. However, fuzzing has still been effective at uncovering serious issues in Go and Rust projects. It is also fairly common for memory-safe languages to interface with C or C++ code via Foreign Function Interface (FFI), or to contain security-relevant code in unsafe blocks, meaning memory issues can still arise.
To be compatible with grey-box coverage-guided fuzzers like Mayhem, a target must be able to take in input as a sequence of bytes and to do something meaningful with it which affects its control flow. This byte sequence can come from a file, a network connection, the environment, etc. Understanding the different input sources and the amount of initialization / preconditions that must be satisfied until an input is accepted is important insight into how the target will perform under fuzzing. In-process fuzzers such as LibFuzzer can simplify these issues if the target can be made compatible. Understanding what code is exposed to untrusted input is important to evaluating the impact of bugs found as well.
Due to the types of mutations which are performed by most grey-box fuzzers, densely packed binary formats tend to perform better than highly structured textual formats. Additional logic to improve the chances of generating an input which passes initial validation can help to alleviate this at the cost of additional effort. Projects such as libprotobuf-mutator can help mapping randomly generated bytes from a fuzzer to more structured input, or custom structure-aware mutators can be developed to improve efficiency. Non-relevant input validation, such as validation of checksums or cryptographic operations can hinder fuzzer progress as well; understanding where these issues may arise can allow you to patch the target and prevent getting “stuck” behind these gates.
Understanding the downstream users of a library can help prioritize fuzzing efforts. Bugs discovered in widely used code, or code used in scenarios where patching is difficult or infrequent, usually has a higher impact. Due to the reusable nature of library code, a bug in a library can be critical, affecting a wide variety of users compared to bugs in individual applications. Popular applications that are placed in security-critical contexts are also of extreme interest.
Many projects that are being fuzzed for the first time will manifest many defects very quickly. This is part of what makes fuzzing new targets so addicting! Projects being continuously fuzzed may still have had only a subset of their functionality tested; exploring new regions of these codebases usually also uncovers issues quickly. Examining coverage reports of corpuses generated by previous fuzzing efforts for any major gaps in tested functionality can lead to an explosion of newly discovered defects. Analyzing results with tools such as bncov or other coverage-measuring tools can help pinpoint gaps in coverage. For more information improving fuzz testing with coverage analysis tools, check out this blog by my colleague, Mark Griffin.
Does the target have a test suite? What coverage does it achieve? Are static analysis tools / linters used? Are these tools run continuously? Asking these questions and looking for gaps can help to inform where bugs can be found and what types of bugs you expect to find. In addition, consulting unit tests are a good way to determine proper library usage in addition to the target’s documentation.
As part of our efforts to integrate and analyze open source projects with Mayhem, we analyzed the open source image parsing library FreeImage, and found the following vulnerabilities:
The impact these vulnerabilities has is mostly dependent on the usage of the library by a host application. However, an application using an unpatched version of FreeImage that reads PSD files from untrusted sources may potentially allow an attacker to execute arbitrary code.
As a case study, let’s evaluate FreeImage using the criteria from section 1:
From a glimpse at the code, documentation, and project page, we have found a good candidate target to fuzz, namely the PSD parser inside of FreeImage.
See other zero-days Mayhem, a ForAllSecure fuzz testing technology, has found.
Following the same general approach for fuzzing libraries with Mayhem (described in detail here), after consulting the documentation and library usage from the test suite we came up with the following LibFuzzer target:
#include "FreeImage.h"
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size)
{
FIMEMORY *mem;
FIBITMAP *dib;
int length, height, bpp, y;
FreeImage_Initialise(true);
mem = FreeImage_OpenMemory(const_cast(data), size);
if (!mem)
return 0;
dib = FreeImage_LoadFromMemory(FIF_PSD, mem, PSD_DEFAULT);
if(!dib)
return 0;
bpp = FreeImage_GetBPP(dib);
length = FreeImage_GetWidth(dib);
height = FreeImage_GetHeight(dib);
for (y = 0; y < height; y++)
FreeImage_GetScanLine(dib, y);
FreeImage_Unload(dib);
FreeImage_CloseMemory(mem);
FreeImage_DeInitialise();
return 0;
}
We can compile and link this against FreeImage 3.18.0, the latest version at the time of this post, and analyze the resulting package with Mayhem.
ForAllSecure attempted to reach out to the maintainers of FreeImage. Unfortunately, they did not respond to our efforts to disclose these vulnerabilities via email or to our efforts to reach them via their mailing list or SourceForge issue tracker. We are disclosing details of these vulnerabilities now in accordance with our vulnerability disclosure policy. Outlined below is our disclosure timeline:
Patches for the vulnerabilities are available here.
When considering the question of “what should I fuzz?” or the related question “what should I fuzz first”, doing some upfront analysis can help to maximize your efficiency and results. Understanding the attack surface and security posture of a potential target can help inform decisions on what to prioritize when surveying targets to fuzz. Once appropriately packaged, fuzzing a target is a mostly automated process with the manual effort being exerted up front, meaning that you can continue to explore new targets and code while the most promising targets are being analyzed by the fuzzer. By making good judgements and prioritizing effectively, you can find defects quicker, and parallelize the work of patching them with expanding coverage (informed by results from prior runs). The amount of effort put in to pick a promising target up front is rather minimal, and can pay dividends over time.
Have questions? I’d be happy to answer them personally. I'll be at Shmoocon with my colleague Mark Griffin, who will have a 50-min speaking session. We’d love to meet. Find us at Mark’s session on “Knowing the UnFuzzed and Finding Bugs with Coverage Analysis”, as a part of the Build It! track, or at the ForAllSecure booth. See you there!
By submitting this form, you agree to our Terms of Use and acknowledge our Privacy Statement.