Assignment: FUZZ

Changelog:

  • 1 April 2025: add instructions regarding using AFL++, and AFL++ that is preinstalled on portal; update fbsd-indent.tar.gz to compile without -fcommon
  • 6 April 2025: adjust wording re: advice to run fuzzing with memory limit
  • 8 April 2025: adjust memory limit in examples to 128MB; in my experience 32MB is plenty for fuzzing, and I did not try compiling with a limit (I’m only concerned about running out memory when running randomized tests) — it seems that 32MB is low for compiling with afl-cc
  • 9 April 2025: correct afplusplus typo to aflplusplus
  • 9 April 2025: add note on fuzzing terminating possibly meaning the memory limit is too low; give command without systemd stuff or -m option for afl-tmin example command
  • 9 April 2025: add note on real-world fuzzing memory limits

In this assignment, you will use a coverage-guidfed “whitebox fuzzing” tool to find memory errors in software in combination with a generic memory error detector.

Your Task

  1. Download

  2. Get access to a copy of AFL++ as descvribed below.

  3. Build indent with AddressSanitizer and run AFL++ to test indent long enough finds at least three distinct test cases that crash it, keeping a copy of the final status screen (e.g as a screenshot) and a copy of the generated failing test cases. See the instructions below for how to do this.

  4. Run AFL++’s minimizer on the crashing test case you found. See the instrutions below for how to do this.

  5. Answer the following questions in a file called answers.txt:

    1. Where did the crash occur in the first crashing test case? If the cause was accessing an array out of bounds, what variable did the program neglect to keep in bounds? If the cause was something else, explain briefly.

    2. Examine some other crashing test cases and try running them with the program. Do they represent different bugs? Explain briefly how you determined this.

  6. Submit the following files:

    • a copy of answers.txt

    • a copy of a screenshot or text copy of the final AFL status screen called status.png or status.jpg or status.txt or similar

    • a copy of one of the test cases as testcase.dat

    • a minimized copy of one of the failing test cases as testcasemin.dat

    • a copy of a test case you used for question 2 in answers.txt as testcase2.dat

General resources

  1. Documentation for the two tools we will be using:

    • AddressSanitizer: a memory error detector integrated with recent versions of the C compilers GCC and Clang.
    • AFL++: a whitebox fuzzing tool derived from american fuzzy lop: a whitebox fuzzing tool written by Michał Zalewski.
  2. Neystadt, “Automated Penetration Testing with White-Box Fuzzing”.

Accessing AFL++

Via a module/Apptainer on portal

  1. If you work on portal, we have AFL++ installed via Apptainer (which provides a Docker-like container) on Portal.

    To use it, run

    module load apptainer aflplusplus
    
  2. Then you can run an afl-XXX command run a command like

    apptainer run $CONTAINERDIR/aflplusplus-4.32a.sif 
    

    then get a prompt that looks like

    Apptainer>
    

    at which you can run commands like afl-fuzz --version.

    (Later, you’ll replace afl-fuzz --version with a more useful command, as described below.)

    This prompt runs inside the container which will persist until you exit.

  3. You can also specify a memory limit for the container

    apptainer run --memory 128M --memory-swap 128M $CONTAINERDIR/aflplusplus-4.32a.sif 
    

    I would strongly recommend setting a memory limit (though perhaps larger than 128MB) when doing fuzzing.

  4. Alternately, rather than getting a separate prompt, you can run a command like afl-fuzz --version with one line using a commnad like:

    apptainer run $CONTAINERDIR/aflplusplus-4.32a.sif afl-fuzz --version
    

Using your own copy of AFL++

  1. Alternately, you can:

    • compile AFL++ yourself on portal (or another Linux system), as described below

    • install AFL++ on a Linux system via the package manager on many Linux distributions

    • use AFL++’s supplied Docker image

  2. If you install AFL++ from source, when we ask to run a command like afl-fuzz or afl-cc, you’ll need to actually supply the path to the afl-fuzz binary. For example, if you have it installed in /u/mst3k/AFLplusplus, instead of typing

    afl-fuzz --version
    

    you’d run

    /u/mst3k/AFLplusplus/afl-fuzz --version
    
  3. Sometimes we’ll pass options to AFL using environment variables. For example

    AFL_SKIP_CPUFREQ=1 afl-fuzz --version
    

    runs afl-fuzz with the AFL_SKIP_CPUFREQ option.

    When running this from your own isntallation, you’ll still need to replace the command with the path to the executable:

    AFL_SKIP_CPUFREQ=1 /u/mst3k/AFLplusplus/afl-fuzz --version
    

Building indent with AFL and AddressSanitizer support

  1. Unpack fbsd-indent.tar.gz using a command like:

    tar -zxvf fbsd-indent.tar.gz

    Then in the fbsd-indent directory compile indent using AFL with ASAN enabled.

    If compiling with GCC, you can use something like:

    AFL_USE_ASAN=cc afl-cc *.c -o indent
    

    In this command:

    • The AFL_USE_ASAN option enabled AddressSanitizer, which will add checks for out-of-bounds errors and use-after-free acceses (at a significant performance cost).

    • The option AFL_USE_ASAN=1 make this program use AddressSanitizer, which is a memory error detector. This supplies the -fsanitize=address option to GCC uses to enable sanitization and tells AFL to adjust how it instruments the application accordingly.

    Alternately, you may also build indent without AddressSanitizer support to perform the fuzzing, then rebuild it with AddressSanitizer to diagnose the memory errors found by the fuzzing. See the instructions under the Hints below. This option makes fuzzing faster, but identifies memory errors less consistently.

  2. Verify that indent works by running it on a simple C file. For example, you can try running it on the io.c that comes with indent using:

    ./indent <io.c >io.c.indented
    

    then compare the two files (io.c and io.c.indented) in a text editor.

    Note that if indent is given a filename as an argument it modifies that file.

Run the AFL fuzzer on indent

  1. AFL++ will take an initial test case and make random changes to it. It will combine different random changes based on which changes seem to cause the program to execute more different paths.

  2. You need to first supply one or more initial test cases. Create a directory called testcases. Inside it create one or more small C files to use to test indent. american fuzzy lop will be able to find more useful test cases if your C files include things that a C indenting program will need to handle. Otherwise, the fuzzer will need to “discover” what indent will recognize by testing randomly.

  3. Since some inputs to the program may cause it to use excessive memory, we want to prevent it from causing the system from running out of memory.

    Ideally, we’d use afl-fuzz’s -m option to enforce this. This sets a resource limit on virtual memory size, which measures the amount of addresses a program has asked the OS to setup regardless of whether those addresses are valid. Unfortunately, AddressSanitizer relies on setting up a large range of addresses but not allocating storage for those, and so it not really compatible with this setting.

    (This memory AddressSanitizer reserves is use store a table with one entry for every 8 bytes of memory indicating whether the memory is valid to access. AddressSanitizer reserves space for all possible addresses in this table, but leaves parts of the table which correspond to invalid addresses as invalid rather actually requiring that space to be allocated by the operating system.)

    Instead, we’ll rely on lower-level support for memory limits. If you’re using the Apptainer-based installation of AFL++ on portal, use something like:

    apptainer run --memory 128M --memory-swap 128M $CONTAINERDIR/aflplusplus-4.32a.sif 
    

    to set a 32 megabyte limit (you can adjust the limit higher if needed).

    If you’re using your own installation of AFL++, you can use systemd-run to run a program with a memory limit if you are on a systemd-based Linux system (such as portal).

    For example,

    systemd-run --pty --wait  -p MemoryHigh=128M -p MemoryMax=129M -p MemorySwapMax=0M --user --same-dir COMMAND
    

    while run COMMAND such that it cannot use more than 33 megabytes of memory (or any swap space and the OS will start trying to free its memory if it exceeds 32 megabytes). If the process exceeds that, it will be terminated abnormally.

    To pass environment variables to a command this way, we can use the env utility:

    systemd-run --pty --wait  -p MemoryHigh=128M -p MemoryMax=129M -p MemorySwapMax=0M --user --same-dir env FOO=1 COMMAND
    

    is equivalent to

    FOO=1 COMMAND
    

    but it runs it with a memory limit.

  4. Now that you have the initial input. Run afl-fuzz on indent with a command like

    AFL_SKIP_CPUFREQ=1 afl-fuzz -i testcases -o findings ./indent
    

    using apptainer or systemd-run to set an appropraiate memory limit as described above. (So the actual command might look like

    AFL_SKIP_CPUFREQ=1 apptainer run --memory 32M --memory-swap 32M $CONTAINER_DIR/aflplusplus-4.32a.sif afl-fuzz ... or
    
    systemd-run --pty --wait  -p MemoryHigh=32M -p MemoryMax=33M -p MemorySwapMax=0M --user --same-dir env AFL_SKIP_CPUFREQ /path/to/afl-fuzz ...
    

    .)

    The options here: * AFL_SKIP_CPUFREQ=1 tells AFL not to complain that our CPU has frequency scaling. This makes fuzzing less fast/reliable (because detecting when programs hang is less consistent), so if you were doing this for a production workload, you would want to address this in another way. * The -o findings option specifies the directory to write output to and to use to store temporary files. * The -i testcases option specifies where to find the initial test cases.

  5. As the fuzzer runs, you will see a status screen which is described here.

  6. Wait until the fuzzer reports finding at least three unique crashes (and preferably at least five or six). In my testing, this took substantially less than 15 minutes. If it does not, then you should consider whether your initial cases are too simple (don’t include enough C syntax that indent will need to process) or too long (take up many kilobytes, making each test run slower and giving the tool more “boring” variations to try). If you still have difficulty after adjusting your test cases, please contact the instructor.

  7. If the fuzzer seems to terminate early, possibly your memory limit is too low; try rerunning with a higher limit.

  8. After the fuzzer generates some crashing test cases, stop it with control-C. If we were doing this for real, we’d keep running this until AFL completed a complete “cycle”, but we will not make you wait that long.

  9. Make a copy of the status screen (screenshot or text copy) and save it to a file for later.

Finding Crashing Test Case

  1. Look in the findings directory that was generated by running AFL. Among this folder there are directories called crashes and hangs. These contain the test cases that match the unique crashes and/or hangs reported in the status screen.

  2. View one of the crashing test cases. Note that the test case may contain some non-ASCII characters, so you may wish to check using a hex editor or od -c file rather than just viewing the test case in a text editor.

  3. Make a copy of the crashing test case and trying running the indent program on it was ./indent <testcase.dat. You should notice a crash. Most likely this crash will be from a memory error. Since we built our program with AddressSanitizer, rather than being a normal segfault, the memory error will have been caught by extra code added by AddressSanitizer. See below under “Hints” for how to interpret this output.

Minimizing the Test Case

  1. The utility afl-tmin that comes with AFL will attempt to simplify a test case. It will try to “fuzz” the given test case slightly without changing what path it takes through the program in order to make it shorter. Run this utility with

    afl-tmin -i input-file -o output-file ./indent
    

    (in the container if on portal) to produce a minimized version of the testcase in input-file in output-file.

  2. Verify that the minimized test case produces a similar crash.

  3. Examine the program around where AddressSanitizer places the crash.

Hints and Debugging and Alternatives

Understanding AddressSanitizer output

  1. AddressSanitizer output includes the following information:

    • A stack trace indicating where the memory error occured. This is the location of the bad read or write, which may not be where the bug needs to be fixed in the program.

    • If the memory error was because of trying to access the heap, the location of the code that most recently freed or malloc’d something near that the location that was accessed.

    • If the memory error was because of trying to access the stack or global data, the variables that are closest in memory to the location that was accessed.

    • “Shadow bytes”, which show AddressSanitizer’s internal data structure for keeping track of the state of memory around the accessed address. This has one byte of data for every eight bytes of program memory, indicating whether that part of program memory is valid or invalid and, if it is invalid, why it is invalid. In this data structure “red zones” are memory regions allocated around objects to catch accesses just outside of an object.

Fuzzing Without AddressSanitizer

  1. It’s also possible to disable AddressSanitizer when fuzzing, but use it only to diagnose the crashes the fuzzer finds. This will make fuzzing faster at the cost finding a few less crashes.

    In this case, I’d recommend using AFL_HARDEN=1 instead of AFL_USE_ASAN=1. This will enable some fast compiler-level bounds checks1

Building AFL++ from source

  1. Get a copy of the AFL++ source code by downloading and extracting this archive or with git clone https://github.com/AFLplusplus/AFLplusplus.git

  2. If on portal, load modules for recent clang and gcc:

    module load gcccore/13.2.0 clang/17.0.6 gcc/13.3.0
    

    If not on portal, install prerequisite packages. If you are a Debian-like system (including Ubuntu), docs/INSTALL.md gives a list of apt-get commands that will usually install them.

  3. From the directory to which you checked out AFL++ run make to build it as in

    make STATIC=1
    

    You can see the docs/INSTALL.md in the source distribution for documentation about make options like STATIC

  4. This will create the programs afl-cc and afl-fuzz among others in the AFL source directory.