ParaFuzz: Coverage-guided Property Fuzzing for Multicore OCaml Programs

Sumit Padhiyar, Adharsh Kamath, KC Sivaramakrishnan
IIT Madras & NIT Karnataka  |  OCaml Workshop 2021 (co-located with ICFP 2021)

Abstract

We develop ParaFuzz, an input and concurrency fuzzing tool for Multicore OCaml programs. ParaFuzz builds on top of Crowbar, which combines AFL-based grey-box fuzzing with QuickCheck, and extends it to handle parallelism. ParaFuzz provides an effective and pragmatic testing technique for the working OCaml programmer to detect concurrency bugs in parallel programs.

The Problem

Testing parallel programs is fundamentally hard because bugs depend on both inputs and thread scheduling interleavings. Assertions can fail only for specific combinations of input values and scheduling decisions. Traditional approaches fall short:

Stress Testing. Running programs repeatedly and hoping to trigger failures is unreliable — many interleavings are never explored.
Random Testing. Generates random inputs with OS scheduler perturbations, but provides insufficient control over scheduling decisions.
Model Checking. Tools like SPIN and TLC offer strong guarantees but are impractical for large projects — they work on program models, not source code.
Formal Verification. Demands extensive expertise and significant time investment, making it inaccessible for most developers.

How ParaFuzz Works

ParaFuzz synthesizes three proven methodologies into a single practical tool:

  1. Property-Based Testing (QuickCheck): Generates random test inputs systematically to explore the input space, allowing developers to express correctness properties as executable specifications.
  2. Grey-Box Fuzzing (AFL): Uses coverage-guided input generation with execution feedback. AFL mutates inputs to maximize code coverage, steering exploration toward previously unexplored paths.
  3. Parallelism Control via Effect Handlers: The key innovation. ParaFuzz replaces the OS thread scheduler with a user-level thread scheduler using OCaml's effect handlers. This gives the fuzzer explicit control over thread scheduling decisions, enabling systematic exploration of different interleavings.

Key Innovation: Effect Handlers

The core challenge in concurrency fuzzing is that the OS controls thread scheduling. ParaFuzz solves this using effect handlers, which provide a modular and composable basis for non-local control-flow mechanisms in Multicore OCaml.

ParaFuzz mocks the parallelism API using effect handlers, transforming the OS thread scheduler into a user-level thread scheduler. This retains explicit scheduling control while allowing the fuzzer to:

Design Advantages

Feature ParaFuzz Stress Testing Model Checking
Works on source code Yes Yes No (models)
Scheduling control Full None Full
No false positives Yes Yes Depends
Deterministic replay Yes No Yes
Scalable to large programs Yes Yes No
Easy to use Drop-in switch Trivial Requires modeling

ParaFuzz is a drop-in replacement for Multicore OCaml as a separate compiler switch. It produces no false positives (since it tests actual source code, not models), and supports deterministic record-and-replay for debugging discovered bugs.

Evaluation

ParaFuzz is evaluated on its effectiveness (fraction of runs that discover the bug) and efficiency (mean time to failure). Compared to stress testing and random testing approaches, ParaFuzz significantly outperforms both:

Higher effectiveness: ParaFuzz discovers concurrency bugs in a significantly larger fraction of runs compared to stress testing and random approaches.
Faster discovery: ParaFuzz achieves lower mean time to failure, finding bugs more quickly through coverage-guided exploration of interleavings.

Contributions