ParaFuzz: Coverage-guided Property Fuzzing for Multicore OCaml Programs

Sumit Padhiyar, Adharsh Kamath, KC Sivaramakrishnan

IIT Madras & NIT Karnataka | OCaml Workshop 2021 (co-located with ICFP 2021)

Talk Code

Abstract

We develop ParaFuzz, an input and concurrency fuzzing tool for Multicore OCaml programs. ParaFuzz builds on top of Crowbar, which combines AFL-based grey-box fuzzing with QuickCheck, and extends it to handle parallelism. ParaFuzz provides an effective and pragmatic testing technique for the working OCaml programmer to detect concurrency bugs in parallel programs.

The Problem

Testing parallel programs is fundamentally hard because bugs depend on both inputs and thread scheduling interleavings. Assertions can fail only for specific combinations of input values and scheduling decisions. Traditional approaches fall short:

				Stress Testing. Running programs repeatedly and hoping to trigger failures is unreliable — many interleavings are never explored.
			

				Random Testing. Generates random inputs with OS scheduler perturbations, but provides insufficient control over scheduling decisions.
			

				Model Checking. Tools like SPIN and TLC offer strong guarantees but are impractical for large projects — they work on program models, not source code.
			

				Formal Verification. Demands extensive expertise and significant time investment, making it inaccessible for most developers.
			

How ParaFuzz Works

ParaFuzz synthesizes three proven methodologies into a single practical tool:

Property-Based Testing (QuickCheck): Generates random test inputs systematically to explore the input space, allowing developers to express correctness properties as executable specifications.
Grey-Box Fuzzing (AFL): Uses coverage-guided input generation with execution feedback. AFL mutates inputs to maximize code coverage, steering exploration toward previously unexplored paths.
Parallelism Control via Effect Handlers: The key innovation. ParaFuzz replaces the OS thread scheduler with a user-level thread scheduler using OCaml's effect handlers. This gives the fuzzer explicit control over thread scheduling decisions, enabling systematic exploration of different interleavings.

Key Innovation: Effect Handlers

The core challenge in concurrency fuzzing is that the OS controls thread scheduling. ParaFuzz solves this using effect handlers, which provide a modular and composable basis for non-local control-flow mechanisms in Multicore OCaml.

ParaFuzz mocks the parallelism API using effect handlers, transforming the OS thread scheduler into a user-level thread scheduler. This retains explicit scheduling control while allowing the fuzzer to:

Yield at synchronization points where context-switching causes non-determinism: domain operations (spawn, join), atomics (get, set, CAS), mutex operations (lock, unlock), and condition variables (wait, notify, broadcast).
Let AFL pick the next thread to run from the ready queue, using coverage feedback to explore different interleavings.
Deterministically record and replay failing executions for debugging.

Design Advantages

Feature	ParaFuzz	Stress Testing	Model Checking
Works on source code	Yes	Yes	No (models)
Scheduling control	Full	None	Full
No false positives	Yes	Yes	Depends
Deterministic replay	Yes	No	Yes
Scalable to large programs	Yes	Yes	No
Easy to use	Drop-in switch	Trivial	Requires modeling

ParaFuzz is a drop-in replacement for Multicore OCaml as a separate compiler switch. It produces no false positives (since it tests actual source code, not models), and supports deterministic record-and-replay for debugging discovered bugs.

Evaluation

ParaFuzz is evaluated on its effectiveness (fraction of runs that discover the bug) and efficiency (mean time to failure). Compared to stress testing and random testing approaches, ParaFuzz significantly outperforms both:

				Higher effectiveness: ParaFuzz discovers concurrency bugs in a significantly larger fraction of runs compared to stress testing and random approaches.
			

				Faster discovery: ParaFuzz achieves lower mean time to failure, finding bugs more quickly through coverage-guided exploration of interleavings.
			

Contributions

An input and concurrency fuzzing tool that combines property-based testing (QuickCheck) with grey-box fuzzing (AFL) for parallel OCaml programs.
A novel use of effect handlers to replace the OS thread scheduler with a user-level scheduler, giving the fuzzer explicit control over thread interleavings.
A practical, drop-in compiler switch for Multicore OCaml with deterministic record-and-replay and no false positives.
Demonstration of significantly improved effectiveness and efficiency over stress testing and random testing for concurrency bug detection.