How Mutation Testing Exposes the Truth (PHP 2026 Edition)


You’ve got 85% code coverage. Your CI pipeline is green. You ship to production — and things break in ways your tests never caught. Sound familiar?

I’ve been there. And for a long time, I thought the answer was more tests. What I actually needed was better tests. That’s exactly what mutation testing taught me, and after using Infection PHP in production projects through 2025 and into 2026, I can confidently say it changed how I think about test quality entirely.

Previous article in this category: https://codecraftdiary.com/2026/04/18/laravel-testing-mistakes/


The Dirty Secret of Code Coverage

Code coverage tells you which lines were executed during your test run. It says nothing about whether your assertions are actually meaningful.

Consider this classic trap:

<?php

class OrderDiscountCalculator
{
    public function calculate(float $price, int $quantity): float
    {
        if ($quantity >= 10) {
            return $price * 0.9;
        }

        return $price;
    }
}
PHP

And a test that covers it 100%:

<?php

use PHPUnit\Framework\TestCase;

class OrderDiscountCalculatorTest extends TestCase
{
    public function testCalculate(): void
    {
        $calculator = new OrderDiscountCalculator();

        // Both branches hit — 100% coverage!
        $calculator->calculate(100.0, 15);
        $calculator->calculate(100.0, 5);
    }
}
PHP

This test covers 100% of the code. It also asserts absolutely nothing. If someone changes 0.9 to 0.5, your test suite stays green while your customers get 50% off everything. That’s a very expensive bug.

This is precisely the problem mutation testing solves.


What Is Mutation Testing?

Mutation testing works by automatically introducing small bugs — called mutants — into your source code, then running your test suite against each mutated version. If your tests catch the bug (the mutant is killed), great. If your tests still pass with the bug in place (the mutant survives), you have a gap.

Common mutations include things like:

  • Changing >= to > or <=
  • Replacing + with -
  • Flipping true to false
  • Removing entire return statements

The metric you care about is the Mutation Score Indicator (MSI) — the percentage of mutants your tests kill. A high MSI means your tests are genuinely sensitive to regressions.


Getting Started with Infection PHP

Infection is the de facto mutation testing framework for PHP. It integrates cleanly with PHPUnit and runs as a Composer dev dependency.

composer require --dev infection/infection
PHP

Run it for the first time with:

./vendor/bin/infection --threads=4
PHP

Infection will run your existing test suite, then start generating and testing mutants. On a modern project with --threads=4, it’s fast enough to include in a CI pipeline.


A Real-World Example: Catching What Coverage Misses

Let me walk you through a scenario I actually encountered on a SaaS project — a pricing engine with tiered discounts.

<?php

class TieredPricingService
{
    private const TIERS = [
        100 => 0.70, // 30% discount for 100+
        50  => 0.80, // 20% discount for 50+
        10  => 0.90, // 10% discount for 10+
    ];

    public function getPrice(float $unitPrice, int $quantity): float
    {
        foreach (self::TIERS as $minQuantity => $multiplier) {
            if ($quantity >= $minQuantity) {
                return round($unitPrice * $multiplier * $quantity, 2);
            }
        }

        return round($unitPrice * $quantity, 2);
    }
}
PHP

My original tests covered all branches. PHPUnit reported 100% coverage. But when I ran Infection, it flagged a surviving mutant — it changed >= to > in the tier check, and my test for exactly 10 units didn’t catch it because I only tested with 11. The boundary condition was untested.

Here’s what the corrected test looked like after Infection exposed the gap:

<?php

use PHPUnit\Framework\TestCase;
use PHPUnit\Framework\Attributes\DataProvider;

class TieredPricingServiceTest extends TestCase
{
    private TieredPricingService $service;

    protected function setUp(): void
    {
        $this->service = new TieredPricingService();
    }

    #[DataProvider('pricingProvider')]
    public function testGetPrice(float $unitPrice, int $quantity, float $expected): void
    {
        $this->assertSame($expected, $this->service->getPrice($unitPrice, $quantity));
    }

    public static function pricingProvider(): array
    {
        return [
            'below first tier'         => [10.0, 5,   50.00],
            'exactly at 10 tier'       => [10.0, 10,  90.00],  // boundary — was missing!
            'above 10 tier'            => [10.0, 11,  99.00],
            'exactly at 50 tier'       => [10.0, 50,  400.00], // boundary
            'exactly at 100 tier'      => [10.0, 100, 700.00], // boundary
            'above highest tier'       => [10.0, 200, 1400.00],
        ];
    }
}
PHP

After adding boundary assertions, Infection’s MSI jumped from 61% to 94%. That’s the difference between a test suite that gives you false confidence and one that actually has your back.


Configuring Infection for Your Project

Infection is configured via infection.json5 in your project root. Here’s a production-ready config I use:

{
    "$schema": "vendor/infection/infection/resources/schema.json",
    "source": {
        "directories": ["src"],
        "excludes": ["src/Infrastructure/Migrations"]
    },
    "mutators": {
        "@default": true
    },
    "testFramework": "phpunit",
    "testFrameworkOptions": "--testsuite=unit",
    "minMsi": 85,
    "minCoveredMsi": 90,
    "threads": 4,
    "logs": {
        "text": "var/log/infection.log",
        "html": "var/log/infection.html",
        "summary": "var/log/infection-summary.log"
    }
}
PHP

The minMsi and minCoveredMsi thresholds are important — they let your CI pipeline fail if mutation score drops below acceptable levels, the same way PHPUnit can fail below a coverage threshold.


Integrating Into CI (GitHub Actions)

Here’s a GitHub Actions job I’ve been running since mid-2025:

mutation-testing:
  runs-on: ubuntu-latest
  needs: tests
  steps:
    - uses: actions/checkout@v4

    - name: Setup PHP
      uses: shivammathur/setup-php@v2
      with:
        php-version: '8.5'
        coverage: xdebug

    - name: Install dependencies
      run: composer install --no-interaction

    - name: Run Infection
      run: ./vendor/bin/infection --threads=4 --min-msi=85 --min-covered-msi=90
PHP

One important note: Infection requires a coverage driver (Xdebug or PCOV) to know which mutants are relevant to which tests. PCOV is faster for large codebases; Xdebug gives more detail. I use Xdebug locally and PCOV in CI.


Common Objections — And Honest Answers

“It’s too slow.” It can be on large codebases, but --threads and configuring source.excludes to skip generated code, migrations, and DTOs makes a huge difference. I typically exclude everything that has no business logic.

“The MSI is too low to be useful.” Start with --min-msi=0 and just look at the HTML report. Prioritize killing mutants in your core domain logic first — that’s where bugs actually hurt.

“It produces too many surviving mutants.” Some mutants are genuinely equivalent (they don’t change behavior). Infection lets you mark these as ignored in config. Over time your noise floor drops significantly.


What My Workflow Looks Like in 2026

My current approach on active PHP projects:

  1. PHPUnit with strict coverage for the fast feedback loop during development.
  2. Infection on every PR targeting only changed files — using --git-diff-filter (available since Infection 0.27) to keep CI times reasonable.
  3. Full Infection run weekly on the main branch to catch gradual MSI drift.

The --git-diff-filter flag is a game-changer for larger repos — it only mutates code touched in the current diff, so mutation testing stays practical even on monorepos.

./vendor/bin/infection --git-diff-filter=AM --threads=8
PHP

Final Thoughts

Code coverage is a floor, not a ceiling. It tells you the minimum — which lines were touched. Mutation testing tells you something far more valuable: whether those lines are protected by tests that would actually catch a regression.

If you’re publishing technical content in 2026 and you’re not talking about mutation testing, you’re leaving one of PHP’s most powerful quality tools completely off the table. The tooling has matured, the CI integration is straightforward, and the payoff in confidence is real.

Start with a single service class. Run Infection. Look at what survives. I promise you’ll find something surprising.



Resources:

Leave a Reply

Your email address will not be published. Required fields are marked *