Articles Snippets Projects

The Flaky Test Chronicles IV: The Teardown Tango

Mastering tearDown, Fakes, and Assertions

January 1st ʼ26 2 months ago 14 min 2646 words

One line in tearDown(). Wrong order. 200 tests fail. Nobody knows why.

The test suite passed for three months. Then someone added a file upload test. Suddenly, 200 tests started failing randomly. The culprit? One line in tearDown() - in the wrong order.

This part covers the mechanics: tearDown() ordering, file cleanup traps, observable event side effects, and the assertion patterns that prevent false positives.


The Order of Destruction

The tearDown() method seems simple. Clean up after your test. What could go wrong?

Everything. Everything can go wrong.

The Golden Rule

Custom cleanup BEFORE parent::tearDown(). Always.

protected function tearDown(): void
{
    // 1. Your custom cleanup first
    $this->resetPaymentGateway();
    $this->clearCustomCache();

    // 2. Parent tearDown last
    parent::tearDown();
}

Why This Order Matters:

  • parent::tearDown() may clear resources your cleanup needs

  • Mockery is closed in parent tearDown

  • Database transactions are rolled back in parent tearDown

Universal vs Targeted Cleanup

Not all cleanup belongs in base TestCase. The key question: does every test need this cleanup?

Universal cleanup (base TestCase):

  • Config/locale resets that affect all tests

  • Global state that leaks between tests

  • Framework-level cleanup (Mockery, transactions)

Targeted cleanup (opt-in traits):

  • Payment gateway reset (only tests that mock external services)

  • External service cleanup (only integration tests)

  • File system cleanup (only tests that write files)

The Opt-in Trait Pattern

Laravel automatically discovers setUpXXX() and tearDownXXX() methods in traits. Use this for targeted cleanup:

/**
 * Use this trait in tests that interact with the payment gateway
 * and need to reset its singleton state between tests.
 */
trait ResetsPaymentGateway
{
    protected function tearDownResetsPaymentGateway(): void
    {
        app()->forgetInstance(PaymentGateway::class);
        app()->forgetInstance(PaymentProcessor::class);
    }
}

Multiple test classes need this cleanup - they share the trait instead of duplicating the logic:

class PaymentGatewayTest extends TestCase
{
    use ResetsPaymentGateway;  // Opt-in: this test uses payment services

    #[Test]
    public function it_uses_correct_provider(): void
    {
        $this->mock(PaymentGateway::class)
            ->shouldReceive('process')
            ->andReturn(true);
        // ...
    }
}

Benefits:

  • Zero overhead for tests that don't need it

  • Self-documenting: trait name and docblock explain the “why”

  • Explicit dependency: you see which tests have special needs


The Shared Resource Trap

Two tests. Same file path. One writes, the other overwrites. Sometimes Test A finishes first. Sometimes Test B. The result? Flaky.

// Test A
public function it_exports_orders(): void
{
    $this->exportService->export($orders, '/tmp/export.csv');
    $this->assertFileExists('/tmp/export.csv');
}

// Test B - runs in parallel
public function it_exports_invoices(): void
{
    $this->exportService->export($invoices, '/tmp/export.csv');  // Same path!
    $this->assertStringContainsString('Invoice', file_get_contents('/tmp/export.csv'));
}
// One test reads the other's file. Flaky.

The fix is isolation. Each test needs its own resource.

The Easy Way: Storage::fake()

For file operations, Storage::fake() creates a temporary disk that gets cleared on each call:

public function it_exports_orders(): void
{
    Storage::fake('local');

    $this->exportService->export($orders, 'export.csv');

    Storage::disk('local')->assertExists('export.csv');
}

When You Need Real Paths

Sometimes you can't use fake disks - absolute paths, actual filesystem behavior, or third-party packages. Then make paths unique per test:

public function it_exports_orders(): void
{
    $path = storage_path('exports/test-' . $this->name() . '-' . uniqid() . '.csv');

    $this->exportService->export($orders, $path);

    $this->assertFileExists($path);
    @unlink($path);  // Clean up
}

Or use a trait to handle the pattern:

trait CreatesUniqueTestFiles
{
    protected array $testFiles = [];

    protected function testPath(string $filename): string
    {
        $path = storage_path('test-files/' . $this->name() . '-' . uniqid() . '-' . $filename);
        $this->testFiles[] = $path;
        return $path;
    }

    protected function tearDownCreatesUniqueTestFiles(): void
    {
        foreach ($this->testFiles as $path) {
            @unlink($path);
        }
    }
}

The pattern extends beyond files. Cache keys, queue names, temporary database records - anything shared between parallel tests needs isolation.


The Cache vs Mock Race

Test A caches a config value. Test B mocks the same config. The cache wins. Test B fails - but only when it runs after Test A.

// Test A - runs first, caches the value
public function it_checks_feature_flag(): void
{
    $enabled = config('feature.notifications_enabled');  // Cached!
    // ...
}

// Test B - runs second, tries to mock
public function it_disables_notifications(): void
{
    Config::shouldReceive('get')
        ->with('feature.notifications_enabled')
        ->andReturn(false);

    // Mock never triggers - cached value wins
    // Test fails randomly depending on execution order
}

The fix: clear the cache before mocking.

public function it_disables_notifications(): void
{
    // Clear before mock
    cache()->forget('feature.notifications_enabled');

    Config::shouldReceive('get')
        ->with('feature.notifications_enabled')
        ->andReturn(false);

    // Now the mock works regardless of test order
}

This applies to any cached value you want to mock: config, service responses, computed results. If it might be cached, clear it first.


Observable Events: The Silent Side Effects

Model observers can wreak havoc on tests. But the solution isn't to blindly suppress all observers. The real task is identifying which ones introduce non-determinism.

The Flaky Risk Spectrum

Not all observers cause flaky tests. The risk depends on what they do:

  • High risk: External API calls, notifications, queue dispatching

  • Medium risk: Database updates to related models, cache mutations (especially in parallel tests)

  • Low/No risk: UUID generation, slug creation, simple timestamps

A typical problematic observer:

class OrderObserver
{
    public function created(Order $order): void
    {
        // High risk: sends HTTP request
        Http::post('https://analytics.example.com/track', [...]);

        // High risk: dispatches notification
        $order->user->notify(new OrderCreated($order));

        // Medium risk: mutates related model
        $order->user->increment('order_count');
    }
}

Compare this to a safe observer:

class OrderObserver
{
    public function creating(Order $order): void
    {
        // No risk: deterministic, no external deps
        $order->uuid ??= Str::uuid();
        $order->reference ??= $this->generateReference();
    }
}

Laravel Native: All-or-Nothing Control

For simple cases, Laravel provides built-in methods to silence all model events:

// Silence all events within a closure
$order = Order::withoutEvents(function () {
    return Order::factory()->create();
});

// Or use quiet methods for single operations
$order->saveQuietly();
$order->deleteQuietly();

This works when you want to silence everything. But what if your creating event sets a UUID (safe) while your created event sends a notification (risky)?

Surgical Control with ignorable-observers

The ignorable-observers package lets you suppress specific events while keeping others active:

// In setUp(): suppress only the risky events
Order::ignoreObservableEvents(['created', 'updated']);
// 'creating' still runs - UUID generation works

User::ignoreObservableEvents(['saved']);
// Notifications won't fire, but validation observers do

When you actually need to test observer behavior, temporarily re-enable:

#[Test]
public function it_sends_notification_on_order_creation(): void
{
    Notification::fake();
    Order::unignoreObservableEvents(['created']);

    $order = Order::factory()->create();

    Notification::assertSentTo($order->user, OrderCreated::class);

    Order::ignoreObservableEvents(['created']);  // Clean up
}

This pattern gives you the best of both worlds: deterministic tests by default, with the ability to test observer behavior when needed.


HTTP, Events, and Queue Fakes

Http::fake() Precedence

Laravel matches HTTP fakes in array order. The first matching pattern wins. Put wildcards last, or they swallow everything.

Before (Broken):

Http::fake([
    'api.example.com/*' => Http::response(['status' => 'ok']),      // Wildcard first
    'api.example.com/users' => Http::response(['users' => [...]]),  // Never reached!
]);

// GET api.example.com/users returns {'status': 'ok'} - wrong response
// The wildcard matched first, specific route never checked

After (Correct):

Http::fake([
    'api.example.com/users' => Http::response(['users' => [...]]),  // Specific first
    'api.example.com/*' => Http::response(['status' => 'ok']),      // Wildcard catches the rest
]);

// GET api.example.com/users returns {'users': [...]} - correct
// GET api.example.com/orders returns {'status': 'ok'} - wildcard fallback

Think of it like route definitions in Laravel: more specific routes go before catch-all routes. Same principle.

Queue/Bus Fakes

You faked the bus. The job won't run. Its internal mocks are useless.

Bus::fake();

// Job won't execute - no need to mock its internals
ProcessOrder::withChain([
    new SendConfirmation(),
    new UpdateInventory()
])->dispatch($order);

// Assert on structure, not execution
Bus::assertDispatched(ProcessOrder::class, fn ($job) =>
    $job->chained[0] instanceof SendConfirmation
);

Partial Fakes

Sometimes you want some jobs to run, sometimes you don't. Be explicit.

// Only fake SpecificJob - let others run
Bus::fake([SpecificJob::class]);

Event Fakes

Events firing events can get messy. Fake only what you need.

// Only fake OrderUpdated - let nested events fire
Event::fake([OrderUpdated::class]);

$order->update(['status' => 'completed']);

Event::assertDispatched(OrderUpdated::class);

Or use fakeFor for scoped faking:

Event::fakeFor(function () use ($order) {
    $order->update(['status' => 'completed']);
    Event::assertDispatched(OrderUpdated::class);
});
// Events are real again here

For centralizing fakes in your base TestCase and trait-based faking strategies, see Part 5: The Abstraction Avalanche.

Timing Matters

Fake BEFORE you trigger. Always.

// CORRECT ORDER
Notification::fake();
Mail::fake();

$order = Order::factory()->create();  // afterCreating hooks won't send real notifications

Assertion Patterns

Boolean Properties

assertTrue() accepts any “truthy” value, not just true. This can mask type bugs:

// Model WITHOUT boolean cast
class User extends Model
{
    // protected $casts = ['is_active' => 'boolean']; // Missing!
}

// Test passes - you think everything is fine
$user = User::factory()->create(['is_active' => true]);
$this->assertTrue($user->fresh()->is_active);  // ✓ passes

// But your API returns: {"id": 1, "is_active": 1}
// Frontend JavaScript breaks: if (user.is_active === true) // never executes!

// assertSame catches the missing cast:
$this->assertSame(true, $user->fresh()->is_active);  // ✗ fails

Add the boolean cast to your model, and assertSame passes. If not, you catch the bug before production.

Money Objects

Floats are liars. 0.10 !== 0.1 in floating point land. I wrote about this in Financial Precision in Agriculture Fintech. The short version: use Money objects with sufficient precision, and store them that way too.

// WRONG: Float comparison nightmares
$this->assertEquals($expected, $actual);

// CORRECT: Use Money's comparison
$this->assertTrue($expectedMoney->isEqualTo($actualMoney));

Exception Assertions

Pattern 1: Use $this->fail() for try/catch

public function it_throws_validation_exception(): void
{
    try {
        OrderValidator::validate($order);

        $this->fail('ValidationException was not thrown');
    } catch (ValidationException $e) {
        $this->assertEquals('expected', $e->getMessage());
    }
}

Pattern 2: Use expectException() (Preferred)

public function it_throws_validation_exception(): void
{
    $this->expectException(ValidationException::class);
    $this->expectExceptionMessage('expected');

    OrderValidator::validate($order);
}

Risky Test Prevention

PHPUnit marks tests as risky when they have no assertions. This often indicates a problem with the test.

Before (Risky - no assertion if exception not thrown):

#[Test]
public function it_throws_validation_exception(): void
{
    try {
        OrderValidator::validate($order);
    } catch (ValidationException $e) {
        $this->assertEquals('expected', $e->getMessage());
    }
    // If no exception is thrown, test passes with no assertions = risky!
}

After (Explicit failure if no exception):

#[Test]
public function it_throws_validation_exception(): void
{
    try {
        OrderValidator::validate($order);

        $this->fail('ValidationException was not thrown');
    } catch (ValidationException $e) {
        $this->assertEquals('expected', $e->getMessage());
    }
}

What's Next

Missed the previous part? Part 3: The Determinism Principle covers time control, random values, and DataProvider timing issues.

Part 5: The Abstraction Avalanche covers the test infrastructure itself:

  • The base TestCase architecture

  • Trait organization and naming conventions

  • Authentication helpers and Passport patterns

  • Service faking strategy

See you there.


The Flaky Test Chronicles is a series documenting what we learned from 300+ commits of test suite cleanup. Remember: parent::tearDown() goes last. Your sanity depends on it.