Restore Confidence in Test Automation – part 2

By Eduard Dubilyer, CTO & Automation Testing Expert at Skipper Soft

In the previous article, I discussed how flaky tests quietly erode trust in your automation strategy.

And I ended with this:

You don’t need to accept this as normal anymore.

Because today, we have powerful AI and ML-driven tools that help detect, isolate, and fix flakiness before it costs you a release.

But here’s the key: tools only work if you use them well.

First, What Is a Test Flakiness Tracker?

It’s a tool that connects to your CI pipeline and:

  • Tracks test results across builds
  • Flags flaky behavior (inconsistent results without code changes)
  • Highlights patterns—e.g., async failures, shared state issues
  • Surfaces root cause suggestions using ML

Think of it as a test reliability radar built into your dev workflow.

The Tools We Recommend (and Use with Clients)

Here are five of the most effective flake-tracking platforms we’ve integrated for startup teams:

1. BuildPulse

Simple GitHub CI integration, flake scoring, visual dashboards. Great for early-stage teams.

2. Launchable

ML-based flake detection + test optimization. Ideal for larger, growing test suites.

3. CircleCI Test Insights

If you’re already on CircleCI, this built-in dashboard shows flake rates and trends.

4. FlakyTestDetector (Open Source)

This is for teams who want to customize their own CI-level tracking with scripts.

5. Playwright/Cypress Observability Tools

Native dashboards for test retries, stability, and failure reasons. Great for frontend teams.

Now Here’s Where It Fails: Passive Usage

Installing the tool isn’t the hard part.

Using it well is.

I’ve seen teams install BuildPulse or Launchable… and then ignore the results.

Or worse, they auto-quarantine flaky tests and never go back.

That’s not quality. That’s avoidance.

What We Recommend for Real Impact

Here’s how we coach teams to turn insights into action:

1. Expose Flakiness Publicly

You can use a tag, dashboard widget, or Slack bot to show the top flaky tests weekly.

2. Set a Flake Budget

Agree on a flakiness threshold (e.g., <2%)—if exceeded, a sprint item must address it.

3. Make It Part of PR and Retro Culture

Ask in code reviews:

“Is this test stable?”

4. Pair AI with Engineering Review

Let tools surface patterns, but have code owners confirm the fix or refactor direction.

5. Use Failure Pattern Clustering

Look for systemic issues: shared state, timing-based assertions, and environment variance.

Then, fix the pattern, not just the test.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *