Analytics fundamentals

A/B Testing

A/B testing is an experiment where you show two versions of something to different groups of users to see which performs better.

What it means

An A/B test is the cleanest way to know if a change actually helps. You take two versions (A and B), randomly show each to half your traffic, measure the result, and pick the winner. The randomization is what makes it work: it removes confounding variables like time of day, traffic source, or user type.

A/B tests are useful for headlines, button colors, layout changes, pricing, and onboarding flows. They're not useful for changes you can't roll back, big strategic decisions, or anything where the difference is too small to detect with your traffic volume.

The trickiest part of A/B testing isn't running the test, it's interpreting it. You need enough sample size to detect the actual effect, you need to wait long enough for the result to stabilize, and you need to resist 'peeking' early and calling a winner before the test is statistically significant.

Why it matters

A/B testing replaces opinions with evidence. Without it, you make changes based on what you think will work, then attribute any change in metrics to your edit (when it might just be normal variance). With it, you know whether your change actually moved the metric or just rode the noise.

Example with real numbers

Concrete example showing how this metric works in practice.

Scenario

You want to know if a green CTA button converts better than your current orange one. You set up an A/B test: 50% of visitors see orange, 50% see green.

What it means

After 10,000 visitors per variant, the orange button converts at 4.2% and the green at 4.8%. The difference is statistically significant. Roll out green to 100% of traffic.

Common mistakes

Things people get wrong when measuring a/b testing.

Mistake 01

Calling a winner too early. With small samples, normal variance can look like a winner.

Mistake 02

Running too many tests at once. Overlapping experiments make it hard to attribute results.

Mistake 03

Testing changes too small to matter. Going from one shade of blue to another rarely produces detectable differences.

Mistake 04

Forgetting about novelty effects. A new design might lift conversion temporarily because it's new, then return to baseline.

How to track it

Use a dedicated A/B testing tool (PostHog, Optimizely, Google Optimize) for proper randomization and statistical handling. For simple tests on landing pages, you can use two URLs and compare conversion rates over a set period.

Want to learn more concepts?

Browse the full glossary of product analytics terms.

Common questions about a/b testing

A/B testing is an experiment where you show two versions of something to randomly split groups of users to determine which performs better on a chosen metric.

Long enough to reach statistical significance with enough traffic. For most sites, that's at least 1-2 weeks per test. Shorter tests are usually unreliable.

Depends on the size of the effect you're measuring. Small effects (lifting from 4% to 4.5%) need tens of thousands of visitors per variant. Large effects (4% to 8%) need only thousands. Use a sample size calculator before starting.

When you don't have enough traffic to reach significance, when the change is too small to matter, or when you can't roll back if the variant turns out worse. Sometimes user research is faster than A/B testing.

Track a/b testing automatically with Muro

Privacy-friendly product analytics that tells you what's working and what to fix next, in plain English.

$5/month after the trial. Cancel anytime.