Statistical Significance
Statistical significance is the probability that a result is real and not due to random chance.
What it means
When you compare two numbers, they're almost never exactly equal. The question is whether the difference is real (meaningful and repeatable) or just noise (would disappear with more data). Statistical significance is the formal way to answer that question.
The most common measure is the p-value. A p-value of 0.05 means there's a 5% chance the result is due to random variation. By convention, results with p-values below 0.05 are called 'statistically significant', meaning we're confident enough that the effect is real.
Significance does not mean importance. A change can be statistically significant but practically meaningless (a 0.1% lift in a metric that doesn't matter). Conversely, a result might be practically important but not yet statistically significant because you haven't collected enough data.
Why it matters
Without statistical significance, you'll declare winners that aren't really winners. You'll roll out changes that look good but were just lucky. Significance is what separates 'it worked' from 'we got lucky once'. Especially important for A/B testing and product changes.
Example with real numbers
Concrete example showing how this metric works in practice.
Scenario
Variant A converts at 4.2%. Variant B converts at 4.8%. The p-value is 0.04.
What it means
The 0.6 percentage point lift is statistically significant (p < 0.05). You can roll out B with confidence. If the p-value were 0.20, the difference might just be noise; collect more data before deciding.
Common mistakes
Things people get wrong when measuring statistical significance.
Mistake 01
Confusing statistical significance with practical importance. A 0.1% significant lift on a metric that doesn't matter is significant noise, not a win.
Mistake 02
Stopping a test as soon as it hits p < 0.05. This is called 'peeking' and inflates false positives.
Mistake 03
Running too many simultaneous tests. The more tests you run, the more false positives you'll see by chance alone.
Mistake 04
Treating p = 0.06 as failure and p = 0.04 as success. The line is somewhat arbitrary.
How to track it
Most A/B testing tools calculate statistical significance automatically. For manual calculations, online calculators take your sample sizes and conversion rates and return p-values.
Related concepts
Other terms worth learning if you're studying this one.
Common questions about statistical significance
Statistical significance is the likelihood that a result is real and not due to random chance. It's commonly measured by p-value, with p < 0.05 typically considered significant.
P-value is the probability that you'd see your observed result by random chance if there were no real effect. A p-value of 0.05 means there's a 5% chance the result is just noise.
No. Significance tells you a result is real. Importance tells you it matters. A statistically significant 0.1% lift might be too small to matter. A meaningful trend might not yet be statistically significant.