Lesson 8: Variance reduction
In this lesson, you learn what variance reduction is and why it causes the means shown in Confidence to differ slightly from raw group averages. The treatment variant effect is interpreted in the same way as without variance reduction. The adjustment simply makes the estimate more precise.
When you look at the results for a metric in Confidence, the control variant mean and treatment variant mean shown may not be exactly the same as the raw averages for those groups. This is because variance reduction is active by default for most metrics. Understanding what this adjustment does and what it does not do is important for reading results correctly.
The core idea: use pre-experiment data to reduce noise
Every metric has natural variation. Some users will use a feature a lot; others will barely touch it. Much of this variation has nothing to do with the experiment. It reflects pre-existing differences between users that existed before the experiment started.
Variance reduction works by using each user's pre-experiment behavior on the metric to predict and cancel out this pre-existing variation. Specifically, Confidence looks at how each user behaved on the metric before they entered the experiment, and uses that data to produce a more precise estimate of the treatment variant effect.
Think of it this way: if you know that certain users were already heavy users before the experiment, you can account for that when estimating whether the treatment variant changed their behavior. Without this adjustment, that pre-existing variation adds noise to your estimate. With it, much of that noise is removed.
The pre-experiment data used for variance reduction must come from before the user entered the experiment, so it cannot be influenced by the treatment variant. This is what makes the adjustment valid.
What changes and what does not
Because of the adjustment, the control variant and treatment variant means shown in Confidence are not the raw group averages. They are slightly shifted versions that have been adjusted to account for pre-existing differences between groups.
However, the estimated treatment variant effect and the way you interpret it remain the same. The point estimate (the relative % change) and the confidence interval are still your best estimate of the observed treatment variant effect. The adjustment makes them more precise, not different in meaning.
The relative change is calculated as the adjusted treatment mean minus the adjusted control mean, divided by the unadjusted control variant mean, so the percentage you see is still relative to the actual (unadjusted) control baseline.
A metric shows a control variant mean of 195.8 and a treatment variant mean of 196.5 in Confidence. These are adjusted values. The raw group averages might have been 196.1 and 196.7. The relative change (+0.36%) and the confidence interval are computed from the adjusted values and represent a more precise estimate of the observed treatment variant effect than you would get from the raw averages alone.
The variance reduction percentage
The variance reduction for a metric tells you how much of the original variance was removed by the adjustment. A variance reduction of 60% means that the adjusted estimate has 60% less variance than the raw estimate, effectively similar to having 2.5 times as many users without the adjustment.
A high variance reduction means the pre-experiment data was strongly predictive of post-experiment behavior. A variance reduction of 0% means no adjustment was applied.
In Confidence, you can see the variance reduction percentage for each metric in the Detailed results view.
Use the interactive below to see how variance reduction narrows the confidence interval compared to no adjustment, for the same sample size and metric noise.
Variance reduction and CI width
The top bar shows the CI without variance reduction; the bottom bar shows it after variance reduction. Both are centred on the same observed treatment effect. Use the direction toggle to set which way the metric should move.
Try the following:
- Set variance reduction to 0%. This is the CI you would get without any adjustment.
- Increase variance reduction to 60%. Notice how the CI narrows: that is extra precision from pre-experiment data, with no additional users.
- Now try increasing the sample size instead. Both approaches narrow the CI; variance reduction is the free version.
Why this matters
Variance reduction is one of the main reasons sample sizes in Confidence can be smaller than in tools that do not use this technique. It improves the precision of every estimate without requiring more users. When you see a narrow confidence interval for a metric, variance reduction is often a contributing factor.
You do not need to think about variance reduction when reading results. Just know that the adjustment is there, it makes estimates more reliable, and you interpret the numbers the same way you would without it.
When variance reduction is active, the means shown for a metric in Confidence are...
How should you interpret the relative % change for a metric when variance reduction is active?
Notes for nerds
The regression adjustment
The variance reduction method used in Confidence is covered in detail in the variance reduction lesson in the intro to metrics course, and its effect on required sample sizes is covered in the sample size calculation III course. In short, the method fits separate regressions of the post-experiment outcome on the pre-experiment variable for each group, then adjusts the treatment variant effect estimate accordingly. The classical CUPED formulation uses a single adjustment coefficient for both groups. Fitting separate regressions per group is never worse, and is strictly better whenever users respond differently to treatment—which is the typical case. The two are equivalent only in a perfectly balanced 50/50 experiment (Negi and Wooldridge, 2020).
Bounds on covariate selection
In principle, any pre-experiment covariate can be included in the regression to reduce variance further. In practice, the pre-experiment measurement of the metric itself is hard to beat—and the gains from going beyond it are bounded. Even the most sophisticated feature engineering can narrow the confidence interval by at most a further 29% beyond what the simple pre-experiment metric already achieves (Ting and Hung, 2023).
Adjusted control means in multi-variant experiments
One consequence is that in experiments with multiple treatment variants, the variance-adjusted control variant mean shown for a given comparison uses only the data from the variants involved in that specific comparison. This means the adjusted control variant mean can differ slightly between comparisons. This is expected and correct. It does not indicate an error in the data.