Lesson 3: Variance reduction
This lesson explains how variance reduction (using regression adjustment) affects the required sample size calculation. The bottom line is that variance reduction allows for a smaller sample size to achieve the same power.
As you learned in Lesson 2 in the level I course on sample size calculation, the variance of the metric impacts the required sample size. The larger the natural variation in a metric across users, the larger samples we require to power the effect (MDE) that we are interested in. In this lesson, you will learn how variance reduction can affect the required sample size calculation.
Variance reduction
Variance reduction has become an umbrella term for any technique that reduces the variance of the treatment effect estimator as compared to the difference-in-means estimator.
Regression adjustment
Regression adjustment was popularized for online experiments as 'CUPED' (Controlled-Experiment using Pre-Experiment Data) by Deng et al. (2013). Although that paper develops new methods for ratio metrics, and points out the efficiency of using pre-exposure data of the same metric as the covariate, the idea of using regression adjustment to reduce variance in randomized experiments has been around for decades dating at least back to the 1930s.
The idea with regression adjustment is to use covariates that can explain variation in the metric that is not due to the treatment. This is done by fitting a regression model that includes the treatment assignment as a predictor variable and the metric as the outcome variable. The treatment effect is then estimated as the coefficient of the treatment assignment variable in the regression model.
The variance reduction factor is simply the proportion of the variance in the
metric that can be explained by the covariates. The variance reduction factor is
always between 0 and 1. The closer it is to 1, the more variance we can explain
and the more we can reduce the variance of the treatment effect estimator. If
the variance of the treatment effect estimator is X, and the variance reduction factor is V, then
the variance of the variance-reduced treatment effect estimator is X(1-V).
Since the required sample size is a linear function of the variance of the treatment effect estimator, this means that the required sample size is also affected linearly by the variance reduction. In other words, if a metric has 40% variance reduction, then the required sample size is 40% smaller than if we had no variance reduction. In other words, variance reduction might as well be called "sample size reduction" for the purpose of experimentation.
For efficient experimentation with as small sample sizes as possible, a metric with slightly higher variance than another can still be more efficient if it has a higher variance reduction factor.
How does variance reduction affect the required sample size in experiments?
What is the variance reduction factor in regression adjustment?
If a metric has a variance reduction factor of 40%, how does this impact the required sample size?
Notes for nerds
The reason why we can "switch" the estimator of the treatment effect is because there are several unbiased estimators of the estimand we are interested in. Estimand is a fancy word for "causal effect of interest", and is used extensively in the economics literature. Note the wording here, it is the treatment effect estimator and its variance that we are concerned with. This variance is a function of the variance of the metric and the sample size. Technically, when we say that we "reduce the variance of a metric" what we really do is change the estimator of the treatment effect. Although there are many estimators for the treatment effect in an A/B test besides just the difference in means, 'variance reduction' usually refers to the use of regression adjustment to estimate the treatment effect.
CUPED isn't exactly the same as regression adjustment, because instead of adding the covariate to a regression together with a treatment dummy variable, two steps are taken: First the outcome is regressed on the covariate and then the residuals are regressed on the treatment dummy, or equivalently, the difference in means are calculated for the residuals. Two great reads on regression-type adjustments are Negi and Wooldrige (2020) and Jin and Ba (2021).