Lesson 3: Sampling distribution of the difference-in-means estimator

Using probability theory, we can calculate, without going through all random samples and treatment assignments, how the difference-in-means estimator will vary across all possible samples and treatment assignments. In fact, we even know the precise distribution that the difference-in-means estimator will have across all possible samples and treatment assignments. For means and difference in means, the result that lets us do this is called the Central Limit Theorem. The Central Limit Theorem states that if the sample size is large enough, then the difference-in-means estimator will be normally distributed around the true average treatment effect across random samples and treatment assignments.

You only observe a point estimate

Importantly, the observed difference in means in a given sample is not normally distributed since it's just a fixed value. It is the difference-in-means estimator across random samples and treatment assignments that is normally distributed. This means that if you would run the experiment many times, the difference in means you observe would be normally distributed.

Simulation

In this simulation, we draw a random sample, split it randomly into treatment and control, and calculate the difference in means. We do this many times to see how the difference in means varies across random samples and treatment assignments. Note that there is no treatment effect in this simulation. The variation in the difference in means is only due to random variation in the sample and treatment assignment. The observed distribution is called the sampling distribution of the difference-in-means estimator, as it is the distribution this estimator has across random samples and treatment assignments.

Random Sample

Treatment

Control

Difference in means

Histogram of difference in means

Samples: 0 / 500

The magic that probability theory and statistics bring us is that we know the what distribution will be a good approximation of the 500 simulated difference-in-means estimates under the null before we have run the simulation. It works, because of math!

The value of knowing the distribution of the difference-in-means can't be overstated. It lets us observe one sample and still draw conclusions (make inference) about the full population. More on that in the next lesson.

Notes for nerds

There are some technicalities in the Central Limit Theorem that we have glossed over. The Central Limit Theorem states that the difference-in-means estimator is normally distributed around the true treatment effect if the sample size is large enough. The exact conditions for when the Central Limit Theorem holds are a bit more nuanced, but for the purposes of this course, we can assume that the Central Limit Theorem holds when the sample size is large enough. In principle, as long as the underlying data doesn't have too fat tails, the Central Limit Theorem will hold.

There are ways of making inference that is not based on the Central Limit theorem. One example is the bootstrap method, which is a resampling method that can be used to estimate the distribution of an estimator without making assumptions about the distribution of the data. The bootstrap method is a powerful tool that can be used in many situations where the Central Limit Theorem doesn't hold. However, the bootstrap method is more computationally intensive but there are some tricks to make it faster. See for example our blog post on bootstrap for quantiles.