Lesson 1: Binary metrics


Variance of a binary metric

Binary metrics, such as whether a user clicked a button or made a purchase, have specific properties. The variance of a binary metric is a deterministic function of the mean (proportion of ones) of the metric.

Let p represent the proportion of ones in the metric. Then the variance of the metric is calculated as Variance = p * (1 - p).

For example, assume we have a metric measuring whether a user clicked a button. The mean of the metric is the proportion of users who clicked the button.

  • If the mean is p = 0.5, the variance is 0.5 * (1 - 0.5) = 0.25

  • If the mean is p = 0.1, the variance is 0.1 * (1 - 0.1) = 0.09


Variance in the treatment group

For continuous metrics, the best guess of the variance in the treatment group is typically the same as the variance in the control group. This is not because we might expect the variance to be the same, but because it is hard to make an informed guess about how a treatment effect of size MDE (Minimum Detectable Effect) would affect the variance.

For binary metrics, the situation is different. Since the variance of a binary metric is a function of its mean, we can determine the variance in the treatment group under a treatment effect of size MDE. Let p represent the proportion of ones in the control group. Then the variance in the treatment group is calculated as Treatment Variance = (p + MDE) * (1 - (p + MDE)).

For example, if p = 0.3 and the MDE is 0.1, then the treatment group variance would be (0.3 + 0.1) * (1 - (0.3 + 0.1)) = 0.4 * 0.6 = 0.24



Note for nerds

For guardrail metrics, the alternative hypothesis used in sample size calculations assumes that the proportion in the treatment group is the same as in the control group.

Variance of Binary Metrics

Technically, the variance in the treatment group used in the sample size calculation should therefore vary depending on whether the metric is a guardrail metric or a success metric.

In practice, the difference is often small enough to ignore, but Confidence's sample size calculator corrects for this.