Choosing a Sequential Testing Framework — Comparisons and Discussions

Choosing a Sequential Testing Framework — Comparisons and Discussions
Mårten Schultzberg, Staff Data Scientist
Mårten Schultzberg, Staff Data Scientist
Sebastian Ankargren, Senior Data Scientist
Sebastian Ankargren, Senior Data Scientist

TL;DR

Sequential tests let you analyze experiments while data is still being collected without inflating false positive rates. But which sequential test should you use? The literature has developed quickly, and most leading A/B testing companies have their own favorite.

This post compares different sequential testing frameworks using simulation results. Two main parameters should affect your choice: whether your data infrastructure provides data in batch or streaming, and whether you can estimate the maximum sample size upfront. Spotify uses group sequential tests (GSTs) because they were originally designed for medical studies where data arrived in batches—similar to how our data infrastructure works.

Read the full post on Spotify Engineering: Choosing a Sequential Testing Framework — Comparisons and Discussions