TL;DR
Part 1 introduced the peeking problem 2.0: when using longitudinal data, sequential tests can inflate false positive rates because units aren't fully observed when they enter the analysis.
The mechanisms are different from the original peeking problem, but the consequence is the same: inflated false positive rates. This post presents modeling approaches to handle multiple observations per experimental unit correctly. The key insight is that you need to account for the fact that users who recently entered the experiment contribute less information than users who have been in longer.
Read the full post on Spotify Engineering: Bringing Sequential Testing to Experiments with Longitudinal Data (Part 2): Sequential Testing



