Feature Flags

What are Sticky Assignments?

Sticky assignments ensure that a user who has been assigned to a variant in a feature flag or experiment continues to see that same variant across sessions, devices, and app restarts.

Sticky assignments ensure that a user who has been assigned to a variant in a feature flag or experiment continues to see that same variant across sessions, devices, and app restarts. Without stickiness, a user could flip between control and treatment on every visit, corrupting the experiment and creating a confusing product experience.

The requirement sounds simple, but the implementation choices have real consequences for both experiment validity and system architecture. Store assignments in a database, and you add a network dependency to every flag evaluation. Compute them on the fly from a hash, and you need a stable user identifier. The approach you pick determines your latency, your resilience to outages, and how much state you have to manage.

How do sticky assignments work in practice?

There are two common approaches.

Stateful persistence. Store the assignment (user ID, flag key, variant) in a database or local storage. On every evaluation, look up the stored assignment. This guarantees stickiness even if the flag configuration changes, but it requires a network call (for server-side storage) or local state (for client-side storage) on every evaluation.

Stateless hashing. Compute the assignment deterministically from a hash of the user ID and a salt. The same inputs always produce the same output, so the assignment is inherently sticky without storing anything. This is the approach Confidence uses. Because the hash is computed locally and requires no network round-trip, flag evaluations resolve in microseconds. The trade-off: if you change the salt, all assignments reshuffle.

Confidence uses deterministic hashing as the default assignment mechanism. A hash of the user ID and a per-flag salt maps each user to a bucket, and the bucket determines the variant. The assignment is sticky by construction: same user, same salt, same variant. No database lookup required.

Why does stickiness matter for experiments?

An A/B test assumes each user has a single, consistent experience throughout the experiment. If a user bounces between control and treatment, two things go wrong.

First, the user experience degrades. They see a new feature, then it disappears, then it reappears. That inconsistency can itself affect behavior, introducing noise that has nothing to do with the change being tested.

Second, the statistical analysis becomes unreliable. Most experiment analyses assume a clean split: each user is in exactly one group for the duration. Users who cross between groups dilute the measured treatment effect and can bias the estimate in either direction.

At Spotify, where experiments run across mobile apps, desktop, and web, stickiness across devices matters. Confidence achieves this through the user ID: as long as the user is logged in, the same ID feeds into the same hash, producing the same assignment regardless of device or session. For anonymous users, a device-level identifier serves the same purpose until the user authenticates.

What happens when you need to break stickiness?

Sometimes you want assignments to change. If you're running a rollout and increasing the percentage from 5% to 20%, you want new users to enter the treatment group without reshuffling the existing 5%. Confidence's bucketing system handles this: the hash maps users into 1,000 numbered buckets, and expanding a rollout adds new buckets to the treatment group without reassigning existing ones.

If you need to completely reshuffle assignments (for example, after discovering a bug in a variant), changing the flag's salt forces a new hash for every user, effectively resetting all assignments.