What is a Bucket Hashing?

Bucket hashing is the mechanism that maps a user into a numbered bucket, which then determines their variant in a feature flag or experiment. A hash function takes the user's identifier and a salt as input, produces a number in a fixed range (typically 0 to 999 or 0 to 9,999), and that number is the user's bucket. The bucket-to-variant mapping is defined by the flag configuration.

Bucket hashing is the engine underneath deterministic assignment. It's what makes it possible to assign millions of users to experiment variants with no stored state, no network calls, and no coordination between servers.

How does bucket hashing work?

The process has three steps.

Hash. Concatenate the user ID with a salt (unique per flag or experiment) and run it through a hash function. Spotify's salt-machine algorithm, described in their engineering blog, uses this approach to produce a uniformly distributed integer.

Map to bucket. Take the hash output modulo the number of buckets (e.g., mod 1,000) to get a bucket number between 0 and 999.

Assign variant. The flag configuration specifies which bucket ranges map to which variants. For a 50/50 A/B test: buckets 0-499 are control, 500-999 are treatment. For a 10% rollout: buckets 0-99 are treatment, 100-999 are control.

Because the hash function distributes users uniformly across buckets, the traffic split closely matches the configured percentages. With 1,000 buckets, the granularity is 0.1% per bucket.

Why use a salt?

The salt ensures that a user's bucket in one experiment is independent of their bucket in another. Without it, every experiment using the same hash function would assign the same users to the same bucket number, meaning the same users would always end up in "treatment" across all experiments. That's a systematic bias problem.

Spotify's coordination strategy goes further. Their experimentation coordination blog post describes how bucket reuse enables exclusive experiments (where a user can only be in one experiment at a time on a shared surface) without wasting traffic. The salt-machine algorithm makes this coordination possible: by controlling which buckets are allocated to which experiments, teams can run dozens of concurrent experiments on the same surface without overlap or interference.

How do rollouts use bucket hashing?

Bucket hashing makes gradual rollouts smooth and predictable. When you increase a rollout from 5% to 20%, you're expanding the treatment bucket range from 0-49 to 0-199. Users in buckets 0-49 stay in treatment (their experience doesn't change). Users in buckets 50-199 are newly added. Users in buckets 200-999 remain in control.

This property is critical for user experience consistency. A user who saw the new feature at 5% still sees it at 20%, at 50%, and at 100%. Confidence uses this bucket expansion model for all its rollouts, and the same buckets underpin A/B test assignments.

What makes a good bucket hashing implementation?

Three properties matter.

Uniform distribution. The hash function must spread users evenly across buckets. If certain buckets are overrepresented, your 50/50 split isn't actually 50/50. Standard cryptographic hash functions (SHA-256, MD5) and fast alternatives like MurmurHash all achieve this.

Independence across salts. Two experiments with different salts should produce uncorrelated bucket assignments for the same user. This is what prevents systematic assignment bias.

Speed. Bucket hashing runs on every flag evaluation. At Spotify's scale, that means billions of evaluations per day. The hash computation needs to be fast enough to add negligible latency. In Confidence's local evaluation mode, the entire flag resolution (including the hash) completes in 10 to 50 microseconds.

What is a Bucket Hashing?

How does bucket hashing work?

Why use a salt?

How do rollouts use bucket hashing?

What makes a good bucket hashing implementation?

Related terms