Lesson 6: Interpretability

Why interpretability matters

A metric is only useful if people can understand what it means and what actions to take when it moves. The most sophisticated metric in the world won't drive good decisions if stakeholders can't interpret it.

When a metric changes, interpretability means everyone understands what user behavior changed, whether the change is good or bad, what might have caused it, and what actions to consider next. Without this shared understanding, you can't make confident decisions.

Compare "adjusted engagement index" to "average session duration per active user per week." The first leaves people guessing—what's being adjusted? What counts as engagement? How do I interpret a 5% increase? The second is clear: you're measuring how long active users spend in sessions each week. If it goes up, users are spending more time. If it goes down, they're spending less.

Metric name format

Metric names should be descriptive and self-explanatory. A good metric name answers three questions: What are you measuring, for whom, and over what time period?

The "what" is the behavior or outcome: purchases, clicks, conversions, completed tasks. The "who" is the unit of analysis: per user, per session, per account. The "when" is the time window: daily, in the first week, during the trial period. Put these together and you get names like "purchase completion rate per user in first 30 days" or "average pages viewed per session for returning visitors."

Write helpful descriptions

Every metric should have complete technical documentation—calculation logic, time windows, filters, and more. But your colleagues shouldn't need to look it up every time they see your metric. A well-written description saves everyone time.

Think of the description as your chance to help colleagues quickly understand what the metric measures and why it matters. When someone is choosing metrics for an experiment or reviewing results, they can read your description and immediately know if this metric is relevant, without diving into the full definition.

A good description answers three questions: What behavior are you measuring? Why does it matter? When would someone use this metric? The time you invest in writing a clear description pays off many times over as colleagues reuse your metric.

This matters even more as AI agents become part of experiment workflows. An agent can read the full metric definition, but when there are thousands of metrics to search through, a well-written description is far more useful than a definition: it lets the agent find the right metric by understanding its purpose rather than parsing its implementation.

Good descriptions are also the foundation of a centralized metric documentation system. When every metric has a clear definition, consistent calculation logic, and well-written context, the whole organization can share a single source of truth. Teams can discover existing metrics rather than recreating them, and when business needs change, the logic only needs to be updated in one place.

Audience communication

Different stakeholders need different levels of detail. Engineers need technical specifics: data sources, joins, edge cases, implementation notes. Product managers need to understand what the metric measures, why it matters for the product, and how to interpret changes. Executives need business impact, connections to strategic goals, and what actions to consider.

The same metric can be explained three different ways. For an engineer: "We calculate monthly active users as distinct user IDs with at least one logged event in the trailing 30-day window, excluding test accounts and automated traffic." For a product manager: "Monthly active users tells us how many unique people used the product in the last month. It's our primary measure of active user base size." For an executive: "Monthly active users grew 3% this quarter, driven by strong growth in emerging markets and improved retention in the free tier."

Technical accuracy and comprehension

Sometimes the most technically accurate description is too complex for broad communication. When this happens, use simple language for general communication, provide technical details in documentation, and highlight any important caveats.