Lesson 11: Evaluate your experiment and make a decision
In this lesson, you learn how to interpret the results of your experiment and make a decision based on the results and how Confidence helps you do that. Use exploration to learn more about the results you got and to get inspiration for new hypotheses.
A good experimentation platform calculates results for you and displays the performance of each variant, taking care of the statistical details so you can focus on learning from the experiment. With those insights, you make a decision on how to proceed with the change you tested.
At this stage, your experiment has successfully run for a period of time and has no visible errors.
Congratulations! Now it's time for the fun part. You have at least one result to interpret, but
often there are more than just one. More precisely, your experiment has T x M results to
interpret, where T = Number of treatment groups (excluding control) and M = Number of metrics.
For an experiment with 3 treatment groups and 4 metrics, you have 12 results to interpret.
Overall decision recommendations
A good experimentation platform provides overall decision recommendations that use the outcomes of all metrics to suggest whether a specific treatment is worth rolling out.
The shipping recommendation recommends you to ship a change if at least one success metric has moved in the desired direction with significance. Simultaneously, all guardrail metrics must be significantly non-inferior, meaning that they're all within the acceptable margin you set using the non-inferiority margin. The test must also be in a healthy state, with no significant negative changes in any of the metrics, and no sign that there is a problem with the quality of the test.
Confidence provides overall decision recommendations on each treatment card on the results page.
Metric results
For each metric, you see a comparison between the control group and each treatment group. You can dig deeper into the results to see metric values, confidence intervals, variances, and more. If you ran your experiment with results delivered continuously, you can also view the results over time.
Exploration
If at the end of the experiment you find things that you would like to dig deeper into, you can do exploratory analysis. Here you can add any metric and see how it performed for each of the treatment groups, and split the results by dimensions.
This type of explorations in which you look at many metrics, perhaps until you find an "interesting" result, severely increases the risk for finding false positives. This means you risk that results are significant only by chance.
For that reason, you shouldn't use exploratory analysis to make decisions about whether an experiment was successful or not. Use it to get inspiration for new hypotheses.
Use the Explore tab to add any metric and split results by dimensions.