Confidence
  • Documentation
  • Blog
  • Bootcamp
  • Status
  • Confidence Bootcamp
    • My learning
    • Intro to experimentation
      • Introduction
      • Lesson 1: Why you should experiment
      • Lesson 2: Experiment hypothesis
      • Lesson 3: Success and guardrail metrics
      • Lesson 4: Success metrics
      • Lesson 5: Set up your experiment
      • Lesson 6: Calculation frequency
      • Lesson 7: Target audience
      • Lesson 8: Sample size
      • Lesson 9: Quality assurance
      • Lesson 10: Run your experiment
      • Lesson 11: Evaluate your experiment and make a decision
      • Lesson 12: A/B tests and rollouts
      • Course wrap up
    • Intro to metrics
      • Introduction
      • Lesson 1: What is a metric?
      • Lesson 2: Metric roles
      • Lesson 3: Time considerations
      • Lesson 4: Capturing behavior
      • Lesson 5: Strategic metrics
      • Lesson 6: Interpretability
      • Lesson 7: Feasibility and sensitivity
      • Lesson 8: Variance reduction
      • Lesson 9: Select metrics
      • Lesson 10: Segment-level analysis
      • Course wrap up
    • Scientific product development
      • Introduction
      • Lesson 1: Why you should experiment
      • Lesson 2: The scientific method
      • Lesson 3: Randomized controlled trials
      • Lesson 4: Experiment hypothesis
      • Lesson 5: Case study
        • Case study
        • Answers to case study
      • Lesson 6: Why do we need statistics?
      • Lesson 7: Success metrics
      • Lesson 8: Detectable effects and sample size
      • Lesson 9: Make a decision
      • Course wrap up
    • A primer on hypothesis testing
      • Introduction
      • Lesson 1: Introduction to hypothesis testing
      • Lesson 2: True vs estimated effects
      • Lesson 3: Sampling distribution of the difference-in-means estimator
      • Lesson 4: Z-tests and how to reject the null hypothesis
      • Lesson 5: False postive rate and alpha
      • Lesson 6: True positive rate, MDE, and power
      • Course wrap up
    • Intro to Feature Flags
      • Introduction
      • Lesson 1: What is a feature flag?
      • Lesson 2: Lifecycle of a feature flag
      • Lesson 3: Clients
      • Lesson 4: Evaluation context and targeting
    • Sample size calculation - I
      • Introduction
      • Lesson 1: What is the required sample size?
      • Lesson 2: Alpha and power
      • Lesson 3: Baseline mean and variance
      • Lesson 4: Sample size playground - I
    • Sample size calculation - II
      • Introduction
      • Lesson 1: Multi-metric decision making
      • Lesson 2: Number of success metrics
      • Lesson 3: Number of guardrail metrics
      • Lesson 4: Number of comparisons
      • Lesson 5: Sample size playground - II
    • Sample size calculation - III
      • Introduction
      • Lesson 1: Binary metrics
      • Lesson 2: Treatment group proportions
      • Lesson 3: Variance reduction
      • Lesson 4: Sequential testing and sample size
      • Lesson 5: Sample size playground - III
    • Advance your experimentation
      • Introduction
      • Lesson 1: Guardrail metrics with non-inferiority margins
      • Lesson 2: Choose evaluation frequency
      • Lesson 3: Metrics' roles in experiments
      • Lesson 4: Cumulative holdback evaluations
    • Experimentation culture
      • Introduction
      • Lesson 1: Onboarding into experimentation
      • Lesson 2: Empowering experimentation champions
      • Lesson 3: Sustaining the experimentation culture
    • Videos

Lesson 2: The origin of the scientific method

Summary

In this lesson, you learn about the scientific method and what makes evidence from this method different from other types of evidence, and why you should regard it as the most trustworthy way of learning about the world.

The scientific method is old. One classic example of the essence of the scientific method and how it differs from other ways of learning about the world is Halley's work on comets.

Example: Halley's comet

Once every 76 years, the comet that is now known as "Halley's comet" passes close to the Earth and can be seen with the naked eye. Halley's comet has been observed since at least the year 240 BC, but it is only since 1704 that we know that this is the same comet that returns again and again. Since Aristotle, scholars believed that comets were disturbances in the Earth's atmosphere. In 1577 the Danish astronomer Tycho Brahe found that comets passed behind the moon, and could therefore not be part of the atmosphere.

In 1704, Edmund Halley used the methods published by Newton to calculate the orbits of comets that had been observed in the centuries before. He found that the comets that had been seen in 1531 and 1607 had the same orbit as a comet that had been seen in 1682. Based on this observation, Halley formulated a hypothesis:

"I observed that the comets from 1531, 1607, and 1682 have the same orbit. Based on this observation, I believe that this is in fact the same comet, that moves in ellipses around the sun and returns once every 76 years. I predict that this comet will be seen again in 1758."

Or, in his own words: "Hence I dare venture to foretell, that it will return again in the year 1758."

Halley died before he could see his prediction verified. The comet returned on schedule, in 1758 to the amazement of the scientific community, the public and the British The Gentlemen's magazine, which wrote: "By its appearance at this time, the truth of the Newtonian Theory of the Solar System is demonstrated to the conviction of the whole world, and the credit of the astronomers is fully established and raised far above all the wit and sneers of ignorant men."

Question: Would Halley's prediction have been as impressive if he had predicted: "I predict that this comet will be seen again some time between 1740 and 1770?" Would Halley's prediction have been as impressive if we saw comets almost every year?

The observation of the comet at the predicted time provided fairly strong evidence for his hypothesis, because it was very specific. Predicting the next observation of a comet and getting it right by pure luck would have been very unlikely.

The scientific method

Scientific method

The scientific method has five basic steps, plus one feedback step:

  1. Make an observation
  2. Ask a question
  3. Form a hypothesis, or testable explanation
  4. Make a prediction based on the hypothesis
  5. Test the prediction
  6. Iterate: use the results to make new hypotheses or predictions

What is an experiment?

An experiment is a procedure designed to test a hypothesis as part of the scientific method.

There are different types of experiments:

Uncontrolled experiments

An uncontrolled experiment involves making a prediction or forming a hypothesis and then gathering data by observing a system. The variables are not controlled in a natural experiment. A historical example is Edmund's Halley's hypothesis about the orbits of comets described above.

Controlled experiments

In a controlled experiment, you compare a treatment group with a control group, to test the effect of a treatment on an outcome. Ideally, the two groups are the same except for the change in treatment. A historical example is the Salk Polio vaccine trial, which treated 600,000 children with either the new vaccine or placebo. A/B tests and rollouts are controlled experiments.

Reader exercise

What is the purpose of the scientific method?

Was this page helpful?

PreviousLesson 1: Why you should experiment
NextLesson 3: Randomized controlled trials

© Copyright 2026. All rights reserved.

Follow us on TwitterFollow us on GitHub

On this page

  1. Example: Halley's comet

  2. The scientific method

  3. What is an experiment?