Skip to main content
The measurements in your fact tables are the foundations of your metrics. These facts describe your entities and their behavior. For example, your fact tables can contain events that describe when a user bought an item, or measurements that describe the number of times a user listened to a song.

Configuration

SQL Query

You need to input the SQL that selects your facts. After you’ve entered your SQL you can click Run Query. Confidence then executes your query with a limit to check that everything is in order and show you some sample rows. Here is a checklist of things to consider when writing the query for your fact data:
  • ✅ Always have an index (also called Time partitioning in BigQuery) on the timestamp column to ensure that Confidence only queries the data within a partition, without this each query may require a full table scan.
  • ✅ If you can, also set up an index on the entity columns (also called Clustered Table in BigQuery), this helps improve the performance of the join that Confidence does between your exposures and facts.
  • ✅ If you can, perform heavy transformations in an upstream step and store the result in an intermediate table to avoid potentially redoing the transformations.
  • ✅ Remove duplicate rows (multiple rows per entity is fine) to avoid double counting in your metrics.
  • ✅ Check that the distribution of nulls, zeros and other anomalous values in your data is as you expect to ensure that your metrics are precise.
You have some placeholders at your disposal if you need to specify what partition you’re querying: {START_TIME} and {END_TIME}. These two parameters vary depending on the requirements of the metric that uses the fact table. You don’t need to use them as Confidence always filters on your timestamp column, but sometimes your table format includes the date. These placeholders come as the Timestamp type of your data warehouse.

Entities

You need to specify what entities the facts belong to. You must have at least one, but you can also have more. Typically, you might have a user that triggered the fact, but may also have related entities in the same fact if the user interacted with other entities.

Timestamp Column

The timestamp column is the column that identifies the time that the fact occurred. This column can either be a timestamp with a specific time point for the fact, or you can select a date column. Selecting a date column indicates that any metric calculation that includes this day should include this fact. It’s the same as specifying a time that is at 23:59:59.

Measurements

A measurement is the value that you aggregate into a metric. The measurement column can either be numeric or a boolean. A boolean measurement treats true as the number 1, and false as the number 0.

Dimensions

A dimension is a value that you use to categorize a fact. A typical example is assigning a “Device” or “Browser” dimension. Unlike dimensions defined by a dimension table dimensions, fact table dimensions are usually determined at runtime and can have different values for the same entity within a metric.

Data Delivery Cadence

For configuring data delivery cadence, see the data delivery cadence page.