Configuration
SQL Query
You need to input the SQL that selects your facts. After you’ve entered your SQL you can click Run Query. Confidence then executes your query with a limit to check that everything is in order and show you some sample rows. Here is a checklist of things to consider when writing the query for your fact data:- ✅ Always have an index (also called Time partitioning in BigQuery) on the timestamp column to ensure that Confidence only queries the data within a partition, without this each query may require a full table scan.
- ✅ If you can, also set up an index on the entity columns (also called Clustered Table in BigQuery), this helps improve the performance of the join that Confidence does between your exposures and facts.
- ✅ If you can, perform heavy transformations in an upstream step and store the result in an intermediate table to avoid potentially redoing the transformations.
- ✅ Remove duplicate rows (multiple rows per entity is fine) to avoid double counting in your metrics.
- ✅ Check that the distribution of nulls, zeros and other anomalous values in your data is as you expect to ensure that your metrics are precise.
You have some placeholders at your disposal if you need to specify what
partition you’re querying:
{START_TIME} and {END_TIME}. These two
parameters vary depending on the requirements of the metric that uses
the fact table. You don’t need
to use them as Confidence always filters on your timestamp column, but
sometimes your table format includes the date. These placeholders come as the
Timestamp type of your data warehouse.Entities
You need to specify what entities the facts belong to. You must have at least one, but you can also have more. Typically, you might have a user that triggered the fact, but may also have related entities in the same fact if the user interacted with other entities.Timestamp Column
The timestamp column is the column that identifies the time that the fact occurred. This column can either be a timestamp with a specific time point for the fact, or you can select a date column. Selecting a date column indicates that any metric calculation that includes this day should include this fact. It’s the same as specifying a time that is at 23:59:59.Measurements
A measurement is the value that you aggregate into a metric. The measurement column can either be numeric or a boolean. A boolean measurement treatstrue as the number 1, and false as the number 0.

