BigQuery
Authentication
To connect with BigQuery we use service accounts, a special kind of account for non-human users. See Understanding Service Accounts for more information. You will likely need privileged access to your Google infrastructure in order to create a service account.
Creating a Service Account
In your Google Cloud dashboard, go to the 'Service Accounts' page.
There you will be able to create a new service account:
Give the service account an appropriate name and continue.
Grant Causal access to BigQuery with the 'BigQuery Admin' role and continue.
It is also possible to configure a more restricted account by creating a new role for the Causal service account and granting access to particular datasets and tables.
The steps are:
The third step is unnecessary, so you can now click 'done'.
Creating a New Key
Go back to the service accounts page and find the new service account. Click on it, and then on the 'keys' section. You can then 'Create new key' using the 'add key' action.
Create a JSON key:
This will download the file you need to authenticate with Causal. You will be prompted to upload the file when creating a new BigQuery data source.
Limits
In order to prevent the accidental import of large amounts of data, there is a limit on the number of rows (150,000) and the number of items you can have in a category (100).
If you would like to remove these limits, please ask our live chat. Please note that data sources that exceed these limits can negatively affect the performance of your model.
Tip: to reduce the size of your data, aggregate it to the granularity you intend to use in Causal (day, week, month) using SQL.
Configuration
To configure your BigQuery data source you will need to specify the query, the date column, and any variable columns. The date column should be one of BigQuery's date formats (DATE
, DATETIME
or TIMESTAMP
) and variable columns should be numeric (INTEGER
, FLOAT
, NUMERIC
or BIGNUMERIC
).
Any columns that aren't a date or variable column will be considered a category, and should have a STRING
type. An exception is the 'cohort' category, which should be a DATE,
with the column header explicitly labelled Cohort
.
For more details about how to write queries for Causal data sources check out Table Format.