top of page

Recipe for creating datasets

In this recipe, we'll walk through using Data Definition Language (DDL) to create a new dataset in BigQuery, setting the foundation for organizing and managing your data efficiently.

Configure your BigQuery Datasets Using SQL

Untitled Diagram.gif

Ingredients

  • BigQuery project

  • DDL

  • BigQuery Admin permission

  • Recipe for defining location code will run in

  1. Open your BigQuery UI on your project and select the location you’ll be setting the datasets up in

  2. Copy the code into your query window

  3. [note] Row 1 will create the dataset if it does not exist, if you do not write this and the dataset exists the code will fail

  4. The options allow you to control everything which you can see in the data set info screen, this means you can quickly deploy many datasets quickly.

  5. You can add this code into python as a function and spin up 100s of datasets with limited upskilling to Python (Github code to be added)

Instructions

Useful tips

  • It’s useful to pre-define your datasets and decide a schema structure.

  • I recommend sticking with the lower snake case.

  • Make sure you consider a strategy for future requirements. For example, have department prefixes on each dataset.

  • It’s key to make sure you name consistently to ensure it’s then easy to identify what datasets should contain.

If your datasets did not exist, these will now be present-woop

Output

Advanced method

The above code can be used so you only need to change the first 4 lines, the code then writes itself and runs itself.

bottom of page