Skip to content

DBT Introduction

DBT - Data Build Tool

DBT is a tool, which helps us to manage our data transformation logic.

airbyte csv source

Everything you need are some .sql files, that hold your transformation logic, and DBT takes then care of the rest.

While DBT has a lot of features, we want to highlight some of them here:

  • LINEAGE

    DBT has knowledge of your data lineage and knows what comes first and what next. It executes your files in the correct order.

  • TESTING

    You can define and write tests and make sure, that your data quality is always at its best.

  • DOCUMENTATION

    By adding column descriptions in your code, you can make sure, that everyone is one the same page about your data.

  • JINJA

    DBT facilitates the use of the DRY principle (Don't Repeat Yourself) by using jinja templating.

One word on the DRY principle (aka jinja templating).

DBT enables the use of jinja templating. This means, that you can use variables and loops in your SQL statements.

This is especially useful, if you have to repeat the same logic over and over again.

Here is an example:

-- This is a jinja variable
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
{% set my_favorite_columns = ['column_a', 'column_b', 'column_c'] %}

-- This is a jinja loop
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
SELECT 
    {% for item in my_variable %}
        item {%- if not loop.last %},{% endif %}
    {% endfor %}
FROM 
    my_schema.my_table

-- This compiles to this:
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
SELECT 
    column_a, 
    column_b, 
    column_c
FROM
    my_schema.my_table

More on that here: DBT Macros