Skip to content

DBT: Diet Trase Coffee 2020 External

File location: s3://trase-storage/diet-trase/diet_trase_coffee_2020_external.parquet

DBT model name: diet_trase_coffee_2020_external

Explore on Metabase: Full table; summary statistics

DBT details


Description

Shareable version of the results with a selected set of fields and only including exports.


Details

Column Type Description
year BIGINT
country_of_production VARCHAR
country_of_production_iso2 VARCHAR
port_of_export_name VARCHAR
hs6 VARCHAR
exporter_name VARCHAR
exporter_node_id BIGINT
exporter_group VARCHAR
importer_name VARCHAR
importer_group VARCHAR
country_of_first_import VARCHAR
country_of_first_import_iso2 VARCHAR
country_of_first_import_economic_bloc VARCHAR
mass_tonnes DOUBLE
fob DOUBLE

Models / Seeds

  • model.trase_duckdb.diet_trase_coffee_2020

No called script or script source not found.

"""
Shareable version of the Diet Trase results with a selected set of fields and only including exports.
"""

import polars as pl


def model(dbt, cursor):
    dbt.config(
        materialized="external",
    )

    lf = dbt.ref("diet_trase_coffee_2020").pl(lazy=True)

    lf = lf.filter(~pl.col("is_domestic"))

    columns = [
        "year",
        "country_of_production",
        "country_of_production_iso2",
        "port_of_export_name",
        "hs6",
        "exporter_name",
        "exporter_node_id",
        "exporter_group",
        "importer_name",
        "importer_group",
        "country_of_first_import",
        "country_of_first_import_iso2",
        "country_of_first_import_economic_bloc",
        "mass_tonnes",
        "fob",
    ]

    lf = lf.select(columns)

    # aggregate by mass_tonnes and fob
    group_cols = [c for c in columns if c not in ("mass_tonnes", "fob")]
    lf = (
        lf.group_by(group_cols)
        .agg(
            [
                pl.sum("mass_tonnes").alias("mass_tonnes"),
                pl.sum("fob").alias("fob"),
            ]
        )
        .with_columns(
            [
                pl.col("mass_tonnes").round(4),
                pl.col("fob").round(2),
            ]
        )
    )

    return lf