Skip to content

DBT: Diet Trase Coffee 2020 External

File location: s3://trase-storage/diet-trase/diet_trase_coffee_2020_external.parquet

DBT model name: diet_trase_coffee_2020_external

Explore on Metabase: Full table; summary statistics

Explore dependencies/lineage: link


Description

Shareable version of the results with a selected set of fields and only including exports.


Details

Column Type Description
year BIGINT
country_of_production VARCHAR
country_of_production_iso2 VARCHAR
port_of_export_name VARCHAR
hs6 VARCHAR
exporter_name VARCHAR
exporter_node_id BIGINT
exporter_group VARCHAR
importer_name VARCHAR
importer_group VARCHAR
country_of_first_import VARCHAR
country_of_first_import_iso2 VARCHAR
country_of_first_import_economic_bloc VARCHAR
mass_tonnes DOUBLE
fob DOUBLE

Review full report including sample errors records if they exist (link)

Test name Test column Last test run Last status
accepted_values_diet_trase_coffee_2020_external_year__2020 year 2026-04-25 13:23 pass
check_trader_groups_diet_trase_coffee_2020_external_exporter_group__2020 `` 2026-04-25 13:23 pass
dbt_utils_expression_is_true_diet_trase_coffee_2020_external_mass_tonnes___0 mass_tonnes 2026-04-25 13:23 pass
not_null_diet_trase_coffee_2020_external_country_of_production country_of_production 2026-04-25 13:23 pass
relationships_diet_trase_coffee_2020_external_country_of_production__country_name__ref_postgres_countries_ country_of_production 2026-04-25 13:23 pass

Not referenced by any model or exposure.

Models

No called script or script source not found.

"""
Shareable version of the Diet Trase results with a selected set of fields and only including exports.
"""

import polars as pl


def model(dbt, cursor):
    dbt.config(
        materialized="external",
    )

    lf = dbt.ref("diet_trase_coffee_2020").pl(lazy=True)

    lf = lf.filter(~pl.col("is_domestic"))

    columns = [
        "year",
        "country_of_production",
        "country_of_production_iso2",
        "port_of_export_name",
        "hs6",
        "exporter_name",
        "exporter_node_id",
        "exporter_group",
        "importer_name",
        "importer_group",
        "country_of_first_import",
        "country_of_first_import_iso2",
        "country_of_first_import_economic_bloc",
        "mass_tonnes",
        "fob",
    ]

    lf = lf.select(columns)

    # aggregate by mass_tonnes and fob
    group_cols = [c for c in columns if c not in ("mass_tonnes", "fob")]
    lf = (
        lf.group_by(group_cols)
        .agg(
            [
                pl.sum("mass_tonnes").alias("mass_tonnes"),
                pl.sum("fob").alias("fob"),
            ]
        )
        .with_columns(
            [
                pl.col("mass_tonnes").round(4),
                pl.col("fob").round(2),
            ]
        )
    )

    return lf