Skip to content

Gold Ethiopia Coffee 2020

s3://trase-storage/ethiopia/trade/bol/2020/gold/gold_ethiopia_coffee_2020.parquet

Dbt path: trase_production.main_ethiopia_coffee.gold_ethiopia_coffee_2020

Explore on Metabase: Full table; summary statistics

Containing yaml file link: trase/data_pipeline/models/ethiopia/trade/bol/2020/_schema_ethiopia_coffee.yml

Model file link: trase/data_pipeline/models/ethiopia/trade/bol/2020/gold_ethiopia_coffee_2020.py

Dbt test runs & lineage: Test results ยท Lineage

Full dbt_docs page: Open in dbt docs (includes lineage graph -at the bottom right-, tests, and downstream dependencies)

Tags: gold, ethiopia, coffee, trade, 2020, diet-trase-coffee


gold_ethiopia_coffee_2020

Description

Pre-processing done initially through an R script called by silver_ethiopia_coffee_2020 which * Loads the vendor excel file * Cleans the country names * Fixes 3 outliers based on exchange rate, and a record without FOB * Checks against comtrade in total, and per country

Then a python file called by gold_ethiopia_coffee_2020: * Cleans additional country names, using as reference the database * Gets the related trader information from the database * Note that a runbook ingested the missing traders


Details

Column Type Description
year INTEGER
exporter_label VARCHAR
exporter_node_id INTEGER
exporter_group_name VARCHAR
country_of_destination VARCHAR
mass_tonnes DOUBLE
fob DOUBLE
importer_label VARCHAR
port_of_export_label VARCHAR
hs6 VARCHAR

Models / Seeds

  • model.trase_duckdb.silver_ethiopia_coffee_2020

No called script or script source not found.

# gold_ethiopia_coffee_2020
from trase.data.ethiopia.trade.bol.y2020.gold import gold_ethiopia_coffee_2020


def model(dbt, session):
    dbt.config(materialized="external")

    # Declaring the sources so they appear in the documentation / lineage
    silver_trade_data = dbt.ref("silver_ethiopia_coffee_2020")
    df = gold_ethiopia_coffee_2020.get_gold_ethiopia_coffee_2020()

    final_data = session.sql(
        f"""
        WITH final_fields AS (
            SELECT
                YEAR::INT AS year,
                EXPORTER_NAME AS exporter_label,
                EXPORTER_ID AS exporter_node_id,
                EXPORTER_GROUP AS exporter_group_name,
                COUNTRY_DESTINATION AS country_of_destination,
                NET_WEIGHT/1000 AS mass_tonnes,
                FOB_VALUE_IN_USD AS fob,
                BUYER_NAME AS importer_label,
                CAST(NULL AS VARCHAR) AS port_of_export_label,
                HS_CODE AS hs6
            FROM df
        )
        SELECT * FROM final_fields
    """
    ).arrow()
    # Saving as arrow as pandas converts the port_of_export_label to int32 even with casting

    return final_data