DBT: Egg Production 2019

File location: s3://trase-storage/brazil/flow_constraints/production/EGG_PRODUCTION_2019.csv

DBT model name: egg_production_2019

Explore on Metabase: Full table; summary statistics

DBT details

Lineage
Test results
Dbt path: trase_production.main_brazil.egg_production_2019
Containing yaml link: trase/data_pipeline/models/brazil/flow_constraints/production/_schema.yml
Model file: trase/data_pipeline/models/brazil/flow_constraints/production/egg_production_2019.py
Calls script: trase/data/brazil/flow_constraints/production/EGG_PRODUCTION_2019.py
Original dbt_docs
Tags: mock_model, brazil, flow_constraints, production

Description

This model was auto-generated based off .yml 'lineage' files in S3. The DBT model just raises an error; the actual script that created the data lives elsewhere. The script is located at trase/data/brazil/flow_constraints/production/EGG_PRODUCTION_2019.py [permalink]. It was last run by Harry Biddle.

Details

ColumnsDepends OnCalled script codeModel code

Column	Type	Description

Models / Seeds

model.trase_duckdb.milk_eggs_honey_wool_production_2019

from trase.tools.aws.aws_helpers_cached import get_pandas_df_once
from trase.tools.aws.metadata import write_csv_for_upload

# read into pandas
df = get_pandas_df_once(
    "brazil/production/statistics/ibge/milk_eggs_honey_wool/MILK_EGGS_HONEY_WOOL_PRODUCTION_2019.csv",
    sep=";",
    converters={"GEOCODMUN": str, "CHICKEN_EGGS_THOUSAND_DOZENS": int},
)

# rename columns
df["EGGS"] = 12_000 * df["CHICKEN_EGGS_THOUSAND_DOZENS"]
df = df[["GEOCODMUN", "EGGS"]]
df["TYPE"] = "EGGS"
df["YEAR"] = 2019

# some quick QA
assert all(df["GEOCODMUN"].str.len() == 7)
assert df["GEOCODMUN"].is_unique
assert all(df["EGGS"] >= 0)

# done! we let the user upload to S3
write_csv_for_upload(df, "brazil/flow_constraints/production/EGG_PRODUCTION_2019.csv")

import pandas as pd


def model(dbt, cursor):
    dbt.ref("milk_eggs_honey_wool_production_2019")

    raise NotImplementedError()
    return pd.DataFrame({"hello": ["world"]})