Pork Production Before 2018

s3://trase-storage/brazil/flow_constraints/production/PORK_PRODUCTION_BEFORE_2018.csv

Dbt path: trase_production.main_brazil.pork_production_before_2018

Explore on Metabase: Full table; summary statistics

Containing yaml file link: trase/data_pipeline/models/brazil/flow_constraints/production/_schema.yml

Model file link: trase/data_pipeline/models/brazil/flow_constraints/production/pork_production_before_2018.py

Calls script: trase/data/brazil/flow_constraints/production/CATTLE_CHICKEN_PORK_PRODUCTION_BEFORE_2018.py

Dbt test runs & lineage: Test results · Lineage

Full dbt_docs page: Open in dbt docs (includes lineage graph -at the bottom right-, tests, and downstream dependencies)

Tags: mock_model, brazil, flow_constraints, production

pork_production_before_2018

Description

This model was auto-generated based off .yml 'lineage' files in S3. The DBT model just raises an error; the actual script that created the data lives elsewhere. The script is located at trase/data/brazil/flow_constraints/production/CATTLE_CHICKEN_PORK_PRODUCTION_BEFORE_2018.py [permalink]. It was last run by Harry Biddle.

Details

ColumnsDepends OnCalled script codeModel code

Column	Type	Description

Models / Seeds

model.trase_duckdb.sif_2018
model.trase_duckdb.sif_2015_2017

import pandas as pd

from trase.tools.aws import get_pandas_df
from trase.tools.aws.metadata import write_csv_for_upload


df = pd.concat(
    [
        get_pandas_df(
            "brazil/production/statistics/sigsif/out/SIF_2015_2017.csv",
            sep=";",
        ),
        get_pandas_df(
            "brazil/production/statistics/sigsif/out/SIF_2018.csv",
            sep=";",
        ),
    ]
)

df = df.rename(
    columns={
        "GEOCODE": "GEOCODMUN",
        "STATE_SLAUGHTER": "STATE_OF_SLAUGHTER",
        "QUANTITY": "HEADS",
    },
)
df = df[["GEOCODMUN", "STATE_OF_SLAUGHTER", "TYPE", "YEAR", "HEADS"]]

for animal in ["CATTLE", "CHICKEN", "PORK"]:
    df_animal = df[df["TYPE"] == animal]
    df_animal = (
        df_animal.groupby(["GEOCODMUN", "STATE_OF_SLAUGHTER", "TYPE", "YEAR"])
        .sum()
        .reset_index()
    )
    write_csv_for_upload(
        df_animal,
        f"brazil/flow_constraints/production/{animal}_PRODUCTION_BEFORE_2018.csv",
    )

import pandas as pd


def model(dbt, cursor):
    dbt.ref("sif_2018")
    dbt.ref("sif_2015_2017")

    raise NotImplementedError()
    return pd.DataFrame({"hello": ["world"]})