Gold Brazil Bol Coffee 2020
s3://trase-storage/brazil/coffee/trade/2020/gold/gold_brazil_bol_coffee_2020.parquet
Dbt path: trase_production.main_brazil_coffee.gold_brazil_bol_coffee_2020
Explore on Metabase: Full table; summary statistics
Containing yaml file link: trase/data_pipeline/models/brazil/coffee/trade/_schema_brazil_coffee.yml
Model file link: trase/data_pipeline/models/brazil/coffee/trade/gold_brazil_bol_coffee_2020.sql
Dbt test runs & lineage: Test results ยท Lineage
Full dbt_docs page: Open in dbt docs (includes lineage graph -at the bottom right-, tests, and downstream dependencies)
Tags: gold, brazil, coffee, trade, bol, 2020, diet-trase-coffee
gold_brazil_bol_coffee_2020
Description
Builds upon the already pre-processed general BOL data of Brazil for 2020 in trase-storage-local/brazil/trade/bol/2020/BRAZIL_BOL_2020.csv, which gets further filtered according to HS CODE and cleaned in the silver_brazil_bol_coffee_2020 model upstream.
As the original data doesn't contain FOBs's, these get imputed by taking the average FOB/kg values from Comtrade monthly data, based on destination country, month, and HS Code.
IMPORTANT: There are still no QA checks on these imputed FOB's, as the FOB value hasn't been so important at this stage as to work on this. So if it gets used, it needs to be checked better.
Details
| Column | Type | Description |
|---|---|---|
year |
INTEGER |
Comes from the YEAR field. |
exporter_label |
VARCHAR |
Comes from the EXPORTER_LABEL field. Default definition: Name (label) of the trader, which should appear as one of the synonyms of a trader in the database. It doesn't necessarily have to be the default name however. See Traders-User-Guide.md: 1.2 Synonyms. |
exporter_node_id |
BIGINT |
Trase node id of the exporter (trader) in the database |
exporter_trase_id |
VARCHAR |
Trase id of the exporter if existing, based on the CNPJ number (BR-TRADER-XXXXX) |
exporter_group_name |
VARCHAR |
Official name of the trader group, as it exists in the Trase database. See Traders-Developer-Guide.md: 1. Traders data model. |
country_of_destination |
VARCHAR |
Comes from the COUNTRY_OF_DESTINATION_NAME field. Default definition: Official name of the country, as it exists in the Trase database. |
mass_tonnes |
DOUBLE |
Comes from the VOL field, and is converted to tonnes from kilogrammes. |
fob |
DOUBLE |
FOB in USD. Doesn't exist in the source data. Gets imputed by taking the average FOB/kg values from Comtrade monthly data, based on destination country, month, and HS Code. Default definition: FOB (Free On Board) in USD. Values are reported in customs data as a measure of the financial value of the trade. |
importer_label |
VARCHAR |
Comes from the IMPORTER_LABEL field as found in the BOL data. Hasn't been checked and cleaned against Trase's database. Default definition: Name (label) of the trader, which should appear as one of the synonyms of a trader in the database. It doesn't necessarily have to be the default name however. See Traders-User-Guide.md: 1.2 Synonyms. |
port_of_export_label |
VARCHAR |
Comes from the PORT_OF_EXPORT_NAME field as found in the BOL data. Hasn't been checked and cleaned against Trase's database. |
hs6 |
VARCHAR |
Comes from the HS6 field. The values 090110X get replaced by 090111 and the values 210110X get replaced by 210111, which are the most common related HS Codes. Default definition: First 6 characters of the Harmonized System code: an internationally-agreed set of codes used to describe products in trade data. |
Models / Seeds
model.trase_duckdb.silver_brazil_bol_coffee_2020model.trase_duckdb.silver_brazil_comtrade_coffee_2020_monthly
No called script or script source not found.
-- gold_brazil_bol_coffee_2020
{{
config(materialized='external')
}}
{% set replace_eu_countries_with_bloc = False %}
WITH bol AS (
SELECT *
FROM {{ ref('silver_brazil_bol_coffee_2020') }}
),
-- Use the monthly comtrade data that contains FOB per kg average
comtrade AS (
SELECT *
FROM {{ ref('silver_brazil_comtrade_coffee_2020_monthly') }}
),
-- Calculate the average FOB per kg for each country, in case some don't have monthly values
comtrade_fob_country_average AS (
SELECT
COUNTRY_DESTINATION_ISO,
AVG(FOB_PER_KG_AVERAGE) AS FOB_PER_KG_AVERAGE
FROM
comtrade
GROUP BY COUNTRY_DESTINATION_ISO
),
-- Useful for countries that don't have values in Comtrade (i.e. Taiwan and Serbia)
comtrade_fob_world_average AS (
SELECT
AVG(FOB_PER_KG_AVERAGE) AS FOB_PER_KG_WORLD_AVERAGE
FROM
comtrade
),
bol_with_imputed_fobs AS (
SELECT
bol.*,
COALESCE(
comtrade.FOB_PER_KG_AVERAGE,
comtrade_avg.FOB_PER_KG_AVERAGE,
comtrade_world.FOB_PER_KG_WORLD_AVERAGE
) AS FOB_PER_KG_AVERAGE,
COALESCE(
bol.VOL * comtrade.FOB_PER_KG_AVERAGE,
bol.VOL * comtrade_avg.FOB_PER_KG_AVERAGE,
bol.VOL * comtrade_world.FOB_PER_KG_WORLD_AVERAGE
) AS FOB_ESTIMATED_VALUE_IN_USD
FROM
bol
LEFT JOIN comtrade
ON bol.COUNTRY_OF_DESTINATION_ISO = comtrade.COUNTRY_DESTINATION_ISO
AND bol.MONTH = comtrade.MONTH
AND bol.HS6 = comtrade.HS_CODE
LEFT JOIN comtrade_fob_country_average comtrade_avg
ON bol.COUNTRY_OF_DESTINATION_ISO = comtrade_avg.COUNTRY_DESTINATION_ISO
CROSS JOIN comtrade_fob_world_average comtrade_world
),
final_fields AS (
SELECT
YEAR::int AS year,
EXPORTER_LABEL AS exporter_label,
exporter_trader_id AS exporter_node_id,
exporter_trase_id,
-- Correct an incorrectly assigned exporter group name from the original data,
-- which additionally didn't exist in the database
CASE
WHEN exporter_group_name = 'OLAM BRASIL LTDA' THEN 'OLAM'
ELSE exporter_group_name
END AS exporter_group_name,
COUNTRY_OF_DESTINATION_NAME AS country_of_destination,
VOL/1000 AS mass_tonnes,
FOB_ESTIMATED_VALUE_IN_USD AS fob,
IMPORTER_LABEL AS importer_label,
PORT_OF_EXPORT_NAME AS port_of_export_label,
CASE
WHEN HS6 LIKE '090110X' THEN '090111'
WHEN HS6 LIKE '0901XX' THEN '090111'
WHEN HS6 LIKE '210110X' THEN '210111'
ELSE HS6
END AS hs6
FROM bol_with_imputed_fobs
)
-- If true (default), replace EU countries with the economic bloc name
-- using 'as_bool' to render the value as a boolean
{% if replace_eu_countries_with_bloc | as_bool %}
,
countries_with_economic_blocs AS (
SELECT
country_name,
UNNEST(economic_bloc) AS economic_bloc
FROM {{ ref('postgres_countries') }}
),
eu_countries AS (
SELECT DISTINCT
country_name AS eu_country,
'EUROPEAN UNION' AS european_union_country
FROM countries_with_economic_blocs
WHERE
json_extract_string(economic_bloc, '$.economic_bloc') = 'EUROPEAN UNION'
AND
country_name != 'UNITED KINGDOM'
),
final_with_eu AS (
SELECT
ff.year,
ff.exporter_label,
ff.exporter_node_id,
exporter_trase_id,
ff.exporter_group_name,
COALESCE(eu.european_union_country, ff.country_of_destination) AS country_of_destination,
ff.mass_tonnes,
ff.fob,
ff.importer_label,
ff.port_of_export_label,
ff.hs6
FROM final_fields ff
LEFT JOIN eu_countries eu
ON ff.country_of_destination = eu.eu_country
)
SELECT * FROM final_with_eu
{% else %}
-- If not, keep the normal country names
SELECT * FROM final_fields
{% endif %}