DBT: Brazil Bol 2023 Gold
File location: s3://trase-storage/brazil/trade/bol/2023/gold/brazil_bol_2023_gold.parquet
DBT model name: brazil_bol_2023_gold
Explore on Metabase: Full table; summary statistics
DBT details
- Lineage
-
Dbt path:
trase_production.main_brazil.brazil_bol_2023_gold -
Containing yaml link: trase/data_pipeline/models/brazil/trade/bol/_schema_brazil_bol.yml
-
Model file: trase/data_pipeline/models/brazil/trade/bol/2023/brazil_bol_2023_gold.sql
-
Tags:
brazil,bol,trade,2023,gold
Description
Based on the cleaned version of Brazil BOL 2023 (brazil_bol_2023_silver), only including curated fields. Find where the fields come from and how they were cleaned in the fields documentation, and all the available source fields in silver version, and the vendor field glossary in google spreadsheets.
Details
| Column | Type | Description |
|---|---|---|
commodity |
VARCHAR |
Uppercase name of the commodity (BEEF, SOY, etc.) based on the hs4 code, and the list of commodities in the Trase reference table postgres_commodities |
country_of_destination_economic_bloc |
VARCHAR |
If the country of destination is part of the EU for the year of the BOL, it sets the value "EUROPEAN UNION" here. Note that currently no other economic blocs are being identified. If this is done in the future, probably the field will turn into an array. |
country_of_destination_label |
VARCHAR |
Based on the Datamar source field PLACE_AND_PORTS_DEST_COUNTRY: "Country where the carrier delivered the cargo" |
country_of_destination_name |
VARCHAR |
Official Trase country name where the export was delivered. It is cleaned based on country_of_destination_label field Note that European Union countries are not unified as 'EUROPEAN UNION' country, although the source BOL does has 768 records with the label "EUROPEAN" instead of a specific country. |
country_of_destination_trase_id |
VARCHAR |
Official Trase country id, using the iso two letter country code based on the field country_of_destination_name |
date |
DATE |
Based on the source field DATES_LONG_HAUL_YYYYMMDD: Berthing (arrival) date of the long haul vessel: Year, month and day |
year |
INTEGER |
|
month |
INTEGER |
|
exporter_cnpj |
VARCHAR |
Based on the source field COMPANY_SHIPPER_REGISTRATION_NUMBER (the shipper is the exporter), which already only contains digits. The cleaning of this field uses stdnum.br python library to identify if its a valid cnpj, cpf, or unkown (and sets this in the exporter_type field). If its a valid cnpj it left pads with 0 until it reaches 14 characters, and if its a cpf left pads to 11 characters. |
exporter_country_label |
VARCHAR |
Based on the source field PLACE_AND_PORTS_POL_COUNTRY: country name of the exporter |
exporter_group |
VARCHAR |
Official Trase trader group name, taken from the trader reference database. First it checks the reference data based on the tax identification of the record, and then based on the exporter_label. As a trader can change its group affiliation in time, the group valid for the current BOL year is taken. |
exporter_label |
VARCHAR |
Based on the source field COMPANY_SHIPPER_SHIPPER_NAME_DETAILED: Detailed Shipper (exporter) name. |
exporter_municipality_label |
VARCHAR |
Based on the source field COMPANY_SHIPPER_CITY: Shipper (exporter) city |
exporter_municipality_name |
VARCHAR |
Official Trase name of the municipality, based on the exporter_municipality_label, and cleaning it based on the postgres_regions reference table - where it checks the municipality exists and takes the official name |
exporter_municipality_trase_id |
VARCHAR |
Official Trase id of the municipality, usually created based on an official geocode of it. It is based on the exporter_municipality_label, and cleaning it based on the postgres_regions reference table - where it checks the municipality exists and takes the id |
exporter_name |
VARCHAR |
Official Trase trader name, taken from the trader reference database. First it checks the reference data based on the tax identification of the record, and then based on the exporter_label field |
exporter_node_id |
BIGINT |
|
exporter_state_label |
VARCHAR |
Based on the source field COMPANY_SHIPPER_STATE_NAME: Shipper (exporter) state name |
exporter_state_name |
VARCHAR |
Official Trase name of the state, based on the exporter_state_label, and cleaning it based on the postgres_regions reference table - where it checks the state exists and takes the official name |
exporter_state_trase_id |
VARCHAR |
Official Trase id of the state, based on the exporter_state_label, and cleaning it based on the postgres_regions reference table - where it checks the state exists and takes the official id |
exporter_trase_id |
VARCHAR |
Official Trase trader id, taken from the trader reference database. First it checks the reference data based on the tax identification of the record, and then based on the exporter_label field. The trase id is built based on the tax number. |
exporter_type |
VARCHAR |
Based on the field exporter_cnpj, it identifies if its a valid cnpj, cpf or if its of unknown type. |
fob |
DOUBLE |
Based on the field fob_original, including imputing empty values based on COMTRADE or BOL average values for the hs6 code. Where imputed, the field fob_adjustment_type shows which kind of imputing was done. |
fob_adjustment_type |
VARCHAR |
|
fob_original |
DOUBLE |
Based on the source field MEASURES_SECEX_FOB. Though this field doesn't exist in the vendor glossary, there is a similar field in the glossary called MEASURES_FOB_VALUE_USD: Monthly average FOB value per commodity in US dollars. |
hs4 |
VARCHAR |
|
hs5 |
VARCHAR |
|
hs6 |
VARCHAR |
|
hs6_description |
VARCHAR |
|
hs8 |
VARCHAR |
|
importer_group |
VARCHAR |
|
importer_label |
VARCHAR |
|
importer_name |
VARCHAR |
|
importer_node_id |
BIGINT |
|
net_weight_kg |
DOUBLE |
Based on the source field MEASURES_WTKG: Weight in kilos |
net_weight_tonnes |
DOUBLE |
Based on the source field MEASURES_WTMT: Weight in metric tonnes |
port_of_export_label |
VARCHAR |
Based on the source field PLACE_AND_PORTS_POL_NAME: POL ‐ PORT OF LOADING ‐ Port where the LONG HAUL vessel loaded the cargo |
port_of_export_name |
VARCHAR |
Official Trase port name where the cargo was loaded. It is cleaned based on port_of_export_label field and the postgres_ports reference table, checking also the port exists in the port_of_export_country |
port_of_import_country_label |
VARCHAR |
Based on the source field PLACE_AND_PORTS_POD_COUNTRY: POD ‐ PORT OF DISCHARGE ‐ Country where the LONG HAUL vessel discharged the cargo |
port_of_import_country_name |
VARCHAR |
|
port_of_import_label |
VARCHAR |
Based on the source field PLACE_AND_PORTS_POD_NAME: POD ‐ PORT OF DISCHARGE ‐ Port where the LONG HAUL vessel discharged the cargo |
port_of_import_name |
VARCHAR |
Official Trase port name where the cargo was discharged. If there is no current existing port with that name in the port_of_import_country, it returns a NULL. It is cleaned based on port_of_import_label field and the postgres_ports reference table. |
Models / Seeds
model.trase_duckdb.brazil_bol_2023_silver
No called script or script source not found.
{{ config(materialized='external')}}
SELECT
commodity,
country_of_destination_economic_bloc,
country_of_destination_label,
country_of_destination_name,
country_of_destination_trase_id,
date,
year,
month,
exporter_cnpj,
exporter_country_label,
exporter_group,
exporter_label,
exporter_municipality_label,
exporter_municipality_name,
exporter_municipality_trase_id,
exporter_name,
exporter_node_id,
exporter_state_label,
exporter_state_name,
exporter_state_trase_id,
exporter_trase_id,
exporter_type,
fob,
fob_adjustment_type,
fob_original,
hs4,
hs5,
hs6,
hs6_description,
hs8,
importer_group,
importer_label,
importer_name,
importer_node_id,
net_weight_kg,
net_weight_tonnes,
port_of_export_label,
port_of_export_name,
port_of_import_country_label,
port_of_import_country_name,
port_of_import_label,
port_of_import_name,
FROM {{ ref('brazil_bol_2023_silver') }}