View or edit on GitHub
This page is synchronized from trase/models/brazil/soy/brazil_soy_v2_6_1_dbt_tests.md. Last modified on 2025-12-14 23:19 CET by Trase Admin.
Please view or edit the original file there; changes should be reflected here after a midnight build (CET time),
or manually triggering it with a GitHub action (link).
Brazil Soy v2.6.1 - DBT Test Results
For the v2.6.1 run of Brazil soy, we have some quite extensive quality assurance tests defined in DBT. Many failures of these tests are actually not significant. Ideally, we would improve the tests to return only true failures. However, for speed and ease, for now I'm just going through each test and commenting on whether it is an issue.
The tests were run on November 25th 2024 as follows:
trase/database/dbt/dbt test --select brazil_soy_v2_6_1
Tests which failed and need investigation or fixes
co2_gross_exposure_aggregation_check_brazil_soy_v2_6_1
[!CAUTION] WARN 8
Seeing slightly higher values in 2.6.1 ("expression" column below), is this OK? Perhaps due to the embedding of UNKNOWN values - explore.
| col_1 | expression | compare_expression | expression_difference | expression_difference_percent |
|---|---|---|---|---|
| 2013 | 99278188 | 85129428 | 14148760 | 0.14251629975357729132 |
| 2014 | 96209987 | 83550111 | 12659876 | 0.13158588203530263443 |
| 2015 | 135005601 | 123485231 | 11520370 | 0.08533253372206387200 |
| 2016 | 110112468 | 101599310 | 8513158 | 0.07731329752776043490 |
| 2017 | 144549005 | 122699184 | 21849821 | 0.15115857075598687103 |
| 2018 | 117199745 | 110155947 | 7043798 | 0.06010079629439466784 |
| 2019 | 107475409 | 86766116 | 20709293 | 0.19268866425062871824 |
| 2020 | 91455743 | 62791671 | 28664072 | 0.31342014246169319296 |
TODO: check with Carina/UNKNOWN flows
co2_gross_exposure_ranking_check_brazil_soy_v2_6_1
[!CAUTION] WARN 253
TODO: lots of changes here, need to visualise & investigate
co2_net_exposure_aggregation_check_brazil_soy_v2_6_1
[!CAUTION] WARN 8
Similar story to co2_gross_exposure_aggregation_check_brazil_soy_v2_6_1
trase=> select * from co2_net_exposure_aggregation_check_brazil_soy_v2_6_1;
col_1 | expression | compare_expression | expression_difference | expression_difference_percent
-------+------------+--------------------+-----------------------+-------------------------------
2013 | 85798308 | 74470079 | 11328229 | 0.13203324475816003271
2014 | 82934615 | 73025052 | 9909563 | 0.11948645327406415283
2015 | 116850546 | 107775035 | 9075511 | 0.07766768158704196384
2016 | 95212595 | 88448515 | 6764080 | 0.07104186163605770854
2017 | 126235906 | 107506367 | 18729539 | 0.14836934746600543272
2018 | 102673239 | 98040225 | 4633014 | 0.04512387107998024685
2019 | 93868124 | 76737816 | 17130308 | 0.18249334566439188664
2020 | 79746797 | 55152242 | 24594555 | 0.30840806057702856705
(8 rows)
TODO: check with Carina/UNKNOWN flows
co2_net_exposure_ranking_check_brazil_soy_v2_6_1
[!CAUTION] WARN 255
TODO: investigate this
deforestation_exposure_aggregation_check_brazil_soy_v2_6_1
[!CAUTION] WARN 8
deforestation_exposure_ranking_check_brazil_soy_v2_6_1
[!CAUTION] WARN 246
Tests which failed but aren't a problem
zdc_volumes_brazil_soy_v2_6_1
[!CAUTION] WARN 26
There are some differences compared to v2.6.0:
trase=> select
col_1 as zdc,
col_2 as year,
col_3 as exporter_group,
expression as volume_v2_6_1,
compare_expression as volume_v2_6_0,
expression_difference as volume_difference,
expression_difference_percent as volume_relative_error
from "trase"."dbt_test__audit"."zdc_volumes_brazil_soy_v2_6_1"
where col_2 <= 2020
order by compare_expression asc
| zdc | year | exporter_group | volume_v2_6_1 | volume_v2_6_0 | volume_difference | volume_relative_error |
|---|---|---|---|---|---|---|
| NONE | 2018 | COOPERATIVA AGROINDUSTRIAL DOS PRODUTORES RURAIS DO SUDOESTE | 405241.39068499993 | 6199.8154 | 399041.5752849999 | 0.9847009324750363 |
| UNKNOWN | 2018 | RISA | 183386.068 | 30401.628 | 152984.44 | 0.8342206235644902 |
| UNKNOWN | 2018 | EFRAIM AGRONEGOCIOS LTDA | 74572.647787 | 67185.107606 | 7387.540180999989 | 0.09906501110300975 |
| NONE | 2018 | GLENCORE | 4270258.435999999 | 4184408.466 | 85849.96999999881 | 0.020104162613730582 |
| UNKNOWN | 2018 | BIORGANICA COMERCIO DE PRODUTOS ORGANICOS | 200 | 200 | 1 | |
| UNKNOWN | 2018 | BIO SOJA INDUSTRIAS QUIMICAS E BIOLOGICAS LTDA. | 3759.2981839999998 | 3759.2981839999998 | 1 | |
| UNKNOWN | 2018 | BOM JESUS AGROPECUARIA | 30545.87750247279 | 30545.87750247279 | 1 | |
| UNKNOWN | 2018 | BR GOODS INDUSTRIA E COMERCIO DE PRODUTOS HOSPITALARES EIREL | 38.689 | 38.689 | 1 | |
| UNKNOWN | 2018 | COPALEM COOP. DE PRODUTORES RURAIS DE LUIS EDUARDO MAGALHAES | 11573.104907999998 | 11573.104907999998 | 1 | |
| UNKNOWN | 2018 | COTRIJUC - COOPERATIVA AGROPECUARIA JULIO DE CASTILHOS | 18900 | 18900 | 1 | |
| UNKNOWN | 2018 | GIRASSOL AGRICOLA | 13298.306493 | 13298.306493 | 1 | |
| UNKNOWN | 2018 | GOLDEN SUCOS S/A | 2239.0082399999997 | 2239.0082399999997 | 1 | |
| UNKNOWN | 2018 | JAPUNGU AGROINDUSTRIAL SA | 4685.098037 | 4685.098037 | 1 | |
| UNKNOWN | 2018 | J N DISTRIBUIDORA DE PRODUTOS DE HIGIENE E BELEZA | 1359.896217 | 1359.896217 | 1 | |
| UNKNOWN | 2018 | USINA RIO PARDO S/A | 19702.535 | 19702.535 | 1 | |
| UNKNOWN | 2018 | VEGRANDE VEICULOS CASAGRANDE | 8336.515 | 8336.515 | 1 | |
| UNKNOWN | 2018 | VILELA VILELA & CIA | 1000 | 1000 | 1 | |
| UNKNOWN | 2018 | ADRIANA AGRICOLA | 8384.740499 | 8384.740499 | 1 | |
| UNKNOWN | 2018 | AGRINVEST BRASIL | 79120.498 | 79120.498 | 1 | |
| UNKNOWN | 2018 | ALCOFLAME INDUSTRIA E COMERCIO DE PRODUTOS QUIMICOS E MAQUIN | 294.936108 | 294.936108 | 1 |
However, I think these are fine. The first and second rows (COOPERATIVA AGROINDUSTRIAL DOS PRODUTORES RURAIS DO SUDOESTE; RISA) are actually an improvement: we previously had traders that were not associated with the right groups.
The differences in all other rows are too small for us to be concerned about.
fob_check_brazil_soy_v2_6_1
[!CAUTION] WARN 2
Getting some errors here:
trase=> select * from "trase"."dbt_test__audit"."fob_check_brazil_soy_v2_6_1";
col_1 | col_2 | col_3 | col_4 | col_5 | col_6 | col_7 | expression | compare_expression | expression_difference | expression_difference_percent
-------+----------+---------+-------+------------+--------------+-------+--------------------+--------------------+-----------------------+-------------------------------
2020 | BR-BIO-X | UNKNOWN | | BR-XXXXXXX | SOYBEAN CAKE | BR-XX | 176578719.28572476 | | 176578719.28572476 | 1
2020 | BR-BIO-X | UNKNOWN | | BR-XXXXXXX | SOYBEANS | BR-XX | 589315293.7709252 | | 589315293.7709252 | 1
(2 rows)
Actually the data is fine - it's just that the test is failing on null country_of_first_import (col_4).
volume_check_brazil_soy_v2_6_1
[!CAUTION] WARN 2
Same as fob_check_brazil_soy_v2_6_1
brazil_soy_close_to_secomex_port_fob_v2_6_1
[!CAUTION] WARN 1
22% more FOB for Soybean oil in 2019:
trase=> select * from "trase"."dbt_test__audit"."brazil_soy_close_to_secomex_port_fob_v2_6_1";
year | product_type | secomex_fob | seipcs_fob | relative_error
------+--------------+-------------+------------------+---------------------
2019 | SOYBEAN OIL | 694673011 | 537252438.607409 | 0.22661103843084382
(1 row)
Soybean oil is relatively small (0.7% of total exports in 2019) so this is not a concern.
cnpj_check_brazil_soy_v2_6_1
[!CAUTION] WARN 10
This is about CNPJs which are missing from the cnpj.cnpj_2019 table, but they are all quite low (7% of CNPJs in 2018 but otherwise usually less than 1%).
brazil_soy_close_to_secomex_port_vol_v2_6_1
[!CAUTION] WARN 3
trase=> select * from brazil_soy_close_to_secomex_port_vol_v2_6_1;
year | product_type | secomex_volume | seipcs_volume | relative_error
------+--------------+----------------+--------------------+---------------------
2019 | SOYBEAN OIL | 1041289620 | 843680513.4798921 | 0.18977343356222823
2020 | SOYBEAN OIL | 1109714591 | 974291275.7143737 | 0.12203436485735664
2022 | SOYBEAN OIL | 2596801981 | 2236867816.8932834 | 0.13860670422321147
(3 rows)
All soybean oil which is a very tiny export volume and not important to consider.
Tests which pass
zero_deforestation_brazil_soy_accepted_values_v2_6_1
[!TIP] PASS
all_columns_not_null_brazil_soy_v2_6_1
[!TIP] PASS
brazil_soy_close_to_bills_of_lading_fob_v2_6_1
[!TIP] PASS
year_accepted_range_brazil_soy_v2_6_1
[!TIP] PASS
product_type_accepted_values_brazil_soy_v2_6_1
[!TIP] PASS
country_of_production_is_brazil_brazil_soy_v2_6_1
[!TIP] PASS
brazil_soy_close_to_bills_of_lading_vol_v2_6_1
[!TIP] PASS
decision_tree_accepted_values_brazil_soy_v2_6_1
[!TIP] PASS
not_null_brazil_soy_v2_6_1_country_of_first_import_trase_id
[!TIP] PASS
brazil_soy_matched_accepted_values_brazil_soy_v2_6_1
[!TIP] PASS