Skip to content

View or edit on GitHub

This page is synchronized from trase/models/brazil/soy/brazil_soy_v2_6_1_dbt_tests.md. Last modified on 2025-12-14 23:19 CET by Trase Admin. Please view or edit the original file there; changes should be reflected here after a midnight build (CET time), or manually triggering it with a GitHub action (link).

Brazil Soy v2.6.1 - DBT Test Results

For the v2.6.1 run of Brazil soy, we have some quite extensive quality assurance tests defined in DBT. Many failures of these tests are actually not significant. Ideally, we would improve the tests to return only true failures. However, for speed and ease, for now I'm just going through each test and commenting on whether it is an issue.

The tests were run on November 25th 2024 as follows:

trase/database/dbt/dbt test --select brazil_soy_v2_6_1

Tests which failed and need investigation or fixes

co2_gross_exposure_aggregation_check_brazil_soy_v2_6_1

[!CAUTION] WARN 8

Seeing slightly higher values in 2.6.1 ("expression" column below), is this OK? Perhaps due to the embedding of UNKNOWN values - explore.

col_1 expression compare_expression expression_difference expression_difference_percent
2013 99278188 85129428 14148760 0.14251629975357729132
2014 96209987 83550111 12659876 0.13158588203530263443
2015 135005601 123485231 11520370 0.08533253372206387200
2016 110112468 101599310 8513158 0.07731329752776043490
2017 144549005 122699184 21849821 0.15115857075598687103
2018 117199745 110155947 7043798 0.06010079629439466784
2019 107475409 86766116 20709293 0.19268866425062871824
2020 91455743 62791671 28664072 0.31342014246169319296

TODO: check with Carina/UNKNOWN flows

co2_gross_exposure_ranking_check_brazil_soy_v2_6_1

[!CAUTION] WARN 253

TODO: lots of changes here, need to visualise & investigate

co2_net_exposure_aggregation_check_brazil_soy_v2_6_1

[!CAUTION] WARN 8

Similar story to co2_gross_exposure_aggregation_check_brazil_soy_v2_6_1

trase=> select * from co2_net_exposure_aggregation_check_brazil_soy_v2_6_1;
 col_1 | expression | compare_expression | expression_difference | expression_difference_percent
-------+------------+--------------------+-----------------------+-------------------------------
  2013 |   85798308 |           74470079 |              11328229 |        0.13203324475816003271
  2014 |   82934615 |           73025052 |               9909563 |        0.11948645327406415283
  2015 |  116850546 |          107775035 |               9075511 |        0.07766768158704196384
  2016 |   95212595 |           88448515 |               6764080 |        0.07104186163605770854
  2017 |  126235906 |          107506367 |              18729539 |        0.14836934746600543272
  2018 |  102673239 |           98040225 |               4633014 |        0.04512387107998024685
  2019 |   93868124 |           76737816 |              17130308 |        0.18249334566439188664
  2020 |   79746797 |           55152242 |              24594555 |        0.30840806057702856705
(8 rows)

TODO: check with Carina/UNKNOWN flows

co2_net_exposure_ranking_check_brazil_soy_v2_6_1

[!CAUTION] WARN 255

TODO: investigate this

deforestation_exposure_aggregation_check_brazil_soy_v2_6_1

[!CAUTION] WARN 8

deforestation_exposure_ranking_check_brazil_soy_v2_6_1

[!CAUTION] WARN 246

Tests which failed but aren't a problem

zdc_volumes_brazil_soy_v2_6_1

[!CAUTION] WARN 26

There are some differences compared to v2.6.0:

trase=> select 
      col_1 as zdc,
      col_2 as year,
      col_3 as exporter_group,
      expression as volume_v2_6_1,
      compare_expression as volume_v2_6_0,
      expression_difference as volume_difference,
      expression_difference_percent as volume_relative_error
  from "trase"."dbt_test__audit"."zdc_volumes_brazil_soy_v2_6_1"
  where col_2 <= 2020
  order by compare_expression asc
zdc year exporter_group volume_v2_6_1 volume_v2_6_0 volume_difference volume_relative_error
NONE 2018 COOPERATIVA AGROINDUSTRIAL DOS PRODUTORES RURAIS DO SUDOESTE 405241.39068499993 6199.8154 399041.5752849999 0.9847009324750363
UNKNOWN 2018 RISA 183386.068 30401.628 152984.44 0.8342206235644902
UNKNOWN 2018 EFRAIM AGRONEGOCIOS LTDA 74572.647787 67185.107606 7387.540180999989 0.09906501110300975
NONE 2018 GLENCORE 4270258.435999999 4184408.466 85849.96999999881 0.020104162613730582
UNKNOWN 2018 BIORGANICA COMERCIO DE PRODUTOS ORGANICOS 200 200 1
UNKNOWN 2018 BIO SOJA INDUSTRIAS QUIMICAS E BIOLOGICAS LTDA. 3759.2981839999998 3759.2981839999998 1
UNKNOWN 2018 BOM JESUS AGROPECUARIA 30545.87750247279 30545.87750247279 1
UNKNOWN 2018 BR GOODS INDUSTRIA E COMERCIO DE PRODUTOS HOSPITALARES EIREL 38.689 38.689 1
UNKNOWN 2018 COPALEM COOP. DE PRODUTORES RURAIS DE LUIS EDUARDO MAGALHAES 11573.104907999998 11573.104907999998 1
UNKNOWN 2018 COTRIJUC - COOPERATIVA AGROPECUARIA JULIO DE CASTILHOS 18900 18900 1
UNKNOWN 2018 GIRASSOL AGRICOLA 13298.306493 13298.306493 1
UNKNOWN 2018 GOLDEN SUCOS S/A 2239.0082399999997 2239.0082399999997 1
UNKNOWN 2018 JAPUNGU AGROINDUSTRIAL SA 4685.098037 4685.098037 1
UNKNOWN 2018 J N DISTRIBUIDORA DE PRODUTOS DE HIGIENE E BELEZA 1359.896217 1359.896217 1
UNKNOWN 2018 USINA RIO PARDO S/A 19702.535 19702.535 1
UNKNOWN 2018 VEGRANDE VEICULOS CASAGRANDE 8336.515 8336.515 1
UNKNOWN 2018 VILELA VILELA & CIA 1000 1000 1
UNKNOWN 2018 ADRIANA AGRICOLA 8384.740499 8384.740499 1
UNKNOWN 2018 AGRINVEST BRASIL 79120.498 79120.498 1
UNKNOWN 2018 ALCOFLAME INDUSTRIA E COMERCIO DE PRODUTOS QUIMICOS E MAQUIN 294.936108 294.936108 1

However, I think these are fine. The first and second rows (COOPERATIVA AGROINDUSTRIAL DOS PRODUTORES RURAIS DO SUDOESTE; RISA) are actually an improvement: we previously had traders that were not associated with the right groups.

The differences in all other rows are too small for us to be concerned about.

fob_check_brazil_soy_v2_6_1

[!CAUTION] WARN 2

Getting some errors here:

trase=> select * from "trase"."dbt_test__audit"."fob_check_brazil_soy_v2_6_1";
 col_1 |  col_2   |  col_3  | col_4 |   col_5    |    col_6     | col_7 |     expression     | compare_expression | expression_difference | expression_difference_percent
-------+----------+---------+-------+------------+--------------+-------+--------------------+--------------------+-----------------------+-------------------------------
  2020 | BR-BIO-X | UNKNOWN |       | BR-XXXXXXX | SOYBEAN CAKE | BR-XX | 176578719.28572476 |                    |    176578719.28572476 |                             1
  2020 | BR-BIO-X | UNKNOWN |       | BR-XXXXXXX | SOYBEANS     | BR-XX |  589315293.7709252 |                    |     589315293.7709252 |                             1
(2 rows)

Actually the data is fine - it's just that the test is failing on null country_of_first_import (col_4).

volume_check_brazil_soy_v2_6_1

[!CAUTION] WARN 2

Same as fob_check_brazil_soy_v2_6_1

brazil_soy_close_to_secomex_port_fob_v2_6_1

[!CAUTION] WARN 1

22% more FOB for Soybean oil in 2019:

trase=> select * from "trase"."dbt_test__audit"."brazil_soy_close_to_secomex_port_fob_v2_6_1";
 year | product_type | secomex_fob |    seipcs_fob    |   relative_error
------+--------------+-------------+------------------+---------------------
 2019 | SOYBEAN OIL  |   694673011 | 537252438.607409 | 0.22661103843084382
(1 row)

Soybean oil is relatively small (0.7% of total exports in 2019) so this is not a concern.

cnpj_check_brazil_soy_v2_6_1

[!CAUTION] WARN 10

This is about CNPJs which are missing from the cnpj.cnpj_2019 table, but they are all quite low (7% of CNPJs in 2018 but otherwise usually less than 1%).

brazil_soy_close_to_secomex_port_vol_v2_6_1

[!CAUTION] WARN 3

trase=> select * from brazil_soy_close_to_secomex_port_vol_v2_6_1;
 year | product_type | secomex_volume |   seipcs_volume    |   relative_error
------+--------------+----------------+--------------------+---------------------
 2019 | SOYBEAN OIL  |     1041289620 |  843680513.4798921 | 0.18977343356222823
 2020 | SOYBEAN OIL  |     1109714591 |  974291275.7143737 | 0.12203436485735664
 2022 | SOYBEAN OIL  |     2596801981 | 2236867816.8932834 | 0.13860670422321147
(3 rows)

All soybean oil which is a very tiny export volume and not important to consider.

Tests which pass

zero_deforestation_brazil_soy_accepted_values_v2_6_1

[!TIP] PASS

all_columns_not_null_brazil_soy_v2_6_1

[!TIP] PASS

brazil_soy_close_to_bills_of_lading_fob_v2_6_1

[!TIP] PASS

year_accepted_range_brazil_soy_v2_6_1

[!TIP] PASS

product_type_accepted_values_brazil_soy_v2_6_1

[!TIP] PASS

country_of_production_is_brazil_brazil_soy_v2_6_1

[!TIP] PASS

brazil_soy_close_to_bills_of_lading_vol_v2_6_1

[!TIP] PASS

decision_tree_accepted_values_brazil_soy_v2_6_1

[!TIP] PASS

not_null_brazil_soy_v2_6_1_country_of_first_import_trase_id

[!TIP] PASS

brazil_soy_matched_accepted_values_brazil_soy_v2_6_1

[!TIP] PASS