In [ ]:
%%HTML 
<script>
    function luc21893_refresh_cell(cell) {
        if( cell.luc21893 ) return;
        cell.luc21893 = true;
        console.debug('New code cell found...' );
        
        var div = document.createElement('DIV');            
        cell.parentNode.insertBefore( div, cell.nextSibling );
        div.style.textAlign = 'right';
        var a = document.createElement('A');
        div.appendChild(a);
        a.href='#'
        a.luc21893 = cell;
        a.setAttribute( 'onclick', "luc21893_toggle(this); return false;" );

        cell.style.visibility='hidden';
        cell.style.position='absolute';
        a.innerHTML = '[show code]';        
                
    }
    function luc21893_refresh() {                
        if( document.querySelector('.code_cell .input') == null ) {            
            // it apeears that I am in a exported html
            // hide this code
            var codeCells = document.querySelectorAll('.jp-InputArea')
            codeCells[0].style.visibility = 'hidden';
            codeCells[0].style.position = 'absolute';                        
            for( var i = 1; i < codeCells.length; i++ ) {
                if (i % 2 == 0){
                luc21893_refresh_cell(codeCells[i].parentNode)}
                else {}
                
            }
            window.onload = luc21893_refresh;
        }                 
        else {
            // it apperas that I am in a jupyter editor
            var codeCells = document.querySelectorAll('.code_cell .input')
            for( var i = 0; i < codeCells.length; i++ ) {
                if ([1,3,4,5,7].includes(i)){
                luc21893_refresh_cell(codeCells[i])}
                else {}
            }            
            window.setTimeout( luc21893_refresh, 1000 )
        }        
    }
    
    function luc21893_toggle(a) {
        if( a.luc21893.style.visibility=='hidden' ) {
            a.luc21893.style.visibility='visible';        
            a.luc21893.style.position='';
            a.innerHTML = '[hide code]';
        }
        else {
            a.luc21893.style.visibility='hidden';        
            a.luc21893.style.position='absolute';
            a.innerHTML = '[show code]';
        }
    }
    
    luc21893_refresh()
</script>

Brazilian beef SEI-PCS 2.2.0 QA¶

Yan Prada Moro 16/01/2023

Introduction¶

we carry out a more extensive QA check on:

  1. Country of Import;
  2. Port of export;
  3. Exporter group;
  4. Importer group;
  5. State
  6. Biome.

Sources:

  • For SEI-PCS model v2.2.0, we used two sources:
    • Amazon S3 bucket ("brazil/beef/sei_pcs/v2.2.0/SEIPCS_BRAZIL_BEEF_{year}.csv")
    • Splitgraph("trase-development/data-source:latest"."supply-chains-latest")
    each source is provided on the graph description.
  • For SEI-PCS previous version model, we used two sources*:
    • Amazon S3 bucket ("brazil/beef/sei_pcs/v2.1.0/resubmission/BRAZIL_SEIPCS_BEEF_AGGREGATED_{year}_part_{part}.csv")
    • Splitgraph("trase/supply-chains:latest"."supply-chains")
    *each source is provided on the graph description.
  • For CD we used amazon s3 bucket:
    • For years before 2018: "brazil/beef/trade/cd/combined/CD_COMBINED_BEEF_{year}_NEW.csv"
    • 2018 ownwards: "brazil/beef/trade/cd/disaggregated/CD_DISAGGREGATED_BEEF_{year}_NEW.csv"
  • For MDIC port we used amazon s3 bucket:
    • For 2018: "brazil/trade/mdic/port/brazil_mdic_port_{year}_redownload.csv"
    • For the other years: "brazil/trade/mdic/port/brazil_mdic_port_{year}.csv"

High-level QA findings¶

Some conclusions are:

  1. Exporter names still need to be cleared;
  2. In 2018 there is no meat preparations, beef boneless and beef offals products. In 2019 we do not have beef dried salted smoked;
  3. After 2018, there is a large amount of exports coming from unknown biomes;
  4. "To order" importer for 2018 (in Splitgraph and Amazon S3);
In [ ]:
from trase.models.brazil.beef.qa_beef.imports.reader import *
from trase.models.brazil.beef.qa_beef.imports.plots_general import *
from trase.models.brazil.beef.qa_beef.imports.functions import *
from trase.models.brazil.beef.qa_beef.imports.main import *


import plotly.io as pio
pio.renderers.default = "plotly_mimetype+notebook"

dfs = load_downloaded_data()
sei_pcs = dfs['sei_pcs']
sei_old = dfs['sei_old']
mdic_port = dfs['mdic_port']
cd = dfs['cd']
merged_df = dfs['merged_df']

Country of import¶

First, we analyze the dynamics of flows by importing country. It is important to mention that after 2018 the import country is the country of first import, not the destination country.

SEI PCS / CD_BoL Volume comparision¶

We can see the difference in 2018, largely based on the problem we have in the new CD file for this year, as discussed in the general plot html.

In [ ]:
group_by = "COUNTRY_OF_IMPORT"
color_discrete_map = COLORS[group_by]
plot_grouped_by_dumbbell_comparision(sei_pcs, cd, group_by)

Market Share - Country of import¶

We can see an increase in market share for China (Mainland) and United States.

Netherlands, Germany, United Arab, Albania and Turkey have increase for 2018, caused by difficult match between BoL and SEI-PCS.

In [ ]:
plot_market_share(group_by, color_discrete_map)

Volume percentage per state per year for top countries¶

We can see a more or less stable distribution of the percentage of volume per state per year for the main importing countries. In 2018 we have a high percentage of UNKNOWN STATE, as expected. For 2019 and 2020, countries that usually import live cattle, such as Turkey and Lebanon, have a high percentage of UNKNOWN STATE.

In [ ]:
plot_state_grouped_by(sei_pcs, group_by)

Port¶

Checking ports dinamycs.

Volume product between SEI-pcs and CD over the years¶

The most significant difference in 2018 between CD and SEI-PCS comes from the export port of Santos.

In [ ]:
group_by = "PORT"
color_discrete_map = COLORS[group_by]
plot_grouped_by_dumbbell_comparision(sei_pcs, cd, group_by)

Market share per Port over the years¶

Itajaí and Barcarena increased their market share in 2018, and Santos dropped in 2018. São Francisco decreased its market share after 2018 and Paranaguá, Santos and Rio Grande increased market share after 2018.

In [ ]:
plot_market_share(group_by, color_discrete_map)

Volume percentage per state per year for port of export¶

As expected, ports in northern Brazil export beef products from the northern states of Brazil, the same for the other regions. It is interesting to note that after 2018, the ports of São Sebastião and Barcarena have the highest percentage of product quantity from an unknown state of production. If we connect it with the import country plot, it makes sense that the ports that export live cattle are the ones that have unknowns. Might be heading to countries like Turkey and Lebanon

In [ ]:
plot_state_grouped_by(sei_pcs, group_by)

Logistic hubs sourcing to ports over the years¶

Nothing to highlight here.

In [ ]:
plot_number_lh(group_by, color_discrete_map)

Export percentage per port per product over the years¶

It is important to highlight that in 2018 there were no exports of Meat Preparations, Boneless Products and Offals. In 2019 we don't have beef dried salted smoked. I find it strange not to have these products. This data comes from the SEI-PCS splitgraph source.

In [ ]:
plot_hs4_grouped_by(group_by, color_discrete_map)

Exporter¶

Market share over the years¶

This data comes from the SEI-PCS splitgraph source. We can see that Marfrig Global is replacing Marfrig's exports after 2019. Frisa Frigorifico is also replacing Frisa's exports after 2019. Perhaps the same will happen with mercury foods and mercury figures. The strange thing is that when I ran the deforestation-free run_trader_names_matching_from_db(country="BRAZIL", commodity="BEEF") it does not suggest any changes to these names.

In [ ]:
group_by = "EXPORTER_GROUP"
color_discrete_map = COLORS[group_by]
plot_market_share(group_by, color_discrete_map)

Number of logistic Hubs sourcing to exporter groups over the year¶

This chart only confirms the findings in the market share charts. BRF also has a strange behavior.

In [ ]:
plot_number_lh(group_by, color_discrete_map)

Export percentage per exporter group per product over the years¶

Beef product in 2018 is more concentrated in JBS and MARFRIG than in other years. Despite this, the distribution appears to be more consistent over the years.

In [ ]:
plot_hs4_grouped_by(group_by, color_discrete_map)

Biome¶

Market share over the years¶

The unknown biome increased a lot after 2018, while the cerrado and the amazon decreased. The Atlantic Forest has been supplying less and less beef over the years.

In [ ]:
group_by = "BIOME"
color_discrete_map = COLORS[group_by]
plot_market_share(group_by, color_discrete_map)

Export percentage per biome per product over the years¶

After 2018, the unknown biome grew the volume produced over the years.

In [ ]:
plot_hs4_grouped_by(group_by, color_discrete_map)

Importer¶

Volume product between SEI-pcs and CD per importer over the years¶

Interesting to see that in SEI-PCS 2018 we have a "To Order" importer. This data comes from Amazon S3. In Splitgraph we also have "TO ORDER" as an importer, as well as "TO ORDER OF" or "TO ORDER BANQUE MISR" and other importer names.

In [ ]:
group_by = "IMPORTER_GROUP"
color_discrete_map = COLORS[group_by]
plot_grouped_by_dumbbell_comparision(sei_pcs, cd, group_by)