Purpose

Many users are concerned about the recency of Trase data and whether the latest results available for a given supply chain are indeed the best data for their analyses. For instance, we are now in 2024 and the team is about to release new Brazilian soy data up to 2022. Even in the best case scenario, Trase data will always be behind the times mostly because we need to wait for at least one full trade year to obtain the combination of commodity trade, production and deforestation data.

The motivation behind this analysis is to better understand the year-on-year variability in our supply chain mapping results in order to respond to user concerns of “stale” Trase data. We also explore if some simple assumptions can predict a supply chain map in a given year. We start with Brazilian soy, one of our flagship supply chains, which has received the most scrutiny thus far and has been key in setting agendas in Brazil, the EU and UK.

Trader analysis

In this analysis, we check the year-on-year variability of the bills of lading between 2019 and 2022 in order to better understand the possible effects on the supply chain maps.

## [1] "just downloaded SEI-PCS for 2019"
## [1] "just downloaded SEI-PCS for 2020"
## [1] "just downloaded SEI-PCS for 2021"
## [1] "just downloaded SEI-PCS for 2022"
## [1] "just downloaded MDIC for 2019"
## [1] "just downloaded MDIC for 2020"
## [1] "just downloaded MDIC for 2021"
## [1] "just downloaded MDIC for 2022"

We first look at the market share of top companies over time (moving 90% of products):

  • at the national level
  • per port

Keep in mind that this information is for maritime shipments only, we can check the size of the non-maritime shipments further down.

National level analysis

First we note that there are a lot of unknown traders meaning that the market share we calculate has the potential to change just on the size of the “Unknown Customer” flows.

With the above in mind, we look at the market share of top companies. Given the size of the “Unknown Customer” flows, we likely cannot reach the 90% volume of top traders.

And then provide the mean market share with standard deviation across the years for the top traders.

We see from the slopes of the linear models (market share vs. time), that a lot of the traders do not have a flat slope, suggesting that the market share increases over time between 2019 and 2022 at the national level.

Companies like Cargill almost doubled their imports between 2019 and 2022, or significantly dropped in market share like LDC. Again, we have to be careful about interpreting this information since there is a large amount of “UNKNOWN CUSTOMER” and so these numbers only relate to the known trades.

Port level analysis

We repeat the above analysis accounting for ports of export.

Here, we see again quite a large spread and important increases in market share year on year for some trade-port combinations.

In terms of consistency of sourcing, we note 21 trader-port combinations that only appear in 1 year of export, and then another 27 with 2 years of export. For instance the ADM-São Francisco do Sul combination only appears in 2019 and 2020, not other years.

## # A tibble: 16 × 4
## # Groups:   year [4]
##     year combination          vol   pct
##    <dbl> <chr>              <dbl> <dbl>
##  1  2019 4           39883467661  47.8 
##  2  2019 3            3355678579   4.03
##  3  2019 2              31524531   0.04
##  4  2019 1             182617000   0.22
##  5  2020 4           48214552782. 49.6 
##  6  2020 3            4481806848.  4.61
##  7  2020 2             420374795   0.43
##  8  2020 1             158617312   0.16
##  9  2021 4           59750197033. 59.1 
## 10  2021 3            4132892559.  4.1 
## 11  2021 2           10347782426. 10.2 
## 12  2021 1             248784036   0.24
## 13  2022 4           55217624904  59.3 
## 14  2022 3            1817972655   1.96
## 15  2022 2           10444873456  11.2 
## 16  2022 1             279246871   0.29

What we find is that the large majority of the top traders are sourcing from the same ports in all of the years (2019-2022). Less than 0.2% of the volume from major traders only source from 1 port in a single year. Interestingly, there seems to be (new?) dynamics in 2021 and 2022.

Country analysis

National level analysis

We then repeat the analysis using countries of destination, also tapping into the MDIC data to see if we note any differences across market shares.

The BoL has some particularities worth noting:

  • “EUROPEAN UNION” appears as a country, alongside FRANCE, GERMANY, etc. we combine them into one “EU” group for this analysis
  • There are UNKNOWN COUNTRY labels which need to also be looked at
## # A tibble: 1 × 5
##    year country_of_destination.name      vol     vol_tot   pct
##   <dbl> <chr>                          <dbl>       <dbl> <dbl>
## 1  2019 UNKNOWN COUNTRY             58732000 83476898788     0

We find a small number of unknown countries of destination in 2019; technically the country label “EUROPEAN UNION” is also unknown, but at least we have some indication on the economic bloc.

We then compare the BoL market share with what we can find with MDIC. Note that the top countries are selected based on the BoL volumes.

Port level analysis

We repeat the above analysis accounting for ports of export.

We do not get the same ports matching in both Bol and MDIC. We analyse the time series for BoL and MDIC separately.

Starting with BoL:

and then MDIC:

Given the above, we see that the top companies are sourcing volumes consistently from the same ports for ~60% of exports in the BoL. These numbers could change drastically based on the number of “UNKNOWN CUSTOMERS” that could be revealed in any given year.

We cannot fully predict the traded volumes of main traders from one year to the next, but there seems to be enough consistency to move to the next phase of analysis for in-country supply chain stickiness.

Similarly with countries, volumes are difficult to predict year-on-year, especially with China which has some “missing” volumes it seems in the BoL beyond 2019 (but still decreasing in MDIC). China + EU represent almost 80% of the market export and there are some key port-hubs where these countries are consistently sourcing from

Trader-port-decision branch analysis

We now focus on the trader-port-decision branch relationship to see if there are any consistent relationships year-on-year that would allow us to make predictions, e.g. predict 2022 supply chain with 2020 BoL data + relationships.

We first look at each company and the general decision tree branch they are associated with:

analysis here

We then look at the breakdown per exporter group considering the companies that most export soy (~50 Mtonnes). The general order of exporters based on those that export the most to the least

  • BUNGE
  • CARGILL
  • ADM
  • LD
  • COAMO
  • GAVILON
  • COFCO
  • AMAGGI
  • GLENCORE
  • OLAM
  • AMAGGI & LD
  • ENGELHART
  • CHS

BUNGE

The main ports of exports for BUNGE are:

  • Santos
  • São Francisco do Sul
  • Rio Grande
  • Salvadore

There is an important prominence of branch 1, but branch 3 dominates the main ports in 2021 and 2022, there are also more ports and exports in new ports starting in 2021.

CARGILL

The main ports of export of CARGILL are:

  • Paranagua
  • Santos

Interestingly those switch over time, starting out with more exports out of Paranagua in 2019, and then our of Santos in 2022. There is roughtly a 50:50 breakdown between branch 1 and 3.

ADM

The main ports of export of ADM are:

  • Santos mostly (with branch 3)
  • then a mix of Barcarena, Rio Grance, Paranagua, Salvadore with varying levels of branch 3 and 2

Louis Dreyfus

The main ports of export of LD are:

  • mostly Paranagua linked to branch 3
  • then Rio Grance (mostly branch 1) and Santos (mostly branch 3)

Coamo

There is little data on Coamo ports, likely not reliable.

Skipped the other companies due to export quantities at this time.

Country-decision branch analysis

We now look at the branch breakdown for EU and China as a means to understand the uncertainty in the connections made between 2019 and 2022.

Notes and questions:

  • What would happen is we only had Branch 3 for all years? We would have a much larger breakup of flows and so having branch 1 allows to filter out and “concentrate” the flows in the LHs identified by the CNPJ and activity. Harry tested this difference going from “lower resolution” or “less information” to “higher resolution” or “more information” and we need to better understand the differences there.
  • Can we realistically predict a later year (e.g. 2022) with an earlier year (2019)? It does seem like an exercise with little return since (1) making a prediction in the first place, means repeating the calculations again later on (effectively doing the work twice) and (2) there are enough changes year-on-year that using prediction are not necessarily better compared to the latest available year.