Skip to content

View or edit on GitHub

This page is synchronized from trase/products/analysis/notebooks/brazil_soy_cofco_sourcing.ipynb. Last modified on 2025-12-13 00:30 CET by Trase Admin. Please view or edit the original file there; changes should be reflected here after a midnight build (CET time), or manually triggering it with a GitHub action (link).

Brazil Soy v2.6.1: COFCO sourcing

Why did COFCO's deforestation exposure increase so much in 2022?

In short:

  • COFCO's sourcing pattern is dominated by a single CNPJ (06.315.338/0015-14) which sources largely from Rio Grande du Sul
  • That state is the most deforestation-intensive state in Brazil
  • In 2022, its deforestation per ton roughly doubled
  • That CNPJ is located at PORTO ALEGRE. The model is taking that to be the logistics hub (i.e. soy silo) but it from Google Maps it looks more like a headquarters in the urban area close to the port. Hypothesis: the model is not performing well here.
from trase.tools import CNX, sps
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import plotly.io as pio

pio.renderers.default = "png"
df = pd.read_sql(
    """
    select 
        year, 
        decision_tree,
        municipality_of_production_trase_id, 
        municipality_of_production, 
        logistics_hub_trase_id,
        state,
        exporter,
        cnpj,
        sum(volume) as volume,
        sum(soy_deforestation_5_year_total_exposure) as deforestation_exposure
    from supply_chains.brazil_soy_v2_6_1
    where exporter_group = 'COFCO'
    group by 1, 2, 3, 4, 5, 6, 7, 8
    """,
    CNX.cnx,
)

The total deforestation exposure of COFCO increased sharply in 2022:

px.bar(
    sps.consolidate(df, ["deforestation_exposure"], ["year"]),
    y="deforestation_exposure",
    x="year",
    title="Total deforestation exposure of the COFCO group per year",
)

png

This pattern is seen not only in the total exposure, but also the exposure per volume of export:

data = sps.consolidate(df, ["deforestation_exposure", "volume"], ["year"])
data["exposure_per_volume"] = data["deforestation_exposure"] / data["volume"]
px.bar(
    data,
    y="exposure_per_volume",
    x="year",
    title="Deforestation exposure per volume of the COFCO group per year",
)

png

Export volume didn't increase by that much:

px.bar(
    sps.consolidate(df, ["volume"], ["year"]),
    y="volume",
    x="year",
    title="Total export volume of the COFCO group per year",
)

How does this break down into the individual exporters? After 2022 we see COFCO International Grains disappearing from the trade record, and COFCO Brasil S.A. dominating all volume:

df["exporter_short"] = df["exporter"].str.slice(0, 30)
px.bar(
    sps.consolidate(df, ["deforestation_exposure", "volume"], ["year", "exporter_short"]),
    y="deforestation_exposure",
    x="year",
    color="exporter_short",
    title="Total deforestation exposure down by COFCO exporter",
)

png

The Trase model typically allocates by CNPJ, so let's break that down. We see that the CNPJ 06.315.338/0015-14 was always present, but became of great importance to deforestation exposure in 2022. Interestingly, it doesn't have that dominance in export volume:

# Get top 10 CNPJs by total exposure_per_volume
total_per_cnpj = sps.consolidate(df, ["deforestation_exposure", "volume"], ["cnpj"])
top_cnpjs = total_per_cnpj.sort_values("deforestation_exposure")["cnpj"][-10:].values

# Replace CNPJs not in the top 10 with "other"
data = df.assign(masked_cnpj=df["cnpj"].where(df["cnpj"].isin(top_cnpjs), "other"))

# consolidate over masked CNPJ
data = sps.consolidate(data, ["deforestation_exposure", "volume"], ["year", "masked_cnpj"])
fig = px.bar(
    data,
    y="deforestation_exposure",
    x="year",
    color="masked_cnpj",
    title="Total deforestation exposure broken down by CNPJ",
)
fig.show()

# and now by volume
px.bar(
    data,
    y="volume",
    x="year",
    color="masked_cnpj",
    title="Total export volume broken down by CNPJ",
)

png

png

Let's look at how the model allocates volume, we see that 1.3 large exporter starts to dominate. In this case the tax municipality of the CNPJ is taken as the municipality of logistics hub. We source preferentially from the logistics hub and pass the rest to an LP.

df["decision_tree_short"] = df["decision_tree"].str.slice(0, 15)
px.bar(
    sps.consolidate(df, ["deforestation_exposure"], ["year", "decision_tree_short"]),
    y="deforestation_exposure",
    x="year",
    color="decision_tree_short",
    title="Total deforestation exposure broken down by branch",
)

png

Let's put that aside for now and look just at where the exposure is coming from in terms of production areas. We see that there has been a big increase in sourcing from Rio Grande Do Sul:

# Get top 10 states of production by total exposure_per_volume
total_per_state = sps.consolidate(df, ["deforestation_exposure", "volume"], ["state"])
top_state = total_per_state.sort_values("deforestation_exposure")["state"][-50:].values

# Replace states not in the top 10 with "other"
data = df.assign(masked_state=df["state"].where(df["state"].isin(top_state), "OTHER"))

# consolidate over masked state
data = sps.consolidate(data, ["deforestation_exposure", "volume"], ["year", "masked_state"])
px.bar(
    data,
    y="deforestation_exposure",
    x="year",
    color="masked_state",
    title="Total deforestation exposure broken down by state of production",
)

png

So is Rio Grande Do Sul associated with high deforestation? In a nutshell, yes: it's the most intense deforestation per ton of any state:

df_spatial = pd.read_sql(
    """
    select
        year,
        parent_region,
        sum(numerical_value) as soy_deforestation_per_volume
    from staging_trase_earth.spatial_metrics 
    where country_slug = 'brazil' and commodity_slug = 'soy' and metric_name = 'SOY_DEFORESTATION_PER_TN_5_YEAR_TOTAL'
    group by 1, 2
    having sum(numerical_value) > 0
    """,
    CNX.cnx,
)
# Create box plot
fig = px.box(
    df_spatial, 
    x='year', 
    y='soy_deforestation_per_volume', 
    points="outliers", 
    title="Soy deforestation per volume of each Brazilian state",
)

# Add single points for 'Rio Grande Do Sul'
rio_grande = df_spatial[df_spatial['parent_region'] == "RIO GRANDE DO SUL"]
fig.add_trace(go.Scatter(
    x=rio_grande['year'], 
    y=rio_grande['soy_deforestation_per_volume'],
    mode='markers',
    marker=dict(size=10, color='red', symbol='circle-open'),
    name='Rio Grande Do Sul'
))
fig.show()

png

Investigating the CNPJ

What can we say about the CNPJ 06.315.338/0015-14? The RFB data has the following:

Field Value
id 4681923
cnpj 06315338001514
company_name COFCO INTERNATIONAL BRASIL S.A.
juridic_nature 2054
nature_name Sociedade Anônima Fechada
company_social_capital 2424237751.08
company_size 05
company_size_name DEMAIS
national_responsible_entity
id_head_branch 2
head_branch_name FILIAL
trade_name COFCO INTL
id_registration_status 2
registration_status_name ATIVA
registration_status_date 2008-05-12
id_registration_status_reason 0
reason_name SEM MOTIVO
foreign_city_name
id_country
country_name
start_activity_date 2008-05-12
cnae 4623199
cnae_name Comércio atacadista de matérias-primas agrícolas não especificadas anteriormente
cnae_secondary 3513100,4622200,4632001,4683400,7490103
address_type AVENIDA
address_street DOUTOR NILO PECANHA
address_number 2900
address_complement ANDAR 7 SALA 702
address_neighbourhood CHACARA DAS PEDRAS
postal_code 91330001
id_city 8801
city_name PORTO ALEGRE
municipality 4314902
id_state 43
state RS
phone_1_area_code 17
phone_1 31225824
phone_2_area_code 11
phone_2 35668600
fax_area_code 11
fax 35668700
email tributoscorporativo@cofcointernational.com
special_status
special_status_date
created_at 2023-11-10 02:32:32.206897

Those CNAEs are

cnae_code description
3513100 Electric energy wholesalers
4622200 Soybean wholesaler
4632001 Wholesale of processed grains and legumes
4683400 Wholesale of pesticides, fertilizers and soil conditioners

The address is

Avenida Doutor Nilo Peçanha, 2900, Andar 7 Sala 702
Chácara das Pedras
Porto Alegre - RS, 91330-001
Brasil

Looking on google maps, that doesn't seem to me to be a logistics hub, but rather a headquarters. Let's just double check that this is indeed the logistics hub used in the Trase data. In the table above the municipality is "municipality: 4314902"

data = df[(df["year"] == 2022) & (df["cnpj"] == "06315338001514")]

px.bar(
    sps.consolidate(data, ["volume"], ["logistics_hub_trase_id"]),
    y="volume",
    x="logistics_hub_trase_id",
    title="Logistics hubs of 06.315.338/0015-14 in 2022"
)

png

Right - it's BR-4314902, matching the RFB data.