View or edit on GitHub
This page is synchronized from trase/products/analysis/notebooks/brazil_soy_cofco_sourcing.ipynb. Last modified on 2025-12-13 00:30 CET by Trase Admin.
Please view or edit the original file there; changes should be reflected here after a midnight build (CET time),
or manually triggering it with a GitHub action (link).
Brazil Soy v2.6.1: COFCO sourcing
Why did COFCO's deforestation exposure increase so much in 2022?
In short:
- COFCO's sourcing pattern is dominated by a single CNPJ (06.315.338/0015-14) which sources largely from Rio Grande du Sul
- That state is the most deforestation-intensive state in Brazil
- In 2022, its deforestation per ton roughly doubled
- That CNPJ is located at PORTO ALEGRE. The model is taking that to be the logistics hub (i.e. soy silo) but it from Google Maps it looks more like a headquarters in the urban area close to the port. Hypothesis: the model is not performing well here.
from trase.tools import CNX, sps
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import plotly.io as pio
pio.renderers.default = "png"
df = pd.read_sql(
"""
select
year,
decision_tree,
municipality_of_production_trase_id,
municipality_of_production,
logistics_hub_trase_id,
state,
exporter,
cnpj,
sum(volume) as volume,
sum(soy_deforestation_5_year_total_exposure) as deforestation_exposure
from supply_chains.brazil_soy_v2_6_1
where exporter_group = 'COFCO'
group by 1, 2, 3, 4, 5, 6, 7, 8
""",
CNX.cnx,
)
The total deforestation exposure of COFCO increased sharply in 2022:
px.bar(
sps.consolidate(df, ["deforestation_exposure"], ["year"]),
y="deforestation_exposure",
x="year",
title="Total deforestation exposure of the COFCO group per year",
)

This pattern is seen not only in the total exposure, but also the exposure per volume of export:
data = sps.consolidate(df, ["deforestation_exposure", "volume"], ["year"])
data["exposure_per_volume"] = data["deforestation_exposure"] / data["volume"]
px.bar(
data,
y="exposure_per_volume",
x="year",
title="Deforestation exposure per volume of the COFCO group per year",
)

Export volume didn't increase by that much:
px.bar(
sps.consolidate(df, ["volume"], ["year"]),
y="volume",
x="year",
title="Total export volume of the COFCO group per year",
)
How does this break down into the individual exporters? After 2022 we see COFCO International Grains disappearing from the trade record, and COFCO Brasil S.A. dominating all volume:
df["exporter_short"] = df["exporter"].str.slice(0, 30)
px.bar(
sps.consolidate(df, ["deforestation_exposure", "volume"], ["year", "exporter_short"]),
y="deforestation_exposure",
x="year",
color="exporter_short",
title="Total deforestation exposure down by COFCO exporter",
)

The Trase model typically allocates by CNPJ, so let's break that down. We see that the CNPJ 06.315.338/0015-14 was always present, but became of great importance to deforestation exposure in 2022. Interestingly, it doesn't have that dominance in export volume:
# Get top 10 CNPJs by total exposure_per_volume
total_per_cnpj = sps.consolidate(df, ["deforestation_exposure", "volume"], ["cnpj"])
top_cnpjs = total_per_cnpj.sort_values("deforestation_exposure")["cnpj"][-10:].values
# Replace CNPJs not in the top 10 with "other"
data = df.assign(masked_cnpj=df["cnpj"].where(df["cnpj"].isin(top_cnpjs), "other"))
# consolidate over masked CNPJ
data = sps.consolidate(data, ["deforestation_exposure", "volume"], ["year", "masked_cnpj"])
fig = px.bar(
data,
y="deforestation_exposure",
x="year",
color="masked_cnpj",
title="Total deforestation exposure broken down by CNPJ",
)
fig.show()
# and now by volume
px.bar(
data,
y="volume",
x="year",
color="masked_cnpj",
title="Total export volume broken down by CNPJ",
)


Let's look at how the model allocates volume, we see that 1.3 large exporter starts to dominate. In this case the tax municipality of the CNPJ is taken as the municipality of logistics hub. We source preferentially from the logistics hub and pass the rest to an LP.
df["decision_tree_short"] = df["decision_tree"].str.slice(0, 15)
px.bar(
sps.consolidate(df, ["deforestation_exposure"], ["year", "decision_tree_short"]),
y="deforestation_exposure",
x="year",
color="decision_tree_short",
title="Total deforestation exposure broken down by branch",
)

Let's put that aside for now and look just at where the exposure is coming from in terms of production areas. We see that there has been a big increase in sourcing from Rio Grande Do Sul:
# Get top 10 states of production by total exposure_per_volume
total_per_state = sps.consolidate(df, ["deforestation_exposure", "volume"], ["state"])
top_state = total_per_state.sort_values("deforestation_exposure")["state"][-50:].values
# Replace states not in the top 10 with "other"
data = df.assign(masked_state=df["state"].where(df["state"].isin(top_state), "OTHER"))
# consolidate over masked state
data = sps.consolidate(data, ["deforestation_exposure", "volume"], ["year", "masked_state"])
px.bar(
data,
y="deforestation_exposure",
x="year",
color="masked_state",
title="Total deforestation exposure broken down by state of production",
)

So is Rio Grande Do Sul associated with high deforestation? In a nutshell, yes: it's the most intense deforestation per ton of any state:
df_spatial = pd.read_sql(
"""
select
year,
parent_region,
sum(numerical_value) as soy_deforestation_per_volume
from staging_trase_earth.spatial_metrics
where country_slug = 'brazil' and commodity_slug = 'soy' and metric_name = 'SOY_DEFORESTATION_PER_TN_5_YEAR_TOTAL'
group by 1, 2
having sum(numerical_value) > 0
""",
CNX.cnx,
)
# Create box plot
fig = px.box(
df_spatial,
x='year',
y='soy_deforestation_per_volume',
points="outliers",
title="Soy deforestation per volume of each Brazilian state",
)
# Add single points for 'Rio Grande Do Sul'
rio_grande = df_spatial[df_spatial['parent_region'] == "RIO GRANDE DO SUL"]
fig.add_trace(go.Scatter(
x=rio_grande['year'],
y=rio_grande['soy_deforestation_per_volume'],
mode='markers',
marker=dict(size=10, color='red', symbol='circle-open'),
name='Rio Grande Do Sul'
))
fig.show()

Investigating the CNPJ
What can we say about the CNPJ 06.315.338/0015-14? The RFB data has the following:
| Field | Value |
|---|---|
| id | 4681923 |
| cnpj | 06315338001514 |
| company_name | COFCO INTERNATIONAL BRASIL S.A. |
| juridic_nature | 2054 |
| nature_name | Sociedade Anônima Fechada |
| company_social_capital | 2424237751.08 |
| company_size | 05 |
| company_size_name | DEMAIS |
| national_responsible_entity | |
| id_head_branch | 2 |
| head_branch_name | FILIAL |
| trade_name | COFCO INTL |
| id_registration_status | 2 |
| registration_status_name | ATIVA |
| registration_status_date | 2008-05-12 |
| id_registration_status_reason | 0 |
| reason_name | SEM MOTIVO |
| foreign_city_name | |
| id_country | |
| country_name | |
| start_activity_date | 2008-05-12 |
| cnae | 4623199 |
| cnae_name | Comércio atacadista de matérias-primas agrícolas não especificadas anteriormente |
| cnae_secondary | 3513100,4622200,4632001,4683400,7490103 |
| address_type | AVENIDA |
| address_street | DOUTOR NILO PECANHA |
| address_number | 2900 |
| address_complement | ANDAR 7 SALA 702 |
| address_neighbourhood | CHACARA DAS PEDRAS |
| postal_code | 91330001 |
| id_city | 8801 |
| city_name | PORTO ALEGRE |
| municipality | 4314902 |
| id_state | 43 |
| state | RS |
| phone_1_area_code | 17 |
| phone_1 | 31225824 |
| phone_2_area_code | 11 |
| phone_2 | 35668600 |
| fax_area_code | 11 |
| fax | 35668700 |
| tributoscorporativo@cofcointernational.com | |
| special_status | |
| special_status_date | |
| created_at | 2023-11-10 02:32:32.206897 |
Those CNAEs are
| cnae_code | description |
|---|---|
| 3513100 | Electric energy wholesalers |
| 4622200 | Soybean wholesaler |
| 4632001 | Wholesale of processed grains and legumes |
| 4683400 | Wholesale of pesticides, fertilizers and soil conditioners |
The address is
Avenida Doutor Nilo Peçanha, 2900, Andar 7 Sala 702
Chácara das Pedras
Porto Alegre - RS, 91330-001
Brasil
Looking on google maps, that doesn't seem to me to be a logistics hub, but rather a headquarters. Let's just double check that this is indeed the logistics hub used in the Trase data. In the table above the municipality is "municipality: 4314902"
data = df[(df["year"] == 2022) & (df["cnpj"] == "06315338001514")]
px.bar(
sps.consolidate(data, ["volume"], ["logistics_hub_trase_id"]),
y="volume",
x="logistics_hub_trase_id",
title="Logistics hubs of 06.315.338/0015-14 in 2022"
)

Right - it's BR-4314902, matching the RFB data.