Tac Signatories Combined
s3://trase-storage/brazil/indicators/actors/zd_commitments/out/TAC_SIGNATORIES_COMBINED.csv
Dbt path: trase_production.main_brazil.tac_signatories_combined
Explore on Metabase: Full table; summary statistics
Containing yaml file link: trase/data_pipeline/models/brazil/indicators/actors/zd_commitments/out/_schema.yml
Model file link: trase/data_pipeline/models/brazil/indicators/actors/zd_commitments/out/tac_signatories_combined.py
Calls script: trase/data/brazil/indicators/actors/zd_commitments/out/tac_signatories_combined.py
Dbt test runs & lineage: Test results · Lineage
Full dbt_docs page: Open in dbt docs (includes lineage graph -at the bottom right-, tests, and downstream dependencies)
Tags: actors, brazil, indicators, out, zd_commitments
tac_signatories_combined
Description
Terms of Conduct Adjustment (TAC)
The Terms of Conduct Adjustment (Termos de Ajustamento de Conduta - TAC) are signed socio-environmental commitments from the Federal Public Ministry (Ministério Público Federal).
TAC description
The Termo de Ajustamento de Conduta (TAC) in the Brazilian beef sector is a legal instrument, first developed in Pará in 2009 and then spreading to other states, designed to ensure companies—particularly slaughterhouses—comply with environmental and labor regulations, especially around deforestation and illegal land use in cattle supply chains.
In the beef sector, TACs emerged as a response to widespread illegal deforestation linked to cattle ranching in the Amazon. The goal was to pressure slaughterhouses to avoid buying cattle from farms with illegal activities (e.g., forced labor, deforestation) or non-compliant with other criterias (e.g., land tenure and environmental regularization). Companies usually had a time limit to present a compliance plan, which was approved by Brazilian Federal Prosecuter's Office. Nowadays, the Boi na Linha protocol establishes a common ground for legal and socio-environmental standards.
Are TACs signed voluntarily?
Yes, TACs are formally voluntary, but in practice there is strong institutional pressure to sign. * Companies can refuse to sign, but may face legal action, reputational damage, or restrictions on access to public financing and major buyers. * Many large buyers (e.g., supermarkets, meat exporters) require suppliers to have signed and comply with a TAC.
So while not mandatory per se, signing is often strategically necessary.
Who signs the TAC da Carne?
A TAC is signed by a company, and the agreement affects the entire company. However, major slaughter companies, such as JBS or Marfrig, which operate in multiple states, can sign agreements for establishments in a specific state or group of states within the Legal Amazon. The TAC documents for companies with multiple establishments normally specify a tax number of a central office and not a specific slaughterhouse (or do not specify the tax number at all). Still, for companies with one (or a few) operational establishments, the tax number can be included, and it is typically associated with the slaughterhouse (CNPJ with 14 digits).
It implies that even if a company decides to open new slaughterhouses or modify existing CNPJs, all active establishments within a determined state will still have to be engaged. See below one example.
| Company | CNPJ on TAC? | TAC Scope | Geographic Limitation | Example & Notes |
|---|---|---|---|---|
| JBS | Central CNPJ (often without 14 digits) | All establishments in Legal Amazon except Pará | State-based (AC, MT, then extended) | JBS TAC – See Preâmbulo 13 & Cláusula 6.7 |
| Frigol | Central office CNPJ (outside Legal Amazon) | Only establishments in Pará | Yes | Frigol TAC – Cláusula 1.1 |
| Marfrig | Central office CNPJ (outside Legal Amazon) | Only establishments in Mato Grosso | Yes | Similar to Frigol |
| Frinorte (Tomé Açu) | Local slaughterhouse CNPJ (14 digits) | Single establishment | No | Local-only agreement, CNPJ identifies facility |
-
Company-Level Responsibility:
- The TAC typically states that it is signed by the company, not just a facility. However, it limits its applicability to establishments in a specific state or region, meaning only those suppliers are affected.
-
CNPJ Usage:
- Large companies usually do not specify all slaughterhouse CNPJs.
- Smaller companies often do specify a single 14-digit CNPJ tied to one slaughterhouse.
- In large companies, the headquarters CNPJ may be listed, but that doesn't mean all national operations are included.
-
Expansion of Scope Over Time:
- Companies like JBS have had multiple TACs or amendments to expand to other states, indicating that each state may require a separate agreement or addendum.
Do TACs have limited 2-year validity?
Not necessarily. Most TACs in the cattle sector do not have a fixed expiration date. Many TACs explicitly state that they are indefinite: "O presente TERMO tem prazo indeterminado". This means the agreement remains valid as long as the company continues operating and has not been released from its obligations.
- The 2-year idea may come from:
- Misinterpretation of monitoring or reporting intervals (e.g., "apresentar relatório a cada 2 anos").
- Older or generic TAC models in other sectors (e.g., environmental remediation).
So in the cattle sector, TACs are typically valid for an indefinite period — especially for large slaughterhouses under ongoing monitoring. They remain binding unless formally rescinded or replaced.
TAC Monitoring Obligations
Companies that sign TACs must: * Check their suppliers against lists of embargoed farms (from IBAMA, state agencies). * Ensure that no cattle are purchased from farms involved in: * Illegal deforestation * Slave-like labor * Overlapping Indigenous lands or conservation units * Provide animal movement records (GTAs) and other documentation.
Other details may be found at TAC Boi na Linha's website
Details
| Column | Type | Description |
|---|---|---|
Models / Seeds
source.trase_duckdb.trase-storage-raw.beef_exporter_subsidiariesmodel.trase_duckdb.tac_zdc_2023model.trase_duckdb.tac_signatoriesmodel.trase_duckdb.beef_cnpj_2010_newmodel.trase_duckdb.beef_cnpj_2011_newmodel.trase_duckdb.beef_cnpj_2012_newmodel.trase_duckdb.beef_cnpj_2013_newmodel.trase_duckdb.beef_cnpj_2014_newmodel.trase_duckdb.beef_cnpj_2015_newmodel.trase_duckdb.beef_cnpj_2016_newmodel.trase_duckdb.beef_cnpj_2017_newmodel.trase_duckdb.beef_auxiliary_cnpj_beef_cnpj_2018_newmodel.trase_duckdb.beef_cnpj_2019_newmodel.trase_duckdb.beef_cnpj_2020_new
Sources
['trase-storage-raw', 'beef_exporter_subsidiaries']
import numpy as np
import pandas as pd
import stdnum.br.cnpj
import stdnum.br.cpf
from trase.tools import sps
from trase.tools.aws.aws_helpers import read_s3_parquet
from trase.tools.aws.aws_helpers_cached import get_pandas_df_once
from trase.tools.aws.metadata import write_csv_for_upload
"""
Config Session
"""
OUTPUT_FILE = "brazil/indicators/actors/zd_commitments/out/TAC_SIGNATORIES_COMBINED.csv"
FILEPATH_TAC_SIGNATORIES_IMAFLORA = "brazil/beef/indicators/silver/tac_zdc_2023.parquet"
FILEPATH_TAC_SIGNATORIES_GIBBSLAB = (
"brazil/indicators/actors/zd_commitments/out/TAC_SIGNATORIES.csv"
)
FILEPATH_RELATIONSHIPS = (
"brazil/logistics/company_relationships/BEEF_EXPORTER_SUBSIDIARIES.csv"
)
LEGAL_AMAZON_UF = ["AC", "AM", "AP", "MA", "MT", "PA", "RO", "RR", "TO"]
TARGET_COLUMNS = {
"relationship": {
"PARENT_CNPJ8": "parent_cnpj8",
"SUBSID_CNPJ8": "subsid_cnpj8",
"START_YEAR": "start_year",
"END_YEAR": "end_year",
},
"old_tac_df": {
"SH_CNPJ": "cnpj",
"YEAR_TAC_SIGNED": "signing_year",
"TAC_STATE": "state",
"APPLIES_TO_LEGAL_AMAZON": "legal_amazon",
},
"new_tac_df": {"signing_year": "signing_year", "state_uf": "state", "cnpj": "cnpj"},
}
"""
Auxiliar Functions
"""
def validate_cnpj_code(code):
# first deduce what the type is given the length of string
# that it appeared in the original data
if code == "NA":
code = "0"
cnpj = code.rjust(14, "0")
is_valid_cnpj = stdnum.br.cnpj.is_valid(cnpj)
cpf = code.rjust(11, "0")
is_valid_cpf = stdnum.br.cpf.is_valid(cpf)
# we use the checksum method if it was unequivocal and fall back on the
# string-length method otherwise
if is_valid_cnpj and not is_valid_cpf:
return "valid", cnpj
elif is_valid_cpf and not is_valid_cnpj:
return "valid", cpf
else:
return "invalid", cnpj if len(code) > 11 else "invalid", cpf
def read_relationships():
# TODO: we should start using 2020-12-19-BEEF_EXPORTER_SUBSIDIARIES.csv
# However we have currently paused work on ZDCs, so I'm kicking this can down the
# road...
return get_pandas_df_once(
FILEPATH_RELATIONSHIPS,
sep=";",
dtype=str,
keep_default_na=False,
)
def process_relationships(df_relationships):
df_relationships = df_relationships.rename(
columns=TARGET_COLUMNS["relationship"], errors="raise"
)
df_relationships.loc[
df_relationships["PARENT_CLEAN"] == "MARFRIG GLOBAL FOODS", "parent_cnpj8"
] = "03853896"
df_relationships.loc[
df_relationships["PARENT_CLEAN"]
== "J.N.J. COMERCIAL IMPORTADORA E EXPORTADORA DE CARNES",
"parent_cnpj8",
] = "07664941"
assert all(df_relationships["parent_cnpj8"].str.len() == 8)
assert all(df_relationships["subsid_cnpj8"].str.len() == 8)
df_relationships = df_relationships[
list(TARGET_COLUMNS["relationship"].values())
].drop_duplicates()
return df_relationships
def read_cnpj(year):
# TODO: use Athena to read the latest RFB data rather than these pre-filtered CSVs
return get_pandas_df_once(
f"brazil/beef/auxiliary/cnpj/BEEF_CNPJ_{year}_NEW.csv",
sep=";",
dtype=str,
keep_default_na=False,
)
def process_and_concatenate_cnpjs(df_cnpjs):
df = sps.concat(df_cnpjs)
df = df.drop_duplicates()
df["is_valid"], df["cnpj"] = zip(*df["cnpj"].apply(validate_cnpj_code))
df["is_valid"] = df["is_valid"] == "valid"
print(
f'Dropped {sum(~df["is_valid"])} CNPJ(s) from the database that are not valid'
)
df = df[df.pop("is_valid")]
return df
def clean_tacs(df_tac_imaflora, df_tac_gibbs, df_cnpj, df_relationships):
df_tac_gibbs = df_tac_gibbs[TARGET_COLUMNS["old_tac_df"].keys()].rename(
columns=TARGET_COLUMNS["old_tac_df"], errors="raise"
)
df_tac_imaflora = df_tac_imaflora[TARGET_COLUMNS["new_tac_df"].keys()].rename(
columns=TARGET_COLUMNS["new_tac_df"], errors="raise"
)
# update legal amazon field
df_tac_imaflora["legal_amazon"] = np.where(
df_tac_imaflora["state"].isin(LEGAL_AMAZON_UF), True, False
)
df_tac_gibbs["legal_amazon"] = np.where(
df_tac_gibbs["state"].isin(LEGAL_AMAZON_UF), True, False
)
# how to handle nan and inf records?
# replace by the most frequent year
df_tac_imaflora["signing_year"] = (
df_tac_imaflora["signing_year"].replace([np.inf, -np.inf], 2013).fillna(2013)
)
df_tac_gibbs["signing_year"] = (
df_tac_gibbs["signing_year"].fillna(0).replace("NA", 2013)
)
# convert year to int
df_tac_imaflora["signing_year"] = df_tac_imaflora["signing_year"].astype(int)
df_tac_gibbs["signing_year"] = df_tac_gibbs["signing_year"].astype(int)
# The supply shed dataset applies to domestic, so remove any cnpj8 that are not in our CNPJ database
# It is applicable only to holly gibbs dataset
# -------------------------------------------------------------
before = len(df_tac_gibbs)
df_tac_gibbs = pd.merge(
df_tac_gibbs, df_cnpj["cnpj"].drop_duplicates(), validate="many_to_one"
)
print(f"Drop {len(df_tac_gibbs) - before} rows with unrecognised CNPJs")
# set cnpj8
df_tac_gibbs["cnpj8"] = df_tac_gibbs["cnpj"].str.slice(0, 8)
df_tac_imaflora["cnpj8"] = df_tac_imaflora["cnpj"].str.slice(0, 8)
# check cnpj8
assert all(
(df_tac_gibbs["cnpj8"].str.len() == 8) & df_tac_gibbs["cnpj8"].str.isdigit()
)
assert all(
(df_tac_imaflora["cnpj8"].str.len() == 8)
& df_tac_imaflora["cnpj8"].str.isdigit()
)
# filter by its relationships (subsidiaries)
df_tac_gibbs = pd.merge(
df_tac_gibbs,
df_relationships,
left_on="cnpj8",
right_on="subsid_cnpj8",
how="left",
validate="many_to_one",
)
# set parent_cnpj8
df_tac_gibbs["parent_cnpj8"] = df_tac_gibbs["parent_cnpj8"].mask(
df_tac_gibbs["parent_cnpj8"].isna(), df_tac_gibbs["cnpj8"]
)
df_tac_imaflora["parent_cnpj8"] = np.where(
df_tac_imaflora["cnpj8"]
!= "04748631", # I found that the only exception was CNPJ8, which has a different parent
df_tac_imaflora["cnpj8"],
"03853896", # this is the parent cnpj8 for the exception
)
# set source field
df_tac_gibbs["source"] = "Gibbs Lab"
df_tac_imaflora["source"] = "Boi na Linha"
return df_tac_gibbs, df_tac_imaflora
def process_data(df_tac_imaflora, df_tac_gibbs, df_cnpjs, df_relationships):
df_relationships = process_relationships(df_relationships)
df_cnpj = process_and_concatenate_cnpjs(df_cnpjs)
df_tac_gibbs, df_tac_imaflora = clean_tacs(
df_tac_imaflora, df_tac_gibbs, df_cnpj, df_relationships
)
# filter target fields
df_tac_gibbs = df_tac_gibbs[
[
"cnpj",
"cnpj8",
"parent_cnpj8",
"signing_year",
"state",
"legal_amazon",
"source",
]
]
df_tac_imaflora = df_tac_imaflora[
[
"cnpj",
"cnpj8",
"parent_cnpj8",
"signing_year",
"state",
"legal_amazon",
"source",
]
]
# merge datasets
df = pd.concat([df_tac_gibbs, df_tac_imaflora])
# group by CNPJ and combine source info
df["source"] = df.groupby("cnpj")["source"].transform(
lambda x: "_".join(sorted(set(x)))
)
# remove duplicates
df = df.sort_values("signing_year").drop_duplicates(subset="cnpj", keep="first")
# filter by target years
YEARS = list(range(2009, 2024))
df = df.query("signing_year in @YEARS")
df = df.drop_duplicates()
# convert types
df["signing_year"] = df["signing_year"].astype(int)
return df.reset_index(drop=True)
def main():
cnpj_years = [2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018, 2019, 2020]
df_cnpjs = [read_cnpj(year) for year in cnpj_years]
df_relationships = read_relationships()
# load tac dataset provided by holly gibbs lab
df_tac_gibbs = get_pandas_df_once(
FILEPATH_TAC_SIGNATORIES_GIBBSLAB, sep=";", dtype=str, keep_default_na=False
)
# load tac dataset provided by imaflora
df_tac_imaflora = read_s3_parquet(FILEPATH_TAC_SIGNATORIES_IMAFLORA)
# clean and combine datasets
df = process_data(df_tac_imaflora, df_tac_gibbs, df_cnpjs, df_relationships)
write_csv_for_upload(df=df, key=OUTPUT_FILE, index=False)
if __name__ == "__main__":
main()
from trase.data.brazil.indicators.actors.zd_commitments.out.tac_signatories_combined import (
process_data,
)
def model(dbt, cursor):
dbt.config(materialized="external")
df_tac_imaflora = dbt.ref("tac_zdc_2023").df()
df_tac_gibbs = dbt.ref("tac_signatories").df()
df_relationships = dbt.source(
"trase-storage-raw", "beef_exporter_subsidiaries"
).df()
df_cnpjs = [
dbt.ref("beef_cnpj_2010_new").df(),
dbt.ref("beef_cnpj_2011_new").df(),
dbt.ref("beef_cnpj_2012_new").df(),
dbt.ref("beef_cnpj_2013_new").df(),
dbt.ref("beef_cnpj_2014_new").df(),
dbt.ref("beef_cnpj_2015_new").df(),
dbt.ref("beef_cnpj_2016_new").df(),
dbt.ref("beef_cnpj_2017_new").df(),
dbt.ref("beef_auxiliary_cnpj_beef_cnpj_2018_new").df(),
dbt.ref("beef_cnpj_2019_new").df(),
dbt.ref("beef_cnpj_2020_new").df(),
]
df = process_data(df_tac_imaflora, df_tac_gibbs, df_cnpjs, df_relationships)
raise NotImplementedError(
"This model has been written but what is on S3 was actually generated by running the Python script directly, not by dbt"
)
return df