View or edit on GitHub
This page is synchronized from doc/Trase-ID.md. Last modified on 2025-12-09 00:30 CET by Trase Admin.
Please view or edit the original file there; changes should be reflected here after a midnight build (CET time),
or manually triggering it with a GitHub action (link).
Trase ID
A Trase ID is a global, unique identifier that we assign to any entity of interest to Trase. For example: slaughterhouses, companies, or ports.
Each category of Trase ID has its own rules for formation. For example, a slaughterhouse in Brazil is the form "BR-BEEF-SLAUGHTERHOUSE-XXXXXXXXXXXXXX", where the "X" represent the CNPJ-14.
A Trase ID should ideally adhere to the following rules:
- It is globally unique (no other entity of any type, in any country or supply chain, shares that ID)
- Each entity can have at most one Trase Id.
- If a standard unique identifier exists, Trase should use that. For example, IGBE codes for municipalities in Brazil.
Ideally, a Trase ID can be reliably constructed from real-world data external to Trase: given a CNPJ-14 of a slaughterhouse, I can construct its Trase ID. Generally we want to avoid the situation where we ourselves have assigned a custom sequential ID.
Here are the rules that we have defined for existing Trase IDs[^1]:
- World / Global
- Argentina
- Bolivia
- Brazil
- Ghana
- Ivory Coast
- China
- Colombia
- Ecuador
- Indonesia
- Peru
- Paraguay
- Tanzania
World / Global
| Entity | Example | Rules |
|---|---|---|
| Country | AR | Two-letter ISO 3166-1 alpha-2 code. Some exceptions are made to the two-letter rule, for example GB-CHA represents the Channel Islands. The unknown country is XX and the "country" called "World" is WORLD. |
| Port | WPI-25320 | From the World Port Index. |
| Petrochemical Plant | PETROCHEMICAL-00002 | Unsure what this is. |
Argentina
| Entity | Example | Rules |
|---|---|---|
| Province | AR-06 | The code AR-XX represents the unknown province. The code AR-XX-STOCK represents volume sources from imports and stock. |
| Department | AR-06385 | The code AR-XXXXX represents the unknown department. The code AR-XXXXX-STOCK represents volume sources from imports and stock |
| Biome | AR-BIO-1 | The code AR-BIO-X represents the unknown biome. The code AR-BIO-X-STOCK represents volume sources from imports and stock |
| Trader | AR-TRADER-3066306025 | Ten-digit CUIT. |
Bolivia
| Entity | Example | Rules |
|---|---|---|
| Department | BO-03 | Based on SIIP government data. The code BR-XX is the unknown department. |
| Province | BO-0309 | Based on SIIP government data. The code BR-XXXX is the unknown province. |
| Municipality | BO-030904 | Based on SIIP government data. The code BR-XXXXXX is the unknown municipality. |
| Biome | BO-BIO-523 | The code BO-BIO-XXX is the unknown biome. |
| Crushing Facility | BO-CRUSHER-01 | Unsure how these are generated. |
| Lake | BO-LAKE-01 | Generated by this script. |
| Salt flat | BO-SALT-FLAT-O1 | Generated by this script. |
| Port | BO-PORT-722 | Unsure how these are generated. The code BO-PORT-XXX represents the unknown port. |
| Silo | BO-SILO-53 | Unsure how these are generated. The code BO-SILO-XX represents the unknown silo. |
| Trader | BO-TRADER-1028385027 | Tax codes are mix of nine and ten digits. |
Brazil
| Entity | GADM Level | Example | Rules |
|-------------------|---------|-------|
| Region | 2 | BR-1 | Geocode as defined by the IGBE. The code 'BR-X' represents the unknown region. The code BR-IMPORTED-REGION is used to represent volume imported into Brazil. |
| State | 3 | BR-16 | Geocode as defined by the IGBE. The code 'BR-XX' represents the unknown state. The code BR-IMPORTED-STATE is used to represent volume imported into Brazil. The code BR-XX-AGGREGATED is used to represent an "aggregated" state. |
| Mesoregion | 4 | BR-1102 | Geocode as defined by the IGBE. The code 'BR-XXXX' represents the unknown mesoregion, as do codes like BR-22XX. The code BR-IMPORTED-MESOREGION is used to represent volume imported into Brazil. |
| Microregion | 5 | BR-11001 | Geocode as defined by the IGBE. The code 'BR-XXXXX' represents the unknown microregion, as do codes like BR-51XXX. The code BR-IMPORTED-MICROREGION is used to represent volume imported into Brazil. |
| Municipality | 6 | BR-3534708 | Geocode as defined by the IGBE. The code 'BR-XXXXXXX' represents the unknown municipality. The code BR-IMPORTED-MUNICIPALITY is used to represent volume imported into Brazil. |
| Slaughterhouse | N/A | BR-BEEF-SLAUGHTERHOUSE-XXXXXXXXXXXXXX | BR-BEEF-SLAUGHTERHOUSE-UNKNOWN represents the unknown slaughterhouse. |
| Biome | N/A | BR-BIO-1 | The code BR-BIO-X represents the unknown biome. The code BR-IMPORTED-BIOME is used to represent volume imported into Brazil. |
| Crushing Facility | N/A | BR-SOY-CRUSHING-02003402002461 | |
| Silo | N/A | BR-SOY-SILO-00059308000102 | |
| Trader | N/A | BR-TRADER-10610917 | First 8 digits of the CPF or CNPJ code. Typically the first digits should not be zero - this often indicates that the CPF was left-padded with zeros before taking the first digits. The code BR-TRADER-XXXXXXXX represents the unknown Brazilian trader. |
| Vessel | N/A | VESSEL-9328534 | A shipping vessel used to export soy from Brazil. These IDs come from the Bill of Lading dataset for Brazil soy. The code VESSEL-XXXXXXX represents the unknown vessel. |
Ghana
The administrative divisions of the Republic of Ghana consist of four geographic terrestrial plains and 16 regions. For local government, there are a total of 261 districts including 145 ordinary districts, 109 municipal districts, and six metropolitan districts.
| Entity | Example | Rules |
|---|---|---|
| Region | GH-16 | Geocode as defined by the GADM dataset, with the trailing version number (e.g. "_2") stripped. |
| District | GH-16.09 | Geocode as defined by the GADM dataset, with the trailing version number (e.g. "_2") stripped. |
India
Currently, no Trase IDs have been defined for India.
| Entity | GADM Level |
|---|---|
| State or Union Territory | 1 |
| District | 2 |
| Sub-district | 3 |
The ISO 3166-2:IN standard provides some codes, but they are state level only. Likely we would want to use codes from the Local Government Directory managed by the Ministry of Panchayati Raj. The Census of India is another possibility.
Ivory Coast
| Entity | GADM Level | Example | Rules |
|---|---|---|---|
| District | 2 | CI-12 | The code CI-XX represents the unknown district. |
| Region | 3 | CI-1202 | The code CI-XXXX represents the unknown region. |
| Department | 4 | CI-120203 | The code CI-XXXXXX represents the unknown department, as do codes like CI-1202XX. The code CI-INDRCT represents indirect sourcing. |
To convert an Ivory Coast GADM geocode like CI-2.1.2_1 to a Trase ID like CI-020102, you can use the following algorithm:
def geocode_to_trase_id(series: pd.Series):
code = series # CI-2.1.2_1
code = code.str.replace(".", "0") # CI-20102_1
code = code.str.slice(3, -2) # 20102
code = code.str.rjust(6, "0") # 020102
code = "CI-" + code # CI-020102
return code.replace("CI-0000NO", "CI-XXXXXX")
An implementation in R can be found here.
China
| Entity | Example | Rules |
|---|---|---|
| Trader | CN-TRADER-2102660062 | The code sometimes includes letters, not sure why! |
Colombia
| Entity | GADM Level | Example | Rules |
|---|---|---|---|
| Department | 2 | CO-XX | |
| Municipality | 3 | CO-XXXXX | |
| Port | N/A | CO-PORT-XXXX | |
| Trader | N/A | CO-TRADER-XXXXXXXXXX |
Ecuador
| Entity | Example | Rules |
|---|---|---|
| Trader | EC-TRADER-1391807902001 | Some Traders have "NA" for their last two digits. |
| Province | EC-01 | The code EC-XX represents the unknown province. |
| Canton | EC-0901 | The code EC-XXXX represents the unknown canton. |
| Parish | EC-090157 | The code EC-XXXXXX represents the unknown parish. |
| Pond | EC-0901-POND-9665060-588194 | The geocode of a canton, followed by the coordinates of a shrimp pond. The code EC-XXXX-POND-XXXXXXXX-XXXXXX represents the unknown pond. |
Ethiopia
We do not have any Trase IDs defined for Ethiopia yet, but here is how the country divides according to GADM:
| Entity | GADM Level |
|---|---|
| Region | 1 |
| Zone | 2 |
| Woreda | 3 |
Indonesia
| Entity | GADM Level | Example | Rules |
|---|---|---|---|
| Region | 2 | ID-KA | One of the seven regions of Indonesia. ID-X represents the unknown region. |
| Province | 3 | ID-11 | ID-XX represents the unknown province. |
| Kabupaten | 4 | ID-XXXX | |
| Concession | N/A | ID-PALM-CONCESSION-XXXXX | |
| Mill | N/A | ID-PALM-MILL-XXXXX | ID-PALM-MILL-X represents the unknown mill. |
| Refinery | N/A | ID-PALM-REFINERY-XXXX | ID-PALM-REFINERY-X represents the unknown refinery. |
| Port | N/A | ID-PORT-XXX | |
| Port | N/A | ID-PORT-XXXX | ID-PORT-XXXX represents the unknown port. |
| Refinery | N/A | ID-REFINERY-XXXX | |
| Trader | N/A | ID-TRADER-XXXX | |
| Concession | N/A | ID-WOOD-CONCESSION-XXXX | |
| Mill | N/A | ID-WOOD-MILL-XXXX |
Peru
| Entity | Example | Rules |
|---|---|---|
| Trader | PE-TRADER-XXXXXXXXXXX |
No Trase IDs have yet been defined for the administrative divisions of Peru, but here is how the country divides according to GADM:
| Entity | GADM Level |
|---|---|
| Region | 1 |
| Province | 2 |
| District | 3 |
Paraguay
| Entity | Example | Rules |
|---|---|---|
| Department | PY-XX | |
| District | PY-XXXX | |
| District | PY-XXXXX | |
| Biome | PY-BIO-X | |
| Crushing Facility | PY-CRUSHER-XX | |
| Port | PY-PORT-XXXX | |
| Silo | PY-SILO-XXX | |
| Trader | PY-TRADER-XXXXXXX | |
| Trader | PY-TRADER-XXXXXXXX | |
| Trader | PY-TRADER-XXXXXXXXX |
Tanzania
| Entity | Example | Rules |
|---|---|---|
| Trader | TZ-TRADER-104042260 | Nine-digit Tanzanian Identification Number (TIN) |
No Trase IDs have yet been defined for the administrative divisions of Peru, but here is how the country divides according to GADM:
| Entity | GADM Level |
|---|---|
| Region | 1 |
| District | 2 |
| Ward | 3 |
| Villages and streets | 4 |
[^1]: This list was obtained via the SQL query select distinct sub_type, regexp_replace(trase_id, '\d', 'X', 'g') from nodes join node_sub_types on node_sub_types.id = sub_type_id where length(trase_id) > 2 order by 2