Archive
View or edit on GitHub
This page is synchronized from trase/models/brazil/beef/qa_beef/archive/README.md. Last modified on 2026-03-21 22:30 CET by Trase Admin.
Please view or edit the original file there; changes should be reflected here after a midnight build (CET time),
or manually triggering it with a GitHub action (link).
QA code
This folder contains the QA phase 2 and 3 code for the beef model. The jupyter notebook codes output the html files that are used to do the QA of the model. The html files are stored on github
trase/data/brazil/beef/sei_pcs/qa_phase2_%%_plots.html
````
The amazon s3 data is used to produce the general and specific plots. The Postgres data is used to produce the compare versions plots. This is good to have in mind to know which data to download in the workflow section.
## Structure and files
│ ├── constants.py # Constants used in the QA of the beef model, ie, hs6 codes, dataframe column names, etc.
│ ├── functions.py # The auxiliary functions used to read the files, process the dataframe for the plots, etc.
│ ├── layout.py # Colors used for the layout of the plots.
│ ├── plots_general.py # All the general plots used to QA beef model.
│ ├── plots_specific.py # All the specific plots used to QA beef model.
│ ├── preprocess.py # Functions to preprocess the dataframes used on the QA.
│ └── reader.py # Functions to read the files in S3, Trase database and local storages. ├── loader.py # The Python file that downloads the databases used in QA. ├── qa_phase2_general_plots.ipynb # The Python file that produces the general html file. ├── qa_phase2_specific_plots.ipynb # The Python file that produces the specific html file. ├── qa_phase2_compare_versions.ipynb # The Python file that produces the compared versions html file. └── qa_phase3_plots.ipynb # The Python file that produces the plots for phase3 QA. ```
Workflow
The workflow consists in the following steps:
-
Go to
imports/constants.pyand define the PATH variable that the data will be downloaded and consumed. The default path creates a folderdownloadsinsideqa_beef -
Run
loader.pywith arguments to download data from S3 and Postgres databases. You can runpython loader.py --helpto check available arguments. As mentioned in the first section of this file, amazon s3 data is used to produce the general and specific plots. Postgres data is used to produce the version comparison charts. Here are some examples to download data depending on your goals:- You can download all the data required to generate all the QA files with the following command:
python loader.py --all True - If you want to download just the data from S3, you can run
python loader.py --all_s3 True - If you want to download just the data from Postgres, you can run
python loader.py --all_db True - If you want to download expecific dataframes, for example, SEI_PCS version 2.2 and MDIC port you can run
python loader.py --sei_pcs True --mdic_port True
Downloading all the data usually takes 35-45 minutes and requires 10GB of memory.
Important: If there is any error downloading the data, most likely the error comes from changing the path or file name or changing the name of the columns in the dataframe. To deal with this it is good to go to amazon s3 and see the modified date of the file. If it was recently, then probably something happened to the file name or columns in the file. If the filename or path on amazon s3 has changed, you can update the path in the
imports/reader.pyfile. If the columns in the file have changed, you can change the column name in theimports/columns.pyfile. - You can download all the data required to generate all the QA files with the following command:
-
After downloading the data from the s3 database, run the jupyter notebook file
qa_phase2_general_plots.ipynbto view the general plots. This notebook imports the function inreader.pythat loads the downloaded data, and imports the functions inplots_general.pythat will produce and display the plots. After running the notebook, you can export it as an html file. This will be saved locally on your computer. -
Run the jupyter notebook file
qa_phase2_specific_plots.ipynbto view the specific plots. This notebook imports the function inreader.pythat loads the downloaded data, and imports the functions inplots_specific.pythat will produce and display the plots. After running the notebook, you can export it as an html file. This will be saved locally on your computer. -
After downloading the data from the Postgres database, you can run the jupyter notebook file
qa_phase2_compare_versions.ipynbto view the version comparison graphs. This notebook imports the function inreader.pythat loads the downloaded data, and imports the functions inplots_general.pyandplots_specific.pythat will produce and display the graphs. After running the notebook, you can export it as an html file. This will be saved locally on your computer. -
Save the html files in Trase GitHub path ```bash trase/data/brazil/beef/sei_pcs/ ````
