View or edit on GitHub
This page is synchronized from doc/CSV-Files.md. Last modified on 2025-12-09 00:30 CET by Trase Admin.
Please view or edit the original file there; changes should be reflected here after a midnight build (CET time),
or manually triggering it with a GitHub action (link).
CSV Files on Trase
CSV files are a mixed bag: they are very simple to understand, but also ridden with hidden traps! Generally, we are moving towards using the Apache Parquet format to replace our use of CSVs.
However, the vast majority of files on AWS S3 representing tabular data are in the CSV format, and some parts of our pipeline (in particular, database ingestion) only support CSV files.
Over the years, we have developed a number of standards for using CSV files on Trase.