Skip to content

View or edit on GitHub

This page is synchronized from doc/Configuration-in-Trase.md. Last modified on 2026-05-06 16:54 CEST by Trase Admin. Please view or edit the original file there; changes should be reflected here after a midnight build (CET time), or manually triggering it with a GitHub action (link).

Configuration in Trase


Trase separates configuration from code. Examples of configuration are:

  • The database host, password, etc.
  • AWS region, password, etc.
  • Trase-specific parameters such as chunk_size

Configuration is stored in a file called trase.env (standard dotenv format).

Here is an example of a trase.env file:

# Top-level settings use `TRASE_<KEY>`.
# Nested settings (e.g. postgres, etc) use `TRASE_<SECTION>__<KEY>` (double underscore).

# use the "production_readonly" service defined in pg_service.conf
TRASE_POSTGRES__SERVICE=production_readonly

# use the AWS profile "readonly"
TRASE_AWS__PROFILE_NAME=readonly

# override chunk size
TRASE_CHUNK_SIZE=100

Top-level settings use TRASE_<KEY>. Nested settings (postgres, aws, duckdb_and_dbt, metabase) use TRASE_<SECTION>__<KEY> (double underscore).

Interacting with the trase.env file

You have two main options to interact with the trase.env file:

  1. Edit it in a text editor. The location of the user configuration file is at:

    Linux ~/.config/trase/trase.env
    Mac ~/Library/Application Support/trase/trase.env
    WindowsC:\Users\MyUserName\AppData\Roaming\trase\trase\trase.env

    There are also repository-level and system files, see Configuration Layers. 2. Use the Trase CLI.

Using the Trase CLI

The Trase CLI provides some wrapper scripts for some common operations, in particular:

  • trase config show: print the current configuration and where it is being read from
  • trase config edit: open the Trase configuration file in the default editor for your operating system
  • trase config env: convert some configuration settings to environment variables
  • trase config db: interactively pick a PostgreSQL service from ~/.pg_service.conf and write it to your user trase.env
  • trase config loggers: print all of the named loggers in the codebase (and upstream libraries)
  • trase config logger: debug the logging configuration of a specific named logger

You can use trase config --help to show all the commands and trase config <command> --help for help on a specific command.

To use the Trase CLI you will need to have set up a Python environment: see the codebase README for instructions.

Database Configuration

The typical workflow for database configuration is:

  1. Set up a PostgreSQL connection service file and password file. See the codebase README for instructions.
  2. Run trase config db to select which service should be written to your user trase.env (see Using the Trase CLI).

However, this is not the only way. You can set any libpq-supported connection parameter directly in your trase.env, for example:

TRASE_POSTGRES__USER=my_user
TRASE_POSTGRES__HOST=trase-db-instance.c6jsbtgl0u2s.eu-west-1.rds.amazonaws.com
TRASE_POSTGRES__DBNAME=trase
TRASE_POSTGRES__OPTIONS=-c default_transaction_read_only=on

Or you can set it as a shell environment variable (without a file):

export TRASE_POSTGRES__USER=my_user

You can find a full list of libpq connection parameters at the PostgreSQL documentation.

Debugging Database Configuration

If your database connection is not working as you expect, try the following:

  1. Start a new Python session. Configuration is only applied once when the Python session starts.
  2. Run trase config show and confirm that the configuration is what you expect. If no service is configured, the output will include a tip to run trase config db.
  3. Run the following Python code to confirm which database you are connected to:
    from trase.tools.pcs.connect import CNX
    
    print(CNX.cnx)
    

AWS Configuration

See the codebase README for instructions.

You can find a full list of boto3 session parameters at the boto3 documentation.

Google Cloud Configuration

The Trase configuration file does not have any configuration options for Google Cloud. See the codebase README for instructions.

DuckDB Configuration

See DuckDB.

Configuration Layers

The final configuration is built up by applying the following layers, if they exist, from bottom to top:

Layer Trase-specific AWS / boto3 PostgreSQL / libpq
0 - boto3 configuration files ⭐️
~/.aws/config, ~/.aws/credentials
️libpq configuration files ⭐️
pg_service.conf, pgpass
1 - boto3 environment variables
AWS_SECRET_ACCESS_KEY_ID, AWS_PROFILE, ...
libpq environment variables
PGSERVICE, PGPASSWORD, ...
2 Repository trase.env ⭐
the (version-controlled!) trase.env at the root of this repository
contains default values; rarely needs to be changed
3 System-wide trase.env
Linux: /etc/xdg/trase/trase.env
Mac: /Library/Application Support/trase/trase.env
Windows: C:\ProgramData\trase\trase.env
4 User-specific trase.env ⭐
Linux: ~/.config/trase/trase.env
Mac: ~/Library/Application Support/trase/trase.env
Windows: C:\Users\<user>\AppData\Roaming\trase\trase\trase.env
5 Current-working-directory trase.env
A trase.env in the current working directory (not version-controlled!)
6 Trase environment variables
TRASE_CHUNK_SIZE, TRASE_AWS__PROFILE_NAME, ...

Why are there so many layers?

  • Layers 0 and 1 allow you to share your configuration with other consumers of boto3 or libpq on the same machine, such as psql.
  • Layer 2 provides default values for Trase (replaces the old trase/default.toml).
  • Layers 3 and 4 are for services such as JupyterHub, where there are system-wide settings with user overrides.
  • Layer 5 is convenient for overrides in Jupyter notebooks and other scripts.
  • Layer 6 is convenient for deploying Trase services to the cloud, following the twelve-factor app philosophy.

Whilst this allows for considerable flexibility, the recommended setup for a developer is to use the three starred (⭐️) layers. To see the final configuration for your system you can use the command trase config show.

Testing

Part of the test suite requires a connection to a PostgreSQL database. By default, an ephemeral PostgreSQL server will be created for each test suite run. However, you can also configure the test suite to use a PostgreSQL server that is already running:

TRASE_TEST__POSTGRES_SERVER=system

# optional connection parameters
TRASE_TEST__POSTGRES__USER=myself

This is particularly useful for Windows where the ephemeral PostgreSQL does not work.

Logging

Logging configuration is hardcoded in trase/config.py (the _LOGGING_CONFIG dict). It is no longer read from the configuration file (previously it was a [logging] table in trase.toml).

Applying Logging Configuration

Logging configuration is applied when the function trase.config.configure_logging runs.

This function should generally not be called in Trase code. The reason is that most of Trase code is intended to be used as a library; imported into other code. However, the configuration of logging is an application level concern.

For local development, you don't need to worry about this. The configure_logging function will be executed for every invocation of Python in the poetry environment by virtue of the file "sitecustomize.py" that lives at the codebase root.

For Jupyter use the magic command %configure_logging in the first cell.

How Python Logging Works

Logging in Python can be a little confusing. There are a few key concepts:

  1. Loggers operate in a parent/child hierarchy determined by "dot" notation on their name (e.g. trase.tools.pcs is a child of trase.tools), and there is one "root" logger
  2. There are five default logging levels: DEBUG, INFO, WARNING, ERROR, and CRITICAL. Earlier levels imply later ones: so ERROR also includes CRITICAL.
  3. If not set, the logging level (info/debug/etc.) of a logger is determined by the first parent that has a logging level

First, we determine whether logging should occur. Only the level of the logger or the first parent with a level is taken into account:

                            no           ┌────────────────────────────┐
                    ┌─────────────────   Does the logger have a level?
                    │                    └────────────────────────────┘
                    ▽                                   │ yes
       Find first parent with level                     ▽
                    │                    ┌────────────────────────────┐
                    └────────────────▷   Is the logging level higher?    ──────▷ stop
                                         └────────────────────────────┘   no
                                                        │
                                                        ┊  yes

Once that is decided, we look for handlers to pass the message to. Note that once we have decided, the levels of any parents are no longer relevant:

                                                        ┊
                                                        │
                                                        ▽
                   ┌─────────────────▷     Pass to handler of logger
                   │                                    │
       set current logger to parent                     │
                   △                                    ▽
                   │                     ┌────────────────────────────┐
                   └──────────────────     Is propogate set to true?     ──────▷ stop
                           yes           └────────────────────────────┘   no

It is all a bit nuanced and not necessarily intuitive at the beginning! To understand this better see the official Python documentation or Understanding Python's logging module.

Debugging Logging Configuration

There are two commands that will help you debug logging configuration. The command trase config loggers will print the hierarchy of all loggers:

$ trase config loggers
root WARNING
    ↳ urllib3 <NullHandler (NOTSET)>
        ↳ urllib3.poolmanager
        ↳ urllib3.util
            ↳ urllib3.util.retry
        ↳ urllib3.connection
        ↳ urllib3.response
        ↳ urllib3.connectionpool
    ↳ dotenv
        ↳ dotenv.main
...

The command trase config logger <logger name> <level> helps you understand whether a logging message will be emitted, and if so, to which handlers:

$ trase config logger trase.tools.pcs WARNING
trase.tools.pcs  has effective level WARNING, which is enabled for WARNING
trase.tools.pcs  Emit to <FileHandler ~/Library/Logs/trase/debug.log (NOTSET)>
trase            No handlers
root             No handlers

Customising Logging Configuration

Since logging is no longer read from a config file, customisation is done in Python. You can add or override loggers after configure_logging() has run:

import logging

# Log trase.tools.aws at DEBUG level
logging.getLogger("trase.tools.aws").setLevel(logging.DEBUG)

# Log trase.tools.pcs at WARNING level
logging.getLogger("trase.tools.pcs").setLevel(logging.WARNING)

Or, to fully replace the logging configuration before it is applied, edit _LOGGING_CONFIG in trase/config.py and add entries under "loggers":

# in trase/config.py _LOGGING_CONFIG["loggers"]:
"trase.tools.pcs": {"level": "DEBUG", "handlers": ["debugLogHandler"]},

The location of debug.log is as follows:

Mac ~/Library/Logs/trase/debug.log
Linux ~/.cache/trase/log/debug.log
WindowsC:\\Users\MyUserName\AppData\Local\trase\Logs\debug.log