View or edit on GitHub

This page is synchronized from doc/Configuration-in-Trase.md. Last modified on 2026-01-22 21:12 CET by Florian Gollnow. Please view or edit the original file there; changes should be reflected here after a midnight build (CET time), or manually triggering it with a GitHub action (link).

Configuration in Trase

Interacting with the trase.toml file
Using the Trase CLI
Database Configuration
Debugging Database Configuration
AWS Configuration
Google Cloud Configuration
DuckDB Configuration
Configuration Layers
Why are there so many layers?
Testing
Logging
Applying Logging Configuration
How Python Logging Works
Debugging Logging Configuration
Examples of Logging Configuration

Trase separates configuration from code. Examples of configuration are:

The database host, password, etc.
AWS region, password, etc.
Trase-specific parameters such as chunk_size

Configuration is typically stored in a file called trase.toml.

Here is an example of a trase.toml file:

autocommit = false

# use the "github" service defined in pg_service.conf
[postgres]
service = "github"

# print pcs.connect logging at WARNING level
[logging.loggers."trase.tools.pcs.connect"]
level = "WARNING"
handlers = [ "stdoutHandler", "stderrHandler"]

# use the AWS profile "readonly"
[aws]
profile_name = "readonly"

Interacting with the trase.toml file

You have two main options to interact with the trase.toml file:

Edit it in a text editor. The location of the user configuration file is at:

Linix ~/.config/trase/trase.toml

Mac ~/Library/Application Support/trase/trase.toml

Windows C:\Users\MyUserName\AppData\Roaming\trase\trase\trase.toml

There are also default and system files, see Configuration Layers. 3. Use the Trase CLI.

Using the Trase CLI

The Trase CLI provides some wrapper scripts for some common operations, in particular:

trase config show: print the current configuration and where it is being read from
trase config edit: open the Trase configuration file in the default editor for your operating system
trase config env: convert some configuration settings to enviroment variables
trase config db: alter which database is chosen in the user configuration file
trase config loggers: print all of the named loggers in the codebase (and upstream libraries)
trase config logger: debug the logging configuration of a specific named logger

You can use trase config --help to show all the commands and trase config <command> --help for help on a specific command.

To use the Trase CLI you will need to have set up a Python environment: see the codebase README for instructions.

Database Configuration

The typical workflow for database configuration is:

Set up a PostgreSQL connection service file and password file. See the codebase README for instructions.
Run trase config db to select which service should be in the user configuration file (see Using the Trase CLI).

However, this is not the only way. You can put any libpq-supported connection parameter in the [postgres] section of the configuration file, for example:

[postgres]
user = "my_user"
host = "trase-db-instance.c6jsbtgl0u2s.eu-west-1.rds.amazonaws.com"
dbname = "trase"
options = "-c default_transaction_read_only=on"

Or you could set it as an environment variable:

export TRASE_POSTGRES__user=my_user

You can find a full list of libpq connection parameters at the PostgreSQL documentation.

Debugging Database Configuration

If your database connection is not working as you expect, try the following

Start a new Python session. Configuration is only applied once when the Python session starts.
Run trase config show and confirm that the configuration is what you expect. (See Using the Trase CLI).
Run the following Python code to confirm which database you are connected to:
```
from trase.tools.pcs.connect import CNX

print(CNX.cnx)
```

AWS Configuration

See the codebase README for instructions.

You can find a full list of boto3 session parameters at the boto3 documentation.

Google Cloud Configuration

The Trase configuration file does not have any configuration options for Google Cloud. See the codebase README for instructions.

DuckDB Configuration

See DuckDB.

Configuration Layers

The final configuration is built up by applying the following layers, if they exist, from bottom to top:

Layer	Trase-specific	AWS / boto3	PostgreSQL / libpq
0	-	boto3 configuration files ⭐️ ~/.aws/config, ~/.aws/credentials	️libpq configuration files ⭐️ pg_service.conf, pgpass
1	-	boto3 environment variables AWS_SECRET_ACCESS_KEY_ID, AWS_PROFILE, ...	libpq environment variables PGSERVICE, PGPASSWORD, ...
2	Default trase.toml the (version-controlled!) trase/default.toml in this repository
3	System-wide trase.toml Linux: /etc/xdg/trase/trase.toml Mac: /Library/Application Support/trase/trase.toml Windows: C:\ProgramData\trase\trase.toml
4	User-specific trase.toml ⭐ Linux: ~/.config/trase/trase.toml Mac: ~/Library/Application Support/trase/trase.toml Windows: C:\Users\<user>\AppData\Roaming\trase\trase\trase.toml
5	️Repository trase.toml️ A trase.toml at the root of the repository (not version-controlled!)
6	Current-working-directory trase.toml
7	Trase environment variables TRASE_CHUNK_SIZE, TRASE_AWS__PROFILE_NAME, ...

Why are there so many layers?

Layers 0 and 1 allow to you share your configuration with other citizens of the system that use boto3 or libpq, such as psql.
Layer 2 is necessary to provide default values for Trase.
Layer 3 and 4 are for services such as JupyterHub, where there are system-wide settings with user overrides.
Layer 5 is convenient for developers who may be working on multiple checkouts of the repository at the same time.
Layer 6 is convenient for overrides for Jupyter notebooks and other scripts.
Layer 7 is convenient for deploying Trase services to the cloud, following the twelve-factor app philosophy.

Whilst this allows for considerable flexibility, the recommended setup for a developer is to use the three starred (⭐️) layers. To see the final configuration for your system you can use the command trase config show.

Testing

Part of the test suite requiers a connection to a PostgreSQL database. By default, an emphemeral PostgreSQL server will be created for each test suite run. However, you can also configure the test suite to use a PostgreSQL server that is already running:

[test]
postgres_server = "system"

    # optional connection parameters...
    [test.postgres]
    user = "myself"

This is particularly useful for Windows where the emphemeral PostgreSQL does not work.

Logging

You can also store logging configuration in the Trase configuration file.

Applying Logging Configuration

Logging configuration is applied when the function trase.config.configure_logging runs.

This function should generally not be called in Trase code. The reason is that most of Trase code is intended to be used as a library; imported into other code. However, the configuration of logging is an application level concern.

For local development, you don't need to worry about this. The configure_logging function will be executed for every invocation of Python in the poetry environment by virtue of the file "sitecustomize.py" that lives at the codebase root.

For Jupyter use the magic command %configure_logging in the first cell.

How Python Logging Works

Logging in Python can be a little confusing. There are a few key concepts:

Loggers operate in a parent/child hierarchy determined by "dot" notation on their name (e.g. trase.tools.pcs is a child of trase.tools), and there is one "root" logger
There are five default logging levels: DEBUG, INFO, WARNING, ERROR, and CRITICAL. Earlier levels imply later ones: so ERROR also includes CRITICAL.
If not set, the logging level (info/debug/etc.) of a logger is determined by the first parent that has a logging level

First, we determine whether logging should occur. Only the level of the logger or the first parent with a level is taken into account:

                            no           ┌────────────────────────────┐
                    ┌─────────────────   Does the logger have a level?
                    │                    └────────────────────────────┘
                    ▽                                   │ yes
       Find first parent with level                     ▽
                    │                    ┌────────────────────────────┐
                    └────────────────▷   Is the logging level higher?    ──────▷ stop
                                         └────────────────────────────┘   no
                                                        │
                                                        ┊  yes

Once that is decided, we look for handlers to pass the message to. Note that once we have decided, the levels of any parents are no longer relevant:

                                                        ┊
                                                        │
                                                        ▽
                   ┌─────────────────▷     Pass to handler of logger
                   │                                    │
       set current logger to parent                     │
                   △                                    ▽
                   │                     ┌────────────────────────────┐
                   └──────────────────     Is propogate set to true?     ──────▷ stop
                           yes           └────────────────────────────┘   no

It is all a bit nuanced and not necessarily intuitive at the beginning! To understand this better see the official Python documentation or Understanding Python’s logging module.

Debugging Logging Configuration

There are two commands that will help you debug logging configuration. The command trase config loggers will print the hierarchy of all loggers:

$ trase config loggers
root WARNING
    ↳ urllib3 <NullHandler (NOTSET)>
        ↳ urllib3.poolmanager
        ↳ urllib3.util
            ↳ urllib3.util.retry
        ↳ urllib3.connection
        ↳ urllib3.response
        ↳ urllib3.connectionpool
    ↳ dotenv
        ↳ dotenv.main
...

The command trase config logger <logger name> <level> helps you understand whether a logging message will be emitted, and if so, to which handlers:

$ trase config logger trase.tools.pcs WARNING
trase.tools.pcs  has effective level WARNING, which is enabled for WARNING
trase.tools.pcs  Emit to <FileHandler ~/Library/Logs/trase/debug.log (NOTSET)>
trase            No handlers
root             No handlers

Examples of Logging Configuration

Here are some example logging configurations that you can put in your trase.toml file:

Here we log at WARNING level for trase.* and DEBUG for trase.tools.aws.*, emitting to the console:

[logging.loggers]
"trase" = { level = "WARNING" }
"trase.tools.aws" = { level = "DEBUG" }

Here we log at DEBUG level for trase.tools.pcs.*, emitting to a file debug.log:

[logging.loggers]
"trase.tools.pcs" = { level = "DEBUG", handlers = [ "debugLogHandler" ] }

The location of debug.log is as follows:

Mac	~/Library/Logs/trase/debug.log
Linux	~/.cache/trase/log/debug.log
Windows	C:\\Users\MyUserName\AppData\Local\trase\Logs\debug.log

Linix	~/.config/trase/trase.toml
Mac	~/Library/Application Support/trase/trase.toml
Windows	C:\Users\MyUserName\AppData\Roaming\trase\trase\trase.toml