View or edit on GitHub
This page is synchronized from doc/Healthchecks.md. Last modified on 2025-12-09 00:30 CET by Trase Admin.
Please view or edit the original file there; changes should be reflected here after a midnight build (CET time),
or manually triggering it with a GitHub action (link).
Healthchecks
Various parts of the Trase system run on an automatic basis such as GitHub Actions or cron jobs.
If any of these jobs stop running or working for whatever reason, we want to know as soon as possible so that we can fix it.
This is where our use of Healthchecks.io comes in. Healthchecks.io provides alerts if a job does not register itself as succeeding on an expected schedule. For example, we might expect that a daily job reports itself as succeeding every day at 9am: if this report does not come in, we assume it has failed and we get an alert.
The way it works is that you create a set of "healthchecks" on their online site, each consisting of:
- A public URL, e.g https://hc-ping.com/xxxx-xxxx
- A cron schedule that the URL expects to be pinged on, e.g. once every hour
- Something to notify if the ping is not received in time, e.g. a Slack channel
We have an account with the email trase-admin@sei.org and notifications go to this email address as well as the #alerts channel on Slack.
How to create a new healthcheck
To do this you will need access to the email address trase-admin@sei.org.
- Visit https://www.healthchecks.io
- Sign in > enter trase-admin@sei.org for the "email me a magic link" option
- Open the link in the email that was sent
- Click "Add Check".
- Edit the check as follows:
- Set the name to something descriptive
- Set the description to something useful, for example a link to the script that will be pinging the healthcheck to help a future user debug a healthcheck failure
- Set the schedule to the match the scheduled job you intend the healthcheck to be for
- Consider adding a tag to help us group the healthchecks together
How to use a healthcheck
Using the healthcheck is as simple executing an HTTP GET request:
curl --retry 3 "https://hc-ping.com/xxxx-xxxx"
If your job is a Bash or Python script, check out the send_healthcheck_if_enabled function in the appropriate language.