- Local development
- Deployment
- Testing
django-upgrade
- Components
- Backends
- Rotating the GitHub token
- Interactive testing
- Dumping co-pilot reporting data
- Ensuring paired field state with CheckConstraints
- Auditing events
- Interfaces
Note: you will need the Bitwarden CLI tool installed in order to access passwords, but it is not a requirement.
- Create a
.env
file; there is an existingdotenv-sample
template that you can use to base your own.env
file on. - Use
bw
to login to the Bitwarden account. - When logged in to Bitwarden, run
scripts/dev-env.sh .env
to retrieve and write the credentials to the target environment file specified..env
is already in.gitignore
to help prevent an accidental commit of credentials.
- Python v3.12.x
- virtualenv
- Pip
- Node.js v20.x (fnm is recommended)
- npm v7.x
- Postgres
Each just
command sets up a dev environment as part of running it.
If you want to maintain your own virtualenv make sure you have activated it before running a just
command and it will be used instead.
Recommend: using docker to provide postgresql
just docker/db
Double check your .env has the right config to talk to this docker instance:
DATABASE_URL=postgres://user:pass@localhost:6543/jobserver
Alternatively, you can install and configure postgresql natively for your OS following the instructions below.
Postgres.app is the easiest way to run Postgres on macOS, you can install it from homebrew (casks) with:
brew install --cask postgres-unofficial
You will need to add its bin directory to your path for the CLI tools to work.
Postico is a popular GUI for Postgres:
brew install --cask postico
Install postgresql with your package manager of choice.
This guide explains how to set up ident (password-less) auth, and set some options for faster, but more dangerous, performance.
If you need to upgrade an installation the ArchWiki is a good reference.
You'll need a database in Postgres to work with, run:
psql -c "CREATE DATABASE jobserver"
On Linux, you'll also need to create the user with relevant permissions:
psql -c "
CREATE ROLE jobsuser PASSWORD 'pass' NOSUPERUSER CREATEDB;
GRANT ALL PRIVILEGES on database jobserver to jobsuser;
"
Copies of production can be restored to a local database using a dump pulled from production. If you do not have access to pull production backups, follow the data setup section instead of restoring a backup.
Backups can be copied with:
scp dokku4:/var/lib/dokku/data/storage/job-server/jobserver.dump jobserver.dump
If using the provided docker db you just need to do (note this will wipe your current dev db):
just docker/restore-db jobserver.dump
If using a manual install, you can restore with:
pg_restore --clean --if-exists --no-acl --no-owner -d jobserver jobserver.dump
Note: This assumes ident auth (the default in Postgres.app) is set up.
Note: pg_restore
will throw errors in various scenarios, which can often be ignored.
The important line to check for (typically at the very end) is errors ignored on restore: N
.
Where N
should match the number of errors you got.
Set up an environment
just devenv
Run migrations:
python manage.py migrate
Build the assets:
See the Compiling assets section.
Run the dev server:
just run
Access at localhost:8000
Run just docker-serve
.
Note: The dev server inside the container does not currently rebuild the frontend assets when changes to them are made.
This project uses Vite, a modern build tool and development server, to build the frontend assets. Vite integrates into the Django project using the django-vite package.
Vite works by compiling JavaScript files, and outputs a manifest file, the JavaScript files, and any included assets such as stylesheets or images.
For styling this project uses Tailwind CSS, and then PostCSS for post-processing.
Vite has a built-in development server which will serve the assets and reload them on save.
To run the development server:
- Update the
.env
file toASSETS_DEV_MODE=True
- Run
just assets-run
This will start the Vite dev server at localhost:5173 and inject the relevant scripts into the Django templates.
To view the compiled assets:
- Update the
.env
file toASSETS_DEV_MODE=False
- Run
just assets-rebuild
Vite builds the assets and outputs them to the assets/dist
folder.
Django Staticfiles app then collects the files and places them in the staticfiles/assets
folder, with the manifest file located at assets/dist/.vite/manifest.json
.
Sometimes it's useful to have a fresh local installation or you may not have authorization to download a production backup. In that situation you can follow the steps below to set up your local copy of the site:
-
Create a GitHub OAuth application.
- The callback URL must be
http://localhost:8000/complete/github/
. - The other fields don't matter too much for local development.
- The callback URL must be
-
Register a user account on your local version of job-server by clicking Login
-
Set the
SOCIAL_AUTH_GITHUB_KEY
(aka "Client ID") andSOCIAL_AUTH_GITHUB_SECRET
environment variables with values from that OAuth application. -
Give your user the
StaffAreaAdministrator
role by running:> python manage.py create_user <your_username> -s
-
Click on your avatar in the top right-hand corner of the site to access the Staff Area.
-
Create an Org, Project, and Backend in the Staff Area.
-
On your User page in the Staff Area link it to the Backend and Org you created.
-
Assign your user account to the Project with
ProjectDeveloper
andProjectCollaborator
roles on the Project page within the Staff Area. -
Navigate to the Project page in the main site using the "View on Site" button.
-
Create a Workspace for the Project.
-
Create a JobRequest in the Workspace.
If you need one or more Jobs linked to the JobRequest you will need to create them in the database or with the Django shell.
The opentelemetry dependencies need to be upgraded as a group. To do this, bump the relevant versions in requirements.prod.in
and then attempt to manually resolve the dependencies by upgrading a number of packages simultaneously. A recent example of this is:
$ pip-compile --resolver=backtracking --allow-unsafe --generate-hashes --strip-extras --output-file=requirements.prod.txt requirements.prod.in --upgrade-package opentelemetry-instrumentation --upgrade-package opentelemetry-exporter-otlp-proto-http --upgrade-package opentelemetry-sdk --upgrade-package opentelemetry-instrumentation-django --upgrade-package opentelemetry-instrumentation-psycopg2 --upgrade-package opentelemetry-instrumentation-requests --upgrade-package opentelemetry-instrumentation-wsgi --upgrade-package opentelemetry-semantic-conventions --upgrade-package opentelemetry-util-http --upgrade-package opentelemetry-instrumentation-dbapi --upgrade-package opentelemetry-api --upgrade-package opentelemetry-proto --upgrade-package opentelemetry-exporter-otlp-proto-common
It is currently configured to be deployed Heroku-style, and requires the environment variables defined in dotenv-sample
.
The Bennett Institute job server is deployed to our dokku4
instance, instructions are are in INSTALL.md.
Run the unit tests:
just test
Run all of the tests (including slow tests) apart from verification tests (that hit external APIs) and run coverage, as it's done in CI:
just test-ci
More details on testing can be found in TESTING.md.
django-upgrade
is used
to migrate Django code from older versions to the current version in use.
django-upgrade
is run via just django-upgrade
.
django-upgrade
also gets run via just check
and is also runs via the pre-commit
checks.
When upgrading to a new Django minor or major version:
- Ensure
django-upgrade
has been run, and any changesdjango-upgrade
makes committed. - Update the Django version used for the invocation of
django-upgrade
in thedjango-upgrade
recipe in thejustfile
.
With a valid bot token, you can run tests and have any slack messages generated actually sent to a test channel by setting some environment variables:
export SLACK_BOT_TOKEN=...
export SLACK_TEST_CHANNEL=job-server-testing
just test-dev
Job Server uses the Slippers library to build reusable components.
To view the existing components, and see what attributes they receive, visit the UI gallery.
Job Server uses Hero Icons.
To add a new icon:
- Find the icon you need
- Copy the SVG to a new file in
templates/_icons/
. The website will give you the SVG code rather than a file. - Edit the properties of that file so that:
height
andwidth
attributes should match the values in theviewBox
- the class is configurable:
class="{{ class }}"
fill
should becurrentColor
unless it's an outline icon then it should benone
andstroke
should becurrentColor
- Map the icon file path to a name in
templates/components.yaml
Backends in this project represent a job runner instance somewhere. They are a Django model with a unique authentication token attached.
This has allowed us some benefits:
- API requests can be tied directly to a Backend (eg get all JobRequests for TPP).
- Per-Backend API stats collection is trivial because requests are tied to a Backend via auth.
- Log into the
opensafely-readonly
GitHub account (credentials are in Bitwarden). - Got to the Personal access tokens (classic) page.
- Click on
job-server-api-token
. - Click "Regenerate token".
- Set the expiry to 90 days.
- Copy the new token.
- ssh into
dokku4.ebmdatalab.net
- Run:
dokku config:set job-server JOBSERVER_GITHUB_TOKEN=<the new token>
- Log into the
opensafely-interactive-bot
GitHub account (credentials are in Bitwarden). - Got to the opensafely-interactive-token.
- Click "Regenerate token".
- Set the expiry to 90 days.
- Copy the new token.
- ssh into
dokku4.ebmdatalab.net
- Run:
dokku config:set job-server INTERACTIVE_GITHUB_TOKEN=<the new token>
Job Server uses the interactive-templates repo code, imported as a Python package, to run OS Interactive analyses and to generate reports.
To facilitate local testing, the osi_run
Django management command has been created to produce a report from an Analysis Request. It's used like this:
python manage.py osi_run <analysis-request-slug>
The resulting HTML report is output into the workspaces
directory and can be released, so that it's visible within Job Server, using the osi_release
management command:
python manage.py osi_release <analysis-request-slug> <user-name> --report workspaces/<analysis-request-pk>/report.html
These two actions can be combined using the osi_run_and_release
management command:
python manage.py osi_run_and_release <analysis-request-slug> <user-name>
Alternatively, the osi_release
command can be used without running an analysis first, for fast development, using a fake report:
python manage.py osi_release <analysis-request-slug> <user-name>
Co-pilots have a report they run every few months, building on data from this service.
To produce a dump in the format they need you will need to install db-to-sqlite via pip, pipx, or your installer of choice.
You will also need to set the DATABASE_URL
environment variable.
Then run just dump-co-pilot-reporting-data
.
We have various paired fields in our database models. These are often, but not limited to fields which track who performed an action and when they performed it. It's useful to be able to ensure these related fields are in the correct state.
Enter Django's CheckConstraint constraint which allows us to encode that relationship at the database level.
We can set these in a model's Meta and use a Q
object for the check kwarg.
See the common patterns section below for some examples.
This example shows how you can ensure both fields are set or null. This is our most common usage at the time of writing.
With some fields that look like this:
frobbed_at = models.DateTimeField(null=True)
frobbed_by = models.ForeignKey(
"jobserver.User",
on_delete=models.CASCADE,
related_name="my_model_fobbed",
null=True
)
Your CheckConstraint which covers both states looks like this:
class Meta:
constraints = [
models.CheckConstraint(
condition=(
Q(
frobbed_at__isnull=True,
frobbed_by__isnull=True,
)
| (
Q(
frobbed_at__isnull=False,
frobbed_by__isnull=False,
)
)
),
name="%(app_label)s_%(class)s_both_frobbed_at_and_frobbed_by_set",
),
]
You can then test these constraints like so:
def test_mymodel_constraints_frobbed_at_and_frobbed_by_both_set():
MyModelFactory(frobbed_at=timezone.now(), frobbed_by=UserFactory())
def test_mymodel_constraints_frobbed_at_and_frobbed_by_neither_set():
MyModelFactory(frobbed_at=None, frobbed_by=None)
@pytest.mark.django_db(transaction=True)
def test_mymodel_constraints_missing_frobbed_at_or_frobbed_by():
with pytest.raises(IntegrityError):
MyModelFactory(frobbed_at=None, frobbed_by=UserFactory())
with pytest.raises(IntegrityError):
MyModelFactory(frobbed_at=timezone.now(), frobbed_by=None)
This is very similar to the pattern above, except we use auto_now=True
and don't allow nulls in the fields, which means we don't have to account for nulls in the constraint:
updated_at = models.DateTimeField(auto_now=True)
updated_by = models.ForeignKey(
"jobserver.User",
on_delete=models.PROTECT,
related_name="my_model_updated",
)
class Meta:
constraints = [
models.CheckConstraint(
condition=Q(updated_at__isnull=False, updated_by__isnull=False),
name="%(app_label)s_%(class)s_both_updated_at_and_updated_by_set",
),
]
The use of auto_now
also changes how we test this constraint.
It cannot be overridden when using any part of the ORM which touches save()
because it's set there.
So we lean on update()
instead:
def test_mymodel_constraints_updated_at_and_updated_by_both_set():
MyModelFactory(updated_by=UserFactory())
@pytest.mark.django_db(transaction=True)
def test_mymodel_constraints_missing_updated_at_or_updated_by():
with pytest.raises(IntegrityError):
MyModelFactory(updated_by=None)
with pytest.raises(IntegrityError):
mymodel = MyModelFactory(updated_by=UserFactory())
# use update to work around auto_now always firing on save()
MyModel.objects.filter(pk=mymodel.pk).update(updated_at=None)
We track events that we want in our audit trail with the AuditableEvent model. It avoids foreign keys so any related model isn't blocked from being deleted
As such constructing these models can be a little onerous, so we have started wrapping the triggering event, eg adding a user to a project, with a function that does both that and sets up the AuditableEvent instance. These are currently called commands because naming things is hard, and they will, we hope, be better organised in the near future into a domain layer. In the meantime, it's just useful to know that creating AuditableEvent instances can be easier.
Since AuditableEvents have no relationships to the models they record changes in we have to manually look up those models, where we can for display in the UI. The presenters package exists to handle all of this. There are a few key parts to it.
AuditableEvents have a type
field which tracks the event type they were created for.
get_presenter() takes an AuditableEvent instance and returns the relevant presenter function, or raises an UnknownPresenter exception.
Presenter functions take an AuditableEvent instance and trust that the caller is passing in one relevant to that function.
At the time of writing we only display events in the staff area so there is no way to change how presenters build their context, or what template they choose. We're aware that we might want to display them as a general feed on the site. If this turns out to be the case the author's expectation is that we will use inversion of control so the calling view can decide the context in which presenters are used. This will most likely affect the template used for each event, and where each object links to, if anywhere.
Descriptions of interfaces between this repo or container and others. These interfaces can be changed, through coordination with the relevant teams.
Except where mentioned otherwise, URLs are relative to the root of the Job Server API endpoint (https://jobs.opensafely.org/api/v2 in production).
Management commands might be used in other repos' tooling and CI. While not required, it's helpful to check for downstream impacts if you change their API.
Job Runner is a container that runs in a secure backend. It executes JobRequests initiated by users of Job Server.
This interacts with jobserver/api/jobs.py It uses the JobRequestAPIList
endpoint (GET /job-requests/
) for reading JobRequest
s. It uses the
JobAPIUpdate
endpoint (POST /jobs/
) for updating the Job
table.
(Current as of 2024-09.)
Refer to the documentation of jobrunner.sync for Job Runner's documentation of this interface.
Airlock is a container that runs in a secure backend. Researchers interact with it to view moderately sensitive outputs produced by Job Runner, to view log output from jobs, and to create requests to release files. Users with the OutputChecker role interact with it to review such release requests and to manage the release of files to Job Server.
Airlock refers to Job Server's permissions model to determine what users can
do. The code it needs is in jobserver/api/releases.py. The endpoints it uses
are Level4TokenAuthenticationAPI
(GET /releases/authenticate/
) and
Level4AuthorisationAPI
(GET /releases/authorise/
). It receives the results
of build_level4_user
to determine whether a user is an OutputChecker, and
which workspaces they can access. (Current as of 2024-09.)
When releases are approved, Airlock triggers creation of a Release for the
associated Workspace on Job Server through the jobserver/api/releases.py
ReleaseWorkspaceAPI
endpoint (POST /releases/workspace/{workspace_name}
).
Files are uploaded from Airlock to Job Server through the ReleaseAPI
endpoint
(POST releases/release/{release_id}
). (Current as of 2024-09.)
Notifications of events related to release requests are triggered through the
airlock_event_view endpoint (POST /airlock/events/
), which is currently the
only responsibility of the airlock app within Job Server. Depending on the
event, users are notified by email, Slack or by creating/updating GitHub
issues. (Current as of 2024-09.)