Skip to content

Commit

Permalink
Merge pull request #1811 from opensafely-core/v1
Browse files Browse the repository at this point in the history
Release ehrQL `v1`
  • Loading branch information
inglesp authored Dec 8, 2023
2 parents 605531e + 5286f57 commit f7f8e39
Show file tree
Hide file tree
Showing 108 changed files with 1,430 additions and 855 deletions.
5 changes: 0 additions & 5 deletions .github/workflows/build-and-deploy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@ jobs:
runs-on: ubuntu-latest
env:
image: ghcr.io/opensafely-core/ehrql
aliased_image: ghcr.io/opensafely-core/databuilder
steps:
- uses: actions/checkout@v4
with:
Expand Down Expand Up @@ -110,9 +109,6 @@ jobs:
--tag ${{ env.image }}:$MAJOR \
--tag ${{ env.image }}:$MINOR \
--tag ${{ env.image }}:$PATCH \
--tag ${{ env.aliased_image }}:$MAJOR \
--tag ${{ env.aliased_image }}:$MINOR \
--tag ${{ env.aliased_image }}:$PATCH
- name: Log into GitHub Container Registry
if: ${{ startsWith(steps.taggedcommit.outputs.tag, 'y') }}
Expand All @@ -122,4 +118,3 @@ jobs:
if: ${{ startsWith(steps.taggedcommit.outputs.tag, 'y') }}
run: |
docker push --all-tags ${{ env.image }}
docker push --all-tags ${{ env.aliased_image }}
2 changes: 0 additions & 2 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -128,8 +128,6 @@ FROM ehrql-base as ehrql
# comment above
COPY ehrql /app/ehrql
RUN python -m compileall /app/ehrql
COPY databuilder /app/databuilder
RUN python -m compileall /app/databuilder
COPY bin /app/bin

# The following build details will change.
Expand Down
1 change: 0 additions & 1 deletion databuilder/codes.py

This file was deleted.

1 change: 0 additions & 1 deletion databuilder/ehrql.py

This file was deleted.

Empty file removed databuilder/tables/__init__.py
Empty file.
Empty file.
1 change: 0 additions & 1 deletion databuilder/tables/beta/smoketest.py

This file was deleted.

1 change: 0 additions & 1 deletion databuilder/tables/beta/tpp.py

This file was deleted.

12 changes: 6 additions & 6 deletions docs/explanation/backend-tables.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,10 +28,10 @@ Make the core tables available for use in a dataset definition
with import statements like:

```python
from ehrql.tables.beta.core import medications, patients
from ehrql.tables.core import medications, patients
```

where the `ehrql.tables.beta.core` specifies that we are using the core tables.
where the `ehrql.tables.core` specifies that we are using the core tables.

## Backend-specific tables

Expand All @@ -54,7 +54,7 @@ For example, for TPP-specific tables,
we use `tpp` in the import statement:

```python
from ehrql.tables.beta.tpp import addresses, patients
from ehrql.tables.tpp import addresses, patients
```

:notepad_spiral: In this example,
Expand All @@ -73,10 +73,10 @@ in the [writing a dataset definition](../tutorial/writing-a-dataset-definition/i
we used the interactive ehrQL sandbox with the following statement to start with:

```python
>>> from ehrql.tables.beta.core import patients, medications
>>> from ehrql.tables.core import patients, medications
```

* `beta.core` is the *table schema*
* `core` is the *table schema*
* `patients` and `medications` are the *table names*

We also accessed *table columns*
Expand All @@ -99,6 +99,6 @@ The table schema reference explains:
* whether table columns contain at most one row per patient,
or may contain multiple rows per patient

:grey_question: Consult the [`beta.core`](../reference/schemas/beta.core.md) schema.
:grey_question: Consult the [`core`](../reference/schemas/core.md) schema.
Choose any of the tables there
and understand its structure from the schema.
4 changes: 2 additions & 2 deletions docs/explanation/measures.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Suppose we want to know what proportion of the patients prescribed atorvastatin

```python
from ehrql import INTERVAL, case, create_measures, months, when
from ehrql.tables.beta.core import medications, patients
from ehrql.tables.core import medications, patients

# Every measure definitions file must include this line
measures = create_measures()
Expand Down Expand Up @@ -68,7 +68,7 @@ measures.define_measure(

You can save this file as `measure_definition.py` and then run the [`generate-measures`](../reference/cli.md#generate-measures) command on it:
```
opensafely exec ehrql:v0 generate-measures measure_definition.py --output measures.csv
opensafely exec ehrql:v1 generate-measures measure_definition.py --output measures.csv
```

### Results
Expand Down
6 changes: 3 additions & 3 deletions docs/explanation/output-formats.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,13 +63,13 @@ Instead, the output is displayed at the command line.
#### `.arrow`

```
opensafely exec ehrql:v0 generate-dataset "./dataset-definition.py" --dummy-tables "example-data/" --output "./outputs/data_extract.arrow"
opensafely exec ehrql:v1 generate-dataset "./dataset-definition.py" --dummy-tables "example-data/" --output "./outputs/data_extract.arrow"
```

#### `.csv.gz`

```
opensafely exec ehrql:v0 generate-dataset "./dataset-definition.py" --dummy-tables "example-data/" --output "./outputs/data_extract.csv.gz"
opensafely exec ehrql:v1 generate-dataset "./dataset-definition.py" --dummy-tables "example-data/" --output "./outputs/data_extract.csv.gz"
```

### Example `project.yaml`
Expand All @@ -82,7 +82,7 @@ expectations:

actions:
extract_data:
run: ehrql:v0 generate-dataset "./dataset_definition.py" --output "outputs/data_extract.arrow"
run: ehrql:v1 generate-dataset "./dataset_definition.py" --output "outputs/data_extract.arrow"
outputs:
highly_sensitive:
population: outputs/data_extract.arrow
Expand Down
18 changes: 9 additions & 9 deletions docs/explanation/running-ehrql.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ by allowing you to interactively query some dummy tables.
:computer:
To start the sandbox, from the `learning-ehrql` directory, run:

opensafely exec ehrql:v0 sandbox example-data
opensafely exec ehrql:v1 sandbox example-data

You will now be in a session with an interactive Python console,
and you should see something like this:
Expand Down Expand Up @@ -58,7 +58,7 @@ For example if you type `1 + 1` and press the return key, you should see:
To use ehrQL, you'll first need to import the tables that you want to interact with:

```pycon
>>> from ehrql.tables.beta.core import patients, medications
>>> from ehrql.tables.core import patients, medications
```

Now, you can inspect the contents of these tables, by entering the names of the tables:
Expand Down Expand Up @@ -189,7 +189,7 @@ and save it in your `learning-ehrql` directory:

```ehrql
from ehrql import create_dataset
from ehrql.tables.beta.core import patients, medications
from ehrql.tables.core import patients, medications
dataset = create_dataset()
Expand All @@ -214,7 +214,7 @@ Make sure you save the file!
use the command below to run your dataset definition with ehrQL.

```
opensafely exec ehrql:v0 generate-dataset dataset_definition.py --dummy-tables example-data --output output/dataset.csv
opensafely exec ehrql:v1 generate-dataset dataset_definition.py --dummy-tables example-data --output output/dataset.csv
```

:notepad_spiral: ehrQL dataset definitions are written in Python.
Expand All @@ -225,8 +225,8 @@ to run the dataset definition.

#### What each part of this command does

* `opensafely exec ehrql:v0` uses the OpenSAFELY CLI to run ehrQL.
The `v0` after the `:` refers to the version of ehrQL being used.
* `opensafely exec ehrql:v1` uses the OpenSAFELY CLI to run ehrQL.
The `v1` after the `:` refers to the version of ehrQL being used.
* `generate-dataset` instructs ehrQL to generate a dataset from the dataset definition.
* `dataset_definition.py` specifies the filename of the dataset definition to use.
* The dataset definition file is in the directory that we are running `opensafely exec`
Expand Down Expand Up @@ -276,7 +276,7 @@ patient_id,med_date,med_code
without the `--dummy-tables` and `--output` options:

```
opensafely exec ehrql:v0 generate-dataset dataset_definition.py
opensafely exec ehrql:v1 generate-dataset dataset_definition.py
```

By not specifying the dummy tables to use,
Expand All @@ -299,7 +299,7 @@ an error message will be displayed on the screen.
This is one example:

```
$ opensafely exec ehrql:v0 generate-dataset dataset_definition.py --dummy-tables example-data --output output/dataset.csv
$ opensafely exec ehrql:v1 generate-dataset dataset_definition.py --dummy-tables example-data --output output/dataset.csv
2023-04-21 17:53:42 [info ] Compiling dataset definition from dataset_definition.py [ehrql.main]
Failed to import 'dataset_definition.py':
```
Expand Down Expand Up @@ -331,7 +331,7 @@ expectations:

actions:
generate_dataset:
run: ehrql:v0 generate-dataset dataset_definition.py --dummy-tables example-data --output output/dataset.csv.gz
run: ehrql:v1 generate-dataset dataset_definition.py --dummy-tables example-data --output output/dataset.csv.gz
outputs:
highly_sensitive:
cohort: output/dataset.csv.gz
Expand Down
2 changes: 1 addition & 1 deletion docs/explanation/selecting-populations-for-study.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,4 +26,4 @@ for further details of this transfer process.
a patient did not change practice during a time period of interest.

For TPP,
there is a [method to select patients with a continuous registration](../reference/schemas/beta.tpp.md#practice_registrations.has_a_continuous_practice_registration_spanning).
there is a [method to select patients with a continuous registration](../reference/schemas/tpp.md#practice_registrations.has_a_continuous_practice_registration_spanning).
2 changes: 1 addition & 1 deletion docs/how-to/assign-multiple-columns.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ This example adds two new columns to the dataset: `asthma_meds_count` and `other

```ehrql
from ehrql import create_dataset
from ehrql.tables.beta.core import patients, medications
from ehrql.tables.core import patients, medications
asthma_codelist = ["39113311000001107", "39113611000001102"]
other_codelist = ["10000000000000001", "10000000000000002"]
Expand Down
14 changes: 7 additions & 7 deletions docs/how-to/dummy-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ You do not need to add anything to the dataset definition itself in order to gen
dataset in this way. ehrQL will use the dataset definition to set up dummy data and generate
matching patients.

By default, 500 patients will be generated in a dummy dataset. If you need to increase this
By default, ten patients will be generated in a dummy dataset. If you need to increase this
number, you can configure it in the dataset definition with:

```
Expand Down Expand Up @@ -50,7 +50,7 @@ For example, take this dataset definition from the tutorial:

```ehrql
from ehrql import create_dataset
from ehrql.tables.beta.core import patients, medications
from ehrql.tables.core import patients, medications
dataset = create_dataset()
Expand Down Expand Up @@ -80,7 +80,7 @@ And this dummy dataset, in a CSV file named `dummy.csv`:
Run the dataset definition with the dummy dataset file:

```
opensafely exec ehrql:v0 generate-dataset dataset_definition.py --dummy-data-file dummy.csv
opensafely exec ehrql:v1 generate-dataset dataset_definition.py --dummy-data-file dummy.csv
```

Now, instead of a generated dummy dataset, you'll see the data from the dummy data file that
Expand Down Expand Up @@ -127,14 +127,14 @@ to generate an initial dataset, and then modify it as you need.
Run the dataset definition with an output path:

```
opensafely exec ehrql:v0 generate-dataset dataset_definition.py --output dataset.csv
opensafely exec ehrql:v1 generate-dataset dataset_definition.py --output dataset.csv
```

Now you can edit `dataset.csv` as you want, and rerun the dataset definition, using it as the
dummy data file:

```
opensafely exec ehrql:v0 generate-dataset dataset_definition.py --dummy-data-file dataset.csv
opensafely exec ehrql:v1 generate-dataset dataset_definition.py --dummy-data-file dataset.csv
```

## Supply your own dummy tables
Expand All @@ -158,7 +158,7 @@ specific tables that are required.
Try this out by running the following command against the simple dataset definition above:

```
opensafely exec ehrql:v0 create-dummy-tables dataset_definition.py dummy-folder
opensafely exec ehrql:v1 create-dummy-tables dataset_definition.py dummy-folder
```

![A screenshot of VS Code, showing the terminal after the `create-dummy-tables` command was run](opensafely_exec_create_dummy_tables.png)
Expand All @@ -169,5 +169,5 @@ dataset definition requires - `patients.csv` and `medications.csv`.
Now you can run ehrQl with these generated tables instead:

```
opensafely exec ehrql:v0 generate-dataset dataset_definition.py --dummy-tables dummy-folder
opensafely exec ehrql:v1 generate-dataset dataset_definition.py --dummy-tables dummy-folder
```
6 changes: 3 additions & 3 deletions docs/how-to/dummy-measures-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ You do not need to add anything to the measures definition itself in order to ge
dataset in this way. ehrQL will use the measures definition to set up dummy data and generate
matching patients.

By default, 500 patients will be generated in a dummy measures output. If you need to increase this number, you can configure it in the measures definition with:
By default, ten patients will be generated in a dummy measures output. If you need to increase this number, you can configure it in the measures definition with:

```
measures.configure_dummy_data(population_size=1000)
Expand Down Expand Up @@ -47,7 +47,7 @@ For example, take this simple measures definition:
```python
from ehrql import create_measures, years
from ehrql.measures import INTERVAL
from ehrql.tables.beta.core import patients, clinical_events
from ehrql.tables.core import patients, clinical_events

events_in_interval = clinical_events.where(clinical_events.date.is_during(INTERVAL))
had_event = events_in_interval.exists_for_patient()
Expand Down Expand Up @@ -76,7 +76,7 @@ And this dummy measures, in a CSV file named `dummy_measures.csv`:
Run the measures definition with the dummy measures output file:

```
opensafely exec ehrql:v0 generate-measres measures_definition.py --dummy-data-file dummy_measures.csv
opensafely exec ehrql:v1 generate-measres measures_definition.py --dummy-data-file dummy_measures.csv
```

Now, instead of generated dummy measures output, you'll see the data from the dummy data file that you provided.
Expand Down
Loading

0 comments on commit f7f8e39

Please sign in to comment.