Merge pull request #1811 from opensafely-core/v1

Release ehrQL `v1`
opensafely-core · Dec 8, 2023 · f7f8e39 · f7f8e39
2 parents 605531e + 5286f57
commit f7f8e39
Show file tree

Hide file tree

Showing 108 changed files with 1,430 additions and 855 deletions.
diff --git a/.github/workflows/build-and-deploy.yml b/.github/workflows/build-and-deploy.yml
@@ -16,7 +16,6 @@ jobs:
     runs-on: ubuntu-latest
     env:
       image: ghcr.io/opensafely-core/ehrql
-      aliased_image: ghcr.io/opensafely-core/databuilder
     steps:
       - uses: actions/checkout@v4
         with:
@@ -110,9 +109,6 @@ jobs:
             --tag ${{ env.image }}:$MAJOR \
             --tag ${{ env.image }}:$MINOR \
             --tag ${{ env.image }}:$PATCH \
-            --tag ${{ env.aliased_image }}:$MAJOR \
-            --tag ${{ env.aliased_image }}:$MINOR \
-            --tag ${{ env.aliased_image }}:$PATCH
 
       - name: Log into GitHub Container Registry
         if: ${{ startsWith(steps.taggedcommit.outputs.tag, 'y') }}
@@ -122,4 +118,3 @@ jobs:
         if: ${{ startsWith(steps.taggedcommit.outputs.tag, 'y') }}
         run: |
           docker push --all-tags ${{ env.image }}
-          docker push --all-tags ${{ env.aliased_image }}
diff --git a/Dockerfile b/Dockerfile
@@ -128,8 +128,6 @@ FROM ehrql-base as ehrql
 # comment above
 COPY ehrql /app/ehrql
 RUN python -m compileall /app/ehrql
-COPY databuilder /app/databuilder
-RUN python -m compileall /app/databuilder
 COPY bin /app/bin
 
 # The following build details will change.

diff --git a/databuilder/codes.py b/databuilder/codes.py
diff --git a/databuilder/ehrql.py b/databuilder/ehrql.py
diff --git a/databuilder/tables/__init__.py b/databuilder/tables/__init__.py
diff --git a/databuilder/tables/beta/__init__.py b/databuilder/tables/beta/__init__.py
diff --git a/databuilder/tables/beta/smoketest.py b/databuilder/tables/beta/smoketest.py
diff --git a/databuilder/tables/beta/tpp.py b/databuilder/tables/beta/tpp.py
diff --git a/docs/explanation/backend-tables.md b/docs/explanation/backend-tables.md
@@ -28,10 +28,10 @@ Make the core tables available for use in a dataset definition
 with import statements like:
 
 ```python
-from ehrql.tables.beta.core import medications, patients
+from ehrql.tables.core import medications, patients
 ```
 
-where the `ehrql.tables.beta.core` specifies that we are using the core tables.
+where the `ehrql.tables.core` specifies that we are using the core tables.
 
 ## Backend-specific tables
 
@@ -54,7 +54,7 @@ For example, for TPP-specific tables,
 we use `tpp` in the import statement:
 
 ```python
-from ehrql.tables.beta.tpp import addresses, patients
+from ehrql.tables.tpp import addresses, patients
 ```
 
 :notepad_spiral: In this example,
@@ -73,10 +73,10 @@ in the [writing a dataset definition](../tutorial/writing-a-dataset-definition/i
 we used the interactive ehrQL sandbox with the following statement to start with:
 
 ```python
->>> from ehrql.tables.beta.core import patients, medications
+>>> from ehrql.tables.core import patients, medications
 ```
 
-* `beta.core` is the *table schema*
+* `core` is the *table schema*
 * `patients` and `medications` are the *table names*
 
 We also accessed *table columns*
@@ -99,6 +99,6 @@ The table schema reference explains:
 * whether table columns contain at most one row per patient,
   or may contain multiple rows per patient
 
-:grey_question: Consult the [`beta.core`](../reference/schemas/beta.core.md) schema.
+:grey_question: Consult the [`core`](../reference/schemas/core.md) schema.
 Choose any of the tables there
 and understand its structure from the schema.
diff --git a/docs/explanation/measures.md b/docs/explanation/measures.md
@@ -17,7 +17,7 @@ Suppose we want to know what proportion of the patients prescribed atorvastatin
 
 ```python
 from ehrql import INTERVAL, case, create_measures, months, when
-from ehrql.tables.beta.core import medications, patients
+from ehrql.tables.core import medications, patients
 
 # Every measure definitions file must include this line
 measures = create_measures()
@@ -68,7 +68,7 @@ measures.define_measure(
 
 You can save this file as `measure_definition.py` and then run the [`generate-measures`](../reference/cli.md#generate-measures) command on it:
 ```
-opensafely exec ehrql:v0 generate-measures measure_definition.py --output measures.csv
+opensafely exec ehrql:v1 generate-measures measure_definition.py --output measures.csv
 ```
 
 ### Results

diff --git a/docs/explanation/output-formats.md b/docs/explanation/output-formats.md
@@ -63,13 +63,13 @@ Instead, the output is displayed at the command line.
 #### `.arrow`
 
 ```
-opensafely exec ehrql:v0 generate-dataset "./dataset-definition.py" --dummy-tables "example-data/" --output "./outputs/data_extract.arrow"
+opensafely exec ehrql:v1 generate-dataset "./dataset-definition.py" --dummy-tables "example-data/" --output "./outputs/data_extract.arrow"
 ```
 
 #### `.csv.gz`
 
 ```
-opensafely exec ehrql:v0 generate-dataset "./dataset-definition.py" --dummy-tables "example-data/" --output "./outputs/data_extract.csv.gz"
+opensafely exec ehrql:v1 generate-dataset "./dataset-definition.py" --dummy-tables "example-data/" --output "./outputs/data_extract.csv.gz"
 ```
 
 ### Example `project.yaml`
@@ -82,7 +82,7 @@ expectations:
 
 actions:
   extract_data:
-    run: ehrql:v0 generate-dataset "./dataset_definition.py" --output "outputs/data_extract.arrow"
+    run: ehrql:v1 generate-dataset "./dataset_definition.py" --output "outputs/data_extract.arrow"
     outputs:
       highly_sensitive:
         population: outputs/data_extract.arrow

diff --git a/docs/explanation/running-ehrql.md b/docs/explanation/running-ehrql.md
@@ -30,7 +30,7 @@ by allowing you to interactively query some dummy tables.
 :computer:
 To start the sandbox, from the `learning-ehrql` directory, run:
 
-    opensafely exec ehrql:v0 sandbox example-data
+    opensafely exec ehrql:v1 sandbox example-data
 
 You will now be in a session with an interactive Python console,
 and you should see something like this:
@@ -58,7 +58,7 @@ For example if you type `1 + 1` and press the return key, you should see:
 To use ehrQL, you'll first need to import the tables that you want to interact with:
 
 ```pycon
->>> from ehrql.tables.beta.core import patients, medications
+>>> from ehrql.tables.core import patients, medications
 ```
 
 Now, you can inspect the contents of these tables, by entering the names of the tables:
@@ -189,7 +189,7 @@ and save it in your `learning-ehrql` directory:
 
 ```ehrql
 from ehrql import create_dataset
-from ehrql.tables.beta.core import patients, medications
+from ehrql.tables.core import patients, medications
 
 dataset = create_dataset()
 
@@ -214,7 +214,7 @@ Make sure you save the file!
 use the command below to run your dataset definition with ehrQL.
 
 ```
-opensafely exec ehrql:v0 generate-dataset dataset_definition.py --dummy-tables example-data --output output/dataset.csv
+opensafely exec ehrql:v1 generate-dataset dataset_definition.py --dummy-tables example-data --output output/dataset.csv
 ```
 
 :notepad_spiral: ehrQL dataset definitions are written in Python.
@@ -225,8 +225,8 @@ to run the dataset definition.
 
 #### What each part of this command does
 
-* `opensafely exec ehrql:v0` uses the OpenSAFELY CLI to run ehrQL.
-  The `v0` after the `:` refers to the version of ehrQL being used.
+* `opensafely exec ehrql:v1` uses the OpenSAFELY CLI to run ehrQL.
+  The `v1` after the `:` refers to the version of ehrQL being used.
 * `generate-dataset` instructs ehrQL to generate a dataset from the dataset definition.
 * `dataset_definition.py` specifies the filename of the dataset definition to use.
     * The dataset definition file is in the directory that we are running `opensafely exec`
@@ -276,7 +276,7 @@ patient_id,med_date,med_code
 without the `--dummy-tables` and `--output` options:
 
 ```
-opensafely exec ehrql:v0 generate-dataset dataset_definition.py
+opensafely exec ehrql:v1 generate-dataset dataset_definition.py
 ```
 
 By not specifying the dummy tables to use,
@@ -299,7 +299,7 @@ an error message will be displayed on the screen.
 This is one example:
 
 ```
-$ opensafely exec ehrql:v0 generate-dataset dataset_definition.py --dummy-tables example-data --output output/dataset.csv
+$ opensafely exec ehrql:v1 generate-dataset dataset_definition.py --dummy-tables example-data --output output/dataset.csv
 2023-04-21 17:53:42 [info     ] Compiling dataset definition from dataset_definition.py [ehrql.main]
 Failed to import 'dataset_definition.py':
 ```
@@ -331,7 +331,7 @@ expectations:
 
 actions:
   generate_dataset:
-    run: ehrql:v0 generate-dataset dataset_definition.py --dummy-tables example-data --output output/dataset.csv.gz
+    run: ehrql:v1 generate-dataset dataset_definition.py --dummy-tables example-data --output output/dataset.csv.gz
     outputs:
       highly_sensitive:
         cohort: output/dataset.csv.gz

diff --git a/docs/explanation/selecting-populations-for-study.md b/docs/explanation/selecting-populations-for-study.md
@@ -26,4 +26,4 @@ for further details of this transfer process.
     a patient did not change practice during a time period of interest.
 
     For TPP,
-    there is a [method to select patients with a continuous registration](../reference/schemas/beta.tpp.md#practice_registrations.has_a_continuous_practice_registration_spanning).
+    there is a [method to select patients with a continuous registration](../reference/schemas/tpp.md#practice_registrations.has_a_continuous_practice_registration_spanning).
diff --git a/docs/how-to/assign-multiple-columns.md b/docs/how-to/assign-multiple-columns.md
@@ -34,7 +34,7 @@ This example adds two new columns to the dataset: `asthma_meds_count` and `other
 
 ```ehrql
 from ehrql import create_dataset
-from ehrql.tables.beta.core import patients, medications
+from ehrql.tables.core import patients, medications
 
 asthma_codelist = ["39113311000001107", "39113611000001102"]
 other_codelist = ["10000000000000001", "10000000000000002"]

diff --git a/docs/how-to/dummy-data.md b/docs/how-to/dummy-data.md
@@ -22,7 +22,7 @@ You do not need to add anything to the dataset definition itself in order to gen
 dataset in this way. ehrQL will use the dataset definition to set up dummy data and generate
 matching patients.
 
-By default, 500 patients will be generated in a dummy dataset. If you need to increase this
+By default, ten patients will be generated in a dummy dataset. If you need to increase this
 number, you can configure it in the dataset definition with:
 
 ```
@@ -50,7 +50,7 @@ For example, take this dataset definition from the tutorial:
 
 ```ehrql
 from ehrql import create_dataset
-from ehrql.tables.beta.core import patients, medications
+from ehrql.tables.core import patients, medications
 
 dataset = create_dataset()
 
@@ -80,7 +80,7 @@ And this dummy dataset, in a CSV file named `dummy.csv`:
 Run the dataset definition with the dummy dataset file:
 
 ```
-opensafely exec ehrql:v0 generate-dataset dataset_definition.py --dummy-data-file dummy.csv
+opensafely exec ehrql:v1 generate-dataset dataset_definition.py --dummy-data-file dummy.csv
 ```
 
 Now, instead of a generated dummy dataset, you'll see the data from the dummy data file that
@@ -127,14 +127,14 @@ to generate an initial dataset, and then modify it as you need.
 Run the dataset definition with an output path:
 
 ```
-opensafely exec ehrql:v0 generate-dataset dataset_definition.py --output dataset.csv
+opensafely exec ehrql:v1 generate-dataset dataset_definition.py --output dataset.csv
 ```
 
 Now you can edit `dataset.csv` as you want, and rerun the dataset definition, using it as the
 dummy data file:
 
 ```
-opensafely exec ehrql:v0 generate-dataset dataset_definition.py --dummy-data-file dataset.csv
+opensafely exec ehrql:v1 generate-dataset dataset_definition.py --dummy-data-file dataset.csv
 ```
 
 ## Supply your own dummy tables
@@ -158,7 +158,7 @@ specific tables that are required.
 Try this out by running the following command against the simple dataset definition above:
 
 ```
-opensafely exec ehrql:v0 create-dummy-tables dataset_definition.py dummy-folder
+opensafely exec ehrql:v1 create-dummy-tables dataset_definition.py dummy-folder
 ```
 
 ![A screenshot of VS Code, showing the terminal after the `create-dummy-tables` command was run](opensafely_exec_create_dummy_tables.png)
@@ -169,5 +169,5 @@ dataset definition requires - `patients.csv` and `medications.csv`.
 Now you can run ehrQl with these generated tables instead:
 
 ```
-opensafely exec ehrql:v0 generate-dataset dataset_definition.py --dummy-tables dummy-folder
+opensafely exec ehrql:v1 generate-dataset dataset_definition.py --dummy-tables dummy-folder
 ```
diff --git a/docs/how-to/dummy-measures-data.md b/docs/how-to/dummy-measures-data.md
@@ -19,7 +19,7 @@ You do not need to add anything to the measures definition itself in order to ge
 dataset in this way. ehrQL will use the measures definition to set up dummy data and generate
 matching patients.
 
-By default, 500 patients will be generated in a dummy measures output. If you need to increase this number, you can configure it in the measures definition with:
+By default, ten patients will be generated in a dummy measures output. If you need to increase this number, you can configure it in the measures definition with:
 
 ```
 measures.configure_dummy_data(population_size=1000)
@@ -47,7 +47,7 @@ For example, take this simple measures definition:
 ```python
 from ehrql import create_measures, years
 from ehrql.measures import INTERVAL
-from ehrql.tables.beta.core import patients, clinical_events
+from ehrql.tables.core import patients, clinical_events
 
 events_in_interval = clinical_events.where(clinical_events.date.is_during(INTERVAL))
 had_event = events_in_interval.exists_for_patient()
@@ -76,7 +76,7 @@ And this dummy measures, in a CSV file named `dummy_measures.csv`:
 Run the measures definition with the dummy measures output file:
 
 ```
-opensafely exec ehrql:v0 generate-measres measures_definition.py --dummy-data-file dummy_measures.csv
+opensafely exec ehrql:v1 generate-measres measures_definition.py --dummy-data-file dummy_measures.csv
 ```
 
 Now, instead of generated dummy measures output, you'll see the data from the dummy data file that you provided.