-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert to reusable action #5
Merged
iaindillingham
merged 25 commits into
main
from
iaindillingham/convert-to-reusable-action
Mar 23, 2022
Merged
Changes from all commits
Commits
Show all changes
25 commits
Select commit
Hold shift + click to select a range
6206021
Ignore study-related files and QA issues
iaindillingham 43c9ed8
Configure dependabot
iaindillingham 0dd43bf
Add study as use case
iaindillingham d948f8f
Move and rename core module
iaindillingham cc6e644
Move cli and replace click
iaindillingham 5eb6c63
Remove sh
iaindillingham d87bbf2
Downgrade Pandas and fix tests
iaindillingham 6524f5d
Remove altair
iaindillingham e5b2510
Copy requirements.dev.in from cohort-joiner
iaindillingham 0b210c6
Require ebmdatalab
iaindillingham 8a69108
Use ebmdatalab.charts for making deciles charts
iaindillingham 1954bfe
Add generate_deciles_charts to study
iaindillingham 31832fa
Delete `get_deciles_table`
iaindillingham d714124
Delete `is_measure_table` decorator
iaindillingham 56c3036
Rename measures_table to measure_table
iaindillingham 82943ef
Remove remaining typings
iaindillingham d209ea6
Remove mocks from tests
iaindillingham 406eb15
Split test into clearer arrange/act/assert stages
iaindillingham 2d8eb7c
Rename deciles_chart to deciles_charts
iaindillingham b6ea652
Add action.yaml
iaindillingham 67a8e77
Copy tagging new version from cohort-joiner
iaindillingham e125014
Update README.md
iaindillingham 646f916
Rename module in tests
iaindillingham 98643d3
Remove ethnicity codelist
iaindillingham cbd9310
Update measure ID
iaindillingham File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,8 @@ | ||
[flake8] | ||
extend-exclude = .direnv,.venv,venv | ||
ignore = \ | ||
E501 \ # line too long (black fixes long lines, except for long strings which may benefit from being long (eg URLs)) | ||
W503 # line break before binary operator (black disagrees) | ||
ignore = | ||
E501 | ||
W503 | ||
per-file-ignores = | ||
analysis/*:INP001 | ||
max-line-length = 88 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,80 @@ | ||
# Deciles Chart | ||
# deciles-charts | ||
|
||
deciles-charts generates a line chart for each [measure table][1] in an input directory. | ||
The line chart has time on the horizontal axis (`x`) and value on the vertical axis (`y`). | ||
Deciles are plotted as dashed lines; | ||
outer percentiles are plotted as dotted lines; | ||
the median is plotted as a solid line. | ||
For example, the following deciles chart was generated from dummy data: | ||
|
||
![A deciles chart generated from dummy data](img/deciles_chart_has_sbp_event_by_population.png) | ||
|
||
[Using deciles to communicate variation][2] has several advantages when compared to the alternatives. | ||
Consequently, deciles charts are used on [OpenPrescribing.net][] | ||
and in several OpenSAFELY publications, such as [Curtis _et al._ (2021)][3]. | ||
|
||
## Usage | ||
|
||
In summary: | ||
|
||
* Use [cohort-extractor][] to extract several weekly or monthly cohorts. | ||
* Use cohort-extractor to generate one or more measure tables from these cohorts. | ||
* Use deciles-charts to generate a deciles chart for each measure table. | ||
|
||
Let's walk through an example _project.yaml_. | ||
|
||
The following cohort-extractor action extracts several monthly cohorts: | ||
|
||
```yaml | ||
generate_cohort: | ||
run: > | ||
cohortextractor:latest generate_cohort | ||
--study-definition study_definition | ||
--index-date-range "2021-01-01 to 2021-06-30 by month" | ||
outputs: | ||
highly_sensitive: | ||
cohort: output/input_2021-*.csv | ||
``` | ||
|
||
The following cohort-extractor action generates one or more measure tables from these cohorts: | ||
|
||
```yaml | ||
generate_measures: | ||
run: > | ||
cohortextractor:latest generate_measures | ||
--study-definition study_definition | ||
needs: [generate_cohort] | ||
outputs: | ||
moderately_sensitive: | ||
measure: output/measure_*.csv | ||
``` | ||
|
||
Finally, the following deciles-charts reusable action generates a deciles chart for each measure table. | ||
Remember to replace `[version]` with [a deciles-charts version][4]: | ||
|
||
```yaml | ||
generate_deciles_charts: | ||
run: > | ||
deciles-charts:[version] | ||
--input_dir output | ||
--output_dir output | ||
needs: [generate_measures] | ||
outputs: | ||
moderately_sensitive: | ||
deciles_charts: output/deciles_chart_*.png | ||
``` | ||
|
||
For each measure table, there will now be a corresponding deciles chart. | ||
For example, given a measure table called `measure_has_sbp_event_by_stp_code.csv`, | ||
there will now be a corresponding deciles chart called `deciles_chart_has_sbp_event_by_stp_code.png`. | ||
|
||
## Notes for developers | ||
|
||
Please see [DEVELOPERS.md](DEVELOPERS.md). | ||
Please see [_DEVELOPERS.md_](DEVELOPERS.md). | ||
|
||
[1]: https://docs.opensafely.org/measures/ | ||
[2]: https://www.thedatalab.org/blog/2019/04/communicating-variation-in-prescribing-why-we-use-deciles/ | ||
[3]: https://www.opensafely.org/research/2021/service-restoration-observatory-1/ | ||
[4]: https://github.com/opensafely-actions/deciles-charts/tags | ||
[cohort-extractor]: https://docs.opensafely.org/actions-cohortextractor/ | ||
[OpenPrescribing.net]: https://openprescribing.net/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
run: python:latest analysis/deciles_charts.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,86 @@ | ||
import argparse | ||
import pathlib | ||
import re | ||
|
||
import pandas | ||
from ebmdatalab import charts | ||
|
||
|
||
MEASURE_FNAME_REGEX = re.compile(r"measure_(?P<id>\w+)\.csv") | ||
|
||
|
||
def _get_denominator(measure_table): | ||
return measure_table.columns[-3] | ||
|
||
|
||
def _get_group_by(measure_table): | ||
return list(measure_table.columns[:-4]) | ||
|
||
|
||
def get_measure_tables(path): | ||
if not path.is_dir(): | ||
raise AttributeError() | ||
|
||
for sub_path in path.iterdir(): | ||
if not sub_path.is_file(): | ||
continue | ||
|
||
measure_fname_match = re.match(MEASURE_FNAME_REGEX, sub_path.name) | ||
if measure_fname_match is not None: | ||
# The `date` column is assigned by the measures framework. | ||
measure_table = pandas.read_csv(sub_path, parse_dates=["date"]) | ||
|
||
# We can reconstruct the parameters passed to `Measure` without | ||
# the study definition. | ||
measure_table.attrs["id"] = measure_fname_match.group("id") | ||
measure_table.attrs["denominator"] = _get_denominator(measure_table) | ||
measure_table.attrs["group_by"] = _get_group_by(measure_table) | ||
|
||
yield measure_table | ||
|
||
|
||
def drop_zero_denominator_rows(measure_table): | ||
mask = measure_table[measure_table.attrs["denominator"]] > 0 | ||
return measure_table[mask].reset_index(drop=True) | ||
|
||
|
||
def get_deciles_chart(measure_table): | ||
return charts.deciles_chart(measure_table, period_column="date", column="value") | ||
|
||
|
||
def write_deciles_chart(deciles_chart, path): | ||
deciles_chart.savefig(path) | ||
|
||
|
||
def parse_args(): | ||
parser = argparse.ArgumentParser() | ||
parser.add_argument( | ||
"--input_dir", | ||
required=True, | ||
type=pathlib.Path, | ||
help="Path to the input directory", | ||
) | ||
parser.add_argument( | ||
"--output_dir", | ||
required=True, | ||
type=pathlib.Path, | ||
help="Path to the output directory", | ||
) | ||
return parser.parse_args() | ||
|
||
|
||
def main(): | ||
args = parse_args() | ||
input_dir = args.input_dir | ||
output_dir = args.output_dir | ||
|
||
for measure_table in get_measure_tables(input_dir): | ||
measure_table = drop_zero_denominator_rows(measure_table) | ||
chart = get_deciles_chart(measure_table) | ||
id_ = measure_table.attrs["id"] | ||
fname = f"deciles_chart_{id_}.png" | ||
write_deciles_chart(chart, output_dir / fname) | ||
|
||
|
||
if __name__ == "__main__": | ||
main() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
from cohortextractor import Measure, StudyDefinition, codelist_from_csv, patients | ||
|
||
|
||
sbp_codelist = codelist_from_csv( | ||
"codelists/opensafely-systolic-blood-pressure-qof.csv", | ||
system="snomed", | ||
column="code", | ||
) | ||
|
||
study = StudyDefinition( | ||
default_expectations={ | ||
"date": {"earliest": "1921-01-01", "latest": "2021-01-01"}, | ||
"rate": "uniform", | ||
"incidence": 1, | ||
}, | ||
index_date="2021-01-01", | ||
population=patients.satisfying("is_registered AND NOT is_dead"), | ||
is_registered=patients.registered_as_of(reference_date="index_date"), | ||
is_dead=patients.died_from_any_cause( | ||
on_or_before="index_date", | ||
return_expectations={"incidence": 0.1}, | ||
), | ||
stp_code=patients.registered_practice_as_of( | ||
date="index_date", | ||
returning="stp_code", | ||
return_expectations={ | ||
"category": { | ||
"ratios": {f"STP{x}": 1 / 50 for x in range(50)}, | ||
}, | ||
}, | ||
), | ||
has_sbp_event=patients.with_these_clinical_events( | ||
codelist=sbp_codelist, | ||
between=["index_date", "index_date"], | ||
return_expectations={"incidence": 0.1}, | ||
), | ||
) | ||
|
||
measures = [ | ||
Measure( | ||
id="has_sbp_event_by_stp_code", | ||
numerator="has_sbp_event", | ||
denominator="population", | ||
group_by="stp_code", | ||
), | ||
] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
{ | ||
"files": { | ||
"opensafely-systolic-blood-pressure-qof.csv": { | ||
"id": "opensafely/systolic-blood-pressure-qof/3572b5fb", | ||
"url": "https://codelists.opensafely.org/codelist/opensafely/systolic-blood-pressure-qof/3572b5fb/", | ||
"downloaded_at": "2022-03-16 09:44:36.226715Z", | ||
"sha": "f2bc461e351499f4e5573a6f94e760b99979491e" | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
opensafely/systolic-blood-pressure-qof/3572b5fb |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have changes to the ebmdatalab decile charts code, do you think we will push updates to the library? Or have our own code here? In relation to opensafely-core/cohort-extractor#759, we might want to be able to output the intermediate deciles tables for output checking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we'll probably end up replacing that implementation with a new implementation, in this module. Writing out intermediate deciles tables would be one reason for replacing implementations (in this case, we could revert 31832fa). I decided to fall back to
charts.deciles_chart
because it's the canonical implementation.