Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert to reusable action #5

Merged
merged 25 commits into from
Mar 23, 2022

Conversation

iaindillingham
Copy link
Member

I've changed multiple files, so it's probably best to step through each commit one-by-one. I've not grouped them especially well, because I didn't want to spend my time resolving merge conflicts 😬. Essentially, though:

  • Several commits replace the pre-reusable actions structure/dependencies with the post-reusable actions structure/dependencies, of these notice 8a69108 and 0b210c6
  • Several commits add a study as a use case
  • Several commits update tooling
  • Several commits remove unnecessary functionality and address shortcomings of the tests

Most importantly: This PR is the minimum needed to get deciles-charts working as a reusable action. I know you will have excellent suggestions for configuration (e.g. #4). I know you will find the deciles charts suboptimal; the x axis tick labels and the legend are truncated, for example. We can address these, and more, in future PRs.

Closes #1

For convenience, we delete `test_version`. It's easier if a reusable
action's version is stored in a Git tag only, rather than in a Git tag
and in the code.
We move the cli from `deciles_chart.cli` to `analysis.deciles_chart` so
that all the Python code is in the same module. Consequently, this
reusable action looks like a scripted action. We replace click, which
isn't isn't a direct dependency of python-docker, with argparse.
We don't need it and it isn't a direct dependency of python-docker.
We downgrade Pandas to 1.0.1, which is the version available on
python-docker. This version of Pandas doesn't allow an empty DataFrame
to be initialised by passing `columns` and not `data`, so we modify how
empty DataFrames are initialised to fix the tests.
Require 0.0.21, which is the version available on python-docker.
This serves as an end-to-end test of the reusable action.
`ebmdatalab.charts` has a function for computing deciles. We revert this
commit, if we remove that dependency. Until then, `get_deciles_table` is
redundant.
On reflection, this decorator is too defensive. If `DataFrame.attrs`
doesn't contain the required key, then the resulting `KeyError` should be
sufficient.
Somewhat pedantic, but consistent: in a study definition the measures
list contains one or more instances of the Measure class. Each instance
generates a single CSV file. Each CSV file is, thus, a measure
(singular) table.
This adds the tag-new-version job, which is discussed in
cohort-joiner#8.
@ccunningham101 ccunningham101 self-requested a review March 21, 2022 10:10


def get_deciles_chart(measure_table):
return charts.deciles_chart(measure_table, period_column="date", column="value")
Copy link

@ccunningham101 ccunningham101 Mar 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have changes to the ebmdatalab decile charts code, do you think we will push updates to the library? Or have our own code here? In relation to opensafely-core/cohort-extractor#759, we might want to be able to output the intermediate deciles tables for output checking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we'll probably end up replacing that implementation with a new implementation, in this module. Writing out intermediate deciles tables would be one reason for replacing implementations (in this case, we could revert 31832fa). I decided to fall back to charts.deciles_chart because it's the canonical implementation.

@@ -0,0 +1,2 @@
opensafely/ethnicity/2020-04-27

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we are loading the ethnicity codelist but not using it. For example studies we probably want to avoid circular dependencies of actions using other actions? So might we want to avoid using the cohort joiner in the decile chart action example study?
Since it is a small example study, it probably would not be too expensive to extract ethnicity for each month.
Or we could assume a reusable action hierarchy in which decile charts could use the cohort joiner, but the cohort joiner would not use decile charts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should certainly remove that codelist. I don't think the example study contains a dependency on cohort-joiner, though. The path is generate_cohort > generate_measures > generate_deciles_charts.

The new ID better reflects the nature of the measure.
@iaindillingham
Copy link
Member Author

I've removed the ethnicity codelist in 98643d3 and, because I noticed it, updated the measure ID in cbd9310. Would you be happy to ✅ the PR, @ccunningham101?

@iaindillingham iaindillingham merged commit 796300c into main Mar 23, 2022
@iaindillingham iaindillingham deleted the iaindillingham/convert-to-reusable-action branch March 23, 2022 10:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Convert to a reusable action
2 participants