-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify an input glob pattern #7
Comments
Indeed, we need to specify the action twice! But we can change that. Before opening Vim, and making it so, it would be good to think about the cases where we'd have measure files in different subdirectories. What are the cases where we'd want to create deciles charts with the same configuration:1
Have you encountered these cases in the wild? Or can you imagine encountering them there? Footnotes
|
My current Depression QOF project has two study populations and two study definitions: the entire population, and those with learning disabilities and autism. And I currently keep all the output separate in /output/qof and /output/lda In my case, we would like decile charts for any variable that was grouped by practice (but not for the demographic subgroups) in both subdirectories. |
I've renamed this issue to "Specify an input glob pattern", following #4. However, I haven't addressed the question: "Should one action map to one subdirectory?" 🙂 I think that one invocation of deciles-charts should map to one input glob pattern and one output subdirectory. I think that the input glob pattern shouldn't recurse. Why? One input glob pattern and one output subdirectory allows deciles-charts to read a subset of the measure tables, writing the deciles charts to the output subdirectory. This is an improvement over current behaviour, where deciles-charts reads the set of the measure tables: if you want some, but not all, deciles charts with current behaviour, then too bad! However, if the input glob pattern recursed, then we'd need to consider what we wrote to the output subdirectory. Using antidepressant-prescribing-lda as an example: >>> glob.glob("output/**/measure_*practice*.csv", recursive=True)
['output/lda/joined/measure_new_antidepressant_tricyclic_practice_rate.csv',
'output/lda/joined/measure_new_antidepressant_ssri_practice_rate.csv',
'output/lda/joined/measure_antidepressant_other_practice_rate.csv',
'output/lda/joined/measure_antidepressant_maoi_practice_rate.csv',
'output/lda/joined/measure_depression_practice_rate.csv',
'output/lda/joined/measure_antidepressant_ssri_practice_rate.csv',
'output/lda/joined/measure_new_depression_practice_rate.csv',
'output/lda/joined/measure_qof_practice_rate.csv',
'output/lda/joined/measure_new_antidepressant_maoi_practice_rate.csv',
'output/lda/joined/measure_new_antidepressant_any_practice_rate.csv',
'output/lda/joined/measure_antidepressant_tricyclic_practice_rate.csv',
'output/lda/joined/measure_antidepressant_any_practice_rate.csv',
'output/lda/joined/measure_new_antidepressant_other_practice_rate.csv',
'output/qof/joined/measure_qof_practice_rate.csv'] We'd expect a deciles chart for each of the above measure tables. Would we expect them to be written as siblings in the output subdirectory? Or would we expect subdirectories within the output subdirectory? If the latter, then how would we handle collisions? Determining the subdirectories within the output subdirectory and handling collisions means writing more code, and makes it less clear for the user. For these reasons, I think that the input glob pattern shouldn't recurse. |
Agreed! |
This replaces the `--input-dir` argument with the `--input-files` argument. Whereas the former accepts a path to a directory, the latter accepts a glob pattern. This commit addresses the substantive issue, but some tidying up would be worthwhile. Closes #7
Currently if you produce measures files in different subdirectories within the same project, for example:
you need to specify the action twice because --input-dir does not resolve wildcards and does not recurse through the directory
Do we think the action should be specified once for each subdirectory, or should the action have the ability to look across multiple subdirectories?
The text was updated successfully, but these errors were encountered: