-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Yaml configuration input #19
Conversation
…th formatting, which is also optional.
…red, with default values of commec-dbs, to be grabbed from here, rather than updating the file paths after search handler creation.
…he databases to the correct filenames now.
…added an override such that that directory is used instead of default when specified to ensure a non-breaking update.
…t might make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still need to run some more tests, but here's a few comments!
commec/setup.py
Outdated
@@ -617,6 +619,9 @@ def do_setup(self): | |||
|
|||
os.remove(filename_zipped) | |||
|
|||
# We update the default path of the configuration file to point to the newly installed databases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two questions here:
- Do we want to allow users to supply a configuration filepath (vs using the default one installed with the package)?
- Should we move the name of the file (also referenced in
ScreenConfig
defaults) to aconstants
module so we can import it and it only lives in one place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved as:
- No, it is nice to overwrite this (though there may be a touch of confusion when users update the package and overwrite the default; hopefully resolvable, though).
- Yes; proposed in 8a3ac8d
@@ -0,0 +1,20 @@ | |||
base_paths: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High-level suggestion: could we put all the defaults in here (vs. tracking some here and some in ScreenConfig
)?
Could turn into something like:
base_paths:
default: commec-dbs/
databases:
benign:
cm:
path: '{default}benign_db/benign.cm'
fasta:
path: '{default}benign_db/benign.fasta'
hmm:
path: '{default}benign_db/benign.hmm'
biorisk_hmm:
path: '{default}biorisk_db/biorisk.hmm'
regulated_nt:
path: '{default}nt_blast/nt'
regulated_protein:
blast:
path: '{default}nr_blast/nr'
diamond:
path: '{default}nr_dmnd/nr.dmnd'
screen_logic:
in_fast_mode: false
skip_nt_search: false
protein_search_tool: blastx
parallelisation:
threads: 1
output_handling:
force: false
resume: false
We could then throw an error if we can't find the default config.yaml and are missing overrides from the CLI or a user-supplied config? We already raise a FileNotFoundError if the default yaml can't be found. This wouldn't require users to change their CLI calls (we're just moving around where the defaults are stored) I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is 100% the end-game purpose of the yaml, I just didn't add all the other commands from screen yet , as just wanted to get the core of it (yaml system and db directories) up and running.
Could certainly update the parsing to include all relevant cli args in this PR though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We decided that we want to do this eventually, but this should be backlogged, and for now we just want to remove the protein_search_tool
piece (since then the config file is more cleanly just about specifying databases, to a level of specificity that was not possible before), proposed change in: 059ad43
commec/config/io_parameters.py
Outdated
"but was invalid as a yaml file:\n", | ||
e, | ||
) | ||
return {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this return value is unused? I'm a bit confused about this vs line 172
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at it more, I think we should decide if the desired behavior here is "raise an exception" (would match with raising an exception when the file is not found) or "no YAML overrides" (probably not viable now that we made the databases argument optional)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now raises an exception as of 8409257
(since I found it a bit confusing that, when not providing `--databases`, I got a "not found" error for a place I had never told the code to look)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested out; clearly going to be very useful!
I'm hoping @MichaelBarnett you can take a quick look over before squash + merge, since I made a few of the changes we talked about this afternoon (e.g. removing the protein_search_tool
key in the default config).
commec/setup.py
Outdated
@@ -617,6 +619,9 @@ def do_setup(self): | |||
|
|||
os.remove(filename_zipped) | |||
|
|||
# We update the default path of the configuration file to point to the newly installed databases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved as:
- No, it is nice to overwrite this (though there may be a touch of confusion when users update the package and overwrite the default; hopefully resolvable, though).
- Yes; proposed in 8a3ac8d
commec/config/io_parameters.py
Outdated
"but was invalid as a yaml file:\n", | ||
e, | ||
) | ||
return {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now raises an exception as of 8409257
@@ -0,0 +1,20 @@ | |||
base_paths: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We decided that we want to do this eventually, but this should be backlogged, and for now we just want to remove the protein_search_tool
piece (since then the config file is more cleanly just about specifying databases, to a level of specificity that was not possible before), proposed change in: 059ad43
Ran some tests, and is working as intended, nicely done on the updates here, looking forward to working with adding in the rest of the settings soon too! |
Background
Dependencies:
New features
Breaking changes
nil, commec screen should still function identically using old calling methods, where the database directory is set via screen CLI.