Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclear error messages when GCNV model paths don't resolve #534

Open
Nicolai-vKuegelgen opened this issue Jul 16, 2024 · 1 comment
Open

Comments

@Nicolai-vKuegelgen
Copy link
Contributor

Describe the bug
In order to run GCNV with the sv_calling_{targeted|wgs} steps, a precomputed model needs to be defined in the config.
This model consists of a 'library' name (matching the ...) as well as file path pattern for contig_ploidy and model_pattern.
If the file path patterns for (probably either) contig_ploidy or model_pattern do not resolve to any actual files, the snappy workflow doesw not run and the error message states that not model matching the library name could be found.

To Reproduce
Steps to reproduce the behavior:

  1. Setup snappy sv_calling_wgs (or trageted) step for GCNV
  2. Have a model_pattern or contig_ploidy entry in the config that will not properly resolve
  3. Run snappy
  4. See the error, which is not helpful.

Expected behavior
If a model is defined, but the file paths can not be resolved the error message should clearly state so.

Additional context
I see 2 possible solutions for this:

  1. The config validation models for GCNV could be adapted to check if the file path patterns can be resolved to actual files
  2. The functions in the GCNV workflow need to throw an error when no files are found. (specifically: snappy_pipeline/workflows/common/gcnv/gcnv_run.py // get_model_dir_list should fail if ouput would be empty)
@tedil
Copy link
Member

tedil commented Jul 16, 2024

While 1 would be nice, the problem is that -- in general -- the files may not yet exist, because they may get produced by some other rule/step/workflow upstream. This is not the case for the GCNV models at the moment, but it would be nice to have a general solution introducing some type that allows us to distinguish between static paths ("these files must exist prior to running the workflow") and dynamic paths (URLs, SRA downloads, SODAR retrieval, simple strings for requesting upstream snakemake output).
So as a short term fix, 2 is a good option I think, especially since the error can be very specific and helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants