-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: make germline variants pon optional for mutect2 #375
Merged
ericblanc20
merged 11 commits into
main
from
369-make-germline-variants-pon-optional-for-mutect2
Feb 20, 2023
Merged
feat: make germline variants pon optional for mutect2 #375
ericblanc20
merged 11 commits into
main
from
369-make-germline-variants-pon-optional-for-mutect2
Feb 20, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…mon variants the workflow is dependent on the presence of a path to the common variants vcf minor bug fixes have also been made to the list output files for mutect & scalpel tests have been adapted to reflect the bug fixes
fix: log and checksums properly handled
Implementation: generic parallel wrapper, which copies all output & log files as requested by the snakemake output The generic wrapper also creates a tar file with all chunks logs (not linked to output yet) Tests have been amended to exercise the complete creation of the parallel snakemake file
ericblanc20
changed the title
369 make germline variants pon optional for mutect2
feat: make germline variants pon optional for mutect2
Feb 11, 2023
mbenary
approved these changes
Feb 20, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The options for mutect2 work as intended.
The parallelization of log-files also seems to work.
Next time, please make two different pull requests for the different tasks.
ericblanc20
deleted the
369-make-germline-variants-pon-optional-for-mutect2
branch
February 20, 2023 16:51
This was referenced Dec 9, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The pull request is more ambitious than expected.
The reason is the need for keeping chunk logs, which are missing for multiple parallelised steps (such as the somatic variant annotations & filtering).
To solve this problem in a generic way, a new abstract class (currently called
ParallelMutect2BaseWrapper
) has been created for parallelisation.It is organised in the following way: the class collects the file names of all outputs & all log files that are stored in the snakemake object. The class then writes a Snakefile, such that each chunk should create all the requested output & log files. The class then merges the log files, and delegates the merging of output files to any derived class.
In this way, the class which parallelise
Mutect2
output just needs to define how to merge the different types of files produced byMutect2
(the vcf, of course, but also files necessary for filtering results (orientation stats, ...)).All other opreations (intervals splitting, Snakefile creation, dynamic resource increase, ...) are taken care of by the
ParallelMutect2BaseWrapper
parent class.The tests have also been extensively re-worked to improve coverage (hopefully).
The changes required to remove the requirements on the existence of a panel of normal, a germline resource and a list of common variants are a bit drowned in the other changes.
They are limited to
Snakefile
&__init__.py
in thesomatic_variant_calling
workflow directory, and to thefilter
wrapper in themutect2
list of wrappers.