Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: make germline variants pon optional for mutect2 #375

Merged
merged 11 commits into from
Feb 20, 2023

Conversation

ericblanc20
Copy link
Contributor

The pull request is more ambitious than expected.

The reason is the need for keeping chunk logs, which are missing for multiple parallelised steps (such as the somatic variant annotations & filtering).
To solve this problem in a generic way, a new abstract class (currently called ParallelMutect2BaseWrapper) has been created for parallelisation.
It is organised in the following way: the class collects the file names of all outputs & all log files that are stored in the snakemake object. The class then writes a Snakefile, such that each chunk should create all the requested output & log files. The class then merges the log files, and delegates the merging of output files to any derived class.
In this way, the class which parallelise Mutect2 output just needs to define how to merge the different types of files produced by Mutect2 (the vcf, of course, but also files necessary for filtering results (orientation stats, ...)).
All other opreations (intervals splitting, Snakefile creation, dynamic resource increase, ...) are taken care of by the ParallelMutect2BaseWrapper parent class.
The tests have also been extensively re-worked to improve coverage (hopefully).

The changes required to remove the requirements on the existence of a panel of normal, a germline resource and a list of common variants are a bit drowned in the other changes.
They are limited to Snakefile & __init__.py in the somatic_variant_calling workflow directory, and to the filter wrapper in the mutect2 list of wrappers.

…mon variants

    the workflow is dependent on the presence of a path to the common variants vcf
    minor bug fixes have also been made to the list output files for mutect & scalpel
    tests have been adapted to reflect the bug fixes
fix: log and checksums properly handled
    Implementation: generic parallel wrapper, which copies all output & log files as requested by the snakemake output
    The generic wrapper also creates a tar file with all chunks logs (not linked to output yet)
    Tests have been amended to exercise the complete creation of the parallel snakemake file
@ericblanc20 ericblanc20 requested a review from mbenary February 11, 2023 11:54
@ericblanc20 ericblanc20 linked an issue Feb 11, 2023 that may be closed by this pull request
@github-actions
Copy link

  • Please format your Python code with black: make black
  • Please format your Snakemake code with snakefmt: make snakefmt
  • Please organize your imports isorts: make isort
  • Please ensure that your code passes flake8: make flake8

You can trigger all lints locally by running make lint

@coveralls
Copy link

Coverage Status

Coverage: 84.865% (-0.01%) from 84.877% when pulling 61e66ae on 369-make-germline-variants-pon-optional-for-mutect2 into fad1ad6 on main.

@ericblanc20 ericblanc20 changed the title 369 make germline variants pon optional for mutect2 feat: make germline variants pon optional for mutect2 Feb 11, 2023
Copy link
Contributor

@mbenary mbenary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The options for mutect2 work as intended.
The parallelization of log-files also seems to work.
Next time, please make two different pull requests for the different tasks.

@ericblanc20 ericblanc20 merged commit 30bc591 into main Feb 20, 2023
@ericblanc20 ericblanc20 deleted the 369-make-germline-variants-pon-optional-for-mutect2 branch February 20, 2023 16:51
@tedil tedil mentioned this pull request Jun 28, 2024
This was referenced Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make germline variants & PON optional for mutect2
3 participants