Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small updates to JointVcfFiltering WDL #8027

Merged
merged 3 commits into from
Sep 22, 2022
Merged

Conversation

meganshand
Copy link
Contributor

  • adds allele specific option
  • handles empty VCF inputs to ScoreVariantAnnotations

@@ -27,6 +27,8 @@ workflow JointVcfFiltering {
String indel_annotations
File? gatk_override

Boolean use_allele_specific_annotations = false
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave this a default, but I can also see an argument for having it be a required input since it's somewhat tied to the input annotations (which are currently required). Would love a second opinion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I think it makes sense to make it required.

More generally, for task-level required/optional arguments, I like the style we use in the CNV WDLs. That is, all arguments are exposed, and optional arguments have their defaults specified in the command line rather than in the parameter block (see e.g.

gatk --java-options "-Xmx~{command_mem_mb}m" PreprocessIntervals \
). This makes it easier to glance at the command line and see the defaults.

Not sure if this is the way automatic WDL generation works (haven't tried it myself), but I think it makes sense. And ideally we'd be good about using that functionality that to keep the defaults in sync between the WDL and the jar. Baby steps!

Don't worry about all that for this PR---I'll tackle it in my push. Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah makes sense, it does seem cleaner to keep the defaults in the command line.

@codecov
Copy link

codecov bot commented Sep 21, 2022

Codecov Report

Merging #8027 (5b61044) into master (ef50c08) will not change coverage.
The diff coverage is n/a.

Additional details and impacted files
@@             Coverage Diff             @@
##              master     #8027   +/-   ##
===========================================
  Coverage     86.659%   86.659%           
- Complexity     38881     38885    +4     
===========================================
  Files           2333      2333           
  Lines         182340    182340           
  Branches       20022     20021    -1     
===========================================
  Hits          158014    158014           
+ Misses         17300     17299    -1     
- Partials        7026      7027    +1     
Impacted Files Coverage Δ
.../hellbender/utils/python/PythonUnitTestRunner.java 75.410% <0.000%> (-3.279%) ⬇️
...e/hellbender/engine/spark/SparkContextFactory.java 75.000% <0.000%> (+2.778%) ⬆️

@@ -27,6 +27,8 @@ workflow JointVcfFiltering {
String indel_annotations
File? gatk_override

Boolean use_allele_specific_annotations = false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I think it makes sense to make it required.

More generally, for task-level required/optional arguments, I like the style we use in the CNV WDLs. That is, all arguments are exposed, and optional arguments have their defaults specified in the command line rather than in the parameter block (see e.g.

gatk --java-options "-Xmx~{command_mem_mb}m" PreprocessIntervals \
). This makes it easier to glance at the command line and see the defaults.

Not sure if this is the way automatic WDL generation works (haven't tried it myself), but I think it makes sense. And ideally we'd be good about using that functionality that to keep the defaults in sync between the WDL and the jar. Baby steps!

Don't worry about all that for this PR---I'll tackle it in my push. Thanks!

@meganshand meganshand merged commit 663dadc into master Sep 22, 2022
@meganshand meganshand deleted the ms_filtering_updates branch September 22, 2022 14:30
rsasch pushed a commit that referenced this pull request Oct 17, 2022
…input in JointVcfFiltering WDL (#8027)

* Small changes to JointVCFFiltering WDL

* making default for use_allele_specific_annotations

* addressing comments
koncheto-broad added a commit that referenced this pull request Feb 2, 2023
* Added a new suite of tools for variant filtering based on site-level annotations. (#7954)

* Adds wdl that tests joint VCF filtering tools (#7932)

* adding filtering wdl

* renaming pipeline

* addressing comments

* added bash

* renaming json

* adding glob to extract for extra files

* changing dollar signs

* small comments

* Added changes for specifying model backend and other tweaks to WDLs and environment.

* Added classes for representing a collection of labeled variant annotations.

* Added interfaces for modeling and scoring backends.

* Added a new suite of tools for variant filtering based on site-level annotations.

* Added integration tests.

* Added test resources and expected results.

* Miscellaneous changes.

* Removed non-ASCII characters.

* Added documentation for TrainVariantAnnotationsModel and addressed review comments.

Co-authored-by: meganshand <[email protected]>

* Added toggle for selecting resource-matching strategies and miscellaneous minor fixes to new annotation-based filtering tools. (#8049)

* Adding use_allele_specific_annotation arg and fixing task with empty input in JointVcfFiltering WDL (#8027)

* Small changes to JointVCFFiltering WDL

* making default for use_allele_specific_annotations

* addressing comments

* first stab

* wire through WDL changes

* fixed typo

* set model_backend input value

* add gatk_override to JointVcfFiltering call

* typo in indel_annotations

* make model_backend optional

* tabs and spaces

* make all model_backends optional

* use gatk 4.3.0

* no point in changing the table names as this is a POC

* adding new branch to dockstore

* adding in branching logic for classic VQSR vs VQSR-Lite

* implementing the separate schemas for the VQSR vs VQSR-Lite branches, including Java changes necessary to produce the different tsv files

* passing classic flag to indel run of CreateFilteringFiles

* Update GvsCreateFilterSet.wdl

cleaning up verbiage

* Removed mapping error rate from estimate of denoised copy ratios output by gCNV and updated sklearn. (#7261)

* cleanup up sloppy comment

---------

Co-authored-by: samuelklee <[email protected]>
Co-authored-by: meganshand <[email protected]>
Co-authored-by: Rebecca Asch <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants