Added separate allele-count thresholds for the normal and tumor in ModelSegments. #5556

samuelklee · 2019-01-02T20:34:57Z

Also fixed some minor style issues in argument variable names and the WDL.

This should help recover some deletions and might possibly clear up some issues with MAF estimation when the number of hets is small. @LeeTL1220 can you run on some test cases to check the effect? (Note that the changes to fix estimation of the posterior widths, which will in turn affect similar-segment smoothing, are in another branch; we should test those changes as well.)

Note that the default threshold of zero for the tumor in matched-normal mode should ensure that the sites genotyped as het should always match in the tumor and the normal. (This will ultimately make multisample segmentation, as enabled by #5524, more straightforward.) There was previously a check for this condition in the integration test; however, it wasn't actually activated by the test data. I could modify the test data to add a proper regression test, but since these test files are generated by running another tool on a test BAM in the repo, this could be misleading. I'm OK with punting in this case.

@jonn-smith do you mind reviewing, since this resulted from your turn as liaison? Should be super quick. Thanks again for raising the issue!

…delSegments.

codecov-io · 2019-01-04T15:50:11Z

Codecov Report

Merging #5556 into master will increase coverage by <.001%.
The diff coverage is 87.5%.

@@               Coverage Diff               @@
##              master     #5556       +/-   ##
===============================================
+ Coverage     87.082%   87.082%   +<.001%     
+ Complexity     31250     31249        -1     
===============================================
  Files           1915      1915               
  Lines         144178    144180        +2     
  Branches       15910     15910               
===============================================
+ Hits          125553    125555        +2     
  Misses         12839     12839               
  Partials        5786      5786

Impacted Files	Coverage Δ	Complexity Δ
...ute/hellbender/tools/copynumber/ModelSegments.java	`98.077% <87.5%> (-0.467%)`	`45 <3> (-1)`
...nder/utils/runtime/StreamingProcessController.java	`67.773% <0%> (+0.474%)`	`33% <0%> (ø)`	⬇️

samuelklee · 2019-01-04T15:53:16Z

@LeeTL1220 @jonn-smith apologies for the blips in getting tests to pass on Travis, but I think this branch should be OK now.

jonn-smith

One minor question, then if tests pass and @LeeTL1220 validates it (as requested), feel free to merge.

jonn-smith · 2019-01-09T21:36:59Z

src/main/java/org/broadinstitute/hellbender/tools/copynumber/ModelSegments.java

@@ -619,12 +630,14 @@ private AllelicCountCollection genotypeHets(final SampleLocatableMetadata metada

        logger.info("Genotyping heterozygous sites from available allelic counts...");

+        AllelicCountCollection filteredAllelicCounts = allelicCounts;


Is there any reason you moved the declaration/definition up here? You end up setting it a couple lines later, so it doesn't seem to make a difference.

Just a very minor matter of style. Now all subsequent transformations after the initial declaration operate on the new variable, so any transformations added later will have an identical format. (Really what happened is that I experimented with adding filtering steps and changing their order, but got bit by not noticing I had inadvertently reverted to the original counts in a later step due to a careless copy and paste...)

jonn-smith

Hit wrong button.

One minor question, then if tests pass and @LeeTL1220 validates it (as requested), feel free to merge.

LeeTL1220

@samuelklee I only have a curiosity question. If you can answer @jonn-smith 's question, I am fine with this being merged.

LeeTL1220 · 2019-01-10T14:21:57Z

scripts/cnv_wdl/somatic/cnv_somatic_pair_workflow.wdl

-            --number-of-burn-in-samples-copy-ratio ${default=50 num_burn_in_copy_ratio} \
-            --number-of-samples-allele-fraction ${default=100 num_samples_allele_fraction} \
-            --number-of-burn-in-samples-allele-fraction ${default=50 num_burn_in_allele_fraction} \
+            --number-of-samples-copy-ratio ${default="100" num_samples_copy_ratio} \


Not that I mind, but curiosity: Were the quotes necessary?

Just another matter of style, I think these were the only numbers missing quotes in the CNV WDLs.

samuelklee · 2019-01-10T15:03:50Z

Thanks @jonn-smith and @LeeTL1220. Will go ahead and merge, but I think we should still check the effect on MAF estimates, etc. in the evaluation/real runs at some point in the future.

samuelklee assigned LeeTL1220 and jonn-smith Jan 2, 2019

samuelklee requested a review from jonn-smith January 2, 2019 20:36

samuelklee changed the title ~~Added separate allele-count thresholds for the normal and tumor in ModelSegments and fixed logic for genotyping of sites.~~ Added separate allele-count thresholds for the normal and tumor in ModelSegments. Jan 2, 2019

samuelklee force-pushed the sl_gtfix branch from 490eb25 to 7fa9c39 Compare January 3, 2019 14:14

Added separate allele-count thresholds for the normal and tumor in Mo…

301c6bc

…delSegments.

samuelklee force-pushed the sl_gtfix branch from 7fa9c39 to 301c6bc Compare January 4, 2019 14:41

jonn-smith reviewed Jan 9, 2019

View reviewed changes

jonn-smith approved these changes Jan 9, 2019

View reviewed changes

LeeTL1220 approved these changes Jan 10, 2019

View reviewed changes

samuelklee merged commit c3e9818 into master Jan 10, 2019

samuelklee deleted the sl_gtfix branch January 10, 2019 15:03

samuelklee mentioned this pull request Jan 14, 2019

Added MinibatchSliceSampler and replaced naive subsampling in ModelSegments. #5575

Merged

samuelklee mentioned this pull request Jan 27, 2022

Added numerical-stability tests and updated test data for all ModelSegments single-sample and multiple-sample modes. #7652

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added separate allele-count thresholds for the normal and tumor in ModelSegments. #5556

Added separate allele-count thresholds for the normal and tumor in ModelSegments. #5556

samuelklee commented Jan 2, 2019 •

edited

Loading

codecov-io commented Jan 4, 2019

samuelklee commented Jan 4, 2019

jonn-smith left a comment

jonn-smith Jan 9, 2019

samuelklee Jan 10, 2019

jonn-smith left a comment •

edited

Loading

LeeTL1220 left a comment

LeeTL1220 Jan 10, 2019

samuelklee Jan 10, 2019

samuelklee commented Jan 10, 2019

		@@ -619,12 +630,14 @@ private AllelicCountCollection genotypeHets(final SampleLocatableMetadata metada

		logger.info("Genotyping heterozygous sites from available allelic counts...");

		AllelicCountCollection filteredAllelicCounts = allelicCounts;

Added separate allele-count thresholds for the normal and tumor in ModelSegments. #5556

Added separate allele-count thresholds for the normal and tumor in ModelSegments. #5556

Conversation

samuelklee commented Jan 2, 2019 • edited Loading

codecov-io commented Jan 4, 2019

Codecov Report

samuelklee commented Jan 4, 2019

jonn-smith left a comment

Choose a reason for hiding this comment

jonn-smith Jan 9, 2019

Choose a reason for hiding this comment

samuelklee Jan 10, 2019

Choose a reason for hiding this comment

jonn-smith left a comment • edited Loading

Choose a reason for hiding this comment

LeeTL1220 left a comment

Choose a reason for hiding this comment

LeeTL1220 Jan 10, 2019

Choose a reason for hiding this comment

samuelklee Jan 10, 2019

Choose a reason for hiding this comment

samuelklee commented Jan 10, 2019

samuelklee commented Jan 2, 2019 •

edited

Loading

jonn-smith left a comment •

edited

Loading