gCNV nan errors #4824

mwalker174 · 2018-05-29T20:57:46Z

@samuelklee @asmirnov239 @mbabadi I tried to run a 30-sample cohort through gCNV on all canonical chromosomes with 250bp bins sharded in 10k-interval blocks, but PostprocessGermlineCNVCalls gave the following error:

19:26:14.967 INFO  PostprocessGermlineCNVCalls - Analyzing shard 223...
19:26:15.107 INFO  PostprocessGermlineCNVCalls - Analyzing shard 224...
19:26:15.259 INFO  PostprocessGermlineCNVCalls - Analyzing shard 225...
19:26:15.260 INFO  PostprocessGermlineCNVCalls - Shutting down engine
[May 29, 2018 7:26:15 PM UTC] org.broadinstitute.hellbender.tools.copynumber.PostprocessGermlineCNVCalls done. Elapsed time: 3.34 minutes.
Runtime.totalMemory()=39753089024
***********************************************************************

A USER ERROR has occurred: Bad input: Validation error occurred on line %d of the posterior file: Posterior probabilities for at at least one posterior record do not sum up to one.

After inspecting the output from shard 225, it seems that the model starts producing nan values after ~1600 warmup iterations (looking at the ELBO log). This shard corresponds to a pericentromeric region chr3:91540501-94090250.

It would be nice to have the option to bypass this error in PostprocessGermlineCNVCalls.

Here is the model config for the shard:

 "p_active": 0.01,
 "cnv_coherence_length": 10000.0,
 "class_coherence_length": 10000.0,
 "max_copy_number": 5,
 "num_calling_processes": 1,
 "num_copy_number_states": 6,
 "num_copy_number_classes": 2
 "max_bias_factors": 5,
 "mapping_error_rate": 0.01,
 "psi_t_scale": 0.001,
 "psi_s_scale": 0.0001,
 "depth_correction_tau": 10000.0,
 "log_mean_bias_std": 0.1,
 "init_ard_rel_unexplained_variance": 0.1,
 "num_gc_bins": 20,
 "gc_curve_sd": 1.0,
 "q_c_expectation_mode": "hybrid",
 "active_class_padding_hybrid_mode": 50000,
 "enable_bias_factors": false,
 "enable_explicit_gc_bias_modeling": false,
 "disable_bias_factors_in_active_class": false
 "version": "0.7"

The text was updated successfully, but these errors were encountered:

ldgauthier · 2018-05-30T13:48:43Z

How did you guys generate the target list for hg38? I've been having some problems with regions near the centromeres for SNPs and indels as well. The centromeres.bed for hg38 from UCSC seems to include the computationally generated centromeres, but not the additional gross regions nearby that we excluded from b37. Laurent was excluding a fair amount of territory beyond the "official" centromeres for his QC based on the density of multi-allelic variant calls.

samuelklee · 2018-05-30T14:03:18Z

@mbabadi We should look into ways to be more robust against NaNs, but I think we should just go ahead and blacklist these regions. This can be done from the outset of the pipeline via the -XL argument to PreprocessIntervals. Does SV team have a canonical list we can start recommending? Looks like http://cf.10xgenomics.com/supp/genome/GRCh38/sv_blacklist.bed may also be a good option. Perhaps we can add some padding if necessary.

mwalker174 · 2018-05-30T14:53:31Z

For SVs, we are not blacklisting any regions except sometimes gaps and centromeres. Unfortunately many of the events occur in messy areas like this and I think it’s going to be a major issue if we can’t guarantee that the model will be robust in such regions.

samuelklee · 2018-05-30T15:18:48Z

@mwalker174 the region you found above is included in the 10X SV blacklist. What list is the SV team currently using?

If the read-depth data is not reliable in these regions, I would not expect the model fit to be very good, even if we made the model more robust against nans. So I wouldn't think the results would be very useful for SV integration. Is there a way to make CNV-SV integration "Bayesian" in the sense that we could fall back on a prior in the case of missing CNV data?

mwalker174 · 2018-05-30T16:06:19Z

@samuelklee We aren't using any blacklists currently. I am less concerned about noisy calls because they usually don't line up well with read pair evidence. That said, I did not get nan errors with 1kbp bins - perhaps we could bump up the bin sizes in regions where this happens?

cwhelan · 2018-05-30T16:11:38Z

@samuelklee In my experience the 10x blacklist is very very conservative and will likely exclude many regions of common copy number variation. I'd recommend something less restrictive for general use.

mwalker174 · 2018-05-30T17:56:15Z

There were more nans in these other chunks, all of which overlap the UCSC centromeres regions:

chr9:43318001-60923750
chr10:39050001-41716500
chr11:50778501-53535250
chr17:22749251-25299000
chr18:17216251-19716250

I'm going to try again with centromeres blacklisted.

mwalker174 · 2018-06-04T15:22:30Z

Blacklisting centromeres resolved the NaN errors except for one block on chrY that roughly corresponds to region q11.23. I blacklisted that block and got no more errors.

samuelklee · 2018-06-04T15:32:48Z

My guess is that these regions have unusually high coverage, which is probably yielding NaN likelihoods. @mwalker174 any way you can check this from your previous runs?

Our philosophy so far has been to keep the tools relatively agnostic by allowing generic blacklisting via -XL and pushing the responsibility to the users.

samuelklee · 2019-01-31T14:00:02Z

@mwalker174 can you go back and check whether high coverage was causing the NaNs?

samuelklee · 2019-08-29T19:11:34Z

@mwalker174 does this need to be addressed?

mwalker174 · 2019-08-30T15:32:56Z

@samuelklee Yes. Looking forward, we will want to reduce the extent of our blacklist and interval filtering, which are currently needed to prevent these errors.

samuelklee · 2019-10-30T00:59:21Z

@mwalker174 let's check whether these NaNs were caused by high coverage. Perhaps we can address them along with those due to vanishing overdispersion (which were caused by large values of interval-psi-scale).

samuelklee · 2019-11-06T14:58:21Z

At least partially addressed in #6245, we can reopen if there are other NaNs that we have to patch.

samuelklee self-assigned this Jun 4, 2018

samuelklee added the Copy Number tools label Jun 4, 2018

samuelklee added the Germline CNV label Nov 15, 2018

samuelklee removed their assignment Feb 1, 2019

samuelklee mentioned this issue Nov 6, 2019

Added epsilons to overdispersion in gCNV models to avoid NaNs. #6245

Merged

samuelklee closed this as completed in #6245 Nov 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gCNV nan errors #4824

gCNV nan errors #4824

mwalker174 commented May 29, 2018

ldgauthier commented May 30, 2018

samuelklee commented May 30, 2018

mwalker174 commented May 30, 2018

samuelklee commented May 30, 2018

mwalker174 commented May 30, 2018

cwhelan commented May 30, 2018

mwalker174 commented May 30, 2018

mwalker174 commented Jun 4, 2018

samuelklee commented Jun 4, 2018

samuelklee commented Jan 31, 2019

samuelklee commented Aug 29, 2019

mwalker174 commented Aug 30, 2019

samuelklee commented Oct 30, 2019 •

edited

Loading

samuelklee commented Nov 6, 2019

gCNV nan errors #4824

gCNV nan errors #4824

Comments

mwalker174 commented May 29, 2018

ldgauthier commented May 30, 2018

samuelklee commented May 30, 2018

mwalker174 commented May 30, 2018

samuelklee commented May 30, 2018

mwalker174 commented May 30, 2018

cwhelan commented May 30, 2018

mwalker174 commented May 30, 2018

mwalker174 commented Jun 4, 2018

samuelklee commented Jun 4, 2018

samuelklee commented Jan 31, 2019

samuelklee commented Aug 29, 2019

mwalker174 commented Aug 30, 2019

samuelklee commented Oct 30, 2019 • edited Loading

samuelklee commented Nov 6, 2019

samuelklee commented Oct 30, 2019 •

edited

Loading