-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strand based annotations will use both reads in an overlapping read pair #5286
Conversation
@davidbenjamin will you review? |
🍕 🎉 🎈 ! |
@takutosato Tests appear to be failing, can you take a look at this? |
@fleharty fixed the bug, the tests should pass now |
@takutosato One of the builds failed, are you sure that fixed the bug? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! You can merge after addressing trivial comments.
populateMapIfEmpty(refReads); | ||
|
||
// assume that the mate of a forward read is a reverse read and vice versa | ||
final int numDiscardedFwdAltMates = (int) altReads.get(Strand.REV).stream().filter(ba -> ba.read.hasAttribute(StrandBiasTest.DISCARDED_MATE_READ_TAG)).count(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extract a method int countDiscardedMates
for this logic even though it's a one-liner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And when you extract such a method you could put in logic for when there are no reads with the given strand, thereby letting you do away withpopulateMapIfEmpty
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -21,6 +21,7 @@ | |||
|
|||
protected static final int ARRAY_DIM = 2; | |||
protected static final int ARRAY_SIZE = ARRAY_DIM * ARRAY_DIM; | |||
public static final String DISCARDED_MATE_READ_TAG = "DM"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This constant belongs in the class that applies the tag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
// This read's mate got discarded by SomaticGenotypingEngine::clipOverlappingReads() | ||
if (read.hasAttribute(DISCARDED_MATE_READ_TAG)){ | ||
// Note that if this read is forward, then we increment its mate which we assume is reverse | ||
table[offset + (isFW ? 1 : 0)]++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isFW
--> isForward
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -123,4 +124,7 @@ | |||
@Argument(fullName = SMITH_WATERMAN_LONG_NAME, doc = "Which Smith-Waterman implementation to use, generally FASTEST_AVAILABLE is the right choice", optional = true) | |||
public SmithWatermanAligner.Implementation smithWatermanImplementation = SmithWatermanAligner.Implementation.JAVA; | |||
|
|||
@Argument(fullName = CORRECT_OVERLAPPING_BASE_QUALITIES_LONG_NAME) | |||
public boolean correctOverlappingBaseQualities = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The convention for boolean arguments is that specifying them yields non-default behavior.
@@ -169,4 +171,12 @@ public double getDefaultAlleleFrequency() { | |||
@Argument(fullName = GET_AF_FROM_AD_LONG_NAME, doc="Use allelic depth to calculate tumor allele fraction; recommended for mitochondrial applications", optional = true) | |||
public boolean calculateAFfromAD = false; | |||
|
|||
/** | |||
* If set to true, count overlapping read pair as two seprate reads instead of one for {@link StrandArtifact} and {@link StrandBiasBySample} | |||
* This would break the independence assumption of reads and over-count the alt depth in these annotations. On the other hand it could |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment can be more direct about the fact that it is, in fact, the objectively correct behavior when it comes to strand artifacts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -101,7 +102,7 @@ public CalledHaplotypes callMutations( | |||
|
|||
for( final int loc : startPosKeySet ) { | |||
final List<VariantContext> eventsAtThisLoc = getVariantContextsFromActiveHaplotypes(loc, haplotypes, false); | |||
final VariantContext mergedVC = AssemblyBasedCallerUtils.makeMergedVariantContext(eventsAtThisLoc); | |||
VariantContext mergedVC = AssemblyBasedCallerUtils.makeMergedVariantContext(eventsAtThisLoc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't this remain final
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it can
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes it can
final byte altQuality = 50; | ||
final byte refQuality = 30; | ||
|
||
final File samFile = File.createTempFile("liquid-biopsy", ".bam"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This setting-up could go in a @BeforeClass
method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to use @BeforeClass
and ran into the issue where the @BeforeClass
method takes in parameters (numAltPairs
and refDepth
) that I later use in Assert
. Hardcoding them in the @BeforeClass
method wouldn't be ideal. I could make them class-wide variables, since there are only two, but widening the scope of variables like that isn't ideal, either. So far I just extracted the sam file creation method without adding the @BeforeClass
tag (in my latest commit). We can talk more tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@takutosato That works for me, no need to shoehorn it into @BeforeClass
.
0dd851a
to
b800fc4
Compare
…tions. 2. Add an option to turn off base quality correction before local reassembly
b800fc4
to
d3d5f3e
Compare
Thanks Takuto! 🎈 |
…tions. 2. Add an option to turn off base quality correction before local reassembly (broadinstitute#5286)
This PR includes two changes: