-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix integer overflow bug in RMSMappingQuality #5435
Conversation
Updated RMS Mapping quality annotations to be of type long/Long instead of int/Integer. With int types, a large cohort could overflow Integer.MAX_VALUE when handling sum of squared mapping qualities. With long types this should not be a problem until we have off-world colonies. This resolves issue 5433.
I did one thing I expect may be controversial, I implemented getAttributeAsLongList() in RMSMappingQuality. It's unlikely that ints would overflow in that particular case, so it might be possible to go without doing that. The cleaner alternative would be to update VariantContext and CommonInfo, but they were in htsjdk, so I wasn't sure how to proceed there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks lovely, thanks for the quick fix! I have one refactoring suggestion -- take it or leave it.
* that ints will overlow beyond Integer.MAX_VALUE | ||
* @return VariantContext attribute indexed by key, as list of long. | ||
*/ | ||
static private List<Long> getAttributeAsLongList(final VariantContext vc, final String key, final Long defaultValue) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with your reticence to put this into htsjdk. The type handling there is super awkward, but I don't think it's worth dealing with until it gets improved in 3.0. However, I might suggest making this method public and moving it to a utility class. GATKProtectedVariantContextUtils
has a lot of similar methods
I made the change you suggested but just realized I forgot to push before
leaving for my flight. I'll push when I get back.
…On Tue, Nov 20, 2018, 12:04 PM ldgauthier ***@***.*** wrote:
***@***.**** commented on this pull request.
Looks lovely, thanks for the quick fix! I have one refactoring suggestion
-- take it or leave it.
------------------------------
In
src/main/java/org/broadinstitute/hellbender/tools/walkers/annotator/RMSMappingQuality.java
<#5435 (comment)>:
> return Arrays.asList(squareSum,totalDP);
} catch (final NumberFormatException e) {
throw new UserException.BadInput("malformed " + GATKVCFConstants.RAW_RMS_MAPPING_QUALITY_KEY + " annotation: " + rawDataString);
}
}
+ /**
+ * Private getter function to replace VariantContext::getAttributeAsIntList in instances where there is a chance
+ * that ints will overlow beyond Integer.MAX_VALUE
+ * @return VariantContext attribute indexed by key, as list of long.
+ */
+ static private List<Long> getAttributeAsLongList(final VariantContext vc, final String key, final Long defaultValue) {
I agree with your reticence to put this into htsjdk. The type handling
there is super awkward, but I don't think it's worth dealing with until it
gets improved in 3.0. However, I might suggest making this method public
and moving it to a utility class. GATKProtectedVariantContextUtils has a
lot of similar methods
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#5435 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGKhCBnSuwz30W-52kw9uZTXywWUCua-ks5uxDYbgaJpZM4Yp2Gl>
.
|
Just pushed change. I also added a getAttributeAsLong function. |
Utils.nonNull(attribute); | ||
return variantContext.getAttributeAsList(attribute).stream().map( | ||
x -> { | ||
if (x == null || x == VCFConstants.MISSING_VALUE_v4) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a bug reported in htsjdk for code that's (nearly) identical to this in CommonInfo.getAttributeAsIntList()
, samtools/htsjdk#1228
It would be nice if there was a way to avoid the duplicate code so we don't have to fix the same bug everywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch noticing this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm willing to fix things in htsjdk if that's the decision that people reach. In the meantime I corrected the equality test for the new getter functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that htsjdk 3.0 is going to have a huge type overhaul, I didn't want to agonize over adding a new method to htsjdk. @lbergelson thoughts?
I created Issue #5454 to migrate the attribute getters from GATKProtectedVariantContextUtils to VariantContext in HTSJDK. I assigned @lbergelson because it could conflict with samtools/htsjdk#1228 and his ongoing work on HTSJDK, but I'm happy to help if that's wanted. |
Updated RMS Mapping quality annotations to be of type long/Long instead
of int/Integer. With int types, a large cohort could overflow
Integer.MAX_VALUE when handling sum of squared mapping qualities. With
long types this should not be a problem until we have off-world
colonies. This resolves issue 5433.