Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor two VariantEval methods to allow subclasses to override #5998

Merged
merged 1 commit into from
Jun 21, 2019

Conversation

bbimber
Copy link
Contributor

@bbimber bbimber commented Jun 12, 2019

We have a tool, VariantQC, that extends VariantEval. This PR is a minor refactor to expose the code that creates the list of VariantStratifier and VariantEvaluator objects as protected methods, so subclasses could modify them. This should have no functional difference on VariantEval itself. We're hoping to use these changes in order to adapt our tool in response to reviewers, so if there is any way to push these changes we would appreciate it.

@bbimber
Copy link
Contributor Author

bbimber commented Jun 12, 2019

FWIW, @cmnbroad is who I worked with on the earlier VariantEval changes.

@cmnbroad cmnbroad self-assigned this Jun 12, 2019
@codecov
Copy link

codecov bot commented Jun 12, 2019

Codecov Report

Merging #5998 into master will decrease coverage by 6.984%.
The diff coverage is 50%.

@@               Coverage Diff               @@
##              master     #5998       +/-   ##
===============================================
- Coverage     86.929%   79.945%   -6.984%     
+ Complexity     32765     30984     -1781     
===============================================
  Files           2016      2016               
  Lines         151460    151466        +6     
  Branches       16628     16628               
===============================================
- Hits          131663    121090    -10573     
- Misses         13732     24549    +10817     
+ Partials        6065      5827      -238
Impacted Files Coverage Δ Complexity Δ
...lkers/varianteval/evaluators/VariantEvaluator.java 70% <0%> (-12.353%) 12 <0> (ø)
...lbender/tools/walkers/varianteval/VariantEval.java 88.079% <100%> (+0.04%) 109 <1> (+1) ⬆️
...ls/walkers/varianteval/util/EvaluationContext.java 76.316% <50%> (-4.24%) 12 <1> (ø)
...dorientation/CollectF1R2CountsIntegrationTest.java 0.714% <0%> (-99.286%) 1% <0%> (-14%)
...kers/filters/VariantFiltrationIntegrationTest.java 0.826% <0%> (-99.174%) 1% <0%> (-25%)
.../walkers/bqsr/BaseRecalibratorIntegrationTest.java 1.031% <0%> (-98.969%) 1% <0%> (-7%)
...s/variantutils/VariantsToTableIntegrationTest.java 1.042% <0%> (-98.958%) 1% <0%> (-21%)
...ers/vqsr/FilterVariantTranchesIntegrationTest.java 1.053% <0%> (-98.947%) 1% <0%> (-5%)
...on/FindBreakpointEvidenceSparkIntegrationTest.java 1.754% <0%> (-98.246%) 1% <0%> (-6%)
...bender/tools/spark/PileupSparkIntegrationTest.java 2.041% <0%> (-97.959%) 2% <0%> (-13%)
... and 173 more

@bbimber bbimber force-pushed the VariantEvalMethods branch from c382256 to 8bc60eb Compare June 13, 2019 18:23
@cmnbroad
Copy link
Collaborator

I restarted the travis build since the one failure seems to be an unrelated transient issue.

@bbimber bbimber force-pushed the VariantEvalMethods branch from c9936e1 to a4c2910 Compare June 14, 2019 20:49
@bbimber
Copy link
Contributor Author

bbimber commented Jun 14, 2019

OK, thanks. I tried to keep edits here limited and protected. I'm happy to describe more about what I'm trying to do in VariantQC if that's helpful.

Also - i have not forgotten about trying to refactor VariantQC to better handle arguments (i.e. dont pass the walker to the VariantEvaluator, and to separate a VariantEvalEngine class, somewhat like VariantAnnotationEngine.

@cmnbroad
Copy link
Collaborator

cmnbroad commented Jun 17, 2019

@bbimber Yes, it might help if could elaborate a bit on what you're trying to do. Specifically I'd like to find an alternative to getStratifierClasses in VariantEvalUtils that doesn't require handing out a map, which should be fairly easy, but it would help if I knew how you were going to use it. Thanks.

@bbimber
Copy link
Contributor Author

bbimber commented Jun 18, 2019

@cmnbroad : if getStratifierClasses() is the only sticking point, we can drop that.

Stepping back: as you probably know we have a tool, VariantQC, which basically sets up a number of instances of VariantEval and uses them to aggregate data as it iterates a VCF. this allows the tool to capture data aggregated/stratified at multiple levels with one pass through the VCF.

There are two related aims:

  1. my tool needs to know the allowable stratification classes. Instead of copy/paste the reflection code to find classes, this PR was exposing that getter as a public method. I'm not sure I understand why this is a sticking point, but we can remove this without that much problem for me. Like we discussed earlier, VariantEval should get refactored into a walker class and some kind of VariantEvalEngine class, and this refactor might be a time to address my concern here. This is not actively blocking us. I could also make this a protected getter on VariantEval if the public aspect is what you dont like.

  2. The remaining changes serve a different purpose. Most VariantEvaluator classes are 'dumb' in that they are instantiated with no configuration and always aggregate the same fields. In our tool, we wanted to let the user specify a list of INFO fields to aggregate (i.e. we dont know the target until runtime). We wrote an InfoFieldAggregator class, which is instantiated with the name of an INFO field. This lets our code create multiple instances of that VariantEvaluator, potentially summarizing different fields. The remaining changes are designed to enable this.

Copy link
Collaborator

@cmnbroad cmnbroad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few simple requests.

stratManager.set(i, ec);
}
}

protected EvaluationContext getEvaluationContext(final Set<Class<? extends VariantEvaluator>> evaluationObjects) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this would be better named createEvaluationContext, and add javadoc describing what implementers should do, i.e. that it should always create a new instance (unless there is some compelling reason to be more permissive?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -11,6 +11,10 @@
private VariantEval walker;
private final String simpleName;

protected VariantEvaluator(String simpleName) {
this.simpleName = simpleName;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other constructor (below) should delegate to this one now: this(getClass().getSimpleName()), and this should have javadoc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, but getClass() cant be called before the supertype constructor is called? is there a way around that?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be fine.

private final ArrayList<VariantEvaluator> evaluationInstances;
private final Set<Class<? extends VariantEvaluator>> evaluationClasses;
protected final List<VariantEvaluator> evaluationInstances;
protected final Set<Class<? extends VariantEvaluator>> evaluationClasses;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets keep these private and add protected getEvaluationClasses and getEvaluationInstances methods to return them, with javadoc that includes stating that the return values can be null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. is there a reason GATK doesnt use @nullable more?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, i got it now, disregard the question about @nullable here. nonetheless, that annotation does seem under-used across GATK

@@ -26,7 +27,7 @@ public EvaluationContext(final VariantEval walker, final Set<Class<? extends Var
private EvaluationContext(final VariantEval walker, final Set<Class<? extends VariantEvaluator>> evaluationClasses, final boolean doInitialize) {
this.walker = walker;
this.evaluationClasses = evaluationClasses;
this.evaluationInstances = new ArrayList<VariantEvaluator>(evaluationClasses.size());
this.evaluationInstances = new ArrayList<>(evaluationClasses.size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

*
* @return An unmodifiable map of all VariantStratifier classes
*/
public static Map<String, Class<? extends VariantStratifier>> getStratifierClasses() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this isn't required, lets remove it. To answer your question, I would generally avoid handing out internal structures like this, especially Maps, since its not clear from the types what it contains. There are several alternative ways to do this, and I know this code already does similar things all over the place (extreme example being bindVariantContexts), but lets not expose any more if we can avoid it. As you mentioned, the refactoring is the right way to address it - I think that would be a fair amount of work, but if we want to continue to make changes to this code it will probably be necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@bbimber
Copy link
Contributor Author

bbimber commented Jun 19, 2019

@cmnbroad thanks for the quick review - i just pushed those changes.

@cmnbroad
Copy link
Collaborator

Restarting tests...

@bbimber
Copy link
Contributor Author

bbimber commented Jun 20, 2019

@cmnbroad are these codecov results an actual problem or something incorrect in how it's running?

Copy link
Collaborator

@cmnbroad cmnbroad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't worry about the code cov diffs.

* @param evaluationObjects
* @return
*/
protected EvaluationContext createEvaluationContext(final Set<Class<? extends VariantEvaluator>> evaluationObjects) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its pro forma, but please add descriptions of the param and return.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@@ -11,6 +11,10 @@
private VariantEval walker;
private final String simpleName;

protected VariantEvaluator(String simpleName) {
this.simpleName = simpleName;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be fine.

public Set<Class<? extends VariantEvaluator>> getEvaluationClasses() {
return evaluationClasses;
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fill out @return.

@bbimber bbimber force-pushed the VariantEvalMethods branch from f5ccb87 to fb3421a Compare June 20, 2019 20:01
@bbimber
Copy link
Contributor Author

bbimber commented Jun 20, 2019

@cmnbroad OK, that's added. commits also squashed

@cmnbroad
Copy link
Collaborator

Thanks @bbimber.

@cmnbroad cmnbroad merged commit 68715da into broadinstitute:master Jun 21, 2019
@bbimber bbimber deleted the VariantEvalMethods branch June 21, 2019 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants