Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding command line exclusion lists, so that users can prune fields from output. #5226

Merged
merged 2 commits into from
Sep 29, 2018

Conversation

LeeTL1220
Copy link
Contributor

@LeeTL1220 LeeTL1220 commented Sep 26, 2018

…. Added some developer utilities for ease of testing and creating new Funcotations.
@codecov-io
Copy link

codecov-io commented Sep 26, 2018

Codecov Report

Merging #5226 into master will increase coverage by 0.007%.
The diff coverage is 95.946%.

@@               Coverage Diff               @@
##              master     #5226       +/-   ##
===============================================
+ Coverage     86.798%   86.804%   +0.007%     
- Complexity     29711     29763       +52     
===============================================
  Files           1820      1820               
  Lines         137406    137644      +238     
  Branches       15160     15177       +17     
===============================================
+ Hits          119265    119481      +216     
- Misses         12637     12651       +14     
- Partials        5504      5512        +8
Impacted Files Coverage Δ Complexity Δ
...ools/funcotator/FuncotatorArgumentDefinitions.java 86.364% <ø> (ø) 1 <0> (ø) ⬇️
.../hellbender/tools/funcotator/FuncotatorEngine.java 85.965% <ø> (ø) 35 <0> (ø) ⬇️
...dataSources/gencode/GencodeFuncotationBuilder.java 96.923% <ø> (ø) 31 <0> (ø) ⬇️
...itute/hellbender/tools/funcotator/Funcotation.java 50% <ø> (-5.556%) 3 <0> (-2)
...dataSources/vcf/VcfFuncotationFactoryUnitTest.java 98.039% <100%> (-0.056%) 29 <0> (-1)
...ncotator/dataSources/TableFuncotationUnitTest.java 97.753% <100%> (ø) 18 <1> (ø) ⬇️
...otator/dataSources/gencode/GencodeFuncotation.java 63.274% <100%> (+0.221%) 226 <0> (+1) ⬆️
...e/hellbender/tools/funcotator/FuncotatorUtils.java 80.718% <100%> (+0.26%) 180 <6> (+5) ⬆️
...tools/funcotator/FuncotatorArgumentCollection.java 100% <100%> (ø) 1 <0> (ø) ⬇️
...ataSources/gencode/GencodeFuncotationUnitTest.java 98.519% <100%> (ø) 12 <1> (ø) ⬇️
... and 20 more

/**
* List of final rendered fields to exclude from a final rendering
*/
public static final String EXCLUSION_FIELDS_LONG_NAME = "exclusion-list";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure exclusion-list is a clear name. Maybe something like "--skip-field" would be clearer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't really bother me, but I think exclude-field is a little more self-explanatory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like --exclude-field

Done

Copy link
Collaborator

@jonn-smith jonn-smith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor requests and questions.

/**
* List of final rendered fields to exclude from a final rendering
*/
public static final String EXCLUSION_FIELDS_LONG_NAME = "exclusion-list";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't really bother me, but I think exclude-field is a little more self-explanatory.

* @return string with the VCF representation of a {@link VcfOutputRenderer#FUNCOTATOR_VCF_FIELD_NAME} for the given
* Funcotation. Never {@code null}, but empty string is possible.
*/
public static String renderSanitizedFuncotationForVcf(final Funcotation funcotation, final List<String> includedFields) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be another method here to for MAF? (i.e. renderSanitizedFuncotationForMaf).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is an excellent question. I originally had renderSanitizedFuncotationForMaf, but realized that I did not need it.

For VCFs, we render each funcotation separately, then concatenate strings. This approach has the drawback of being less flexible in terms of ordering the fields (and setting up aliases), but that has not mattered yet. Regardless, it means that all string operations (e.g. excluding fields and sanitizing values) must be in the same method (in this case renderSanitizedFuncotationForVcf) and that method must work on a funcotation.

For MAFs, we flatten out the funcotations and put the fields into a giant map. Then we do the changes to field names and values on that map. But by the time I want to exclude fields and sanitize, the map is already made, so we do not render individual funcotations. Therefore no need for a renderSantiziedFuncotationForMaf.

If you would like to refactor to make the conventions similar, we can file an issue.

No action.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. If anything, it's more flexible to make the VCF renderer behave more like the MAF renderer.

I'd need to review it to see how feasible that is.

I've added issue #5240 to track this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad you filed the issue. I would prefer the VCF renderer behave like the MAF renderer. Then we could probably share a lot more of the code.

@@ -254,7 +259,7 @@ public MafOutputRenderer(final Path outputFilePath,
defaultMap.put(MafOutputRendererConstants.FieldName_BAM_File, MafOutputRendererConstants.UNUSED_STRING);

// Cache the manual annotation string so we can pass it easily into any Funcotations:
manualAnnotationSerializedString = (manualAnnotations.size() != 0 ? MafOutputRendererConstants.FIELD_DELIMITER + String.join( MafOutputRendererConstants.FIELD_DELIMITER, manualAnnotations.values() ) + MafOutputRendererConstants.FIELD_DELIMITER : "");
//manualAnnotationSerializedString = (manualAnnotations.size() != 0 ? MafOutputRendererConstants.FIELD_DELIMITER + String.join( MafOutputRendererConstants.FIELD_DELIMITER, manualAnnotations.values() ) + MafOutputRendererConstants.FIELD_DELIMITER : "");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented out code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -327,12 +334,14 @@ public void write(final VariantContext variant, final FuncotationMap txToFuncota
writeString(entry.getValue());
writeString(MafOutputRendererConstants.FIELD_DELIMITER);
}
writeLine(manualAnnotationSerializedString);
// writeLine(manualAnnotationSerializedString);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove commented out code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -101,4 +101,11 @@
doc = "(Advanced / DO NOT USE*) If you select this flag, Funcotator will force a conversion of variant contig names from b37 to hg19. *This option is useful in integration tests (written by devs) only."
)
public boolean forceB37ToHg19ContigNameConversion = false;

@Argument(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add to this explanation that the fields will be excluded by exact name matching only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

public Object[][] provideForRenderSanitizedFuncotationForVcf() {

return new Object[][]{
{OutputRenderer.createFuncotationFromLinkedHashMap(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you label the test cases with comments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Done

* Create a dummy gencode funcotation in the most manual way possible. This method does not check for consistency
* or reasonable values.
*
* @param hugoSymbol
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kind of a pain, but can you fill in some descriptions for these fields?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@jonn-smith
Copy link
Collaborator

Back to you, @LeeTL1220

@LeeTL1220
Copy link
Contributor Author

@jonn-smith Back to you

Copy link
Collaborator

@jonn-smith jonn-smith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Feel free to merge once tests complete / pass.

* @return string with the VCF representation of a {@link VcfOutputRenderer#FUNCOTATOR_VCF_FIELD_NAME} for the given
* Funcotation. Never {@code null}, but empty string is possible.
*/
public static String renderSanitizedFuncotationForVcf(final Funcotation funcotation, final List<String> includedFields) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. If anything, it's more flexible to make the VCF renderer behave more like the MAF renderer.

I'd need to review it to see how feasible that is.

I've added issue #5240 to track this.

@LeeTL1220 LeeTL1220 merged commit dc59aab into master Sep 29, 2018
@LeeTL1220 LeeTL1220 deleted the ll_4359_funcotator_exclusion_list branch September 29, 2018 00:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants