Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complex SV intervals support #8521

Merged
merged 3 commits into from
Jul 11, 2024
Merged

Complex SV intervals support #8521

merged 3 commits into from
Jul 11, 2024

Conversation

mwalker174
Copy link
Contributor

  • Complex (CPX) SV records no longer use END2/CHR2 INFO fields and instead are defined by CPX_INTERVALS in addition to normal coordinates.
  • Adds new class SVCallRecord.ComplexEventInterval for storing complex event intervals, which are associated with a particular SV type.
  • SV clustering now takes into account CPX subtype and intervals, requiring intervals to match on type and interval similarity as defined by the clustering parameters (e.g. minimum reciprocal overlap).
  • Improved reciprocal translocation (CTX) record handling. In particular, these records are now treated similarly to BNDs for clustering purposes.
  • Added unit tests for new CPX cases.

Copy link
Contributor

@epiercehoffman epiercehoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this - complex SVs make for complex code. My primary hesitation is related to the way length is calculated for CPX events - let me know what you think. There are some other smaller comments/questions as well, and you'll need to rebase as I anticipate conflicts with my recent CTX PR.

final Integer end2;
final String chr2;
if (type == GATKSVVCFConstants.StructuralVariantAnnotationType.BND
|| type == GATKSVVCFConstants.StructuralVariantAnnotationType.CTX) {
|| type == GATKSVVCFConstants.StructuralVariantAnnotationType.CTX) {
end = record.getPositionA();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note for future: To handle complex translocations, there need to be breakpoints at CHROM:POS, CHROM:END, CHR2:POS2 (which we currently don't have), and CHR2:END2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a TODO note. Would it also be possible to use the CPX intervals?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just looked back at my conversation with Xuefang about this - for annotation we discussed that we should check CPX_INTERVALS for complex CTX, but in her CTX_INV example only the INV was in CPX_INTERVALS. For the CTX we used the other positions. If the INV is on the first chromosome, then the CTX breakpoints on that chromosome are defined by CHROM:POS and CHROM:END. But if the INV is on the second chromosome, then the CTX breakpoints are not fully defined by CHR2:END2 - there would need to be a CHR2:POS2.

private final GATKSVVCFConstants.StructuralVariantAnnotationType intervalType;
private final SimpleInterval interval;

public ComplexEventInterval(final String str) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about whether it would make sense to use this in SVAnnotate to harmonize across the different SV tools more. But in SVAnnotate I treat complex SV intervals the same way as any SV interval so that the annotation methods can be interchangeable. Maybe it could be done with some class hierarchy though? Or you could use SVSegment? The main difference is the class method to parse the interval string. I didn't come to a conclusion but wanted to put the thought out there of unifying the SV methods more if it's possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great point. This class is completely redundant with SVSegment, so I've moved that to its own class. However, I've made ComplexEventInterval a subclass with encode()/decode() methods. We could delete the new class altogether, but I think we should have this separate class since each is used in different tools. This will make it easier to make independent changes later if we need to.

Cpx clustering support with testing

Fix complex subtype check

Complex event intervals

Fix typos

Another typo

Address reviewer comments
@gatk-bot
Copy link

gatk-bot commented Jul 9, 2024

Github actions tests reported job failures from actions build 9859920680
Failures in the following jobs:

Test Type JDK Job ID Logs
integration 17.0.6+10 9859920680.11 logs
integration 17.0.6+10 9859920680.0 logs

public static final Set<ComplexVariantSubtype> COMPLEX_POINT_SUBTYPES =
Collections.set(
GATKSVVCFConstants.ComplexVariantSubtype.dDUP,
GATKSVVCFConstants.ComplexVariantSubtype.dDUP_iDEL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dDUP_iDEL events don't have POS = END because there is a deletion at the sink

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't actually find where this constant gets used though?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. I was considering this but didn't end up using it

Copy link
Contributor

@epiercehoffman epiercehoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for making those changes!

@mwalker174 mwalker174 merged commit 23a38ce into master Jul 11, 2024
20 checks passed
@mwalker174 mwalker174 deleted the mw_cpx_end branch July 11, 2024 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants