Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronize update of shared genotype likelihood tables. #5071

Merged
merged 3 commits into from
Oct 3, 2018
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,8 @@
import java.util.Arrays;

/**
* Genotype likelihood calculator utility.
* Genotype likelihood calculator utility. This class is thread-safe since access to shared mutable state is
* synchronized.
*
* <p>
* This class provide genotype likelihood calculators with any number of alleles able given an arbitrary ploidy and allele
Expand Down Expand Up @@ -257,7 +258,7 @@ private static GenotypeAlleleCounts[] buildGenotypeAlleleCountsArray(final int p
*
* @return never {@code null}.
*/
public GenotypeLikelihoodCalculator getInstance(final int ploidy, final int alleleCount) {
public synchronized GenotypeLikelihoodCalculator getInstance(final int ploidy, final int alleleCount) {
checkPloidyAndMaximumAllele(ploidy, alleleCount);

if (calculateGenotypeCountUsingTables(ploidy, alleleCount) == GENOTYPE_COUNT_OVERFLOW) {
Expand All @@ -270,17 +271,16 @@ public GenotypeLikelihoodCalculator getInstance(final int ploidy, final int alle
}

/**
* Update of shared tables
Copy link
Collaborator

@jamesemery jamesemery Sep 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Upon some thought I realized that the getInstance() method which calls this one should probably be synchronized, specifically because calculateGenotypeCountUsingTables makes a check against maximumPloidy & maximumAllele which are both not thread safe. Now since this method currently only ever increases either of those fields the check won't cause problems despite being unsafe, but we should probably add a strong comment somewhere enforcing that maximumPloidy & maximumAllele can only be increased by this method so we can feel safe checking against them where this is called.

* Update of shared tables.
*
* @param requestedMaximumAllele the new requested maximum allele maximum.
* @param requestedMaximumPloidy the new requested ploidy maximum.
*/
private void ensureCapacity(final int requestedMaximumAllele, final int requestedMaximumPloidy) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a comment to the top of this method explaining that since each region in spark owns one HaplotypeCallerEngine (and therefore genotyping engine) which gets reused that this must be synchronized to avoid race conditions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment.

private synchronized void ensureCapacity(final int requestedMaximumAllele, final int requestedMaximumPloidy) {

final boolean needsToExpandAlleleCapacity = requestedMaximumAllele > maximumAllele;
final boolean needsToExpandPloidyCapacity = requestedMaximumPloidy > maximumPloidy;

// Double check with the lock on to avoid double work.
if (!needsToExpandAlleleCapacity && !needsToExpandPloidyCapacity) {
return;
}
Expand Down Expand Up @@ -380,12 +380,9 @@ public static int computeMaxAcceptableAlleleCount(final int ploidy, final int ma
throw new GATKException("Code should never reach here.");
}


private int calculateGenotypeCountUsingTables(int ploidy, int alleleCount) {
private synchronized int calculateGenotypeCountUsingTables(int ploidy, int alleleCount) {
checkPloidyAndMaximumAllele(ploidy, alleleCount);
if (ploidy > maximumPloidy || alleleCount > maximumAllele) {
ensureCapacity(alleleCount, ploidy);
}
ensureCapacity(alleleCount, ploidy);
return alleleFirstGenotypeOffsetByPloidy[ploidy][alleleCount];
}
}