-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delete a bunch of unused methods #6107
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks as always for being a good boy scout, @davidbenjamin! Mostly minor comments except for one rather serious (IMO) issue. Determinism of production code aside, I think we need a thorough audit of test determinism.
@@ -37,7 +31,6 @@ | |||
public static final double LOG10_ONE_THIRD = -Math.log10(3.0); | |||
public static final double LOG_ONE_THIRD = -Math.log(3.0); | |||
public static final double INV_LOG_2 = 1.0 / Math.log(2.0); | |||
public static final double INV_LOG_10 = 1.0 / Math.log(10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LN_1_M_EXP_THRESHOLD
appears to be unused. There are also some inconsistencies in these names (LOG
vs LN
, use of OF
, etc.) that might be worth cleaning up while you're here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
return IntStream.range(0, matrix.getRowDimension()) | ||
.mapToDouble(r -> sum(matrix.getRow(r))).toArray(); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like sum(final double[][][] array)
and at least a few more methods in MathUtils
are only called by test code. Some of these can probably be removed; others are used for checking test results, but I'm not sure if they need to be in production code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did a bit more deleting
@@ -292,7 +292,7 @@ private void takeEstep(final double[] artifactPriors) { | |||
private static double computeLogPosterior(final int altDepth, final int altF1R2Depth, final int depth, | |||
final double statePrior, final BetaDistributionShape afPseudoCounts, | |||
final BetaDistributionShape f1r2PseudoCounts){ | |||
Utils.validateArg(MathUtils.isAProbability(statePrior), String.format("statePrior must be a probability but got %f", statePrior)); | |||
Utils.validateArg(MathUtils.goodProbability(statePrior), String.format("statePrior must be a probability but got %f", statePrior)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think isValidProbability
and isValidLog10Probability
are probably better names for these methods, but I'll leave it up to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
final double[] correctedNoiseBySample = MathUtils.rowStdDevs(correctedCoverage); | ||
Arrays.stream(correctedNoiseBySample).forEach(x -> Assert.assertTrue(x < NON_GC_BIAS_NOISE_LEVEL * MEAN_READ_DEPTH)); | ||
Utils.nonNull(correctedCoverage); | ||
final Variance varianceEvaluator = new Variance(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably should just use StandardDeviation
(although this actually uses Variance
under the hood as well).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
final double mean = MathUtils.mean(dresults, 0, dresults.length); | ||
final double std = new StandardDeviation().evaluate(dresults); | ||
final double mean = Arrays.stream(results).average().getAsDouble(); | ||
final double std = new StandardDeviation().evaluate(Arrays.stream(results).mapToDouble(i->i).toArray()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, is there some reason we don't typically use asDoubleStream
in such code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No good reason.
@@ -1,6 +1,7 @@ | |||
package org.broadinstitute.hellbender.utils.locusiterator; | |||
|
|||
import org.broadinstitute.hellbender.utils.MathUtils; | |||
import org.broadinstitute.hellbender.utils.Utils; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove MathUtils
import (probably should just optimize imports across the board).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -114,7 +115,7 @@ private void makeReads() { | |||
int alignmentStart = 1; | |||
|
|||
for ( int readsThisStack : readCountsPerAlignmentStart ) { | |||
ArrayList<GATKRead> stackReads = new ArrayList<>(ArtificialReadUtils.createIdenticalArtificialReads(readsThisStack, header, "foo", 0, alignmentStart, MathUtils.randomIntegerInRange(50, 100))); | |||
ArrayList<GATKRead> stackReads = new ArrayList<>(ArtificialReadUtils.createIdenticalArtificialReads(readsThisStack, header, "foo", 0, alignmentStart, Utils.getRandomGenerator().nextInt(51) + 50)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was/is this deterministic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, just a cursory examination shows a number of other tests that do not reset the seed. @lbergelson @jamesemery considering your recent foray into checking determinism, perhaps it's time for a thorough audit? Any reason why we need to use a global generator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps go ahead and extract a variable for the read length
while we're here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noticed lots of missing final
s in this class, up to you if you want to fix those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a bunch of stuff
Back to @samuelklee. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making changes, @davidbenjamin! A few comments on your refactoring of PerSampleReadStateManagerUnitTest
. OK to merge once addressed, but please double check that behavior is unchanged from the original test. Probably worth it to explicitly mention this refactoring in the commit messages, too.
} | ||
|
||
@Test(dataProvider = "PerSampleReadStateManagerTestDataProvider") | ||
public void runPerSampleReadStateManagerTest( PerSampleReadStateManagerTester test ) { | ||
logger.warn("Running test: " + test); | ||
|
||
Utils.resetRandomGenerator(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove logger.warn
line above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resetting the seed here non-trivially changes the set of test cases. Is this desired?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also put the resetting of the seed as close as possible to the use of the random generator to make the behavior more obvious. I would put the generation of lengths in createPerSampleReadStateManagerTests
and reset the seed at the beginning of that method. This essentially encapsulates the use of the generator in the data provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I can confirm that this restores the original set of test cases, so I'm assuming the rest of your refactoring is good!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done done done
private void makeReads() { | ||
int alignmentStart = 1; | ||
|
||
private void makeReads(List<GATKRead> reads, List<ArrayList<AlignmentStateMachine>> recordStatesByAlignmentStart) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parameters should be final
, but I think you should get rid of this method as discussed above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true
PerSampleReadStateManager perSampleReadStateManager = new PerSampleReadStateManager(LocusIteratorByState.NO_DOWNSAMPLING); | ||
final List<GATKRead> reads = new ArrayList<>(); | ||
final List<ArrayList<AlignmentStateMachine>> recordStatesByAlignmentStart = new ArrayList<>(); | ||
makeReads(reads, recordStatesByAlignmentStart); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect that a makeReads
method that modifies its parameters is unnecessary, but I admit that I haven't looked at the code in detail. I think it would be more intuitive to generate and directly assign the nested list of GATKRead
s and then flatten and map to the nested list of AlignmentStateMachine
s as needed. Probably a couple of stream statements would suffice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
// The read we get back should be literally the same read in memory as we put in | ||
Assert.assertTrue(originalRead == readFromPerSampleReadStateManager); | ||
Assert.assertTrue(originalReadsIterator.hasNext()); | ||
Assert.assertTrue(originalReadsIterator.next() == stateIterator.next().getRead()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing state in the assert statement doesn't seem necessary. A developer might reasonably assume that deleting assert statements should remove checks of behavior without changing behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
if ( readCount % removalInterval == 0 ) { | ||
originalRead = originalReadsIterator.next(); // advance to next read, since the previous one should have been discarded | ||
readCount++; | ||
Iterator<AlignmentStateMachine> secondPassStateIterator = perSampleReadStateManager.iterator(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be final
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
// The read we get back should be literally the same read in memory as we put in (after accounting for removals) | ||
Assert.assertTrue(originalRead == readFromPerSampleReadStateManager); | ||
Assert.assertTrue(secondPassStateIterator.hasNext()); | ||
Assert.assertTrue(reads.get(readIndex) == secondPassStateIterator.next().getRead()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
new PerSampleReadStateManagerTester(thisTestReadStateCounts, removalInterval); | ||
} | ||
} | ||
final List<Integer> removeIntervals = Arrays.asList(0, 2, 3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think removalIntervals
is more consistent. Might be worth changing names everywhere to indicate that these are interval indices, but I leave that up to you.
Reviewer said to merge once changes made
@samuelklee Did everything, need your approving review to merge. |
Right, sorry about that. Thanks for making the changes! |
@samuelklee Would you mind looking at this cruft-removal PR?