debug: fixed --keep-zeros skip/dup loci #73
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When using
perbase base-depth
with--keep-zeros
I sometimes get missing and duplicate loci in the output, which I think is the same as issue #71.I believe this small change to the
process_regions
function inbase_depth.rs
fixes the issue which occurs when there are no-depth loci skipped in theresult:Vector<PileupPosition>
.The previous code created
new_result
fromresult
assuming the following situation:where the x's are where the
result
vector recorded non-zero depth positions.The prior code first fills in the "left" of x's, then extends
new_result
with the x's fromresult
, thenfills in the "right" of the x's.
But the code produces incorrect output when the situation instead looks like:
I've added two new tests cases to
check_depths
by creating two new rstest#[fixture]
callednon_mate_aware_keep_zeros_positions
andmate_aware_keep_zeros_positions
. I'm not sure if this was the best approach and I'm definitely open to suggestions.Besides these two test cases I've also tested on a small .bam file
wget https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/phase3/data/HG00096/alignment/HG00096.chrom11.ILLUMINA.bwa.GBR.low_coverage.20120522.bam
with the following command which was producing dups/skips before the code changes but now lists each locus once.