feat: Add option to make annotaTR less strict on Beagle AP field checks #233
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The main change is the addition of the option
--warn-on-AP-error
which results in skipping loci where checks on AP fields fail. In these cases, rather than the program quitting, we output nan values for dosages. In particular the checks this is relevant to are:Most of these should still never happen. We have encountered cases where values sum to more than 1, likely due to rounding errors in cases with huge numbers of alleles.
This is a somewhat dangerous flag and its use should not be encouraged. The main motivation is in cases where we run annotaTR on huge VCF files which takes many hours only to encounter a bad AP field at the very end and crash, or when the vast majority of AP fields are fine but a few problematic loci cause the whole run to fail.
Other specific changes:
strict
toGetDosages()
. This defaults totrue
, in which case we throwValueError
for the cases above. If this isfalse
, we output a warning and return all dosage values asnp.nan
.strict
option is set, added info to the error/warning messages about which locus was problematic to help with tracking down those cases.Checklist
fix:
. Otherwise, if it introduces a new feature, please prefix it withfeat:
. If it introduces a breaking change, please add an exclamation before the colon, likefeat!:
. If the scope of the PR changes because of a revision to it, please update the PR title, since the title will be used in our CHANGELOG.poetry lock --no-update
to ensure the lock file stays up to date and that our dependencies are locked to their minimum versions