-
Notifications
You must be signed in to change notification settings - Fork 444
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Synced reader does not support regions with contigs including colons #1620
Comments
This is not easy to support. The function accepts regions in multiple formats ( The only solution I can think of is to move the responsibility for resolving ambiguities like this to the user and require full intervals |
Guess they should have thought better before allowing colons in contig names... Looks kinda necessary after all! 😅 |
We solved this already though elsewhere, with additional notation such as {chra:b}:10-20, or even just parsing from the other end so the last colon is the one that is used. This doesn't work if you rather foolishly created a contig named "chr10:100-200" as that's ambiguous, but that's why we adding the curly brace notation. The htslib APIs correct support this, so it's possibly simply an issue of synced reader doing its own parsing rather than using the official region parsing API. (I haven't looked.) |
However a simplified version of it could be used, and as @jkbonfield noted, the explicit delimiter notation suggested in the appendix could also be supported. Or perhaps it should be superseded by an API function that is provided with the set of contigs in play in the file(s) to be read. |
Note hts_parse_region() cannot be used because it requires the header and without the header the caller does not learn the contig name. Resolves samtools#1620
Note hts_parse_region() cannot be used because it requires the header and without the header the caller does not learn the contig name. Resolves #1620
Note hts_parse_region() cannot be used because it requires the header and without the header the caller does not learn the contig name. Resolves samtools#1620
The synced_bcf_reader currently (as of version 1.17) fails with an error when it is initialised via
bcf_sr_regions_init
using region strings containing contig names with colons, despite the fact that colons are no longer disallowed characters in contig names since VCFv4.3:The underlying
_regions_init_string
function will likely need to be updated to scan for colons from the back instead of the front to accommodate this change of spec.The text was updated successfully, but these errors were encountered: