-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement -L system, enable access to it for tools that request it #4
Comments
One limitation of the GATK interval implementation is that it doesn't handle variants with ranges (like GVCF reference blocks) correctly - and it would be hard to change it so it does. We should keep this use case in mind. The problem is that only the start position is considered, so an interval covering the middle or end of a variant range will not select it. |
Just wanted to chime in and say that it would be great to handle ranged variants well from a structural variation perspective too.. |
I second Joel, I had to produce base level resolution gVCF to avoid it which is an "ugly" workaround |
I third Joel. On Wed, Dec 10, 2014 at 10:20 AM, amilev [email protected] wrote:
|
I fourth Joel on behalf of the users.Also, would it be possible to allow overlapping intervals without merging? Users have asked for this in order to handle alternative transcripts in walkers like DepthOfCoverage.On Wed, Dec 10, 2014 at 10:22 AM, ldgauthier [email protected] wrote: I third Joel. On Wed, Dec 10, 2014 at 10:20 AM, amilev [email protected] wrote:
—Reply to this email directly or view it on GitHub. |
I'm going to assign it to @kshakir for now - to implement the simplest -L (one set of intervals, provided by a file). He may choose to split this issue into smaller ones for more granular features. The approach we're going to take is to implement only the features of -L that we need. The first milestone reflects the first feature to implement. |
This may require us to decide whether to support both GATK-style interval files and Picard-style interval files, or standardize on a single format. We also need to improve the code that detects the interval file format (currently catches exception from Picard to detect GATK interval files) |
Should also support -XL |
Note that right now the hellbender engine is hardcoded to merge adjacent/overlapping intervals when doing read traversals (as required by the htsjdk API) whereas in the GATK the user had the option of not merging intervals |
I think that especially with the imminent switch over to the next reference, I would prefer the Picard version that includes a dictionary header. otherwise we are setting ourselves up for a lot of support questions whose answer is (after hours of debugging) "you are using the wrong reference". |
I'll add: the relevant code has mostly been moved over already (the IntervalUtils class). The current bad/unreliable method of distinguishing between GATK and Picard interval files can be seen in the intervalFileToList() method. |
Implement -L system, enable access to it for tools that request it (define how they 'request it' - maybe by implementing an interface or calling a function or overriding some generic hook - part of this issue is to design it).
The text was updated successfully, but these errors were encountered: