-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GAG discards UTR features #149
Comments
@marchoeppner This is by design, or by laziness perhaps. Since UTR isn't included in the NCBI's .tbl file, we choose to ignore them. Including them in the output is non-trivial, since certain filters and fixes within GAG can shift the boundaries between CDS and UTR. It's doable, it's just more complicated than Read-Them-In-And-Write-Them-Out. As long as this omission doesn't cause anybody trouble with their genome submission, fixing it is low-priority. If anyone gets errors or other flak due to the absence of UTR, we'll move it up the queue. |
Understood - maybe something for the future? We are using Gag and Annie for things other than NCBI tbl dumping, so not being able to parse all features makes things a little tricky. That being said, it is already a very useful tool as is! |
We have decided to do this. Maybe next week, depending on how horribly some transcriptome submissions go ... |
We will create new UTR features from scratch, rather than preserve the original ones. This is simpler, gets around the issue of fixes and filters shifting UTR boundaries. |
i solved it replacing the line 246 in src/gff_reader.py:
this mantain the UTR in the .gff output but i thinks that not valid for NCBI tbl |
I am not sure this behaviour is by design, but GAG currently ejects UTR features into the file genome.ignored.gff , resulting in the final annotation genome.gff to have no UTR annotations either. This concerns features with the feature type:
five_prime_UTR
three_prime_UTR
However, these two features are perfectly valid within the GFF3 standar and probably shouldn't be ignored (?).
More information at: http://www.sequenceontology.org/gff3.shtml
The text was updated successfully, but these errors were encountered: