-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
agat_sp_filter_by_ORF_size.pl is not isoform aware #512
Comments
Currently, in such case the gene should be reported in both outputs |
@GallVp |
…st only if all isoforms pass the test
Thank you @Juke34 for the prompt response. I think one of the following outcomes make sense,
|
I think a transcript without CDS should pass into the good_gene_list as I understand this is the current behaviour of agat_sp_filter_by_ORF_size.pl. If there is no ORF, there is no application of the filter test. Removing such transcripts would imply that the tool assumes the absence of CDSs as an ORF of length 0. Doing so will be also remove all non-protein coding transcripts. |
Ok I finally updated to get the original expected behaviour. Any transcript that does not pas the test is discarded. So
becomes with output_sup24.gff
and output_NOT_sup24.gff
|
Would you think I should add:
|
Thank you @Juke34 This is amazing.
I can't think of a use-case where these options will be useful. Your new implementation where filter is applied at the transcript level and the passing and failing isoforms are separated out works perfectly for all my needs. Thank you very much! |
Thank you for your amazing work. I am a big fan of AGAT.
Describe the bug
agat_sp_filter_by_ORF_size.pl includes a multi-isoform gene to the
good_gene_list
if the first isoform passes the test filter even if the second isoform was to fail the test filter. This behaviour is either unintended or not documented.General (please complete the following information):
To Reproduce
Please use the below GFF to do,
mRNA
gene19851.t2
of genegene19851
should fail the test and, therefore, genegene19851
should be removed. But that is not what happens. Rather,gene19851
is retained in theoutput_sup24.gff
file.The text was updated successfully, but these errors were encountered: