You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I want to split one big predicted protein to exons according to their gff file. I have three output .fas .codon.fas .headersMap.tsv and .gff produced by Metaeuk.
In gtf file, CDS coordination is based on assembled contig. So I could not find the information of coordination where exon stop in protein (.fas) output. Basically, what I want to do is that,
This protein contains more than one exon. I want to
I could not find this information in Metaeuk gff file, This is based on contigs, so I am able to separate it in .codon.fas file using these information, not in output .fas
I am not 100% sure I have understood your need so please correct me if I am wrong.
It seems like you wish to split each single fasta record to multiple records, one for each exon.
If so, then indeed, MetaEuk does not provide this kind of output but it should be possible to write a script that creates this fasta from the original fasta file*. Each exon is described in the fasta header, separated with pipes from the other exons. The numbers given for each exon are the original coordinates on the contig (please note the possible short overlap between exons. There is one between the first and second in your example). Also note that unlike the report in the MetaEuk header, the GFF coordinates start with index 1, as standard for that format. https://github.com/soedinglab/metaeuk#the-metaeuk-header
Hello Eli,
I want to split one big predicted protein to exons according to their gff file. I have three output
.fas .codon.fas .headersMap.tsv
and.gff
produced by Metaeuk.In gtf file, CDS coordination is based on assembled contig. So I could not find the information of coordination where exon stop in protein (.fas) output. Basically, what I want to do is that,
This protein contains more than one exon. I want to
to
I could not find this information in Metaeuk gff file, This is based on contigs, so I am able to separate it in .codon.fas file using these information, not in output .fas
Does Metaeuk provide any coordination information regarding splitting of exons in big coding sequence?
Thank you !
The text was updated successfully, but these errors were encountered: