Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Understanding values in annotation file #261

Closed
jfourquet2 opened this issue Dec 16, 2020 · 9 comments
Closed

Understanding values in annotation file #261

jfourquet2 opened this issue Dec 16, 2020 · 9 comments

Comments

@jfourquet2
Copy link

Hi,
To be sure of my undertanding of fields in the output annotation file, are the elements into "Transferred annotations fields" are all GOs terms, EC numbers, etc of all OG presents into the column "eggNOG OGs"?
Thanks in advance for your answer !

@Cantalapiedra
Copy link
Collaborator

Hi @jfourquet2 ,

could you specify which version of eggnog-mapper are you using, and also an example of a results you asking about?

Thank you.

Best,
Carlos

@jfourquet2
Copy link
Author

Hi Carlos,
I'm using the version 2.0.2-rf1 of eggNOG-mapper and for exemple I have this result:

#query_name	seed_eggNOG_ortholog	seed_ortholog_evalue	seed_ortholog_score	eggNOG OGs	narr_og_name	narr_og_cat	narr_og_desc	best_og_name	best_og_cat	best_og_desc	Preferred_name	GOs	EC	KEGG_ko	KEGG_Pathway	KEGG_Module	KEGG_Reaction	KEGG_rclass	BRITE	KEGG_TC	CAZy	BiGG_Reaction	PFAMs
SC1802-114574_CCGCGGTT-AGCGCTAG-BHLGV2DSXX_L0041.Prot_00008	877418.ATWV01000015_gene2702	1.3e-73	283.5	COG3850@1|root,COG3850@2|Bacteria	COG3850@2|Bacteria	T	phosphorelay sensor kinase activity	COG3850@2|Bacteria	T	phosphorelay sensor kinase activity		-	2.7.13.3,4.6.1.1	ko:K01768,ko:K07673,ko:K07713	ko00230,ko02020,ko02025,ko04113,ko04213,map00230,map02020,map02025,map04113,map04213	M00471,M00499,M00695R00089,R00434	RC00295	ko00000,ko00001,ko00002,ko01000,ko01001,ko02022	-	-	-	4HB_MCP_1,AAA,AAA_2,AAA_5,CZB,Cache_1,Cache_3-Cache_2,DUF3365,DUF443,GAF,GAF_2,Guanylate_cyc,HAMP,HATPase_c,HD,HTH_8,Hemerythrin,HisKA,HisKA_3,MASE3,MCPsignal,NIT,PAS_4,PilJ,Sigma54_activ_2,Sigma54_activat,TarH,dCache_1,dCache_3

In the column eggNOG OGs I have for the first result line COG3850@1|root,COG3850@2|Bacteria COG3850@2|Bacteria T phosphorelay sensor kinase activity COG3850@2|Bacteria and I wanted to know if into the columns of functional annotation (GOs, EC, etc) I have all the GOs terms, EC numbers (etc) of the COGs into this eggNOG OGs column?

@Cantalapiedra
Copy link
Collaborator

Hi @jfourquet2 ,

not sure if I understand your question. I will try to answer. Besides the eggNOG OGs you have another column, best_og_name, which is the OG used to retrieve the orthologs from which annotation terms are finally obtained. So more specifically, the annotations you see should not come from COG3850@1|root and COG3850@2|Bacteria, but only from COG3850@2|Bacteria in this case.

I hope this makes sense.

Best,
Carlos

@jfourquet2
Copy link
Author

Hi Carlos,
Thanks a lot for your answer, it is very clear now ! I have also an other question: why are there different annotations separated by commas ?
Best,
Joanna

@Cantalapiedra
Copy link
Collaborator

Hi Joanna,

glad to help. There are different annotations separated by commas because there are different annotations in the eggnog DB for your Orthologous Group. For example, check COG3850 in http://eggnog5.embl.de/ under the "Functional profile" -> "Domains" tabs

I hope this makes sense.

Best,
Carlos

@jfourquet2
Copy link
Author

jfourquet2 commented Dec 17, 2020

Hi Carlos,
Thanks a lot ! I've found here https://github.com/eggnogdb/eggnog-mapper/wiki/eggNOG-mapper-v1 the description of the output file for eggNOG v1. In this description it is notified for the output file that 11th column corresponds to best_OG|evalue|score: Best matching Orthologous Groups (only in HMM mode). Is the best_og_name column of eggNOG v2.0.2-rf1 corresponds to this old column of the old output file? I've not used HMM profiles (because of the version I've used) so I didn't well understand that...
Concerning the annotations columns (EC, GOs, etc), are all these informations contained in eggNOG 5.0.1 database? You didn't use an other database than eggNOG? Because I wanted to know if for exemple PFAMs database is updated if this update is directly taking into account by upgrading eggNOG-mapper or if eggNOG must be updated after the update of PFAMs database and then I must update the version of eggNOG used into eggnog-mapped? Thanks a lot in advance !
Best,
Joanna

@Cantalapiedra
Copy link
Collaborator

Hi Joanna,

sorry for the delay answering.

Hi Carlos,
Thanks a lot ! I've found here https://github.com/eggnogdb/eggnog-mapper/wiki/eggNOG-mapper-v1 the description of the output file for eggNOG v1. In this description it is notified for the output file that 11th column corresponds to best_OG|evalue|score: Best matching Orthologous Groups (only in HMM mode). Is the best_og_name column of eggNOG v2.0.2-rf1 corresponds to this old column of the old output file? I've not used HMM profiles (because of the version I've used) so I didn't well understand that...

It should be conceptually the same. However, when using HMMER the first hit is in fact an OG. From that OG, the query is realigned to the OG members, and the best hit is used as seed ortholog. Then, the next steps (finding the other OGs in the hierarchy, deciding which is best, etc) is the same. I guess the "only in HMM mode" was because that evalue and score from a hit to a OG is only obtained using HMMER. With diamond the evalue and score are from the alignment to the seed ortholog.

Concerning the annotations columns (EC, GOs, etc), are all these informations contained in eggNOG 5.0.1 database? You didn't use an other database than eggNOG? Because I wanted to know if for exemple PFAMs database is updated if this update is directly taking into account by upgrading eggNOG-mapper or if eggNOG must be updated after the update of PFAMs database and then I must update the version of eggNOG used into eggnog-mapped? Thanks a lot in advance !

Yes, all the annotations are from the eggNOG 5.0.1 database, unless you are using --pfam_realign options, in which case the PFAM database is used directly. The PFAM database used for eggNOG 5.0.1 DB is PFAM31 currently if I recall correctly. We have plans to update all the annotations soon, but I cannot confirm when will happen. To update, so far the idea is that each eggnog-mapper version has associated an eggNOG database, and therefore when you update eggnog-mapper and you run "emapper.py --version" you should be warned if the eggnNOG DB version is not the one expected for the emapper.py version. In such case, you better run "download_eggnog_data.py" again to update the database.

Best,
Joanna

I hope this makes sense.

Best,
Carlos

@jfourquet2
Copy link
Author

Hi Carlos,
Thank your for your detailed answer !
It helps a lot.
Best,
Joanna

@Cantalapiedra
Copy link
Collaborator

Glad to be of help.
Best,
Carlos

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants