Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Taxonomy Parse errors #15

Open
chiras opened this issue Jan 15, 2019 · 2 comments
Open

Taxonomy Parse errors #15

chiras opened this issue Jan 15, 2019 · 2 comments
Assignees

Comments

@chiras
Copy link
Collaborator

chiras commented Jan 15, 2019

[01-15 14:10:34] [ReferenceDbCreator] Adding taxonomy to fasta Use of uninitialized value in numeric eq (==) at /NCBI-Taxonomy/lib/NCBI/Taxonomy.pm line 407, <IN> line 157.
Happens for approx 30 IDs after downloading seqs and taxfile with the command specified in #14

@iimog
Copy link
Member

iimog commented Jan 15, 2019

I'll have to dig into this. Will do it asap (probably tonight).

@iimog
Copy link
Member

iimog commented Mar 15, 2019

I finally got around to digging into this problem. Apparently these messages indicate that for some taxid the lineage could not be found in the local nodes.dmp and names.dmp. In consequence the tax= field in the fasta file remains empty. A similar message is printed by Krona in the end:

   [ WARNING ]  The following taxonomy IDs were not found in the local
                database and were set to root (if they were recently added to                                                                 
                NCBI, use updateTaxonomy.sh to update the local database):                                                                    
                2486830 2338477 2338481 2486579 2338462 2338482 2338485 2338470
...

So it is required to update the local taxonomy dumps inside the docker containers:

cd /NCBI-Taxonomy/
perl make_taxid_indizes.pl
cd /Krona/KronaTools/
./updateTaxonomy.sh

Running the same command afterwards produces no warnings. Those commands do not take very long but it is still not necessarily the best option to run them every time a reference database is created. Every time the docker image is built (beware of caching) the taxonomy databases are updated. However, this is probably too rare. We should probably include some script to do this automatically or a cron job (at least for the web interface).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants