MIDORI updates #56
Labels
enhancement
New feature or request
LW mid priority
priority for the LW developers
update
update a resource used in pema to a more recent version
According to recent emails with the MIDORI developers, it seems wise to update PEMA to where the midori db is now published. Hopefully this will solve a couple of issues that we have had (1) the gaps in the taxonomic classification output when there are missing taxon nodes (2) some were errors and discrepancies in the classifications wrt NCBI
Copy of the emails (latest to first):
Sorry to say that we are no more updating the databases in "MIDORI server”.
We are updating only databased you can download from here : http://www.reference-midori.info/download.php#
Hi Christina,
Thank you for your email.
I think PEMA is using old MIDORI database.
I have fixed this problem quite long time ago.
In all formats, except RAW files, we have inserted missing taxonomy by creating it from a lower taxonomic ranking (ex. description in class-level was missing, so it was created from order-level in the following example, >JF502242.1.7041.7724 root_1;Eukaryota_2759;Chordata_7711;class_Crocodylia_1294634;Crocodylia_1294634;Crocodylidae_8493;Crocodylus_8500;Crocodylus intermedius_184240).
Will it be possible that you download recent databases from our cite and locally perform the taxonomic assignment?
We are using NCBI taxonomy for all MIDORI databases.
I think those inconsistency is happening because PEMA is using old database (NCBI taxonomy has been consistently revised).
If you have further questions, please write me back again.
Best regards, Ryuji
Dear Dr Machida,
My name is Christina Pavloudi and I am a Post Doctoral Researcher at the CNRS.
In my previouds Post Doc position, I was working for the ARMS-MBON project (my colleagues are in CC), where we were sequencing ARMS samples for COI (among other genes) and we were using PEMA for the analyses of the results.
PEMA is using MIDORI for the taxonomic assignment of COI reads, hence I am contacting you regarding an issue we came across.
At the moment, the MIDORI output does not always have the same number of columns, i.e. the same number of taxonomic levels, for all the assignments.
You can see an example in the the attached file ("Example_species_notall.tsv")
For some assignments, the output has all the 8 levels: root, superkingdom, phylum, class, order, family, genus, species (see attached file "Example_species_alllevels.tsv").
It would be extremely helpful, in terms of FAIRness for the ARMS-MBON project, if the MIDORI output was consistent and always contained the 8 levels, even if some columns were empty (see attached "Example_species_emptylevels.tsv"). Do you perhaps consider doing something like this for future versions of MIDORI?
Also, could I ask which taxonomy you are using in MIDORI?
Because, as you can see in "Example_species_emptylevels_completed.tsv", for some of the species in question the missing taxonomic levels do exist (if we check at the WoRMS, but also at the NCBI Taxonomy). Also, some of them are different from the output that is produced by MIDORI.
The text was updated successfully, but these errors were encountered: