-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RDS is always 0.0 #31
Comments
Hi @barbaracania,
Could you provide some information on your data: Metagenome or isolate? Merged assemblies? |
Thank you for your answer. My data is metagenomic, but the samples were treated with a plasmid-safe DNAse, so it should contain mostly plasmid reads. I ran SPAdes on it with the --metaplasmid option, and afterwards I only modified the names of contigs by removing all the information after the coverage, as otherwise Platon was not able to read the coverage correctly from them. Without the modification, the names look like this: >NODE_1_length_63294_cov_26.832935_cutoff_20_type_circular. The data was not modified in any other way. As it is suggested that the contigs produced by metaplasmidSPAdes should still be confirmed as plasmids by additional means, I thought of including Platon in my pipeline for this purpose. Just to make this clear, I understand that using the --characterize option for Platon gives only info about contigs. I used it only to get an idea about my data and also to show it to you. When I was testing the three different modes, I was not using this option. For example, when I used platon contigs.fasta --db ~/Databases/db --output platon_accu --mode accuracy --threads 8 my contigs.tsv file starts like this: ID Length Coverage # ORFs RDS Circular Inc Type(s) # Replication # Mobilization # OriT # Conjugation # AMRs # rRNAs # Plasmid Hits My contigs.chromosome.fasta contains only the first two contigs from my previous post that were not identified by Platon as circular, and the contigs.plasmid.fasta has everything else, including the contig on which the rRNA genes were found. When I try the sensitivity mode, I get the same results as with the accuracy mode, but the specificity mode gives me empty contigs.tsv and contigs.plasmid.fasta files, while all the contigs are found in the contigs.chromosome.fasta. From what I understood, the accuracy mode should take all the contig characteristics into consideration when making a choice if a contig comes from a plasmid or a chromosome, while the other two modes are relying only on the RDS values. Since all my RDS values are 0.0, I am confused why I am getting the above-described results... |
Hi, |
Hi, |
Hi, So in principle, there are 2 different reasons that I can think of:
|
Good morning, |
I took a look at the logs and from a technical perspective, everything is just fine. As mentioned above, I'm currently computing and compiling a database update which could help here - of course this would require further investigations. As of today, it seems to be the case that Platon is not the right tool for your dataset. May I refere you to PlasFlow? Since Platon was initially developed with single isolates in mind, PlasFlow might provide better results since it's solely addressing metagenome data. I'll leave this open until we've released the new database version and Platon [v1.7] just to let you know. |
Hi!
I am trying to use Platon 1.6 installed with BioConda to identify plasmid contigs. By running the following command:
platon contigs.fasta --db ~/Databases/db --output platon_accu --mode accuracy --threads 8 --characterize
I got the following result (I am showing the first few lines):
ID Length Coverage # ORFs RDS Circular Inc Type(s) # Replication # Mobilization # OriT # Conjugation # AMRs # rRNAs # Plasmid Hits
NODE_1_length_66028_cov_26.537579 66028 26.5 50 0.0 no 0 0 0 0 0 0 0 0
NODE_1_length_63294_cov_26.832935 63294 26.8 48 0.0 no 0 0 0 0 0 0 0 0
NODE_1_length_63165_cov_26.834275 63165 26.8 48 0.0 yes 0 0 0 0 0 0 0 0
NODE_1_length_51546_cov_2.360878 51546 2.4 74 0.0 yes 0 0 0 0 0 0 0 0
NODE_2_length_32011_cov_1.484036 32011 1.5 39 0.0 yes 0 0 0 0 0 0 0 0
NODE_3_length_19747_cov_141.934964 19747 141.9 3 0.0 yes 0 0 0 0 0 0 2 0
After running the same command without "--characterize", the first two contigs are classified as chromosomal and the rest as plasmids. Now, I am not sure if it is a bug or if I am misunderstanding how the calculation of RDS or the classification criteria work, but the RDS value for all my contigs (over a thousand of them) is always 0.0. Moreover, it looks like rRNA genes were detected in the last showed contig and the number of ORFs was very low, but it was still characterized as a plasmid. Lastly, when I tried to use the sensitivity mode, I got the same results as with the accuracy mode, but when using the specificity mode, all my contigs were classified as chromosomes. Is this an expected behavior?
The text was updated successfully, but these errors were encountered: