-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Input data metrics explanation, % of mtDNA reads of the total sequence reads that mapped to the whole mtDNA #216
Comments
Hi, Yes it is indeed 0.43% |
Thanks so much, But the the number of raw reads (pairs) for both mitochondrial and nuclear together is 216,237,628 . What the number 105400318 in the Input data metrics referes to? Is it the number of mitochondrial reads? Sorry, I'm still confused. Best regards, |
105400318 is the total reads used. You have put a max memory, so it subsampled your data and only used 105400318 reads, it doesn't call the rest when you subsample. You have a large dataset so don't need to use the complete set |
Good point. |
Dear Nicolas, Just a follow up question for this issue. How do I know the right % of mtDNA reads of the total sequence reads that mapped to the whole mtDNA? I'm doing a comparison between the performance of NOVOPlasty and other de novo assemblies and this information is so important. When I used adifferent memmory settings (all other settings are fixed), I have got the same lenght of the largest contig, but with differnt number for assembled reads, aligned and total reads. Does the subsampled fraction: 99.99 % when setting the Max memory= Null is right? if so, the number of total reads is over the number of reads in the raw data. I'm still confused, sorry. max memory Null max memory 100 memory 64 Kind regards, |
Dear friends,
I have got a question from a reviewers regarding % of mtDNA reads of the total sequence reads that mapped to the whole mtDNA in Novoplaty. Where I can find this information, in Novoplasty outputs, please. Is it 0.43 % (see Input data metrics below, please)
Can someone explain the "Input data metrics", please- I'm just confused?
Below is the log file.
Kind regards,
Ali
NOVOPlasty: The Organelle Assembler
Version 4.3.1
Author: Nicolas Dierckxsens, (c) 2015-2020
Input parameters from the configuration file: *** Verify if everything is correct ***
Project:
Project name = mito_1_375
Type = mito
Genome range = 15000-18000
K-mer = 33
Max memory = 64
Extended log = 1
Save assembled reads = yes
Seed Input = NC_008434.1_Vv_complete_mitogenome16813bp.fasta
Extend seed directly = no
Reference sequence =
Variance detection =
Chloroplast sequence =
Dataset 1:
Read Length = 151
Insert size = 350
Platform = illumina
Single/Paired = PE
Combined reads =
Forward reads = /mnt/scratch/c1845371/whole_genome/data/375_R1.fastq.gz
Reverse reads = /mnt/scratch/c1845371/whole_genome/data/375_R2.fastq.gz
Store Hash =
Heteroplasmy:
Heteroplasmy =
HP exclude list =
PCR-free =
Optional:
Insert size auto = yes
Use Quality Scores =
Output path = /mnt/scratch/c1845371/whole_genome/mitochondrial_genome/mito_12/
Subsampled fraction: 24.14 %
Forward reads without pair: 13259
Reverse reads without pair: 5025
Retrieve Seed...
Initial read retrieved successfully: TCTTACACCCGCCAGATCTTGCTGTCTATCTATAGATATCATTTCCTTGATATTTTATTTTTTACCGCCTCTATAGTTCGCACCAACAAAGCCAAAAACAAAAGTTAATGTAGCTTAATTAGTAAAGCAAGGCACTGAAAATGCCAAGATG
Start Assembly...
------------Assembly 1 finished: Contigs are automatically merged in Merged_contigs file------------
Contig 01 : 16521 bp
Contig 02 : 349 bp
Contig 03 : 992 bp
Contig 04 : 385 bp
Contig 05 : 881 bp
Total contigs : 5
Largest contig : 16521 bp
Smallest contig : 349 bp
Average insert size : 337 bp
-----------------------------------------Input data metrics-----------------------------------------
Total reads : 105400318
Aligned reads : 455762
Assembled reads : 418834
Organelle genome % : 0.43 %
Average organelle coverage : 4176
The text was updated successfully, but these errors were encountered: