Input data metrics explanation, % of mtDNA reads of the total sequence reads that mapped to the whole mtDNA #216

AliBasuony2022 · 2023-12-23T07:31:56Z

Dear friends,

I have got a question from a reviewers regarding % of mtDNA reads of the total sequence reads that mapped to the whole mtDNA in Novoplaty. Where I can find this information, in Novoplasty outputs, please. Is it 0.43 % (see Input data metrics below, please)

Can someone explain the "Input data metrics", please- I'm just confused?

Below is the log file.

Kind regards,
Ali

NOVOPlasty: The Organelle Assembler
Version 4.3.1
Author: Nicolas Dierckxsens, (c) 2015-2020

Input parameters from the configuration file: *** Verify if everything is correct ***

Project:

Project name = mito_1_375
Type = mito
Genome range = 15000-18000
K-mer = 33
Max memory = 64
Extended log = 1
Save assembled reads = yes
Seed Input = NC_008434.1_Vv_complete_mitogenome16813bp.fasta
Extend seed directly = no
Reference sequence =
Variance detection =
Chloroplast sequence =

Dataset 1:

Read Length = 151
Insert size = 350
Platform = illumina
Single/Paired = PE
Combined reads =
Forward reads = /mnt/scratch/c1845371/whole_genome/data/375_R1.fastq.gz
Reverse reads = /mnt/scratch/c1845371/whole_genome/data/375_R2.fastq.gz
Store Hash =

Heteroplasmy:

Heteroplasmy =
HP exclude list =
PCR-free =

Optional:

Insert size auto = yes
Use Quality Scores =
Output path = /mnt/scratch/c1845371/whole_genome/mitochondrial_genome/mito_12/

Subsampled fraction: 24.14 %
Forward reads without pair: 13259
Reverse reads without pair: 5025

Retrieve Seed...

Initial read retrieved successfully: TCTTACACCCGCCAGATCTTGCTGTCTATCTATAGATATCATTTCCTTGATATTTTATTTTTTACCGCCTCTATAGTTCGCACCAACAAAGCCAAAAACAAAAGTTAATGTAGCTTAATTAGTAAAGCAAGGCACTGAAAATGCCAAGATG

Start Assembly...

------------Assembly 1 finished: Contigs are automatically merged in Merged_contigs file------------

Contig 01 : 16521 bp
Contig 02 : 349 bp
Contig 03 : 992 bp
Contig 04 : 385 bp
Contig 05 : 881 bp

Total contigs : 5
Largest contig : 16521 bp
Smallest contig : 349 bp
Average insert size : 337 bp

-----------------------------------------Input data metrics-----------------------------------------

Total reads : 105400318
Aligned reads : 455762
Assembled reads : 418834
Organelle genome % : 0.43 %
Average organelle coverage : 4176

ndierckx · 2023-12-25T05:15:54Z

Hi,

Yes it is indeed 0.43%

AliBasuony2022 · 2023-12-25T07:45:28Z

Thanks so much,

But the the number of raw reads (pairs) for both mitochondrial and nuclear together is 216,237,628 . What the number 105400318 in the Input data metrics referes to? Is it the number of mitochondrial reads?

Sorry, I'm still confused.

Best regards,
Ali

ndierckx · 2023-12-28T07:27:06Z

105400318 is the total reads used. You have put a max memory, so it subsampled your data and only used 105400318 reads, it doesn't call the rest when you subsample. You have a large dataset so don't need to use the complete set

AliBasuony2022 · 2024-01-01T18:00:06Z

Good point.
Thanks so much, Nicolas.

AliBasuony2022 · 2024-02-06T18:27:05Z

Dear Nicolas,

Just a follow up question for this issue.

How do I know the right % of mtDNA reads of the total sequence reads that mapped to the whole mtDNA? I'm doing a comparison between the performance of NOVOPlasty and other de novo assemblies and this information is so important.

When I used adifferent memmory settings (all other settings are fixed), I have got the same lenght of the largest contig, but with differnt number for assembled reads, aligned and total reads.

Does the subsampled fraction: 99.99 % when setting the Max memory= Null is right? if so, the number of total reads is over the number of reads in the raw data. I'm still confused, sorry.

max memory Null
log_mito_1_375_12_6_max memory Null.txt

max memory 100
log_mito_1_375_12_3_max memory 100.txt

memory 64
log_mito_1_375_12_max memory 64.txt

Kind regards,
Ali

ndierckx closed this as completed Jan 4, 2024

AliBasuony2022 mentioned this issue Feb 6, 2024

Input data metrics explanation, % of mtDNA reads of the total sequence reads that mapped to the whole mtDNA_follow up #222

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Input data metrics explanation, % of mtDNA reads of the total sequence reads that mapped to the whole mtDNA #216

Input data metrics explanation, % of mtDNA reads of the total sequence reads that mapped to the whole mtDNA #216

AliBasuony2022 commented Dec 23, 2023

ndierckx commented Dec 25, 2023

AliBasuony2022 commented Dec 25, 2023

ndierckx commented Dec 28, 2023

AliBasuony2022 commented Jan 1, 2024

AliBasuony2022 commented Feb 6, 2024

Input data metrics explanation, % of mtDNA reads of the total sequence reads that mapped to the whole mtDNA #216

Input data metrics explanation, % of mtDNA reads of the total sequence reads that mapped to the whole mtDNA #216

Comments

AliBasuony2022 commented Dec 23, 2023

NOVOPlasty: The Organelle Assembler Version 4.3.1 Author: Nicolas Dierckxsens, (c) 2015-2020

Project:

Dataset 1:

Heteroplasmy:

Optional:

ndierckx commented Dec 25, 2023

AliBasuony2022 commented Dec 25, 2023

ndierckx commented Dec 28, 2023

AliBasuony2022 commented Jan 1, 2024

AliBasuony2022 commented Feb 6, 2024

NOVOPlasty: The Organelle Assembler
Version 4.3.1
Author: Nicolas Dierckxsens, (c) 2015-2020