Update medaka in artic #250

hoelzer · 2023-02-21T23:06:20Z

Solves #247

WIP - tests need to be run.

…stian)

hoelzer · 2023-02-22T08:08:27Z

nextflow run replikation/poreCov -r update-medaka-in-artic -profile slurm,singularity --krakendb GRCh38.p13_SC2_2022-03-01.tar.gz --cachedir /singularity/ --update --fastq_pass fastq_pass --samples samplesheet.csv --primerV V4.1 --medaka_model r941_min_hac_g507 -resume

[87/93c492] process > artic_ncov_wf:artic_medaka (17)                      [100%] 24 of 24 ✔

🤯

hoelzer · 2023-02-22T08:20:27Z

Works for me also w/ --medaka_model r1041_e82_400bps_sup_g615

Unfortunately, I can not test this branch w/ --primerV V5.3.2_400 - can I somehow merge that in here @replikation @DataSpott ?

Then we could compare here the results between

--primerV V5.3.2_400 w/ old ARTIC container and medaka old model
--primerV V5.3.2_400 w/ new ARTIC container and medaka model r1041_e82_400bps_sup_g615

hoelzer · 2023-02-22T08:52:54Z

I injected the matching primer scheme via

--primerV poreCov/data/external_primer_schemes/nCoV-2019/V5.3.2_400/nCoV-2019.primer.bed

and it looks really good.

I get, from a birds eye view, the same mutations as with the old ARTIC container but we're using now medaka 1.7.2 and have access to the new R1041 etc.. models.

So comparing R9 model and now R10 w/ new container via md5 checksums of the final genome FASTAS I only see differences for four samples in this run. For one, the new model repaired a frame shift - perfect. For the other three we will look via pairwise whole genome aln and then report here

hoelzer · 2023-02-22T12:56:20Z

Update: for the three sequences w/ different md5 checksum (V5 primers, R9 in old container vs R10 in new container), we see that the difference is only a single base.

Either the old container w/ R9 calls an N at a single position instead of the reference/alternative base compared to the new container w/ R10 or vice versa:

The below sub-image shows the mapped reads. Apparently, an A should be called instead of the reference G. The old container with R9 model does this, apparently correctly. In contrast, the new container with R10 model seems undecided and inserts an N at the end of the ARTIC workflow.

For the top sub-image it's another position in the genome and the other way around. Old container and R9 calls an N, while new container and R10 calls a base.

hoelzer · 2023-02-22T12:58:03Z

These are the only differences we discovered so far. Quite minor in my eyes. And bc/ we basecalled this run w/ R10.4.1* model, it makes also fully sense to me to use that model in Medaka. Even though there are slight differences that "look better" with an older model such as R9. However, basecalling w/ R10 model and then analyses with R9 model is also not really justifiable.

replikation · 2023-02-23T08:08:30Z

@DataSpott if everything is fine on your end we can merge

DataSpott · 2023-02-23T08:48:31Z

Tested it with one of our routine runs (starting from fastq-pass) and could only find in one sample a difference. this was only one more ambiguous base with the old container compared to the new container (1112 vs 1113 ambiguous bases). So on my end the run worked fine.
Tested with commit "db05fa8".

martin and others added 4 commits November 1, 2022 23:57

bump artic container incl medaka v1.7.2

d906d01

bump artic container incl medaka v1.7.2, fix double medakas (thx Chri…

dff2982

…stian)

use artic v1.2.3 w/ medaka v1.6.1

f2bb773

update medaka in artic container

db05fa8

hoelzer marked this pull request as draft February 21, 2023 23:06

hoelzer marked this pull request as ready for review February 22, 2023 15:49

hoelzer requested a review from replikation February 22, 2023 15:49

replikation approved these changes Feb 23, 2023

View reviewed changes

hoelzer merged commit 08565ae into master Feb 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update medaka in artic #250

Update medaka in artic #250

hoelzer commented Feb 21, 2023

hoelzer commented Feb 22, 2023 •

edited

Loading

hoelzer commented Feb 22, 2023

hoelzer commented Feb 22, 2023

hoelzer commented Feb 22, 2023

hoelzer commented Feb 22, 2023

replikation commented Feb 23, 2023

DataSpott commented Feb 23, 2023

Update medaka in artic #250

Update medaka in artic #250

Conversation

hoelzer commented Feb 21, 2023

hoelzer commented Feb 22, 2023 • edited Loading

hoelzer commented Feb 22, 2023

hoelzer commented Feb 22, 2023

hoelzer commented Feb 22, 2023

hoelzer commented Feb 22, 2023

replikation commented Feb 23, 2023

DataSpott commented Feb 23, 2023

hoelzer commented Feb 22, 2023 •

edited

Loading