-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A best practices document for assembly to assembly comparisons? #109
Comments
Use Variant calling is a bit complicated. You need to: git clone https://github.com/lh3/minimap2 # you need the latest version; not from conda
cd minimap2 && make
curl -L https://github.com/attractivechaos/k8/releases/download/v0.2.4/k8-0.2.4.tar.bz2 | tar -jxf -
cp k8-0.2.4/k8-`uname -s` k8 # or copy it to a directory on you $PATH
./minimap2 -cx asm5 --cs ref.fa query.fa | sort -k6,6 -k8,8n | ./k8 misc/paftools.js call - var.txt You can find a brief explanation of the output here. The documentation there will be changed in the near future. |
This is great, thanks for the example code! When I ran this on my data, I actually got no SNPs/indels reported:
I used the Do you think my strains are too similar? Is there a way to tune the algorithm further? Thanks for any insight! |
It is more likely that the two strains are too divergent. You may first try: ./minimap2 -c --cs ref.fa query.fa | sort -k6,6 -k8,8n | ./k8 misc/paftools.js call -L20000 - > var.txt and see what is happening. |
Sorry, a typo. You should use |
This did the trick:
I noticed the output format is not VCF. Do you know of a tool that converts this to VCF? I would like to overlay this in Geneious to visualize the mutation landscape. Thanks! |
Unfortunately, it will take some non-trivial effort to implement VCF in paftools.js. I will consider that, but probably not soon. By the way, I suspect that most substitutions and gaps are consensus errors, not true variants. For this dataset, you can consider to add |
It changed the output a little bit:
By the way, I am looking at a minion and a pacbio assembly of the same strain. The minion assembly was error-corrected with illumina but I believe the Pacbio wasn't. I will try error correcting the pacbio data to see if some of these SNPs disappear. If you do end up implementing VCF, that would be great! |
You can generate VCF with: git clone https://github.com/lh3/htsbox
(cd htsbox && make)
minimap2 -axasm5 wt_minion.fasta wt_pacbio.fasta | samtool sort - > sorted.bam
htsbox/htsbox pileup -q5 -S10000 -vcf wt_minion.fasta sorted.bam > diff.vcf |
I error-corrected the pacbio data but it didn't seem to affect the SNP calls too much. In your own work, how do you "consume" the data that minimap2 produces? Is there another visualization method you can recommend? How do you cross-reference SNP calls with annotation? Thanks for any insight you might have! |
Dear Lh3, I want to reduce a diploid assembling to haploid assembly (from canu) using minimap2. My species seems to have 90% of repeated regions, 7% of heterozygotie and probably a lot of SV (but diffuclt to infer without a good assembly). This species will be a good candidat for assemblathon ! I will copy the canu_assembly.fa to read canu_read.fa option 1./minimap2 -cx asm10 ./XX/canu_assembly.fa ./XX/canu_read.fa > aln.paf option 1 bisI also plan to modify the alignment option (-k19 -w19 -A1 -B9 -O16,41 -E2,1 -s200 -z200) to stick to the biological features of our species. option 2./minimap2 -d ./XX/canu_assembly.mmi ./XX/canu_assembly.fa What option seems to you the smartest ? Or a new one ? JB |
@lfaller probably it is too late, but just let you know that ./minimap2 -c --cs ref.fa query.fa \
| sort -k6,6 -k8,8n \
| ./k8 misc/paftools.js call -f ref.fa -L20000 - > var.vcf @jblamyatifremer the latest github master has a new ./minimap2 -cx asm20 ./XX/canu_assembly.fa ./XX/canu_read.fa > aln.paf |
I tried your preset "asm20", it appears that minimap2 (version 2.9) does not have the preset "asm20". Before closing this post, Could you provide the full combination of value for alignment parameters (the preset). You could close this thread... at least from my point of view. |
@jblamyatifremer you need to use the github master branch. Closing... |
Hello, |
"misc/paftools.js" also in binary release. |
I downloaded the paftool.js from here, then when I run it I get (k8 is in the path, both are in the same folder):
I see now: I downloaded k8 in a different folder than minimap2 (downloaded with Bioconda), and I don't have the misc/fodler there. This is how my folder (added to the the path) looks now:
|
It is minimap2 release: |
Your downloaded paftools.js as HTML. You need to download "raw" or clone it. |
OK, it works now, I installed minimap and k8 outisde of conda. |
is this an alternate for the |
|
I am looking for a tool to do assembly to assembly mapping for bacteria and I am very excited to have found minimap2!
Do you have a best practices document for this use case?
Or is
minimap2 -ax asm5 ref.fasta query.fasta > alignment.sam
the way to go?Edited to add that I am specifically curious about how to annotate differences between the two assemblies (i.e. SNPs, indels, etc).
Thanks!
~Lina
The text was updated successfully, but these errors were encountered: