-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assertion `hasNodeId(node_id)' failed #43
Comments
Hi, from the error it looks like the pantranscriptome does not match the graph that was given as input (ref.xg). Could you share the command lines you used when running vg rna and vg mpmap? Thanks! |
Hi, here are the full commands I used: bgzip -c contigs.vcf > contigs.vcf.gz && tabix contigs.vcf.gz vg mpmap -t 2 -x vg_rna.spliced.xg -g vg_rna.spliced.gcsa -d vg_rna.spliced.dist -f r1.fq.gz -f r2.fq.gz > mpmap.gamp Thanks! |
To co-generate the pantranscriptome and the spliced graph/indexes, you should use |
Sorry to bother you again, I'm trying your answer but now I'm getting another error I can't resolve: The output: I've tried adding the parameter --gfa ref.gfa with similar result: [IndexRegistry]: Checking for phasing in VCF(s). |
The issue is that |
|
Have you already checked for phasing in your VCF? |
Hello! |
VCFs can express either phased or unphased genotypes. Phased genotypes link together the alleles at multiple loci as occurring on the same haplotype. Unphased genotypes simply assert the alleles at each locus without specifying what combination of alleles co-occur on each haplotype. The pantranscriptome is built from haplotype-specific transcripts, so you need phased genotypes in order to specify the haplotype sequences. There's more detail about phasing in the VCF format in section 1.4.2 of the file specification. |
example.vcf.zip |
The most common way you'd see it expressed is if the genotypes were separated with a bar (e.g. |
Thank for your reply! [IndexRegistry]: Checking for haplotype lines in GFA. This is exactly the crux of my problem, and I find it very difficult to understand, again to bother you. |
I think the issue is further upstream in this pipeline: you're not getting a phased VCF from your Minigraph-Cactus workflow. I would think that Minigraph-Cactus could (in theory) produce phased VCFs, so I'm not sure why you're not getting one. I would recommend making a support request/issue to the Minigraph-Cactus developers to see if it's possible to get a phased VCF as output. It's also a bit puzzling to me that you're getting haploid genotypes on chromosome 1 in the example you gave me, since that's a diploid chromosome in primates. |
Ok, thank you for your reply. Wish you all the best! |
Hello, sorry for the delay. I phased my vcf using vcf_phase.py (https://ppp.readthedocs.io/en/latest/PPP_pages/Functions/vcf_phase.html). With this phased vcf the autoindex command is working now. |
Now, I'm getting an error running rpvg: Error: |
Hi, I am not sure why this is happening. Would you be able to share the data? You can send it to [email protected] Thanks! |
Hi, see related issue: #46 (comment) |
Thank you for your help! |
Thank you! |
Hi, I need help with an error running rpvg command.
I'm getting "paths_index.cpp:73: uint32_t PathsIndex::nodeLength(uint32_t) const: Assertion `hasNodeId(node_id)' failed" error, but I have no idea what that means.
The full command is:
rpvg -t 2 -g ref.xg -p pantranscriptome.gbwt -a mpmap.gamp -o rpvg --inference-model haplotypes-transcripts
Any idea of what the problem could come from, the .xg, .gbwt or .gamp parameter?
Thanks!
The text was updated successfully, but these errors were encountered: