Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CIGAR and query sequence are of different length with art_illumina #22

Closed
farchaab opened this issue Aug 30, 2024 · 0 comments · Fixed by #23 or #35
Closed

CIGAR and query sequence are of different length with art_illumina #22

farchaab opened this issue Aug 30, 2024 · 0 comments · Fixed by #23 or #35
Assignees
Labels
bug Something isn't working

Comments

@farchaab
Copy link
Collaborator

farchaab commented Aug 30, 2024

MeSS version

$ mess -v
mess, version 0.9.0

Describe the bug
Truncated SAM file when simulating reads with art_illumina. The cigar string length is wrongly printed as longer than the maximum read length. Example:

$ sed -n '82315p' NZ_CP033731.0.sam
NZ_CP033731.0-NZ_CP033731.065720	147	NZ_CP033731.0	940764	99	152M	=	940716	-198	TAAGGGAGTAGAACAAATGATTGATATTCAAATTCAAAAATGTGACTTAATTGAAGTAATGAATCAATAATCAATGAAAAGTCGAAGTTGACAATCAAGATGCATGTGAATATTTAGCGACTGAAATTGTGAATGGATTAGAAAGCTACA	CG=GGGGGCGGGGGGCCG8GGGG8GCGCG=CGGGGGCCGGC=CCJJ=CCGGCGG8CGGGCCGGGCG1GCGCGGCCJCGG8GCGGCCGCGCGCGGJGGCGGCJGJ8JGGJGJGGGJJJGJJJGGG8JJJ1JJJJJJJCGGGGGGGGGG=CC

Minimal example

mess simulate -i simulate_test.csv --fasta fasta/ --sdm apptainer --bam

Additional context
Error can be reproduced using samtools 1.20 :

  • Files
    SAM

  • Command

$ gunzip NZ_CP033731.0.sam.gz
$ samtools view -Sbh NZ_CP033731.0.sam > output.bam
[E::sam_parse1] CIGAR and query sequence are of different length
[W::sam_read1_sam] Parse error at line 1367
samtools view: error reading file "NZ_CP033731.0.sam"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
1 participant