-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assertion Error during MSA #10
Comments
Thank is a strange one! After the sequences are aligned in the MSA, the Trycycler code checks to make sure that they are the same sequences as before. I.e. if you take each sequence and remove the gaps ( I can't say what the problem is without deeper investigation. Is there any chance you could share the Ryan |
|
Hi Ryan, Thanks for the quick response. I have attached the requested file. I think this might have to do with the size of the max indel--much appreciated for checking into this. Best, |
This turned out to be a fairly straightforward bug: the lowercase characters in your inputs confused Trycycler. I guess I never tested it with an assembler that uses lowercase in its FASTA files! I've fixed the problem by making Trycycler explicitly convert to uppercase loading and saving FASTAs. You can grab the fixed version by either pulling from the master branch or from this fresh release: v0.4.2. Thanks for letting me know and sharing your sequence for debugging purposes! Ryan |
I have encountered the same issue with version 0.5.0, so this isn't upper/lower case anymore. At first, I had 4 assemblies, which looked like this after reconcile:
(To achieve the above, I had to slightly increase the max_indel_size to 265 and max_add_seq to 11000.) At
I then tried removing the most obvious outlier
Still got
Unfortunately, I can't share the sequences for detailed debugging, but I do have some observations about these assemblies:
This is unlikely to help much, but here are some more outputs from the first stages of the pipeline:
and a dotplot for cluster_001 |
Merging MSA (2024-09-13 16:33:34) AssertionError my solution: line 675, add "#", results is ok, passed! then, I harvested this: RRwich, OK? |
It might be okay, but I would like to understand why that assertion caused a crash. Simply commenting out the assertion line prevents the crash, but it might have allowed corrupted data through. If you can share your data, could you send me the |
Hi Ryan. I've encountered this error a couple times recently. While I cannot send the sequences, I did inspect 2_all_seqs.fasta in Vim. Specifically I used :/[^ATCG] to search for non-ATCG characters, and there were none except for the headers:
Hope this helps. |
Sorry, but that doesn't shed any light on the problem 😕 If you're able to email me the file, I'll delete it once I've debugged the issue. I won't include my email here (for bot-scraping reasons), but you can find it in the license message at the top of Trycycler's source files. |
Okay, I sent you an e-mail. |
Hi Ryan. It happened again. I hesitate to send you the sequences because I think it'll just mysteriously work for you again. Taking a cue from @DingJingZhi, I simply commented out line 196 of msa.py and lines 673-675 of consensus.py to get it to work. Note I needed to comment another two lines from consensus.py compared to @DingJingZhi. |
Very frustrating! Since your previous case worked on my computer but not on yours, I suspect this could be a machine-specific problem. Do you have access to another computer where you could try running it? Also, you could try both MUSCLE v3 and MUSCLE v5. Both worked for me on your last case, but since MUSCLE is doing the heavy lifting for Trycycler MSA, I wouldn't be surprised if the problem relates to MUSCLE somehow. Ryan |
I've been running this on AWS EC2 custom AMIs. Each time I do a run, I'm starting with a fresh machine image, with all the Bioconda stuff preinstalled. So it's a predefined, pristine sandbox each time with the same software versions. For MUSCLE, I've been using the 3.8.31 version from Bioconda. I'll try some other versions. |
...and that did it. Bioconda muscle-3.8.1551 works, but muscle-3.8.31 does not. I can upgrade and downgrade, and these results are reproducable. |
Excellent! I can see on the MUSCLE v3 download page that v3.8.425 has a 'bug fix for long sequences', and presumably v3.8.1551 is a later version and also contains that fix. The alignment @marade sent me had a very long partition, so I suspect the MUSCLE bug may be behind this Trycycler problem. Hopefully using v3.8.1551 also fixes the problem for others. I'll add a note about this to Trycycler's FAQ. |
great!
…---Original---
From: "Ryan ***@***.***>
Date: Sun, Nov 3, 2024 05:32 AM
To: ***@***.***>;
Cc: ***@***.******@***.***>;
Subject: Re: [rrwick/Trycycler] Assertion Error during MSA (#10)
Excellent!
I can see on the MUSCLE v3 download page that v3.8.425 has a 'bug fix for long sequences', and presumably v3.8.1551 is a later version and also contains that fix. The alignment @marade sent me had a very long partition, so I suspect the MUSCLE bug may be behind this Trycycler problem.
Hopefully using v3.8.1551 also fixes the problem for others. I'll add a note about this to Trycycler's FAQ.
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Hi,
First just wanted to say thanks for creating such a valuable tool.
I am currently experiencing an issue when running the MSA step of the workflow. I am getting this:
MSA length: 1,756,900 bp
Traceback (most recent call last):
File "/home/djones2/anaconda3/envs/trycycler/bin/trycycler", line 8, in
sys.exit(main())
File "/home/djones2/anaconda3/envs/trycycler/lib/python3.8/site-packages/trycycler/main.py", line 46, in main
msa(args)
File "/home/djones2/anaconda3/envs/trycycler/lib/python3.8/site-packages/trycycler/msa.py", line 36, in msa
merge_pieces(temp_dir, args.cluster_dir, seqs)
File "/home/djones2/anaconda3/envs/trycycler/lib/python3.8/site-packages/trycycler/msa.py", line 187, in merge_pieces
assert seqs[n] == msa_minus_dashes
AssertionError
Does it seem like there is a resolve for this? Any experience with this issue? Thanks in advance!
The text was updated successfully, but these errors were encountered: