-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Assertion Error during MSA #10
Comments
Thank is a strange one! After the sequences are aligned in the MSA, the Trycycler code checks to make sure that they are the same sequences as before. I.e. if you take each sequence and remove the gaps ( I can't say what the problem is without deeper investigation. Is there any chance you could share the Ryan |
|
Hi Ryan, Thanks for the quick response. I have attached the requested file. I think this might have to do with the size of the max indel--much appreciated for checking into this. Best, |
This turned out to be a fairly straightforward bug: the lowercase characters in your inputs confused Trycycler. I guess I never tested it with an assembler that uses lowercase in its FASTA files! I've fixed the problem by making Trycycler explicitly convert to uppercase loading and saving FASTAs. You can grab the fixed version by either pulling from the master branch or from this fresh release: v0.4.2. Thanks for letting me know and sharing your sequence for debugging purposes! Ryan |
I have encountered the same issue with version 0.5.0, so this isn't upper/lower case anymore. At first, I had 4 assemblies, which looked like this after reconcile:
(To achieve the above, I had to slightly increase the max_indel_size to 265 and max_add_seq to 11000.) At
I then tried removing the most obvious outlier
Still got
Unfortunately, I can't share the sequences for detailed debugging, but I do have some observations about these assemblies:
This is unlikely to help much, but here are some more outputs from the first stages of the pipeline:
and a dotplot for cluster_001 |
Merging MSA (2024-09-13 16:33:34) AssertionError my solution: line 675, add "#", results is ok, passed! then, I harvested this: RRwich, OK? |
It might be okay, but I would like to understand why that assertion caused a crash. Simply commenting out the assertion line prevents the crash, but it might have allowed corrupted data through. If you can share your data, could you send me the |
Hi Ryan. I've encountered this error a couple times recently. While I cannot send the sequences, I did inspect 2_all_seqs.fasta in Vim. Specifically I used :/[^ATCG] to search for non-ATCG characters, and there were none except for the headers:
Hope this helps. |
Sorry, but that doesn't shed any light on the problem 😕 If you're able to email me the file, I'll delete it once I've debugged the issue. I won't include my email here (for bot-scraping reasons), but you can find it in the license message at the top of Trycycler's source files. |
Okay, I sent you an e-mail. |
Hi,
First just wanted to say thanks for creating such a valuable tool.
I am currently experiencing an issue when running the MSA step of the workflow. I am getting this:
MSA length: 1,756,900 bp
Traceback (most recent call last):
File "/home/djones2/anaconda3/envs/trycycler/bin/trycycler", line 8, in
sys.exit(main())
File "/home/djones2/anaconda3/envs/trycycler/lib/python3.8/site-packages/trycycler/main.py", line 46, in main
msa(args)
File "/home/djones2/anaconda3/envs/trycycler/lib/python3.8/site-packages/trycycler/msa.py", line 36, in msa
merge_pieces(temp_dir, args.cluster_dir, seqs)
File "/home/djones2/anaconda3/envs/trycycler/lib/python3.8/site-packages/trycycler/msa.py", line 187, in merge_pieces
assert seqs[n] == msa_minus_dashes
AssertionError
Does it seem like there is a resolve for this? Any experience with this issue? Thanks in advance!
The text was updated successfully, but these errors were encountered: