Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion failed : length(seqH) >= 1u was: 0 < 1 #3

Closed
DelphIONe opened this issue Jul 16, 2019 · 8 comments
Closed

Assertion failed : length(seqH) >= 1u was: 0 < 1 #3

DelphIONe opened this issue Jul 16, 2019 · 8 comments
Assignees

Comments

@DelphIONe
Copy link

I run this command
porechop -abi -i test_chia.fastq -b test_ch --discard_middle and I obtain

44,000 / 44,000 (100.0%)

41,578 / 44,000 reads had adapters trimmed from their start (1,989,870 bp removed)
36,169 / 44,000 reads had adapters trimmed from their end (1,879,168 bp removed)

Discarding reads containing middle adapters
180 / 44,000 (0.4%)porechop/include/seqan/align/dp_setup.h:283 Assertion failed : length(seqH) >= 1u was: 0 < 1

stack trace:
0 [0x7f6293f24071] seqan::ClassTest::fail() + 0xe
1 [0x7f6293f2bdce] seqan::Value<seqan::Score<int, seqan::Tagseqan::Simple_ >, 0>::Type seqan::setUpAndRunAlignment<int, seqan::Tagseqan::AffineGaps_, unsigned char, seqan::String<seqan::DPCell<int, seqan::Tagseqan::AffineGaps_ >, seqan::Alloc >, seqan::String<unsigned char, seqan::Alloc >, seqan::TraceSegment_<unsigned long, unsigned long>, seqan::Alloc, seqan::Tagseqan::Default_, seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5_>, seqan::Alloc >, seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5_>, seqan::Alloc >, int, seqan::Tagseqan::Simple_, seqan::GlobalAlignment_<seqan::FreeEndGaps_<seqan::False, seqan::False, seqan::False, seqan::False> >, seqan::DPBandConfig<seqan::Tagseqan::BandOff_ >, seqan::FreeEndGaps_<seqan::True, seqan::True, seqan::True, seqan::True>, seqan::TracebackOn<seqan::TracebackConfig_<seqan::Tagseqan::SingleTrace_, seqan::Tagseqan::GapsLeft_ > > >(seqan::DPContext<seqan::DPCell_<int, seqan::Tagseqan::AffineGaps_ >, unsigned char, seqan::String<seqan::DPCell_<int, seqan::Tagseqan::AffineGaps_ >, seqan::Alloc >, seqan::String<unsigned char, seqan::Alloc > >&, seqan::String<seqan::TraceSegment_<unsigned long, unsigned long>, seqan::Alloc >&, seqan::DPScoutState_<seqan::Tagseqan::Default_ >&, seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5_>, seqan::Alloc > const&, seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5_>, seqan::Alloc > const&, seqan::Score<int, seqan::Tagseqan::Simple_ > const&, seqan::AlignConfig2<seqan::GlobalAlignment_<seqan::FreeEndGaps_<seqan::False, seqan::False, seqan::False, seqan::False> >, seqan::DPBandConfig<seqan::Tagseqan::BandOff_ >, seqan::FreeEndGaps_<seqan::True, seqan::True, seqan::True, seqan::True>, seqan::TracebackOn<seqan::TracebackConfig_<seqan::Tagseqan::SingleTrace_, seqan::Tagseqan::GapsLeft_ > > > const&) + 0x87
2 [0x7f6293f2a150] seqan::Value<seqan::Score<int, seqan::Tagseqan::Simple_ >, 0>::Type seqan::setUpAndRunAlignment<seqan::TraceSegment<unsigned long, unsigned long>, seqan::Alloc, seqan::Tagseqan::Default_, seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5_>, seqan::Alloc >, seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5_>, seqan::Alloc >, int, seqan::Tagseqan::Simple_, seqan::GlobalAlignment_<seqan::FreeEndGaps_<seqan::False, seqan::False, seqan::False, seqan::False> >, seqan::DPBandConfig<seqan::Tagseqan::BandOff_ >, seqan::FreeEndGaps_<seqan::True, seqan::True, seqan::True, seqan::True>, seqan::TracebackOn<seqan::TracebackConfig_<seqan::Tagseqan::SingleTrace_, seqan::Tagseqan::GapsLeft_ > >, seqan::Tagseqan::AffineGaps_ >(seqan::String<seqan::TraceSegment_<unsigned long, unsigned long>, seqan::Alloc >&, seqan::DPScoutState_<seqan::Tagseqan::Default_ >&, seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5_>, seqan::Alloc > const&, seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5_>, seqan::Alloc > const&, seqan::Score<int, seqan::Tagseqan::Simple_ > const&, seqan::AlignConfig2<seqan::GlobalAlignment_<seqan::FreeEndGaps_<seqan::False, seqan::False, seqan::False, seqan::False> >, seqan::DPBandConfig<seqan::Tagseqan::BandOff_ >, seqan::FreeEndGaps_<seqan::True, seqan::True, seqan::True, seqan::True>, seqan::TracebackOn<seqan::TracebackConfig_<seqan::Tagseqan::SingleTrace_, seqan::Tagseqan::GapsLeft_ > > > const&, seqan::Tagseqan::AffineGaps_ const&) + 0x77
3 [0x7f6293f291ab] int seqan::globalAlignment<seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5_>, seqan::Alloc >, seqan::Tagseqan::ArrayGaps_, int, seqan::Tagseqan::Simple_, true, true, true, true, seqan::Tagseqan::Default_, seqan::Tagseqan::Gotoh_ >(seqan::Align<seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5_>, seqan::Alloc >, seqan::Tagseqan::ArrayGaps_ >&, seqan::Score<int, seqan::Tagseqan::Simple_ > const&, seqan::AlignConfig<true, true, true, true, seqan::Tagseqan::Default_ > const&, seqan::Tagseqan::Gotoh_ const&) + 0xa5
4 [0x7f6293f281f8] int seqan::globalAlignment<seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5_>, seqan::Alloc >, seqan::Tagseqan::ArrayGaps_, int, seqan::Tagseqan::Simple_, true, true, true, true, seqan::Tagseqan::Default_ >(seqan::Align<seqan::String<seqan::SimpleType<unsigned char, seqan::Dna5_>, seqan::Alloc >, seqan::Tagseqan::ArrayGaps_ >&, seqan::Score<int, seqan::Tagseqan::Simple_ > const&, seqan::AlignConfig<true, true, true, true, seqan::Tagseqan::Default_ > const&) + 0x79
5 [0x7f6293f275e0] adapterAlignment + 0x1bb
6 [0x7f6294173e20] ffi_call_unix64 + 0x4c
7 [0x7f629417388b] ffi_call + 0x2eb
8 [0x7f629416e01a] _ctypes_callproc + 0x49a
9 [0x7f6294161fcb] /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(+0x9fcb)
10 [0x5c20e7] PyObject_Call + 0x47
11 [0x53b656] PyEval_EvalFrameEx + 0x4ed6
12 [0x53b294] PyEval_EvalFrameEx + 0x4b14
13 [0x53b294] PyEval_EvalFrameEx + 0x4b14
14 [0x53b294] PyEval_EvalFrameEx + 0x4b14
15 [0x540b0b] PyEval_EvalCodeEx + 0x13b
16 [0x4ec3f7] /usr/bin/python3.5()
17 [0x5c20e7] PyObject_Call + 0x47
18 [0x538cab] PyEval_EvalFrameEx + 0x252b

Abandon (core dumped)

Can you help me please ?
Thanks,

@qbonenfant
Copy link
Collaborator

qbonenfant commented Jul 16, 2019

Hi,
I manged to recreate your problem, it seems that some of your read are too short after trimming, and no middle sequence is left.
Your dataset must contains some fairly short sequences, which is natively not well supported by the current Porechop version.

The only solution that come to my mind right now is to filter your input dataset to remove reads with length <= 300 ( 2x the current porechop sampling window, if you want to be on the safe side, but 2 times the inferred adapter length should be enough) which may end up being too short anyway. Depending on your application you may prefer to adjust the sampling window by using --end_size XX to a more forgiving size (90 / 100 for example).

I hope this may help you, keep us informed if it did.

Quentin Bonenfant

@qbonenfant qbonenfant self-assigned this Jul 16, 2019
@DelphIONe
Copy link
Author

Hi,
Thanks for your reply.
The Porechop version (not abi version) works fine on the same file.
Moreover, I have the same result if I launch porechop abi without discard_middle option.

@qbonenfant
Copy link
Collaborator

qbonenfant commented Jul 17, 2019

Hi,
Which version of Porechop (non ABI) are you using ? Some behaviors may differ from current version. Our contribution only add a step right after reading the adapter database, and should not interfer with native code.
Also, removing --discard_middle do not skip the middle adapter ressearch, so it won't change the outcome in this case. If you want to try without middle adapter ressearch, try.--no_split instead.

Quentin

@DelphIONe
Copy link
Author

I use Porechop 0.2.4
The command porechop (abi) with no_split option works fine! I don't know if it help you ?

@qbonenfant
Copy link
Collaborator

Hi,

Since we do use the same version of Porechop, the thing i can think of is that the adapter inference finds a better (or at least fits the reads slightly better) adapter than the ones in the database, and some reads that were not trimmed before now are.

If using --no_split do correct your problem, that confirm my previous hypothesis about your reads, which is that some of them are too short and completely deleted after ends trimming.

You can try it yourself on a subset of very short reads using -v 2 so you will see what porechop is really trimming and what is left after that. You will see that in some cases, porechop will delete the whole sequence if it thinks it is only composed of adapters, leading to the error you encounter.
If times allows it, i will try to patch this in the future.

Unless you really need to keep very short reads, i do recommend that you filter them from your fasta / fastq files if you are sure your dataset is not trimmed.

@DelphIONe
Copy link
Author

Ok, thanks. One question out of subject but... do you have tested your tool on amplicon run ? I'm wondering to one over represented sequence can be confused with adapter. Do you see what I mean ?

@qbonenfant
Copy link
Collaborator

We did tried our tool on amplicon run, and yes we do face this problem. We are still investigating if it is possible to separate adapter from the sequences on those run. We currently need to collect more datasets to extend our knowledge on this kind of data.

I hope i answered all of your questions.

Quentin

@DelphIONe
Copy link
Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants