Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core dumped #64

Open
minjinhan opened this issue Aug 1, 2023 · 5 comments
Open

core dumped #64

minjinhan opened this issue Aug 1, 2023 · 5 comments

Comments

@minjinhan
Copy link

Hi,
We recently used spaln (version 3.4.13f) to align CDS sequences (~25,000 sequences) to a genome(~450Mb) on Centos 8 system (Our server has 1T of memory) . But we encountered some problems as follows:

  1. The running process will be interrupted and showed "core dumped", such as:
    (1) double free or corruption (!prev)
    Aborted (core dumped)

    (2)15046 > 179: Vmf out of range !
    151127305 > 481: Vmf out of range !
    free(): invalid size
    Aborted (core dumped)

    (3) 33753347 > 3443: Vmf out of range !
    64576 > 14242: Vmf out of range !
    Segmentation fault (core dumped)

    (4) malloc(): invalid size (unsorted)
    Aborted (core dumped)

  2. There are >2000 CDS sequences that did not give the result. The resulting files are empty, but the following information is displayed on the terminal. I do not understand the meaning of these information and I do not know why these sequences did not produce mapping Hits.

    BMnr21259 > 0 2664 Contig29_118 13717 13720 5.17 0.00 1310 1380 57 56 130 124
    BMnr21250 < 0 1368 Contig49_118 18167 18147 17.75 5.20 658 711 28 27 68 62
    BMnr21295 > 0 1242 Contig32_118 14382 14387 0.00 9.28 642 664 26 25 61 57
    .....

We generated the genomic database as follows:
spaln -W -KD -E -t10 Sample118A.gf

We have tried following commands, But the "core dumped" still arises:
spaln -Q4 -O0,5,6,7 -M5 -t120 -d Sample118A BMnr_CDS.fas
spaln -Q7 -O0,5,6,7 -M5 -t120 -d Sample118A BMnr_CDS.fas
spaln -Q4 -O0,5,6,7 -M300 -t120 -d Sample118A BMnr_CDS.fas
spaln -Q7 -O0,5,6,7 -M300 -t120 -d Sample118A BMnr_CDS.fas
spaln -Q4 -O0,5,6,7 -po -yX1 -M500 -S3 -LS -t120 -d Sample118A BMnr_CDS.fas
spaln -Q7 -O0,5,6,7 -po -yX1 -M500 -S3 -LS -t120 -d Sample118A BMnr_CDS.fas
....

We've also tried running with different query sequences, and it seems that some of them are causing "core dump". But there are so many CDS sequences that it's hard to figure out exactly which query sequences out of 25,000 will cause "core dump".

Any suggestions?

Thanks,
Min-Jin

@ogotoh
Copy link
Owner

ogotoh commented Aug 4, 2023

Thank you for your report.

I wander some options you set might be troublesome.

  1. -E option is obsolete, and should not be used.
  2. The argument to -t option should not be larger than the number of cores of your system. Even if your machine is equipped with many cores, too many threads will soon exhaust available memory. Please try a moderate number, say 16 or 32.
  3. The argument to -M option should also be moderate. I expect it (expected number of close paralogs) only a few for DNA queries.

< There are >2000 CDS sequences that did not give the result.

It is not unusual that some queries fail to output results but instead show messages like your examples. This happens when, for some reasons, the query is not similar to any part of the genomic sequence.

Although I don’t think it the major cause of your trouble, I have found a few minor bugs. I just uploaded a new version, spaln2.4.13g that fixes them.

Osamu,

@minjinhan
Copy link
Author

minjinhan commented Aug 8, 2023 via email

@ogotoh
Copy link
Owner

ogotoh commented Aug 10, 2023

Dear Min-Jin,

I am trying a larger scale test to find when spaln fails. Please wait a while till I can figure out the source of the trouble.

Because of the nature of the queue used by spaln, it is normal that only a small fraction of CPUs actually works at the end of a run. A small number of queries may contribute to the prolonged execution,

Osamu,

@ogotoh
Copy link
Owner

ogotoh commented Sep 13, 2023

Dear Min-Jin,

I have just uploaded new version of spaln ver3.0.0. Compared with previous versions, the computation speed has been considerably improved, partly due to modified algorithms and vectorization. For DNA queries, speeding up is most prominent with -LS (local similarity) option. Please try the new version, and if possible, please tell me your opinion as to it.

Osamu,

@minjinhan
Copy link
Author

minjinhan commented Sep 13, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants