Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[1;31mERROR:[0m Uncontrolled exit resulting from an unexpected error #267

Closed
xushaoyi opened this issue Jul 22, 2020 · 8 comments
Closed
Labels
error Help required for a GTDB-Tk error.

Comments

@xushaoyi
Copy link

xushaoyi commented Jul 22, 2020

hi, sorry to disturb you. I have met some problem when using gtdbtk to classify bins. here is the details
the code is: gtdbtk classify_wf --cpus 24 -x fa --genome_dir temp/binning/bin_refinment_maxbin_matbat_70_10/metawrap_70_10_bins --out_dir temp/binning/bin_classify_gtdbtk_maxbin_matbat_70_10
##the error details is as follows
[2020-07-22 21:10:53] �[1mINFO:�[0m Done.
[2020-07-22 21:10:59] �[1mINFO:�[0m Aligning markers in 239 genomes with 24 threads.
[2020-07-22 21:10:59] �[1mINFO:�[0m Processing 237 genomes identified as bacterial.
[2020-07-22 21:11:08] �[1mINFO:�[0m Read concatenated alignment for 30,238 GTDB genomes.

==> Aligned 0/237 (0%) genomes [?it/s, ETA ?]
[2020-07-22 21:11:09] �[1;31mERROR:�[0m Uncontrolled exit resulting from an unexpected error.

================================================================================
EXCEPTION: AttributeError
MESSAGE: 'NoneType' object has no attribute 'terminate'


Traceback (most recent call last):
File "/public/home/luhuijie/lib/miniconda2/envs/gtdbtk/lib/python3.8/site-packages/gtdbtk/external/hmm_aligner.py", line 115, in align_marker_set
p_worker.start()
File "/public/home/luhuijie/lib/miniconda2/envs/gtdbtk/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/public/home/luhuijie/lib/miniconda2/envs/gtdbtk/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/public/home/luhuijie/lib/miniconda2/envs/gtdbtk/lib/python3.8/multiprocessing/context.py", line 277, in _Popen
return Popen(process_obj)
File "/public/home/luhuijie/lib/miniconda2/envs/gtdbtk/lib/python3.8/multiprocessing/popen_fork.py", line 19, in init
self._launch(process_obj)
File "/public/home/luhuijie/lib/miniconda2/envs/gtdbtk/lib/python3.8/multiprocessing/popen_fork.py", line 70, in _launch
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/public/home/luhuijie/lib/miniconda2/envs/gtdbtk/lib/python3.8/site-packages/gtdbtk/main.py", line 509, in main
gt_parser.parse_options(args)
File "/public/home/luhuijie/lib/miniconda2/envs/gtdbtk/lib/python3.8/site-packages/gtdbtk/main.py", line 752, in parse_options
self.align(options)
File "/public/home/luhuijie/lib/miniconda2/envs/gtdbtk/lib/python3.8/site-packages/gtdbtk/main.py", line 334, in align
markers.align(options.identify_dir,
File "/public/home/luhuijie/lib/miniconda2/envs/gtdbtk/lib/python3.8/site-packages/gtdbtk/markers.py", line 476, in align
user_msa = hmm_aligner.align_marker_set(cur_genome_files,
File "/public/home/luhuijie/lib/miniconda2/envs/gtdbtk/lib/python3.8/site-packages/gtdbtk/external/hmm_aligner.py", line 130, in align_marker_set
p.terminate()
File "/public/home/luhuijie/lib/miniconda2/envs/gtdbtk/lib/python3.8/multiprocessing/process.py", line 133, in terminate
self._popen.terminate()
AttributeError: 'NoneType' object has no attribute 'terminate'

@aaronmussig
Copy link
Member

Hello,

You're out of memory:

File "/public/home/luhuijie/lib/miniconda2/envs/gtdbtk/lib/python3.8/multiprocessing/popen_fork.py", line 70, in _launch
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

Try running it again and monitor how much memory is free on the server. If it's still a problem can you please include the following info:

  • CPU: grep -m 1 "^model name" /proc/cpuinfo && grep -c "^processor" /proc/cpuinfo
  • RAM: grep "^MemTotal" /proc/meminfo
  • OS: cat /etc/os-release

@aaronmussig aaronmussig added the error Help required for a GTDB-Tk error. label Jul 23, 2020
@xushaoyi
Copy link
Author

Hello,

You're out of memory:

File "/public/home/luhuijie/lib/miniconda2/envs/gtdbtk/lib/python3.8/multiprocessing/popen_fork.py", line 70, in _launch
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

Try running it again and monitor how much memory is free on the server. If it's still a problem can you please include the following info:

  • CPU: grep -m 1 "^model name" /proc/cpuinfo && grep -c "^processor" /proc/cpuinfo
  • RAM: grep "^MemTotal" /proc/meminfo
  • OS: cat /etc/os-release

thanks! I retry it after submit the job to a server, and it worked, but new iussues came, which is related to pplacer. the following is the detail. I used PBS system to submit the job with code as follows:
#!/bin/bash
#PBS -N test-gtdbtk
#PBS -l nodes=4:ppn=24
#PBS -j oe
#PBS -l walltime=1000:00:00
cd $PBS_O_WORKDIR
gtdbtk classify_wf --cpus 24 -x fa --genome_dir temp/binning/bin_refinment_maxbin_matbat_70_10/metawrap_70_10_bins --out_dir temp/binning/bin_classify_gtdbtk_maxbin_matbat_70_10
####to run: hit qsub & script name in terminal

the error was as follows:
EXCEPTION: PplacerException
MESSAGE: An error was encountered while running pplacer.


Traceback (most recent call last):
File "/public/home/lizhen/lib/anaconda3/envs/gtdbtk/lib/python3.8/site-packages/gtdbtk/main.py", line 509, in main
gt_parser.parse_options(args)
File "/public/home/lizhen/lib/anaconda3/envs/gtdbtk/lib/python3.8/site-packages/gtdbtk/main.py", line 754, in parse_options
self.classify(options)
File "/public/home/lizhen/lib/anaconda3/envs/gtdbtk/lib/python3.8/site-packages/gtdbtk/main.py", line 482, in classify
classify.run(genomes,
File "/public/home/lizhen/lib/anaconda3/envs/gtdbtk/lib/python3.8/site-packages/gtdbtk/classify.py", line 450, in run
classify_tree = self.place_genomes(user_msa_file,
File "/public/home/lizhen/lib/anaconda3/envs/gtdbtk/lib/python3.8/site-packages/gtdbtk/classify.py", line 193, in place_genomes
pplacer.run(self.pplacer_cpus, 'wag', pplacer_ref_pkg, pplacer_json_out,
File "/public/home/lizhen/lib/anaconda3/envs/gtdbtk/lib/python3.8/site-packages/gtdbtk/external/pplacer.py", line 92, in run
raise PplacerException(
gtdbtk.exceptions.PplacerException: An error was encountered while running pplacer.

I check the pplacer.bac120 file, and the details is:
Running pplacer v1.1.alpha19-0-g807f6f3 analysis on temp/binning/bin_classify_gtdbtk_maxbin_matbat_70_10/align/gtdbtk.bac120.user_msa.fasta...
Didn't find any reference sequences in given alignment file. Using supplied reference alignment.
Pre-masking sequences... sequence length cut from 5040 to 5040.
Determining figs... figs disabled.
Allocating memory for internal nodes... done.
Caching likelihood information on reference tree...

@aaronmussig
Copy link
Member

This is a known issue when running pplacer on HPCs / using scheduler systems (see the FAQ).

Briefly, the bacterial tree requires ~110GB of RAM. The HPC thinks each thread pplacer is using the full amount of memory, so multiply that by the number of threads you're using and it thinks you're out of memory, i.e. it thinks you're using 110 x 24 = 2.6TB of RAM.

Ensure that you're able to access at least 110 GB of RAM, and try run GTDB-Tk with --pplacer_cpus 1

@xushaoyi
Copy link
Author

This is a known issue when running pplacer on HPCs / using scheduler systems (see the FAQ).

Briefly, the bacterial tree requires ~110GB of RAM. The HPC thinks each thread pplacer is using the full amount of memory, so multiply that by the number of threads you're using and it thinks you're out of memory, i.e. it thinks you're using 110 x 24 = 2.6TB of RAM.

Ensure that you're able to access at least 110 GB of RAM, and try run GTDB-Tk with --pplacer_cpus 1

thanks! I tried to run gtdbtk by adding --placcer_cpus 1, but it still failed to work. here is a strange phenomenona, for the PBS system, I tried to use 4 nodes, but it seemed only one node were used, and then after it reached the maximum, the process was killed.
image

@aaronmussig
Copy link
Member

Thanks for including that info. Since pplacer runs in a single thread and I assume that there is no distributed memory configuration, each node can only take a maximum of 80GB.

A couple of things for you to try:

  1. Try and submit your job to a high memory queue using the appropriate PBS command (request ~120-150GB).
  2. Use the --scratch_dir argument to specify a path where pplacer can try map the memory to disk.

If neither of those methods work, you're going to be unable to meet the minimum RAM requirement to run GTDB-Tk for now. In that case, you should check out KBase and run it on there, note that the R95 update is pending for their web app.

@xushaoyi
Copy link
Author

Thanks for including that info. Since pplacer runs in a single thread and I assume that there is no distributed memory configuration, each node can only take a maximum of 80GB.

A couple of things for you to try:

  1. Try and submit your job to a high memory queue using the appropriate PBS command (request ~120-150GB).
  2. Use the --scratch_dir argument to specify a path where pplacer can try map the memory to disk.

If neither of those methods work, you're going to be unable to meet the minimum RAM requirement to run GTDB-Tk for now. In that case, you should check out KBase and run it on there, note that the R95 update is pending for their web app.

thanks for the suggesstion. I have tried step 2, add --scratch_dir argument to the command line, it worked for a long time, with the details as follows, it seems the procedure was in a loop, is this normal?

information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.31%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.31%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.31%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.31%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.31%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.31%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.31%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.31%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.54/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood information on reference tree (6.53/151.86 GB, 4.30%).
==> Step 5 of 9: Caching likelihood

@aaronmussig
Copy link
Member

Glad to hear it worked, pplacer does take a performance hit when using that flag. The loop is normal and is only displayed to stdout to indicate progress, it's not present in the gtdbtk.log file.

The behaviour is to override each line when new metrics are available but it appears your system writes each line instead of flushing the existing one.

@xushaoyi
Copy link
Author

Glad to hear it worked, pplacer does take a performance hit when using that flag. The loop is normal and is only displayed to stdout to indicate progress, it's not present in the gtdbtk.log file.

The behaviour is to override each line when new metrics are available but it appears your system writes each line instead of flushing the existing one.

Thanks for your patience to response! thoughafter several hours, the process didn't move on, and just changed from 4.90% to 4.30%,so I stop it. Anyway, I will tray again, thanks for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
error Help required for a GTDB-Tk error.
Projects
None yet
Development

No branches or pull requests

2 participants