Skip to content
This repository has been archived by the owner on Jan 3, 2024. It is now read-only.

cellfinder crashes as it maxes out RAM[BUG] #52

Closed
agriffa0201 opened this issue May 11, 2022 · 6 comments · Fixed by #53
Closed

cellfinder crashes as it maxes out RAM[BUG] #52

agriffa0201 opened this issue May 11, 2022 · 6 comments · Fixed by #53
Assignees
Labels
bug Something isn't working

Comments

@agriffa0201
Copy link

Describe the bug
Hello, I recently installed cellfinder as in the instructions and run it on a clerared whole-brain (2 channels, 500+GB each). It completed the registration successfully, but crashed shortly after during the cell detection. The issue appears to be the incapacity to load further arrays in memory. (manually checking resource use with task manager confirmed saturation of ram before crash)

File "C:\anaconda52\x64\envs\napari-env\lib\site-packages\cellfinder_core\detect\detect.py", line 132, in main
mp_tile_processor.get_tile_mask, args=(np.array(plane),)
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 208. MiB for an array with shape (10774, 10098) and data type uint16

I have tried limiting the memory with the --max-ram setting, but even with --max-ram=250 (half of total ram), cellfinder ends up using the whole memory anyway and eventually crashing.

File "C:\anaconda52\x64\envs\napari-env\lib\site-packages\tifffile\tifffile.py", line 10649, in read_array
result = numpy.empty(count, dtype) if out is None else out
numpy.core._exceptions._ArrayMemoryError: Unable to allocate 208. MiB for an array with shape (108795852,) and data type uint16

When I try to use cellfinder-napari on a subset of images (16GB in total), the detection works fine.
Is there a way to actually limit ram usage by the cell detection step?

To Reproduce
I have attached my conda environment .yaml (renamed as txt) cellfinder.txt. I ran:
conda activate napari-env
cellfinder -s W:\nobackup\garber\grifalbe\20220202_FULL\C1 -b W:\nobackup\garber\grifalbe\20220202_FULL\C2 -o W:\nobackup\garber\grifalbe\test_full_brain2 -v 3 1 1 --orientation ipr --debug --atlas allen_mouse_25um --max-ram 250

Expected behavior
I hoped to run cell detection on the whole brain and that max-ram would limit ram usage of all steps of the pipeline, thus preventing it from crashing.

Log file
I am attaching the log of the last failed run.
cellfinder_2022-05-11_10-39-52.log

Desktop:
Windows Version 10.0.19042
500GB RAM,
2x Intel(R) Xeon(R) CPU E5-4669 v4 @ 2.20GHz, 2195 Mhz, 12 Core(s)
(no GPUs)

Additional context

Add any other context about the problem here.

@agriffa0201 agriffa0201 added the bug Something isn't working label May 11, 2022
@adamltyson
Copy link
Member

Something has gone wrong here, the cell detection part doesn't use that much RAM (I've analysed 500GB images on a laptop with 16GB RAM).

Could you try downgrading cellfinder-core to see if some recent changes are to blame?

pip install cellfinder-core==0.2.8

@agriffa0201
Copy link
Author

Hello,
I downgraded cellfinder-core as you suggested, and managed to fully process the entire brain without issues (RAM usage seemed stable at 80GB).
It took 5 days to process this one brain (1TB in total) with the 25um allen atlas, is this a reasonable time?
I noticed that for the classification it only used 3 of the available cores (while for detection it used all of them), is there a way to influence this so that it does more parallel processing for this step as well and so takes less time?
(my original command was cellfinder -s /path/to/signal -b /path/to/background -o /path/to/output -v 3 1 1 --orientation ipr --debug --atlas allen_mouse_25um --max-ram 400)

2022-05-13 21:50:09 PM - DEBUG - MainProcess cells.py:184 - Removing artifacts
2022-05-13 21:50:48 PM - DEBUG - MainProcess system.py:134 - Determining the maximum number of CPU cores to use
2022-05-13 21:50:48 PM - DEBUG - MainProcess system.py:143 - Number of CPU cores available is: 22
2022-05-13 21:50:48 PM - DEBUG - MainProcess system.py:173 - Setting number of processes to: 22
2022-05-13 21:50:48 PM - DEBUG - MainProcess tf.py:35 - Setting maximum number of threads for tensorflow to: 22
2022-05-13 21:50:50 PM - DEBUG - MainProcess tf.py:26 - No GPUs found, using CPU.
2022-05-13 21:50:50 PM - DEBUG - MainProcess prep.py:53 - No model or weights supplied, so using the default
2022-05-13 21:50:50 PM - DEBUG - MainProcess prep.py:74 - Reading config file: C:\anaconda52\x64\envs\napari-env\lib\site-packages\cellfinder_core\config\cellfinder.conf.custom
2022-05-13 21:50:50 PM - INFO - MainProcess main.py:167 - Running cell classification
2022-05-13 21:50:52 PM - DEBUG - MainProcess system.py:134 - Determining the maximum number of CPU cores to use
2022-05-13 21:50:52 PM - DEBUG - MainProcess system.py:143 - Number of CPU cores available is: 22
2022-05-13 21:50:52 PM - DEBUG - MainProcess system.py:162 - Forcing the number of processes to 3 based on other considerations.
2022-05-13 21:50:52 PM - DEBUG - MainProcess system.py:173 - Setting number of processes to: 3

@adamltyson
Copy link
Member

5 days is a very long time. Is your data on a local or a network drive? There is a lot of data read/write within cellfinder, so the faster the storage, the quicker it is. If you can put your data on a local SSD, you may see speedups.

How many cells do you have? If you have many tens or hundreds of thousands, cellfinder isn't very efficient currently see brainglobe/cellfinder#356).

Lastly, there isn't currently a way to speed up the classification. It only uses 3 cores, but the GPU is the bottleneck at this point.

@dstansby the newest version of cellfinder-core seems to be causing problems here. Any ideas?

@dstansby dstansby self-assigned this May 24, 2022
@dstansby
Copy link
Member

I'll take a look into this and do some memroy profiling across the changes we've been making recently.

@dstansby
Copy link
Member

Here's some rough memory profiles I ran using the small dataset used in the tests:

Verison 0.2.8

028

Version 0.3.0

030

It looks like there's two things to investigate:

  1. In 0.2.8, detect.main() didn't increase memory useage at all. In 0.3.0 memory increases a little bit during detect.main()
  2. In 0.2.8 classify.main() seems to release some memory, but in 0.3.0 this doesn't happen.

Note that I haven't done multiple runs here, but this gives me a starting place for further investigations.

@dstansby
Copy link
Member

I think I've found the issue - in cellfinder-core 0.2.8 only n_processes planes are read into memory at one time. In 0.3.0 every plane is read into memory before being submitted to the processing queue 😬 . Sorry about this - I'll try and implement a fix. I'm also going to transfer this issue to the cellfinder-core repo (if I can).

@dstansby dstansby transferred this issue from brainglobe/cellfinder Jun 10, 2022
willGraham01 pushed a commit that referenced this issue Aug 24, 2023
move TODOs to issues #52, #53
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants