Rapidtide failing due to memory issue #10

tsalo · 2024-09-03T20:22:09Z

What happened?

It's probably something I can fix with my fmripost-rapidtide call or my Docker settings

What command did you use?

docker run -ti --rm \
    -v /Users/taylor/Documents/nipreps/fmripost-rapidtide/src/fmripost_rapidtide:/opt/conda/envs/fmripost_rapidtide/lib/python3.11/site-packages/fmripost_rapidtide \
    -v /Users/taylor/Documents/datasets/ds005115/derivatives/code:/code \
    -v /Users/taylor/Documents/datasets/ds005115/derivatives:/data \
    nipreps/fmripost-rapidtide:main \
    /data/fmriprep-mni6-func \
    /data/fmripost-rapidtide-mni6-func-noraw \
    participant \
    --participant_label 01 \
    --n_cpus 1 \
    --skip-bids-validation \
    --notrack \
    --work /data/work-mni6-func-noraw \
    --write-graph \
    --stop-on-first-crash

What version of fMRIPost-Rapidtide are you running?

main

How are you running fMRIPost-Rapidtide?

Docker

Is your data BIDS valid?

Yes

Are you reusing any previously computed results?

Anatomical derivatives

Please copy and paste any relevant log output.

Traceback (most recent call last):
	  File "/opt/conda/envs/fmripost_rapidtide/bin/rapidtide", line 23, in <module>
	    rapidtide_workflow.rapidtide_main(rapidtide_parser.process_args(inputargs=None))
	  File "/opt/conda/envs/fmripost_rapidtide/lib/python3.11/site-packages/rapidtide/workflows/rapidtide.py", line 444, in rapidtide_main
	    nim, nim_data, nim_hdr, thedims, thesizes = tide_io.readfromnifti(fmrifilename)
	                                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	  File "/opt/conda/envs/fmripost_rapidtide/lib/python3.11/site-packages/rapidtide/io.py", line 58, in readfromnifti
	    nim_data = nim.get_fdata()
	               ^^^^^^^^^^^^^^^
	  File "/opt/conda/envs/fmripost_rapidtide/lib/python3.11/site-packages/nibabel/dataobj_images.py", line 373, in get_fdata
	    data = np.asanyarray(self._dataobj, dtype=dtype)
	           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	  File "/opt/conda/envs/fmripost_rapidtide/lib/python3.11/site-packages/nibabel/arrayproxy.py", line 457, in __array__
	    arr = self._get_scaled(dtype=dtype, slicer=())
	          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	  File "/opt/conda/envs/fmripost_rapidtide/lib/python3.11/site-packages/nibabel/arrayproxy.py", line 426, in _get_scaled
	    scaled = scaled.astype(np.promote_types(scaled.dtype, dtype), copy=False)
	             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	numpy.core._exceptions._ArrayMemoryError: Unable to allocate 689. MiB for an array with shape (91, 109, 91, 100) and data type float64

Additional information / screenshots

No response

tsalo · 2024-09-03T21:15:17Z

Setting --nprocs in the rapidtide call allowed it to get further along. I guess rapidtide was automatically trying to use all my computer's CPUs, but wasn't allowed to? Not sure. Here's the new error:

Traceback (most recent call last):
	  File "/opt/conda/envs/fmripost_rapidtide/bin/rapidtide", line 23, in <module>
	    rapidtide_workflow.rapidtide_main(rapidtide_parser.process_args(inputargs=None))
	  File "/opt/conda/envs/fmripost_rapidtide/lib/python3.11/site-packages/rapidtide/workflows/rapidtide.py", line 1596, in rapidtide_main
	    outcorrarray, dummy, dummy = tide_util.allocshared(internalcorrshape, rt_floatset)
	                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	  File "/opt/conda/envs/fmripost_rapidtide/lib/python3.11/site-packages/rapidtide/util.py", line 1026, in allocshared
	    outarray_shared = mp.RawArray("d", thesize)
	                      ^^^^^^^^^^^^^^^^^^^^^^^^^
	  File "/opt/conda/envs/fmripost_rapidtide/lib/python3.11/multiprocessing/context.py", line 130, in RawArray
	    return RawArray(typecode_or_type, size_or_initializer)
	           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	  File "/opt/conda/envs/fmripost_rapidtide/lib/python3.11/multiprocessing/sharedctypes.py", line 61, in RawArray
	    obj = _new_value(type_)
	          ^^^^^^^^^^^^^^^^^
	  File "/opt/conda/envs/fmripost_rapidtide/lib/python3.11/multiprocessing/sharedctypes.py", line 41, in _new_value
	    wrapper = heap.BufferWrapper(size)
	              ^^^^^^^^^^^^^^^^^^^^^^^^
	  File "/opt/conda/envs/fmripost_rapidtide/lib/python3.11/multiprocessing/heap.py", line 331, in __init__
	    block = BufferWrapper._heap.malloc(size)
	            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	  File "/opt/conda/envs/fmripost_rapidtide/lib/python3.11/multiprocessing/heap.py", line 309, in malloc
	    (arena, start, stop) = self._malloc(size)
	                           ^^^^^^^^^^^^^^^^^^
	  File "/opt/conda/envs/fmripost_rapidtide/lib/python3.11/multiprocessing/heap.py", line 192, in _malloc
	    return self._new_arena(size)
	           ^^^^^^^^^^^^^^^^^^^^^
	  File "/opt/conda/envs/fmripost_rapidtide/lib/python3.11/multiprocessing/heap.py", line 166, in _new_arena
	    arena = Arena(length)
	            ^^^^^^^^^^^^^
	  File "/opt/conda/envs/fmripost_rapidtide/lib/python3.11/multiprocessing/heap.py", line 89, in __init__
	    self.buffer = mmap.mmap(self.fd, self.size)
	                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	OSError: [Errno 12] Cannot allocate memory

tsalo · 2024-09-03T21:25:04Z

@bbfrederick have you profiled rapidtide based on the input file size in the past? E.g., rapidtide requires X times the size of the input file's array?

tsalo · 2024-09-04T14:50:32Z

I had an old rapidtide run on my laptop actually. The memusage's Self Max RSS column peaks at 8.069 GB, and my input file's data array is ~1.5 GB, so a scalar of ~5.5x might be roughly what rapidtide needs. @bbfrederick does that sound right?

bbfrederick · 2024-09-04T17:44:34Z

That's probably about right. I don't know how strongly the scalar depends on the various options, but that's not a bad place to start. ~Are you running this in a container, on bare metal on your local machine, or on a cluster?~ (nvm - I see you are running it in Docker) --nprocs sets the number of cpus to "all the cpus on the machine" if you give it an argument of -1, it will use all the cpus on the node, so on clusters, and Docker containers, where you aren't able to use all of them, you need to set it explicitly. Because of the way python forks processes, that increases your memory usage some (although the majority of allocations are done in shared memory, so the memory load is not linear with the number of cpus). BTW, Github is seeming kind of broken to me - very long load times, can’t save comments on the issues page.

…

On Sep 4, 2024, at 10:50 AM, Taylor Salo ***@***.***> wrote: I had an old rapidtide run on my laptop actually. The memusage's Self Max RSS column peaks at 8.069 GB, and my input file's data array is ~1.5 GB, so a scalar of ~5.5x might be roughly what rapidtide needs. @bbfrederick <https://github.com/bbfrederick> does that sound right? — Reply to this email directly, view it on GitHub <#10 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAH5DU5XULZ4TUM6RSX2GDLZU4M45AVCNFSM6AAAAABNS3WY4CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRZGI3TSMZWHA>. You are receiving this because you were mentioned.

tsalo · 2024-09-04T18:56:06Z

I'm testing it out on my local machine (an old MacBook Pro) with Docker. I have Docker limited to 6 GB of memory because anything more would probably make my machine unusable while it's running.

I have nprocs set to 1. Are there any other tricks I should employ to save memory?

bbfrederick · 2024-09-05T13:23:22Z

You can try --spcalculation - that does all internal math in single precision to save RAM.

Also, call rapidtide with --memprofile - this will help figure out where the problem is.

bbfrederick · 2024-09-05T14:44:02Z

One thing that has always annoyed me about RAM use - macOS and linux python implementations seem to handle internal memory allocation very differently - if you run rapidtide on macOS directly RAM use goes up AND DOWN over time. But in any linux environment, including Docker containers running on macOS, RAM use only goes up, and doesn't ever seem to reuse deallocated RAM. By the end of a run on a linux system, rapidtide can have something like 50GB of VMEM when processing an HCP-YA resting state dataset.

BTW, rapidtide shouldn't need all its RAM to be resident at once. Does maximizing Docker's swap space help?

tsalo · 2024-09-05T21:18:33Z

--spcalculation helped a ton. It got all the way to the GLM filtering step before running out of memory this time.

I tried using --memprofile, but it failed early on:

Stdout:
	memprofiler does not exist
	pfftw does not exist
	no aggressive optimization
	will not use numba even if present
	garbage collection is on
Stderr:
	starting rapidtide UNKNOWN
	using 1 MKL threads
	Traceback (most recent call last):
	  File "/opt/conda/envs/fmripost_rapidtide/bin/rapidtide", line 23, in <module>
	    rapidtide_workflow.rapidtide_main(rapidtide_parser.process_args(inputargs=None))
	  File "/opt/conda/envs/fmripost_rapidtide/lib/python3.11/site-packages/rapidtide/workflows/rapidtide.py", line 394, in rapidtide_main
	    tide_util.logmem("before reading in fmri data")
	  File "/opt/conda/envs/fmripost_rapidtide/lib/python3.11/site-packages/rapidtide/util.py", line 193, in logmem
	    outvals.append(formatmemamt(rcusage.ru_maxrss - lastmaxrss_parent))
	                                                    ^^^^^^^^^^^^^^^^^
	NameError: name 'lastmaxrss_parent' is not defined

I know it's a problem that memprofiler isn't installed in the Docker image, but the error seems unrelated to that.

I'll try increasing the swap space next time.

bbfrederick · 2024-09-06T00:03:14Z

That last failure seems to be because memprofiler wasn't installed but it just barreled along and tried to use it anyway. I'll check that and fix it.

If --spcalculation got you that far, then we can do it in two parts. Set the --noglm option, and it will hopefully complete without error (but not do the filtering). Then you can run retroglm, which takes a completed rapidtide analysis and runs the filter - it gives identical results to running it in one go.

tsalo · 2024-09-06T12:56:23Z

That's a great idea! I'll try that out.

EDIT: I'll need an interface for retroglm anyway, since we'll eventually want to loop over output spaces and denoise in each.

tsalo · 2024-09-06T13:34:23Z

Is retroglm what we will use to apply the regressor to other spaces (e.g., CIFTIs)?

bbfrederick · 2024-09-06T15:00:18Z

I hadn't thought about it, but yes, that could certainly work. You'd just need to resample the delay map and the mask to the target space.

tsalo · 2024-09-06T20:30:38Z

The more I think about it, the more I'm convinced that we should run rapidtide in boldref space and then warp the outputs to other spaces, both because it's more memory-efficient and because it's easier to warp from boldref to other output spaces using fMRIPrep transforms. I'm trying that out locally and it seems to be working.

bbfrederick · 2024-09-09T14:06:01Z

Is there a gray/white/CSF segmentation in boldref space by default? If not, we'd need to generate one.

tsalo · 2024-09-09T14:13:10Z

I can just warp the T1w/T2w-space one into boldref space.

tsalo added the bug Something isn't working label Sep 3, 2024

tsalo mentioned this issue Sep 6, 2024

Combining rapidtide regressors with other confounds #16

Open

tsalo linked a pull request Sep 20, 2024 that will close this issue

Split regressor estimation and denoising steps #17

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rapidtide failing due to memory issue #10

Rapidtide failing due to memory issue #10

tsalo commented Sep 3, 2024

tsalo commented Sep 3, 2024

tsalo commented Sep 3, 2024

tsalo commented Sep 4, 2024

bbfrederick commented Sep 4, 2024 via email •

edited

Loading

tsalo commented Sep 4, 2024

bbfrederick commented Sep 5, 2024 •

edited

Loading

bbfrederick commented Sep 5, 2024

tsalo commented Sep 5, 2024

bbfrederick commented Sep 6, 2024 •

edited

Loading

tsalo commented Sep 6, 2024 •

edited

Loading

tsalo commented Sep 6, 2024

bbfrederick commented Sep 6, 2024

tsalo commented Sep 6, 2024

bbfrederick commented Sep 9, 2024

tsalo commented Sep 9, 2024

Rapidtide failing due to memory issue #10

Rapidtide failing due to memory issue #10

Comments

tsalo commented Sep 3, 2024

What happened?

What command did you use?

What version of fMRIPost-Rapidtide are you running?

How are you running fMRIPost-Rapidtide?

Is your data BIDS valid?

Are you reusing any previously computed results?

Please copy and paste any relevant log output.

Additional information / screenshots

tsalo commented Sep 3, 2024

tsalo commented Sep 3, 2024

tsalo commented Sep 4, 2024

bbfrederick commented Sep 4, 2024 via email • edited Loading

tsalo commented Sep 4, 2024

bbfrederick commented Sep 5, 2024 • edited Loading

bbfrederick commented Sep 5, 2024

tsalo commented Sep 5, 2024

bbfrederick commented Sep 6, 2024 • edited Loading

tsalo commented Sep 6, 2024 • edited Loading

tsalo commented Sep 6, 2024

bbfrederick commented Sep 6, 2024

tsalo commented Sep 6, 2024

bbfrederick commented Sep 9, 2024

tsalo commented Sep 9, 2024

bbfrederick commented Sep 4, 2024 via email •

edited

Loading

bbfrederick commented Sep 5, 2024 •

edited

Loading

bbfrederick commented Sep 6, 2024 •

edited

Loading

tsalo commented Sep 6, 2024 •

edited

Loading