Reduce memory usage for metric calculations #269

jbteves · 2019-04-22T15:54:17Z

Summary

RAM usage appears very high during metric calculations. In order to reduce memory usage, we should look for places where we can write images to disk and free the memory they were using.

Additional Detail

There are actually two places where memory usage is high per a quick test by @dowdlelt:

PCA
Metric Calculation.

However, as @tsalo has pointed out in Gitter, PCA is implemented in sklearn and its memory usage may be stuck high. Since we have more control over the metric calculation, we should attempt to curb its memory usage where possible, especially since as of issue #254 there is at least one user who cannot run tedana to completion since its memory usage was so large. (See figure here).

Next Steps

Identify where files can be written to disk and reloaded later
Implement these changes in code
Determine new memory requirements per these changes

The text was updated successfully, but these errors were encountered:

tsalo · 2019-06-18T11:29:15Z

We've had another user report a memory problem (see here), so I'd like to move this issue up in terms of priority. I've asked him to give some information about his dataset so that we can simulate some data that are large enough to replicate the issue. Once I have that, I'll dig into this.

dowdlelt · 2019-06-18T17:13:20Z

I know tedana has masking and unmasking features - is it operating on a vector of values within the mask or is it still holding all the ~zero values on the outside of the mask/brain in memory? In the original MEICA the data was actually cropped around the edges to remove anything outside of the brain. It was annoying (output dims differed from input), but effective in reducing memory usage.

I know that the clustering steps require that the brain is organized as 3D, but perhaps the heaviest lifting steps could be done on Vox X Time data, if they are not already. I think the tradeoff for internally flipping the data back and forth would be worth it for less memory usage.

We could always reproduce the original MEICA implementation of cutting away slabs of data that is outside of the brain mask and then adding zero filler slabs back at the end so that the data is the same dimension again.

tsalo · 2019-06-19T15:41:02Z

I believe that it operates on the masked data already. The mask is then used to unmask the data before writing it out, so that shouldn't be a problem here (and we shouldn't have to crop the unmasked images).

jbteves · 2019-06-19T15:49:15Z

From here it looks like it's just a simple 2D array, and it should be masked from here in the optimal combination step. So that's as memory-efficient as we can get unless python is passing copies, but I don't think numpy arrays allow that.

tsalo · 2019-06-19T15:51:31Z

True, although I guess we could split arrays up for voxel-wise metric calculation or maybe do a better job at deleting unused variables. Plus we should make sure that int or bool arrays are actually stored as such. I'm not sure what else there is to do.

jbteves · 2019-06-19T16:08:49Z

I just quadruple-checked, and it looks like numpy arrays are indeed passed by reference, so that's not introducing any overhead; pointers are cheap. So the only options remaining are basically:

See if the typing is correct; make it a smaller type if not
Try to free memory as quickly as possible
See if we can architect away some of the more expensive calculations (very difficult)
Rewriting intensive parts in C/C++ (extremely difficult)
I'll see if I can figure out some ways to free up the memory, but since I'm not as familiar with the code I'm not sure how well that will go.
One thing worth noting is that we should try to keep copying to a minimum. I bet it takes a lot of time for the interpreter to find a nice big block of contiguous memory to copy into. In-place where possible will save memory and time. I'm not sure that we can improve on that compared to what we have already, but it's good to keep in mind for future development.
@tsalo do you know if the user had the latest version, where the automask is used by default? That saves a lot of memory.

emdupre · 2019-06-19T17:21:47Z

Let's not think about C/C++ quite yet, if possible 😸 Although cython is lovely, it'll increase the barrier to entry pretty dramatically.

I wonder if we could focus on the masking, first. I think passing arrays around will help quite a bit, particularly if we're zero-ing everything outside the brain. This will interact with the integration of post-tedana AROMA, but I think we can worry about that after memory usage is down.

If PCA is a problem, we can think about alternative decompositions. The simplest first step might just be setting copy=False in the PCA call.

rmarkello · 2019-06-19T17:50:37Z

I agree with @emdupre that setting copy=False should help somewhat and that it's definitely worth trying to find areas of the code where copies, rather than views, of the imaging data are accidentally being made before diving into anything too crazy.

It's also worth noting that the imaging files in question from the Neurostars issue @tsalo linked to, when loaded into memory, are going to be ~4.25 GB each:

>>> import numpy as np
>>> np.zeros((92, 92, 50, 1350)).nbytes  / (1024 ** 3)`
4.2566657066345215

(Using np.zeros generates accurate estimates of memory usage without actually consuming the memory.)

That's already ~13GB for all three time series loaded into memory in the catd array. If even just one copy of that array is made you've already basically hit your maximum with 32GB RAM... These are pretty hefty data that are definitely pushing traditional use cases.

jbteves · 2019-06-19T18:53:58Z

I guess that didn't translate well; the C/C++ was 99% joking. Even if we had a bunch of programmers lying around, communally maintaining C/C++ is pretty difficult, and even more difficult to test effectively.

Masking already works pretty well; the user either gives a mask or a substantial amount of data gets "ignored" by the automask, it's quite efficient. The fact that the user's files are big points back to the related issue #267, so that users understand that files that large will require beefy machines. We can't really compress the data further.

PCA is a pretty memory-expensive operation if I recall the details correctly, but it's also a pretty widely used approach. I don't see a reason to ditch it without a very attractive alternative.

emdupre · 2019-06-19T19:02:39Z

Sorry for being unclear -- I don't think we need to ditch it just yet. The copy=False option prevents a copy from being created in memory to run the PCA on. I vote we try that first.

Definitely think the tie-in with #267 is important, too !

jbteves · 2019-06-19T19:04:47Z

I didn't think you were suggesting dumping it right away, I'm just stating for the record that there are many compelling reasons to stick with PCA.

Trying copy=false sounds good. Documentation here for reference.

jbteves · 2019-06-19T19:10:08Z

Note: another option would be to detect if RAM usage will be high/catch memory errors and use memory mapped arrays, which I think is the strategy that AfNI employs when it runs into similar problems.

eurunuela · 2019-09-15T11:46:54Z

I've been pretty busy and haven't taken the time to properly read the code. I hope I can do that before the hackathon. May I ask whether the calculation of the metrics is done voxelwise? Or is it a matrix computation?

In the development of my own algorithms I've had some memory issues with Python that I managed to solve by using memmaps. I believe this approach could be helpful. Basically, the idea is to save big matrices in ROM memory and not load them into Python, but access them. This can have a big impact when doing voxelwise analysis given that one could just access the data of one voxel in that huge matrix, having little to no impact at all in memory. The easiest way of working with this approach is to save the matrix with Numpy's save function and then load the matrix with the mmap_mode option in load. The files that are stored are actually not very big for what I have seen so far.

See the documentation here:

I hope it can be helpful!

Edit: Paragraphs.

stale · 2020-02-16T16:56:02Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions to tedana:tada: !

jbteves added enhancement issues describing possible enhancements to the project help wanted issues where we're particularly excited for contributions labels Apr 22, 2019

jbteves modified the milestones: documentation, method extensions & improvements May 24, 2019

tsalo mentioned this issue Jun 21, 2019

[ENH, REF] Reduce memory requirements for metric calculation and PCA #345

Merged

tsalo changed the title ~~Reduce Memory Usage for Metric Calculations~~ Reduce memory usage for metric calculations Nov 18, 2019

stale bot added the stale label Feb 16, 2020

stale bot closed this as completed Feb 23, 2020

handwerkerd mentioned this issue Mar 13, 2020

Topics for March 2020 Developers’ call: Pandemic Edition #550

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce memory usage for metric calculations #269

Reduce memory usage for metric calculations #269

jbteves commented Apr 22, 2019 •

edited

Loading

tsalo commented Jun 18, 2019

dowdlelt commented Jun 18, 2019

tsalo commented Jun 19, 2019

jbteves commented Jun 19, 2019

tsalo commented Jun 19, 2019

jbteves commented Jun 19, 2019

emdupre commented Jun 19, 2019

rmarkello commented Jun 19, 2019

jbteves commented Jun 19, 2019

emdupre commented Jun 19, 2019

jbteves commented Jun 19, 2019 •

edited

Loading

jbteves commented Jun 19, 2019

eurunuela commented Sep 15, 2019 •

edited

Loading

stale bot commented Feb 16, 2020

Reduce memory usage for metric calculations #269

Reduce memory usage for metric calculations #269

Comments

jbteves commented Apr 22, 2019 • edited Loading

Summary

Additional Detail

Next Steps

tsalo commented Jun 18, 2019

dowdlelt commented Jun 18, 2019

tsalo commented Jun 19, 2019

jbteves commented Jun 19, 2019

tsalo commented Jun 19, 2019

jbteves commented Jun 19, 2019

emdupre commented Jun 19, 2019

rmarkello commented Jun 19, 2019

jbteves commented Jun 19, 2019

emdupre commented Jun 19, 2019

jbteves commented Jun 19, 2019 • edited Loading

jbteves commented Jun 19, 2019

eurunuela commented Sep 15, 2019 • edited Loading

stale bot commented Feb 16, 2020

jbteves commented Apr 22, 2019 •

edited

Loading

jbteves commented Jun 19, 2019 •

edited

Loading

eurunuela commented Sep 15, 2019 •

edited

Loading