-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce memory usage for metric calculations #269
Comments
We've had another user report a memory problem (see here), so I'd like to move this issue up in terms of priority. I've asked him to give some information about his dataset so that we can simulate some data that are large enough to replicate the issue. Once I have that, I'll dig into this. |
I know tedana has masking and unmasking features - is it operating on a vector of values within the mask or is it still holding all the ~zero values on the outside of the mask/brain in memory? In the original MEICA the data was actually cropped around the edges to remove anything outside of the brain. It was annoying (output dims differed from input), but effective in reducing memory usage. I know that the clustering steps require that the brain is organized as 3D, but perhaps the heaviest lifting steps could be done on Vox X Time data, if they are not already. I think the tradeoff for internally flipping the data back and forth would be worth it for less memory usage. We could always reproduce the original MEICA implementation of cutting away slabs of data that is outside of the brain mask and then adding zero filler slabs back at the end so that the data is the same dimension again. |
I believe that it operates on the masked data already. The mask is then used to unmask the data before writing it out, so that shouldn't be a problem here (and we shouldn't have to crop the unmasked images). |
True, although I guess we could split arrays up for voxel-wise metric calculation or maybe do a better job at deleting unused variables. Plus we should make sure that |
I just quadruple-checked, and it looks like numpy arrays are indeed passed by reference, so that's not introducing any overhead; pointers are cheap. So the only options remaining are basically:
|
Let's not think about C/C++ quite yet, if possible 😸 Although cython is lovely, it'll increase the barrier to entry pretty dramatically. I wonder if we could focus on the masking, first. I think passing arrays around will help quite a bit, particularly if we're zero-ing everything outside the brain. This will interact with the integration of post-tedana AROMA, but I think we can worry about that after memory usage is down. If PCA is a problem, we can think about alternative decompositions. The simplest first step might just be setting |
I agree with @emdupre that setting It's also worth noting that the imaging files in question from the Neurostars issue @tsalo linked to, when loaded into memory, are going to be ~4.25 GB each: >>> import numpy as np
>>> np.zeros((92, 92, 50, 1350)).nbytes / (1024 ** 3)`
4.2566657066345215 (Using That's already ~13GB for all three time series loaded into memory in the |
I guess that didn't translate well; the C/C++ was 99% joking. Even if we had a bunch of programmers lying around, communally maintaining C/C++ is pretty difficult, and even more difficult to test effectively. Masking already works pretty well; the user either gives a mask or a substantial amount of data gets "ignored" by the automask, it's quite efficient. The fact that the user's files are big points back to the related issue #267, so that users understand that files that large will require beefy machines. We can't really compress the data further. PCA is a pretty memory-expensive operation if I recall the details correctly, but it's also a pretty widely used approach. I don't see a reason to ditch it without a very attractive alternative. |
Sorry for being unclear -- I don't think we need to ditch it just yet. The Definitely think the tie-in with #267 is important, too ! |
I didn't think you were suggesting dumping it right away, I'm just stating for the record that there are many compelling reasons to stick with PCA. Trying |
Note: another option would be to detect if RAM usage will be high/catch memory errors and use memory mapped arrays, which I think is the strategy that AfNI employs when it runs into similar problems. |
I've been pretty busy and haven't taken the time to properly read the code. I hope I can do that before the hackathon. May I ask whether the calculation of the metrics is done voxelwise? Or is it a matrix computation? In the development of my own algorithms I've had some memory issues with Python that I managed to solve by using memmaps. I believe this approach could be helpful. Basically, the idea is to save big matrices in ROM memory and not load them into Python, but access them. This can have a big impact when doing voxelwise analysis given that one could just access the data of one voxel in that huge matrix, having little to no impact at all in memory. The easiest way of working with this approach is to save the matrix with Numpy's See the documentation here: I hope it can be helpful! Edit: Paragraphs. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions to tedana:tada: ! |
Summary
RAM usage appears very high during metric calculations. In order to reduce memory usage, we should look for places where we can write images to disk and free the memory they were using.
Additional Detail
There are actually two places where memory usage is high per a quick test by @dowdlelt:
However, as @tsalo has pointed out in Gitter, PCA is implemented in sklearn and its memory usage may be stuck high. Since we have more control over the metric calculation, we should attempt to curb its memory usage where possible, especially since as of issue #254 there is at least one user who cannot run
tedana
to completion since its memory usage was so large. (See figure here).Next Steps
The text was updated successfully, but these errors were encountered: