ENH: Serialize+parallelize 4D `apply()` into 3D+t and add 'low memory' loading #215

oesteban · 2024-07-23T11:00:31Z

Re-implements serialization (i.e., splitting a 4D transformation into 3D transformations). For time-dependent transforms (e.g., when also slice-time correcting), parallelization will require a different partition (spatial and/or temporal windowing).

…reenable-parallelization-apply-214

3D transform chains resulting of composing several transformations (e.g., affine and deformation fields in spatial normalization) should not be split into its components. This is in contrast to lists of 3D transforms such as head-motion correcting affines, where each applies to one timepoint. These should be considered 4D and in some future they may integrate slice timing correction in them.

oesteban · 2024-07-31T07:21:35Z

~~Dropping parallelization for a future PR.~~ Added through #220. This is ready for review.

…llelization-apply-214

codecov · 2024-07-31T09:04:04Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.76%. Comparing base (7c9eaed) to head (063e1f0).
Report is 45 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #215      +/-   ##
==========================================
+ Coverage   94.39%   94.76%   +0.36%     
==========================================
  Files          15       15              
  Lines        1713     1756      +43     
  Branches      323      328       +5     
==========================================
+ Hits         1617     1664      +47     
+ Misses         79       76       -3     
+ Partials       17       16       -1

Flag	Coverage Δ
unittests	`94.76% <100.00%> (+0.36%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Resolves: #214.

…214-parallel ENH: Parallelize serialized 3D+t transforms

Resolves: #218. Co-authored-by: Chris Markiewicz <[email protected]>

…214-dtypes ENH: Implement a memory limitation mechanism in loading data

oesteban · 2024-08-05T13:49:38Z

@effigies I'm not planning on adding any more features within this PR, it is safe to start reviewing :)

effigies

Overall the code looks like it should work. I think we're pushing a surprising amount of process control logic into this library, and would suggest seeing if it can be avoided. In particular, I don't think we should choose the number of threads, and I would try not to use multiprocessing.

The tests, I'm assuming, are mostly copied from their original locations? It's a lot to look over in detail. If there are new bits among large blocks of old, could you highlight them for review?

nitransforms/resampling.py

effigies · 2024-08-05T15:04:50Z

nitransforms/resampling.py

+    cval: float = 0.0,
+    prefilter: bool = True,
+    output_dtype: np.dtype = None,
+    serialize_nvols: int = SERIALIZE_VOLUME_WINDOW_WIDTH,


There's no parameter in the docstring, and the whole concept of serialization here is non-obvious. To the best of my understanding, the idea is that you want to parallelize over volumes but only if you hit a threshold that justifies the overhead of multiprocessing.

Have you compared the performance of multiprocessing versus async workers? If you're able to async or thread, you will have much less overhead and might not need this additional concept.

Overall it is a great interpretation.

There's something not so explicit: if you want real 4D interpolation (e.g., head motion + slice timing corrections) then you want to set serialize_nvols = np.inf because you can't split by volume (we've talked about doing this in a moving window, and then this parameter would map onto the width of that window).

Happy to document better.

I have added this parameter's description in the docstring, as it was missing. LMKWYT

oesteban · 2024-08-05T15:24:27Z

are mostly copied from their original locations?

Yes, and added parameters to exercise new branches.

It's a lot to look over in detail. If there are new bits among large blocks of old, could you highlight them for review?

Indeed, thanks for taking the time. I'll first see how to remove the process pool and then make a final pass to identify areas that may require focused attention.

Thanks a lot :)

Co-authored-by: Chris Markiewicz <[email protected]>

oesteban · 2024-08-07T07:46:08Z

nitransforms/resampling.py

+    # Number of data volumes
+    data_nvols = 1 if spatialimage.ndim < 4 else spatialimage.shape[-1]
+    # Number of transforms: transforms chains (e.g., affine + field, are a single transform)
+    xfm_nvols = 1 if transform.ndim < 4 else len(transform)


This is new in this PR. Julien introduced the concept of transform.ndim to differentiate 3D transforms that require four parameters (e.g., displacements fields are typically stored as 4D NIfTIs, but are still 3D transforms) from 4D transformations (e.g., head-motion correction affines).

Following the logic of the previous version, the output is a 4D array if:

The input image is 4D and the transform is 3D (e.g., BOLD series and coregistration to T1w image), the same transform is applied to the series.

The input image is 3D and the transform is 4D (e.g., mapping a fieldmap onto all volumes of BOLD series), the same moving image is transformed through the transforms series.

The input image is 4D and the transform is 4D (e.g., head motion correction).

All these 4D transforms can be called "3D+t" if the coordinates in time are not moving (e.g., no slice-timing correction). If the transform is 3D+t, then we can serialize (i.e., apply one-by-one) and use an embarrassingly parallel approach for concurrence.

Julien also implemented 4D transforms -- this was new in his PR and would require more thorough testing (esp. testing moving coordinates in time). This code is executed if serialize_4d is not True.

oesteban · 2024-08-10T07:49:12Z

Seems to be working and the issues have been addressed - merging. Happy to get back to this in future post-merge reviews.

oesteban force-pushed the enh/reenable-parallelization-apply-214 branch from 0b2302f to b922fa5 Compare July 23, 2024 11:01

oesteban and others added 5 commits July 23, 2024 13:01

wip: initiate implementation

b922fa5

enh: draft implementation of serialize 4d

6064b8c

fix: passes more tests, more suggestions in progress

e47a476

fix: pass tests

1616a35

fix: pass tests, serialization implemented

6292daf

jmarabotto mentioned this pull request Jul 25, 2024

ENH: reenable-parallelization-apply-214 (builds on PR #215, solves Issue #214) #217

Merged

oesteban force-pushed the enh/reenable-parallelization-apply-214 branch from b922fa5 to 86b3d11 Compare July 29, 2024 16:14

oesteban added 8 commits July 29, 2024 18:14

wip: initiate implementation

86b3d11

Merge remote-tracking branch 'jmarabotto/patch/oesteban-pr' into enh/…

23daabb

…reenable-parallelization-apply-214

enh: integrating @jmarabotto's code

79e5cad

fix: ensure output dtype when resampling

e0bde09

fix: resolve some failing tests

fbb0451

fix: ensure __len__ is defined for all transforms``

0153472

maint: reorganize tests around the spun-off apply

06a1c01

oesteban changed the title ~~ENH: Serialize 4D apply() into 3D+t including parallelization~~ ENH: Serialize 4D apply() into 3D+t Jul 31, 2024

oesteban force-pushed the enh/reenable-parallelization-apply-214 branch from de81e23 to 8dd883d Compare July 31, 2024 07:19

sty: format changed files

8dd883d

oesteban requested a review from effigies July 31, 2024 07:21

Merge remote-tracking branch 'upstream/master' into enh/reenable-para…

9f91e2f

…llelization-apply-214

oesteban added 4 commits August 1, 2024 08:49

enh: expand test coverage

4c06174

enh: prepare code for easy parallelization with a process pool executor

754785f

Resolves: #214.

enh: create process pool

38bb388

Merge pull request #220 from nipy/enh/reenable-parallelization-apply-…

8ba34c9

…214-parallel ENH: Parallelize serialized 3D+t transforms

oesteban force-pushed the enh/reenable-parallelization-apply-214 branch from ed1ffc8 to 026a10a Compare August 1, 2024 08:08

oesteban added 2 commits August 1, 2024 10:08

enh: expand test coverage

026a10a

sty: add type annotations

7dcc78d

oesteban force-pushed the enh/reenable-parallelization-apply-214 branch from a659355 to 7dcc78d Compare August 2, 2024 06:53

oesteban and others added 2 commits August 2, 2024 09:49

enh: implement a memory limitation mechanism in loading data

79305a9

Resolves: #218. Co-authored-by: Chris Markiewicz <[email protected]>

Merge pull request #221 from nipy/enh/reenable-parallelization-apply-…

7c7608f

…214-dtypes ENH: Implement a memory limitation mechanism in loading data

oesteban changed the title ~~ENH: Serialize 4D apply() into 3D+t~~ ENH: Serialize+parallelize 4D apply() into 3D+t and add 'low memory' loading Aug 5, 2024

effigies reviewed Aug 5, 2024

View reviewed changes

oesteban force-pushed the enh/reenable-parallelization-apply-214 branch from 436ae1d to b42b172 Compare August 6, 2024 07:02

enh: port from process pool into asyncio concurrent

063e1f0

Co-authored-by: Chris Markiewicz <[email protected]>

oesteban force-pushed the enh/reenable-parallelization-apply-214 branch from b42b172 to 063e1f0 Compare August 6, 2024 07:42

oesteban commented Aug 7, 2024

View reviewed changes

oesteban merged commit b141a8e into master Aug 10, 2024
14 checks passed

oesteban deleted the enh/reenable-parallelization-apply-214 branch August 10, 2024 07:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Serialize+parallelize 4D `apply()` into 3D+t and add 'low memory' loading #215

ENH: Serialize+parallelize 4D `apply()` into 3D+t and add 'low memory' loading #215

oesteban commented Jul 23, 2024

oesteban commented Jul 31, 2024 •

edited

Loading

codecov bot commented Jul 31, 2024 •

edited

Loading

oesteban commented Aug 5, 2024

effigies left a comment •

edited

Loading

effigies Aug 5, 2024

oesteban Aug 5, 2024

oesteban Aug 7, 2024

oesteban commented Aug 5, 2024

oesteban Aug 7, 2024

oesteban commented Aug 10, 2024 •

edited

Loading

ENH: Serialize+parallelize 4D apply() into 3D+t and add 'low memory' loading #215

ENH: Serialize+parallelize 4D apply() into 3D+t and add 'low memory' loading #215

Conversation

oesteban commented Jul 23, 2024

oesteban commented Jul 31, 2024 • edited Loading

codecov bot commented Jul 31, 2024 • edited Loading

Codecov Report

oesteban commented Aug 5, 2024

effigies left a comment • edited Loading

Choose a reason for hiding this comment

effigies Aug 5, 2024

Choose a reason for hiding this comment

oesteban Aug 5, 2024

Choose a reason for hiding this comment

oesteban Aug 7, 2024

Choose a reason for hiding this comment

oesteban commented Aug 5, 2024

oesteban Aug 7, 2024

Choose a reason for hiding this comment

oesteban commented Aug 10, 2024 • edited Loading

ENH: Serialize+parallelize 4D `apply()` into 3D+t and add 'low memory' loading #215

ENH: Serialize+parallelize 4D `apply()` into 3D+t and add 'low memory' loading #215

oesteban commented Jul 31, 2024 •

edited

Loading

codecov bot commented Jul 31, 2024 •

edited

Loading

effigies left a comment •

edited

Loading

oesteban commented Aug 10, 2024 •

edited

Loading