Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚡️ Make memory estimates data-dependent #1499

Merged
merged 27 commits into from
May 21, 2021

Conversation

shnizzedy
Copy link
Member

@shnizzedy shnizzedy commented Apr 28, 2021

Fixes

Fixes #1480 by @shnizzedy
Fixes #1511 by @shnizzedy

Description

Adds an optional mem_x parameter that takes a tuple of (multiplier, Node_property_containing_filename) to set memory estimates dynamically.

mem_gb remains a constant, such that

if mem_x is not set, node.mem_gb = mem_gb

if mem_x is set, node.mem_gb = mem_gb + mem_x[0] * size_of(get_attr(node.inputs, mem_x[1]))

Technical details

Tests

def test_Workflow(tmpdir):
example_filepath = os.path.join(data_path, 'example4d.nii.gz')
get_sample_data = lambda filepath: filepath
pass_in_filepath = IdentityInterface(fields=['x'])
pass_in_filepath.inputs.x = example_filepath
node_1 = Node(pass_in_filepath, 'pass_in_filepath')
# This node just returns the filepath given. We're testing the
# just-in-time memory allocation
node_2 = Node(Function(['filepath'], ['filepath'], get_sample_data),
name='get_sample_data',
inputs=['filepath'],
outputs=['filepath'])
assert node_2.mem_gb == DEFAULT_MEM_GB
node_2 = Node(Function(['filepath'], ['filepath'], get_sample_data),
name='get_sample_data',
inputs=['filepath'],
outputs=['filepath'],
mem_x=(0.1, 'filepath'))
with pytest.raises(FileNotFoundError):
node_2.mem_gb
wf = Workflow('example_workflow', base_dir=tmpdir)
wf.connect(node_1, 'x', node_2, 'filepath')
assert wf.get_node('get_sample_data').inputs.filepath is Undefined
out = wf.run()
assert list(out.nodes)[0].mem_gb == DEFAULT_MEM_GB + get_data_size(
example_filepath) * 0.1

Screenshots

Checklist

  • My pull request has a descriptive title (not a vague title like Update index.md).
  • My pull request targets the develop branch of the repository.
  • My commit messages follow best practices.
  • My code follows the established code style of the repository.
  • I added tests for the changes I made (if applicable).
  • I updated the changelog.
  • I added or updated documentation (if applicable) ― 📝 Document mem_x fcp-indi.github.io#253
  • I tried running the project locally and verified that there are no
    visible errors.

Developer Certificate of Origin

Developer Certificate of Origin
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

@shnizzedy shnizzedy self-assigned this Apr 28, 2021
@shnizzedy
Copy link
Member Author

Right now the function's only considering the time dimension of files because that's all I've varied in calculating the estimates. I plan on also varying physical dimensions and updating the function and estimates before marking this PR ready for review.

@shnizzedy
Copy link
Member Author

Example (seems to be working):

Estimate in code:

func_resample = pe.Node(
interface=fsl.FLIRT(),
name='{}_flirt'
.format(functional_at_resolution_key),
mem_gb=0.521,
mem_x=(0.0098, 'in_file')
)

In callback.log:

{"id": "cpac_sub-105923_ses-retest.nuisance_regressors_Regressor-2_225.Functional_2mm_flirt", "hash": "caf85785cf4dca97a3b2f6692406ac9b", "start": "2021-04-28T15:52:00.444716", "finish": "2021-04-28T15:52:56.886663", "runtime_threads": 1, "runtime_memory_gb": 2.983402251953125, "estimated_memory_gb": 3.461, "num_threads": 1}
{"id": "cpac_sub-105923_ses-retest.nuisance_regressors_Regressor-1_225.Functional_2mm_flirt", "hash": "caf85785cf4dca97a3b2f6692406ac9b", "start": "2021-04-28T15:52:02.494853", "finish": "2021-04-28T15:53:00.245118", "runtime_threads": 1, "runtime_memory_gb": 2.95299530078125, "estimated_memory_gb": 3.461, "num_threads": 1}

@shnizzedy shnizzedy force-pushed the enh/memory_data-dependent branch from 9600b21 to acb446f Compare May 12, 2021 15:44
@shnizzedy
Copy link
Member Author

I'm going through and cleaning up the hacked-together bits now, but here's what I've put together that seems to be working (Paul at Columbia's trying it out on their cluster now):
The image I sent to Paul falls back on a multiplicand that is the size of the functional files I know he's using. I've removed that fallback from the branch and am testing out that the workflow-level memoized value is available everywhere I think it should be. Still to do:

for 1.8.1
    • Finish cleaning up the code changes.
    • Edit the docs and put in a PR.
    • Put in some reasonable CI test(s).
    • Take out strings in the mem_x​s that I've set.
for 1.8.2+
    • Run more variety of data shapes through C-PAC to tune the estimates.
Some design choices I'm on the fence about:
  1. Is it worthwhile to be able to specify per-node inputs, or is per-workflow good enough? I'm leaning toward the latter for simplicity's sake, but I could be convinced otherwise. Not pressing since both seem to work currently, but the documentation would be simpler and the question of which to use if both are defined goes away if we just use the largest functional scan.
  2. Is a coefficient times (x × y × z × t) the right approach or should there be separate coefficients for spatial dimensions and time? Or some other approach to tying the memory estimates to data size?

― me, 📧 "Setting data-dependent memory-estimates"

@shnizzedy shnizzedy force-pushed the enh/memory_data-dependent branch 2 times, most recently from 741fde5 to efd96eb Compare May 14, 2021 17:14
@shnizzedy shnizzedy force-pushed the enh/memory_data-dependent branch from efd96eb to 6881068 Compare May 14, 2021 17:32
@shnizzedy shnizzedy force-pushed the enh/memory_data-dependent branch from f247a00 to 09bb6f0 Compare May 18, 2021 19:50
@shnizzedy shnizzedy force-pushed the enh/memory_data-dependent branch from 09bb6f0 to 44ac6a6 Compare May 19, 2021 14:36
@shnizzedy shnizzedy marked this pull request as ready for review May 19, 2021 15:10
@sgiavasis sgiavasis merged commit c85c95b into develop May 21, 2021
@shnizzedy shnizzedy deleted the enh/memory_data-dependent branch May 24, 2021 13:28
@shnizzedy shnizzedy restored the enh/memory_data-dependent branch June 1, 2021 20:45
@shnizzedy shnizzedy deleted the enh/memory_data-dependent branch June 1, 2021 21:15
@shnizzedy shnizzedy mentioned this pull request Aug 12, 2021
8 tasks
@shnizzedy shnizzedy mentioned this pull request Sep 21, 2021
8 tasks
shnizzedy pushed a commit to shnizzedy/C-PAC that referenced this pull request Feb 25, 2022
⚡️ Make memory estimates data-dependent
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants