Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiFab Fixture Cleanup via FabArray::clear #214

Merged
merged 3 commits into from
Nov 1, 2023

Conversation

ax3l
Copy link
Member

@ax3l ax3l commented Oct 21, 2023

Using a context manager and calling clear ensures that we will not hold device memory anymore once we hit AMReX::Finalize, even in the situation where an exception is raised in a test. This avoids segfaults for failing tests.

Related to #81

Using a context manager and calling clear ensures that we will
not hold device memory anymore once we hit `AMReX::Finalize`,
even in the situation where an exception is raised in a test.
This avoids segfaults for failing tests.
@ax3l ax3l added enhancement New feature or request backend: cuda Specific to CUDA execution (GPUs) labels Oct 21, 2023
Clear out memory safely on runtime errors.
@ax3l ax3l added the bug Something isn't working label Oct 21, 2023
del self.mfab

with MfabContextManager() as mfab:
yield mfab
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be simplified?

def mfab(boxarr, distmap, request):
    num_components = request.param[0]
    num_ghost = request.param[1]
    try:
        mfab = amr.MultiFab(boxarr, distmap, num_components, num_ghost)
        mfab.set_val(0.0, 0, num_components)
        yield mfab
    finally:
        mfab.clear()
        del mfab

Copy link
Member Author

@ax3l ax3l Oct 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to my knowledge, that will cache the MultiFab between runs and break out memory pool assumptions...

Worth to try again

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. The fixture is marked function scope, so it shouldn't cache a value. Is it pytest that's caching the value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought so... maybe I misunderstood something when caching kicks in.

def __enter__(self):
num_components = request.param[0]
num_ghost = request.param[1]
self.mfab = amr.MultiFab(boxarr, distmap, num_components, num_ghost)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pylint would probably create warnings for this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh really? Linting still coming in #208 😅

CodeQL did not yet complain, what's to be avoided here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pylint will complain about attributes being declared/created in a non-init function.

@ax3l
Copy link
Member Author

ax3l commented Nov 1, 2023

@sayerhs I wonder if I shall just merge this for now and refactor it later, since it fixes issues we see on Perlmutter...?

@ax3l ax3l merged commit 3c73a42 into AMReX-Codes:development Nov 1, 2023
1 check passed
@ax3l ax3l deleted the test-clear-mfab branch November 1, 2023 17:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend: cuda Specific to CUDA execution (GPUs) bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants