Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No pinned flags in setup_trans #142

Merged
merged 1 commit into from
Sep 2, 2024
Merged

Conversation

lukasm91
Copy link
Collaborator

@lukasm91 lukasm91 commented Sep 2, 2024

I suggest that we remove pinned flags from setup_trans, mainly because there are no allocations in setup_trans anymore that are in the critical path.

The pinned flag used to be here for a few intermediate buffers that were copied forth and back in each iteration. Those are now gone.

If we are not removing pinned, we have to add it to a few more files to make sure we use the same API for deallocation. This is one of the reasons trans_release/trans_end is not working right now, and the error messages are far from intuitive.

@wdeconinck
Copy link
Collaborator

I definitely don't like the "-gpu=pinned" flag and would like to avoid it if possible.
Did you also verify with an experiment that this modification has no negative impact on performance within the iterations loop?

@marsdeno marsdeno self-requested a review September 2, 2024 12:07
Copy link
Collaborator

@marsdeno marsdeno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this in the run-up to merge of GPU code path to develop, and my commit must somehow have got lost somewhere, sorry.
In any case in my tests things worked fine with no effect on performance.

@lukasm91
Copy link
Collaborator Author

lukasm91 commented Sep 2, 2024

Yes, I confirm, also didn't see an impact on performance, which makes sense because setup_trans has no allocations that are moved between CPU and GPU in the critical path (this was different in earlier versions because the CPU-GPU staging buffers where allocated in this routine).

@wdeconinck
Copy link
Collaborator

Does this now mean that this flag is also no longer required on the link line?

@wdeconinck wdeconinck merged commit 51cd73c into ecmwf-ifs:develop Sep 2, 2024
11 checks passed
@lukasm91
Copy link
Collaborator Author

lukasm91 commented Sep 2, 2024

Does this now mean that this flag is also no longer required on the link line?

In the current code, yes, because the only transfers in the critical path that benefit from pinned memory are copying input/output buffers from outside ectrans, and those have to be allocated accordingly, but outside of ectrans. If we were to optimize the "no cuda aware MPI case" on GPU, we would need to compile and link that file with -gpu=pinned, but that is probably a stretch, and is also not a regression.

@lukasm91 lukasm91 deleted the no-pinned branch September 3, 2024 08:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants