fix deadlock in add_srcdata via new require_source_components() function #1521
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
@mochen4 has been seeing some deadlocks when using near2far adjoints on multiple processors. I think this PR fixes the underlying problem.
fields::add_srcdata
was callingfields::require_component(c)
to ensure that the fields corresponding to the source component were added, butrequire_component
is a collective function — it must be called from all processes with the samec
. Because of the low-level way in whichadd_srcdata
is constructed, however, the list of sources can be different on each processor, and in particular sources were only added on processors whose chunks overlap with the DFT near-field region.To solve this, I did some refactoring and added a new
fields::require_source_components()
function which can be called collectively after all of the sources are added, allocating the necessary fields. Python now calls this by default after it adds sources, just in case one of the sources usedadd_srcdata
(anIndexedSource
in Python).