Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long-range actors should be script interface objects #4454

Closed
20 tasks done
jngrad opened this issue Feb 16, 2022 · 0 comments · Fixed by #4506
Closed
20 tasks done

Long-range actors should be script interface objects #4454

jngrad opened this issue Feb 16, 2022 · 0 comments · Fixed by #4506

Comments

@jngrad
Copy link
Member

jngrad commented Feb 16, 2022

Many electrostatic and magnetostatic methods implemented for the GPU have the following design:

IF ELECTROSTATICS and MMM1D_GPU:
    cdef class MMM1DGPU(ElectrostaticInteraction):
        cdef shared_ptr[Mmm1dgpuForce] sip
        cdef EspressoSystemInterface * interface

        def __cinit__(self):
            self.interface = &EspressoSystemInterface.Instance()
            self.sip = make_shared[Mmm1dgpuForce](dereference(self.interface))

        def _deactivate_method(self):
            dereference(self.sip).deactivate()

The ScriptInterface framework provides a convenient interface to store a shared pointer to a core class. It has many advantages:

  • no need to manage a shared pointer, this is done by the script interface
  • no need to store a raw pointer to the EspressoSystemInterface global variable
  • no need to re-declare (and maintain) the core class in the .pxd file
  • the Cython class can be converted to a regular Python class

The CPU implementations can also benefit from such a refactoring:

  • no need to duplicate class declarations in the .pxd files
  • all electrostatics and magnetostatics core global variables can be removed

Roadmap

Rewrite electrostatic and magnetostatic methods as C++ classes

  • improve interface segregation and encapsulate algorithms in classes
  • do dependency inversion (e.g. dipoles.hpp should include all magnetostatic methods, instead of all magnetostatic methods including dipoles.hpp)
  • extract common code (P3M data structs and utility functions, FFT code)
  • move actor source files to the electrostatics or magnetostatics modules
  • remove the Actors framework in the core
  • split electrostatic methods from magnetostatic methods
    • the core should have three new folders: electrostatics, magnetostatics, p3m (for common P3M code and FFT)
    • the core should no longer have folders actors, electrostatics_magnetostatics
  • split ScaFaCoS electrostatic methods from ScaFaCoS magnetostatic methods

Write the script interface

  • write ScriptInterface classes for electrostatic methods
  • write ScriptInterface classes for magnetostatic methods
  • write ScriptInterface classes for ScaFaCoS methods
  • sort out issues related to MPI errors when throwing exceptions on worker nodes, e.g. in the constructors and tuning functions (see Script interface object constructors are not thread-safe #4449)
  • sort out issues related to MPI deadlock when mixing boost::mpi communication with MpiCallbacks communication

Roll out changes to the core

  • remove all electrostatics/magnetostatics global variables in the core
  • add boost::optional<boost::variant<Method1,Method2,Method3>> magnetostatics_actor global variables in the core
  • implement boost visitors for the energy, force and pressure calculation
  • make DLC a stand alone actor instead of an extension to break dependencies in the core and lift ambiguities in the script interface (see ELC misclassified as electrostatic extension #2155)
    • e.g. what should be the state of DipolarP3M when removing the DLC "extension"? The DipolarP3M force and energy kernels are not const, therefore the actor will be altered and no longer be in sync with the DP3M python class. Making the DP3M actor an argument of DLC removes the ambiguity. In addition, the Cython Actors class no longer has to keep track of active actors vs. active extensions.
  • clarify when on_coulomb_change(), parameter broadcasts and long-range actor sanity checks are really needed; right now they are called in too many places (e.g. sanity checks are run at each integration step during tuning) and parameter broadcasts override each other when an actor is adapted by ELC or DLC. Parameter broadcasts need to be removed entirely by making the tuning code synchronous, sanity checks should only be executed when an actor is active, and the on_coulomb_change() should only be called when activating, deactivating or modifying the state of an actor.
  • runtime errors triggered by long-range actor sanity checks are queued and their handling is sometimes deferred to the integration loop, which is too late (fixed by Improve exception handling and fix various regressions #4479)
  • break dependencies in the core: at the moment, coulomb.hpp is indirectly included in around 60 files, half of which are totally unrelated to electrostatics; to break this chain of includes, we need to make the BondedCoulombSR kernels non-inline (i.e. remove its #include "coulomb_inline.hpp" and move it to a .cpp file) and split the pair criteria into individual classes
  • decouple magnetostatics events from electrostatics events (for example, modifying the DipolarP3M parameters should not reinitialize the P3MGPU parameters)
@jngrad jngrad added this to the Espresso 4.2 milestone Feb 16, 2022
@jngrad jngrad self-assigned this Feb 16, 2022
@kodiakhq kodiakhq bot closed this as completed in #4506 May 31, 2022
kodiakhq bot added a commit that referenced this issue May 31, 2022
Fixes #4454

Description of changes:
- apply the SOLID principles
   - split electrostatics from magnetostatics
   - factor out code duplication
   - extract P3M/DP3M tuning code
   - hide ELC details in private class methods (the energy/force correction logic used to be exposed in `coulomb.cpp`)
   - hide Coulomb implementation details by no longer indirectly including `coulomb.hpp` in 30 unrelated files in the core
- write clear pre-conditions as event hooks
   - these events are triggered upon box size change, periodicity change, node grid change and cell structure change
      - partial fix for #4428
      - setting a `node_grid` incompatible with a long-range actor is no longer possible (the old `node_grid` is restored)
   - these preconditions always run once before the integration loop starts
   - write test cases to check for these events
   - bugfix: charge neutrality checks now run once before the integration loop starts, instead of once at actor activation
- encapsulate the state of long-range methods in classes managed by the script interface
   - bugfix: the cached `DipolarP3M` energy correction is now invalidated when the box length changes
   - remove global variables for CPU methods (partial fix for #2628) and introduce one global variable for each type of long-range actor: dipolar solver, Coulomb solver, Coulomb extension
  - convert Cython files for long-range actors to Python files using the `ScriptInterface` framework
     - long-range actors and their python object now have the same lifetime, regardless of whether the actor is active or not (partial fix for #4220)
  - the list of active long-range actors can no longer end up in an invalid state (partial fix for #4219):
     - updating an active actor with an invalid parameter now automatically restores the original parameter
     - inserting an invalid long-range actor in the list of actors now automatically removes the actor from the list of active actors, even when the exception happens during tuning
- API changes:
   - `ELC` and `DLC` are no longer extensions, instead they take another actor as argument
   - `P3MGPU` is now able to execute the CPU energy and pressure kernels using parameters tuned for the GPU force kernel
      - closes #239
      - P3M-based reaction scripts run with 2x speed-up when reaction steps take place every 1000 integration steps
   - `P3MGPU` is now supported by `ELC`
   - long-range actor parameters are now immutable, since changing one parameter (e.g. the prefactor) would usually invalidate the tuned parameters
   - the charge neutrality check sensitivity can now be modified via `actor.charge_neutrality_tolerance`; this is mostly relevant to actors coupled to `ICC`, whose induced charges can have value spanning several orders of magnitude
- under-the-hood changes:
   - CPU-based long-range actors no longer rely on the `MpiCallbacks` framework
   - the subtracted Coulomb bonded interaction and pair criteria no longer indirectly include `coulomb.hpp`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant