Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WorkChain: Protect public methods from being subclassed #5779

Merged
merged 2 commits into from
Nov 28, 2022

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Nov 17, 2022

Fixes #5467

The paradigm of the WorkChain requires a user in an implementation to define the workflow logic through classmethods of the WorkChain subclass. While this gives great flexibility and choice to the user, there is a risk that a user inadvertently chooses a method name that already exists on the WorkChain base class. Usually the super is not called in this scenario and so the functionality is broken.

The typical example is where the user uses the run method as a step in the outline of the WorkChain. The work chain will still run, however, only that one step in the outline is called. Since the logic to continue to the next step in the outline is defined in WorkChain.run, which is overridden and now no longer called, the rest of the work chain is skipped without any warning or error message, leaving the user scratching their head as to what happened.

Here we protect this and other public methods on the WorkChain class to prevent them from being overridden in subclasses. This is accomplished by adding the Protect class as a metaclass. Since the WorkChain already has the metaclass plumpy.ProcessStateMachineMeta, which it inherits from its Process base class, and all metaclasses need to share the same base, Protect also subclasses the ProcessStateMachineMeta class.

The Protect class provides the final classmethod which can be used to decorate a method in the WorkChain class that should be protected. If a subclass implements it, as soon as the class is imported, a RuntimeError is raised mentioning that the method cannot be overridden.

The test test_report_dbloghandler had to be fixed because it actually suffered from the very problem that is being fixed. It used the run method to setup the test, but since the check was never being called, the test always passed, even though the code self._backend in the check is incorrect.

@sphuber
Copy link
Contributor Author

sphuber commented Nov 17, 2022

@danielhollas I think this fix should have saved us some debugging time in #5775 😅

@danielhollas
Copy link
Collaborator

@sphuber did you mean to tag @janssenhenning ?

In any case, this is cool Python magic. 👍

I would additionally propose to rename the run method to something less generic so that the potential clash is minimized. Not sure if that has any potential to break things though.

@unkcpz
Copy link
Member

unkcpz commented Nov 17, 2022

@sphuber thanks! Before going into details, there are some github hints in the review panel, like below. What are they, is that indicate some glitches need to be taken care of?
image

@danielhollas
Copy link
Collaborator

Edit, now I understand, the run method is part of public API, so cannot be changed. Sorry for the confusion.

@sphuber
Copy link
Contributor Author

sphuber commented Nov 17, 2022

@sphuber did you mean to tag @janssenhenning ?

I did indeed 😅 apologies.

I would additionally propose to rename the run method to something less generic so that the potential clash is minimized. Not sure if that has any potential to break things though.

Not a bad suggestion, but this would require changes in plumpy for sure. Not sure how trivial this is. Would leave this for a separate discussion.

@sphuber
Copy link
Contributor Author

sphuber commented Nov 17, 2022

@sphuber thanks! Before going into details, there are some github hints in the review panel, like below. What are they, is that indicate some glitches need to be taken care of?

These have been there for a long long time. I am not sure how they got enabled, but I have the feeling this is GHA trying to be clever. It seems to be picking up on exceptions that get raised in our tests, but these tests are doing that on purpose and so that is exactly what we expect.

@sphuber sphuber force-pushed the fix/5467/protect-workchain-methods branch 2 times, most recently from e396edf to 902a95b Compare November 18, 2022 09:20
The paradigm of the `WorkChain` requires a user in an implementation to
define the workflow logic through classmethods of the `WorkChain`
subclass. While this gives great flexibility and choice to the user,
there is a risk that a user inadvertently chooses a method name that
already exists on the `WorkChain` base class. Usually the `super` is not
called in this scenario and so the functionality is broken.

The typical example is where the user uses the `run` method as a step in
the outline of the `WorkChain`. The work chain will still run, however,
only that one step in the outline is called. Since the logic to continue
to the next step in the outline is defined in `WorkChain.run`, which is
overridden and now no longer called, the rest of the work chain is
skipped without any warning or error message, leaving the user
scratching their head as to what happened.

Here we protect this and other public methods on the `WorkChain` class
to prevent them from being overridden in subclasses. This is
accomplished by adding the `Protect` class as a metaclass. Since the
`WorkChain` already has the metaclass `plumpy.ProcessStateMachineMeta`,
which it inherits from its `Process` base class, and all metaclasses
need to share the same base, `Protect` also subclasses the
`ProcessStateMachineMeta` class.

The `Protect` class provides the `final` classmethod which can be used
to decorate a method in the `WorkChain` class that should be protected.
If a subclass implements it, as soon as the class is imported, a
`RuntimeError` is raised mentioning that the method cannot be
overridden.

The test `test_report_dbloghandler` had to be fixed because it actually
suffered from the very problem that is being fixed. It used the `run`
method to setup the test, but since the `check` was never being called,
the test always passed, even though the code `self._backend` in the
`check` is incorrect.
@sphuber sphuber force-pushed the fix/5467/protect-workchain-methods branch from 902a95b to bb67487 Compare November 18, 2022 22:27
@sphuber sphuber requested a review from mbercx November 22, 2022 13:24
Copy link
Member

@unkcpz unkcpz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sphuber thanks!
Can you explain why has to create Protect class rather than using @typing.final directly, I think the reason is you make it a runtime check here. But didn't understand the full trick here.

@unkcpz
Copy link
Member

unkcpz commented Nov 26, 2022

If I understand correctly, the Protect serve two purposes, 1. protect all public methods from ProcessStateMachineMeta, 2. provide final decorator for other workchain class.
I wonder if is it more clear to separate these two purposes by decoupling the decorator and using it in the Protect class?

@sphuber
Copy link
Contributor Author

sphuber commented Nov 27, 2022

Can you explain why has to create Protect class rather than using @typing.final directly, I think the reason is you make it a runtime check here. But didn't understand the full trick here.

Indeed, the typing.final decorator only reports an error during type checking but we need an error to be raised during runtime. Most of the workchain developers will not be running active type checking on their plugins and so we need something more aggressive than typing.final

@sphuber
Copy link
Contributor Author

sphuber commented Nov 27, 2022

If I understand correctly, the Protect serve two purposes, 1. protect all public methods from ProcessStateMachineMeta, 2. provide final decorator for other workchain class.
I wonder if is it more clear to separate these two purposes by decoupling the decorator and using it in the Protect class?

Not quite. The public methods of ProcessStateMachineMeta are not automatically protected from being overwritten. Only those that are explicitly marked by the final decorator are protected. The meta class is necessary to have the check automatically applied as soon as a new class of WorkChain is defined.

@sphuber sphuber requested a review from unkcpz November 27, 2022 22:18
Copy link
Member

@unkcpz unkcpz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation! All good to me.

@sphuber sphuber merged commit 7929257 into aiidateam:main Nov 28, 2022
@sphuber sphuber deleted the fix/5467/protect-workchain-methods branch November 28, 2022 10:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Protect methods from WorkChain being override
3 participants