Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add periodic checks by compute services that their claimed Tasks are actioned by at least one active AlchemicalNetwork; halt execution and drop claim for those Tasks that are not #301

Open
dotsdl opened this issue Sep 11, 2024 · 1 comment

Comments

@dotsdl
Copy link
Member

dotsdl commented Sep 11, 2024

Currently, compute services like our SynchronousComputeService will continue executing a claimed Task even if that Task is no longer wanted by any user, and no longer actioned on any active AlchemicalNetworks. It will continue until execution has either succeeded or failed, and is largely a waste of compute resources that can come with additional opportunity cost if many such Tasks saturate limited resources.

Instead of this, we would like compute services to periodically check that their currently-claimed Tasks are still actioned by at least one active AlchemicalNetwork. For those Tasks that are not, they should immediately drop their claim and attempt to halt their execution.

To do this in the SynchronousComputeService, it will likely be necessary to make ProtocolUnit execution happen in a subprocess, since otherwise it will not be possible for ProtocolUnits executed in-process to be cleanly halted if the conditions above are met. The subprocess can then be SIGTERMed or SIGKILLed by the calling process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant