-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure that ProcessNode.get_builder_restart
fully restores all inputs including non_db
ones
#4089
Comments
I see what you are trying to do, but I would advise against and anyway as you have noticed now it is not possible. So the exception is more or less intentional. By exposing inputs in the top-level namespace, you are overriding all ports that already exist if they have the same name. In your case then you are overriding the I am not sure if we should add support for this, because it isn't really what it is supposed to be doing. I would indeed expect that adding a Thinking about this, maybe there is a more generic flaw or problem with @greschd what do you think about this? |
Sorry for jumping in, but I though a bit about this issue and I briefly add my idea.
And of course then in WC you need to place "options_for_calcjob" into the |
Thanks for the replies, I understand the concerns and potential risks, and indeed I just wanted to see if it was possible not to have to use builder.mycode.metadata.options.resources and such, which I found not really user friendly, as too deeply nested. Excluding metadata was indeed my go-to workaround in this case, before trying to play with set_option directly. I had a little trouble forwarding it to the job underneath (Dict/dict/Attributdict issues, mainly, all sorted out), but I now have a working way to pass run_opts to the workchain, which will become metadata for the job. I agree that maybe excluding metadata when using expose_inputs would be a good idea, and so is maybe adding a more explicit error when someone tries to forward them as I did. But that's not really critical either way as the current solution satisifies me. |
I agree with @bosonie that it's often nicer to expose some inputs at the top-level, but as you've noticed it needs some care as to how these inputs interact with the existing inputs. Just for completeness' sake, note that you can also mix top-level and namespaced inputs. Here's a complete example: from aiida import orm
from aiida.engine import WorkChain, CalcJob
class CJ(CalcJob):
@classmethod
def define(cls, spec):
super().define(spec)
spec.input('a', valid_type=orm.Int)
spec.input('b', valid_type=orm.Float)
spec.input('c', valid_type=orm.Int)
class WC(WorkChain):
@classmethod
def define(cls, spec):
super().define(spec)
spec.expose_inputs(CJ, include=('a', 'b'))
spec.expose_inputs(CJ, namespace='calcjob', exclude=('a', 'b'))
spec.outline(cls.run)
def run(self):
self.report(self.exposed_inputs(CJ, namespace='calcjob')) That could be run with run(WC,
a=orm.Int(1),
b=orm.Float(1.2),
calcjob={
'code': orm.Code.get(label='testcode'),
'c': orm.Int(3),
'metadata': {
'options': {
'resources': {
'num_machines': 1,
'num_mpiprocs_per_machine': 1
}
}
}
}) and the from collections import ChainMap
from aiida import orm
from aiida.engine import WorkChain, CalcJob
class CJ(CalcJob):
@classmethod
def define(cls, spec):
super().define(spec)
spec.input('a', valid_type=orm.Int)
spec.input('b', valid_type=orm.Float)
spec.input('c', valid_type=orm.Int)
spec.inputs.dynamic = True
class WC(WorkChain):
@classmethod
def define(cls, spec):
super().define(spec)
spec.expose_inputs(CJ, include=('a', 'b'))
spec.expose_inputs(CJ, namespace='calcjob', exclude=('a', 'b'))
spec.inputs['calcjob'].dynamic = True
spec.outline(cls.run)
def run(self):
self.report(
ChainMap(self.exposed_inputs(CJ, namespace='calcjob'),
self.inputs.calcjob)) Back to the question at hand, how to deal with the In general, we can't know if a particular input should be shared (top-level, potentially overriding existing input ports) or not - and thus we leave this choice up to the developer. What makes this specific case troublesome however is that If we decide that Excluding
|
And to make @bosonie's approach simpler: We could add a "rename" option to the "expose_inputs", like spec.expose_inputs(CalcJob, rename=[('metadata', 'calcjob_metadata')]) that would rename the port and automatically translate back in If I understood correctly you'd want it to work with a nested port: spec.expose_inputs(CalcJob, rename=[('metadata.options', 'calcjob_options')]) Not sure how simple that would be to implement. |
Thanks for all the inputs guys. I agree with a number of the points made:
As for solutions, I am not a fan of the solution proposed by @bosonie because it makes the options an actual input to the workchain instead of the calculation job. The whole point of the exposed inputs functionality (among other things of course) was to allow to get rid of having to wrap options in an actual node and then unwrap them. You lose the auto-completion and validation of the Finally, I think that having a namespace should not be seen as a burden, but actually as an aid to remind one that those inputs are not meant for the called process, but one of the sub processes. I think that reminding ourselves that exposing does not really mean "re-purposing". The wrapper workchain is merely allowing the ports of its child process to "poke through" the interface of its own, and it is not claiming those ports to be its own. |
This depends a bit on what the intention of the wrapping process is: If the goal is to be "opaque" to the user, such that the user shouldn't really have to care about which sub-process will be called, I can see that the namespace is a bit of an eye sore. All that's to say, I think there are valid reasons to use either full namespacing, toplevel-only exposing, or a mixed approach. To make the |
I think that is fine, but I think this makes the solutions to wrapping a @classmethod
def define(cls, spec):
spec.expose_inputs(SomeCalcJob, exclude=('metadata',))
spec.expose_inputs(SomeCalcJob, include=('metadata',), namespace='sub')
inputs ={
'code': ...
'parameters': ..
'sub': {
'metadata': {
'options': {}
}
}
}
submit(WrapperWorkChain, **inputs) or something like @classmethod
def define(cls, spec):
spec.expose_inputs(SomeCalcJob, exclude=('metadata',))
spec.inputs('calcjob_options', type=orm.Dict)
inputs ={
'code': ...
'parameters': ..
'calcjob_options': orm.Dict(dict={}),
}
submit(WrapperWorkChain, **inputs) I find that both those options are more complex than: @classmethod
def define(cls, spec):
spec.expose_inputs(SomeCalcJob, namespace='sub')
inputs ={
'sub': {
'code': ...
'parameters': ..
'metadata': {
'options': {}
}
}
}
submit(WrapperWorkChain, **inputs) With the latter note, how you can literally take the input dictionary as one would have passed it straight to the The second option I still consider to be the worst, because as said before, you loose the entire specification of the options defined on the |
Yeah, I agree with your style assessment here. Personally, I'd use 1) only if there are multiple sub-workchains that all share the same input (e.g., To clarify what I was proposing though: I would consider all three options correct, even if maybe not recommended. What I think should eventually (after deprecation, and maybe with an override switch) produce an error is straight up @classmethod
def define(cls, spec):
spec.expose_inputs(SomeCalcJob) because it shadows an existing The option
would of course still be allowed, because there's no existing In terms of implementation, we could add an |
One problem with On the other hand, wraping it into a Is it possible to have |
I think this may actually be possible, because the Then there is just the trick that the |
Yes, but more complicated workchain may have exposed multiple I think one solution can be to be Or we can have the linkage stored under some attribute of the Then we can ditch passing |
Actually, this won't work, even if we record the explicit link (which |
Good point. Then I think the only way out is to record these |
ProcessNode.get_builder_restart
fully restores all inputs including non_db
ones
I have a simple workchain to wrap a calculation. I used expose_inputs to forward everything, and it was using a namespace until recently.
But I would like to get rid of the namespace in this case, and simply forward every input to the calculation job, to get rid of the extra layer of indirection. But metadata.options seems to be impossible to forward directly, as validation for the options key only works for calculations.
The offending line seems to be:
set_option is only defined in calcjob.py, but not in workchain.py (or process.py). Adding it manually seems to just work in my case. I'm not sure if this has unwanted side effects, though, I only tried a local job for now.
The text was updated successfully, but these errors were encountered: