Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the options argument to all get_builder_from_protocol #846

Merged
merged 2 commits into from
Oct 19, 2022

Conversation

sphuber
Copy link
Contributor

@sphuber sphuber commented Oct 7, 2022

Fixes #844

Often a user will need to use the same options for all the jobs that will be run in a nested workflow, such as specifying the queue_name or the account, etc. Currently, the user will have to manually specify those on the builder, or in the overrides, but this means that the user knows exactly where in the nesting the jobs are.

The options argument is added to all get_builder_from_protocol implementations, which takes a dictionary just as one would pass to the metadata.options input for a CalcJob. The implementation ensures that these inputs are recursively set on all the inputs of all the CalcJobs that are part of the nested namespace.

Often a user will need to use the same options for all the jobs that
will be run in a nested workflow, such as specifying the `queue_name` or
the `account`, etc. Currently, the user will have to manually specify
those on the builder, or in the `overrides`, but this means that the
user knows exactly where in the nesting the jobs are.

The `options` argument is added to all `get_builder_from_protocol`
implementations, which takes a dictionary just as one would pass to the
`metadata.options` input for a `CalcJob`. The implementation ensures
that these inputs are recursively set on all the inputs of all the
`CalcJob`s that are part of the nested namespace.
@sphuber sphuber requested review from mbercx and ltalirz October 7, 2022 16:34
@sphuber
Copy link
Contributor Author

sphuber commented Oct 7, 2022

@ltalirz quick implementation of the first feature request. If you could verify this satisfies your requirements and works in the wild, that would be great.

Copy link
Member

@ltalirz ltalirz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @sphuber , this is exactly what I was looking for!

I will put a note here when I've tested it, but it looks fine to me.

@mbercx
Copy link
Member

mbercx commented Oct 9, 2022

I can definitely see the use case, have thought about implementing something similar in the past. I had some reservations at the time, since there are many inputs one might want to set for all calculation jobs (e.g. parallellization, settings, parameters, ...) and I didn't want the input signature to be extended too much. We also hadn't fully agreed on the get_builder_from_protocol() API yet (#745) and the options aren't necessarily the same for all calculations in the work chain (e.g. resources, you might want to run the relaxation on 4 nodes, but the NSCF on a single node). So any serious user of the work chain would still have to figure out where the calcjob options are in the work chain hierarchy, which is easier now that the builder representation shows the hierarchy and where the options are.

That said, I don't want to block this feature, it can definitely be useful for people that just begin to use the work chain, and the options are one of the inputs you pretty much always have to override.

@sphuber
Copy link
Contributor Author

sphuber commented Oct 9, 2022

The points you raise are indeed why we at the time didn't implement this feature. There simply doesn't seem to be a single solution that fits all use-cases, which is what I mentioned to @ltalirz as to why this didn't exist yet. That being said though, currently it is quite difficult to use the get_builders_from_protocol for novice users precisely because they most likely will have to set some options for all calcjobs and they will have to find them in the namespace. It think @ltalirz mentioned, rightfully so, that we should try and make it as easy to use for the majority of use cases (novice users) while still allowing expert users to tune the behavior. This was also the main design goal of the common workflows. So even though for some workflows you might want to use different resources, expert users that know they want to do this, will therefore know that there even are different calcjobs and so they will be more likely to find them in the namespace. Overall then, I think this change will make life of the average user easier.

Regarding the parallel with the parallelization, settings and parameters. The parameters are essentially entirely taken care of by the protocols, so these don't really count I think. The parallelization and settings are really more for advanced users, and so again I think it might be reasonable to expect them to understand the workchain in more detail and be able to address the calcjobs in the various namespaces.

@ltalirz
Copy link
Member

ltalirz commented Oct 19, 2022

@mbercx what is the final verdict here?

@sphuber
Copy link
Contributor Author

sphuber commented Oct 19, 2022

Verdict is that it is not a 100% sure that this is the best solution, but let's merge it for now as it brings considerable advantages at the risk we may have to change things forcefully in the future.

@sphuber sphuber merged commit 0496b5f into main Oct 19, 2022
@sphuber sphuber deleted the feature/844/builder-from-protocol-options branch October 19, 2022 09:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add options argument to all get_builder_from_protocol
3 participants