-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure that ProcessNode.get_builder_restart
fully restores all inputs including metadata inputs
#5801
Ensure that ProcessNode.get_builder_restart
fully restores all inputs including metadata inputs
#5801
Commits on Dec 14, 2022
-
ProcessBuilder
: Fix bug in_recursive_merge
The `_recursive_merge` method could raise a `KeyError` for a process builder that contains a dynamic port namespace. In this case it would be possible to merge in a value that contains a nested dictionay which would try to access directly `dictionary[key]`, where `dictionary` is the namespace of the builder, and so hit a `KeyError` since this nested namespace didn't exist yet in the dynamical namespace. The solution is to explicitly check if the `key` already exists in the builder's namespace, and if not, we assig the entire `value` to it, as there is no nothing to recurse into.
Configuration menu - View commit details
-
Copy full SHA for 3b579ae - Browse repository at this point
Copy the full SHA 3b579aeView commit details -
ProcessBuilder
: Move_prune
method to standalone utilityThe method is a generic utility to operate on mappings and is not specific to the `ProcessBuilder`. It needs to be used elsewhere soon so therefore it is moved to a standalone function in a new utility module `aiida.engine.processes.utils`.
Configuration menu - View commit details
-
Copy full SHA for 86265ae - Browse repository at this point
Copy the full SHA 86265aeView commit details -
Port
: Add theis_metadata
keywordIn the engine redesign from `v0.x` to `v1.0` the process interface needed a way to distinguish inputs that are lined up to the process node as nodes themselves and those that are stored directly on the process node, for example as an attribute. To this end the `non_db` keyword was added to the `InputPort`. This keyword was used for the `metadata` input namespace, to designate that these inputs would not be `Data` instances. The name is quite unfortunate, however, since the inputs of these ports actually do get stored in the database, contrary to what the keyword suggests. The most straightforward solution would be to rename the keyword, however, for better or worse, the keyword has been adopted by (a limited amount of) plugin packages. A known application is to pass in `Group` instances which are not storable as nodes, but can be passed through a `non_db` port. Hypothetically, users may have used the port as well to pass in sensitive data that should never be stored. Renaming the port or changing its behavior is therefore likely to break existing code. Instead, the `is_metadata` keyword is added to the `InputPort` and `PortNamespace` through the `WithMetadata` mixin. When set to `True` this keyword functions as the original intention of the `non_db` flag and an `is_metadata` port signals that its value will be stored in the database but directly on the `ProcessNode` instead of being linked as a `Data` node. The naming makes sense, because the only use of this keyword is by the `metadata` input namespace that all `Process` classes have. The inputs in this namespace are stored, through custom logic in the `Process` and `CalcJob` class, in the attributes of the node or dedicated columns of the node database model, such as the label and the description. This addition leaves the behavior of `non_db` inputs unchanged and so plugin packages that use this keyword should continue to function as before.
Configuration menu - View commit details
-
Copy full SHA for 48cf609 - Browse repository at this point
Copy the full SHA 48cf609View commit details -
Process
: Store JSON-serializable metadata inputs on the nodeThe promise of AiiDA's provenance is that all inputs to a `Process` are stored as nodes in the provenance graph linked to a node representing the process. From the very early beginning, however, there needed to be exceptions of inputs to processes that were not nodes. Notable examples were the various "options" set for calculation jobs. Even in AiiDA v0.x these "settings" as they were called back then, were stored in the attributes of the node. In the AiiDA v1.0 redesign, where all inputs of a process are defined through the process spec, these non-node inputs were implemented by allowing input ports to be made non-database storable, indicated by the `non-db` argument in the port declaration. These inputs would be passed to the `Process` instance and would be available during its lifetime, but would not be stored in the database. Once again there are exceptions as certain inputs defined by `aiida-core` are stored on the node, but in various places. Notable examples are the `label` and `description` of the process, and the `metadata.options` of the `CalcJob` class. This historical decision has as a direct result in that it is difficult if not impossible in certain cases to reconstruct the exact input dictionary that was used to run a `Process` from the data stored on the `ProcessNode`. From a provenance point of view, this is a huge weakpoint and is what is being corrected here. The input ports marked `non_db=True` on the base process classes provided by `aiida-core`, `Process` and `CalcJob` were changed to use the new `is_metadata` keyword instead in the previous commit, to remove the inconsistency between the naming and behavior. In this commit, all inputs that correspond to `is_metadata` ports and are JSON-serializable are stored in the attributes of the process node under the key `metadata_inputs`. All `is_metadata` input ports that are defined on process base classes by `aiida-core`, such as `Process` and `CalcJob` *are* JSON serializable. However, plugin packages can implement process classes with ports that accept inputs that are not JSON serializable, which is why this additional condition has to be added. But all inputs defined by `aiida-core` should be covered.
Configuration menu - View commit details
-
Copy full SHA for 06d53a6 - Browse repository at this point
Copy the full SHA 06d53a6View commit details -
ProcessBuilder
: Include metadata inputs inget_builder_restart
The `get_builder_restart` method on the `ProcessNode` base class would return a `ProcessBuilder` with the inputs set to those attached to that node instance. The `CalcJobNode` would override this to add the metadata options as well. This would be ok for `CalcJobNode`s, but if a restart builder was created from a `WorkChainNode` that calls a `CalcJobNode` it would have also received options, but those would not be restored. One could think that when calling `get_restart_builder` on a `WorkChainNode` that we can just go down the callstack, find all the `CalcJobNode`s and set the options in the respective input namespaces. But this would not exactly reproduce the original inputs, as the options that a calculation job has received could have been a modified version of the original options passed to the workchain, changed in the logic of the workchain. Instead, now that the exact metadata inputs for each process are stored in the attribute of the node, added in the previous commit, it is this dictionary that is used to restore the exact original inputs. It not only addresses the problem of incorrect `CalcJob` options, but it also restores any other metadata inputs such as the `label` and `description`.
Configuration menu - View commit details
-
Copy full SHA for d46d77b - Browse repository at this point
Copy the full SHA d46d77bView commit details