Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset-scope tailoring of EnsureCommandParameterization validation #200

Closed
mih opened this issue Dec 14, 2022 · 2 comments · Fixed by #260
Closed

Dataset-scope tailoring of EnsureCommandParameterization validation #200

mih opened this issue Dec 14, 2022 · 2 comments · Fixed by #260

Comments

@mih
Copy link
Member

mih commented Dec 14, 2022

A dataset often provides a critical piece of information to be able to validate properly and in-depth. For example, the name of an existing remote can only be validated once a dataset is determined in which this remote exists.

Constraints have for_dataset() that is meant to be used for this purpose.

EnsureCommandParameterization would need to call this method for any constraint in order to obtain the most adequate variant.

It could make sense to have the validation order be significant, i.e. have no Dataset identified, until a dataset= argument is processed, only only from there on call for_dataset() on the subsequent constraints.

It may also be necessary to support an alternative name declaration for a "dataset" command, in case one and the same command has two arguments of this type.

For that reason, it may also be beneficial to ignore the name of an argument and go by constraint-type (i.e. EnsureDataset).

Combined with validation order rules (a class author sets the order of arguments in _validator_) it may be sensible to start tailoring constraints when a first dataset-type (not name) argument is found, tailor all constraints from there on, and switch the dataset for tailoring, whenever another dataset-type argument is found.

@mih
Copy link
Member Author

mih commented Feb 13, 2023

With #234 in place, the situation presents itself to me as follows:

A simple implicit or explicit order-based tailoring is not sufficient. Any particular command may take more than one Dataset-type parameter (e.g. a context dataset (something to work in), plus a dataset target (something to create/delete/modify). Any other command parameter might be related to the former or the latter. Imposing particular naming patterns on all DataLad commands is needlessly complex and likely to cause incomprehensible behavior.

Instead, I propose this:

EnsureCommandParameterization receives a new constructor argument tailor_for_datasets. The value for this parameter is a mapping of parameter names to parameter names. The keys are Dataset-type parameters that provide a context for tailoring other constraints to (see Constraint.for_dataset()), and the values are the parameters that shall have their constraints be tailored for the respective dataset context.

Based on this input, EnsureCommandParameterization can determine which constraints to evaluate first, and which constraints to tailor to what context.

@mih
Copy link
Member Author

mih commented Feb 15, 2023

I decided to not address this in #234 directly, but have that be merged first.

mih added a commit to mih/datalad-next that referenced this issue Feb 24, 2023
This exposes a, so far dormant, feature of the `Constraint` class to
parameter validation: constraint tailoring for particular datasets.

`EnsureCommandParameterization` learned a `tailor_for_dataset` parameter
that can be used to identify which parameters' constraints should be
tailored for which `Dataset` instances. Tailoring will only be actually
done under the following conditions:

- the dataset-providing parameters need to evaluate to a
  DatasetParameter instance, typically via `EnsureDataset`
  (all regular conditions apply, e.g. a default `None` not being processed
  etc)
- the to-be-tailored parameter `Constraint` needs to implement its
  `for_dataset()` method to perform a tailoring.

A test with some custom-made constraints is included. At the moment,
no production-ready constraints do implement `for_dataset()`. However,
now that a consumer for that is available, it makes more sense to
address related issues, such as:

- datalad#193
- datalad#131

Closes datalad#200
@mih mih closed this as completed in #260 Feb 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant