Export: support partial depseudnoymization when deducing pseudo rules from dataset metadata #4

kschulst · 2021-04-24T04:26:33Z

Right now, if specify that we should depseudonymize during export, then the depseudonymization will be applied for all pseudo rules that are provided. This is done using the pseudoRules parameter which accepts a list of pseudo rules (name, pattern, func), each of which potentially matches multiple fields.

In the export endpoint, if we don't explicitly specify which pseudo rules to use, then we try to retrieve these rules from the dataset metadata. Deducing pseudo rules from the dataset metadata is assumably going to be the main use case. However, for these cases:

we don't have any mechanism to only specify a subset of rules to applied
we don't have any mechanism to only specify a subset of fields to be depseudonymized

Thus, the suggestion is to introduce two new parameters: pseudoRulesFilter and pseudoFieldsFilter.

To summarize, depseudonymization during export would be specified by the following parameters:

pseudoRules - if not present, then deduce these from the dataset path
pseudoRulesPath - optional explicit path to deduce pseudo rules from (Export: support retrieving pseudo rules from another dataset path #2)
pseudoRulesFilter - a list of named pseudo rules that should be considered
pseudoFieldsFilter - a list of globs that addresses the fields that should be considered. Allows the user to have more control over which fields gets depseudonymized, since a pseudo rule might match multiple fields
depseudo - whether or not the export should depseudonymize. Only required if pseudo rules should be deduced from dataset path and no pseudo filters have been specified. If either of the above parameters are present, then the export should assume this property to be true.

The text was updated successfully, but these errors were encountered:

kschulst changed the title ~~Export: Support partial depseudnoymization when deducing pseudo rules from dataset metadata~~ Export: support partial depseudnoymization when deducing pseudo rules from dataset metadata Apr 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Export: support partial depseudnoymization when deducing pseudo rules from dataset metadata #4

Export: support partial depseudnoymization when deducing pseudo rules from dataset metadata #4

kschulst commented Apr 24, 2021 •

edited

Loading

Export: support partial depseudnoymization when deducing pseudo rules from dataset metadata #4

Export: support partial depseudnoymization when deducing pseudo rules from dataset metadata #4

Comments

kschulst commented Apr 24, 2021 • edited Loading

kschulst commented Apr 24, 2021 •

edited

Loading