Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export: support partial depseudnoymization when deducing pseudo rules from dataset metadata #4

Open
kschulst opened this issue Apr 24, 2021 · 0 comments

Comments

@kschulst
Copy link
Contributor

kschulst commented Apr 24, 2021

Right now, if specify that we should depseudonymize during export, then the depseudonymization will be applied for all pseudo rules that are provided. This is done using the pseudoRules parameter which accepts a list of pseudo rules (name, pattern, func), each of which potentially matches multiple fields.

In the export endpoint, if we don't explicitly specify which pseudo rules to use, then we try to retrieve these rules from the dataset metadata. Deducing pseudo rules from the dataset metadata is assumably going to be the main use case. However, for these cases:

  • we don't have any mechanism to only specify a subset of rules to applied
  • we don't have any mechanism to only specify a subset of fields to be depseudonymized

Thus, the suggestion is to introduce two new parameters: pseudoRulesFilter and pseudoFieldsFilter.

To summarize, depseudonymization during export would be specified by the following parameters:

  • pseudoRules - if not present, then deduce these from the dataset path
  • pseudoRulesPath - optional explicit path to deduce pseudo rules from (Export: support retrieving pseudo rules from another dataset path #2)
  • pseudoRulesFilter - a list of named pseudo rules that should be considered
  • pseudoFieldsFilter - a list of globs that addresses the fields that should be considered. Allows the user to have more control over which fields gets depseudonymized, since a pseudo rule might match multiple fields
  • depseudo - whether or not the export should depseudonymize. Only required if pseudo rules should be deduced from dataset path and no pseudo filters have been specified. If either of the above parameters are present, then the export should assume this property to be true.
@kschulst kschulst changed the title Export: Support partial depseudnoymization when deducing pseudo rules from dataset metadata Export: support partial depseudnoymization when deducing pseudo rules from dataset metadata Apr 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant