-
Notifications
You must be signed in to change notification settings - Fork 16
[#743] Store provided identity data in application database #834
Conversation
General question about this. This information gets stored for any request including erasure? It seems wrong to store personal information in the database if the request was to remove personal information. |
Yes, including erasures. I agree it's counter-intuitive, however it's required for audit reasons. Fidesops operators must be able to show which identities were processed for any type of privacy request in the event of an audit. The individual regulation will specify the timeline that the operator must be able to show history for (e.g. GDPR and CCPA will specify different timelines) so we'll need to build that into the Policy layer. Notably this PII is different to any type of data collected in the traversal, as it was provided by the user up front. Traversal data is still stored in the cache. |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work on this @seanpreston ! Couple small things
@@ -222,6 +231,40 @@ def cache_identity(self, identity: PrivacyRequestIdentity) -> None: | |||
value, | |||
) | |||
|
|||
def persist_identity(self, db: Session, identity: PrivacyRequestIdentity) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we need to persist the identity every time we create
a privacy request, can we call this method from def create():
in this same file?
This is also more in line with our pattern of deleting the identities within the def delete()
method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is possible, we currently don't handle identity data within that method at all, in favour of caching it separately from the model. The reason I left it that way was because we don't need the identity data in the PrivacyRequest
table, and those ORM overrides should mainly focus on what that table needs. In the event of deletion, we do need to clear the foreign keys in order to process deleted of PrivacyRequest
s.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, OK we can leave as is for now, thanks!
class ProvidedIdentityType(EnumType): | ||
"""Enum for privacy request identity types""" | ||
|
||
email = "email" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the enums for identity types supported by fidesops, so I'm thinking we should use the enums here, too: https://github.com/ethyca/fidesops/blob/main/src/fidesops/service/drp/drp_fidesops_mapper.py#L26
Something like:
DRP_TO_FIDESOPS_SUPPORTED_IDENTITY_PROPS_MAP: Dict[str, str] = {
"email": ProvidedIdentityType.email.value,
"phone_number": ProvidedIdentityType.phone_number.value,
}
}, | ||
) | ||
|
||
def get_persisted_identity(self) -> PrivacyRequestIdentity: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a way to get by hashed value yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great spot! I had intended to use this method but you're right — it wouldn't be useful for that because it generates a new salt
each time without the option to refer to ProvidedIdentity.salt
. Will fix 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method now uses a static SALT
. This way we can consistently hash values we're searching for to see if they exist in a hashed form in the table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay, this looks good to me now.
Approved and confirming that I've checked down revision of migrations. Thanks for these changes @seanpreston ! |
* adds identity fields to PrivacyRequest model * store identity data inside database * update changelog * add identities in test data command * store identities provided via the DRP creation endpoint * black + isort * store provided identity data in request creation from onetrust * remove deprecated migration * adds new provided identity table * use new provided identity table * add docstring, remove comment * update DRP privacy request creation to use ProvidedIdentity model * update identity creation in test data command * use persisted identity in OneTrust * update test to use persisted identity * isort update * use enums * optionally receive a salt in hash_value cmd * use a constant salt for provided identity hashing * remove import * use typehints * update typedef * use enum in dict
🚨 This PR contains a migration — please check the downrev before merging
Purpose
Store provided identity data in the application DB alongside the privacy request to enable:
One concern I have is that we're not removing this data upon request completion (but are upon request deletion). This is something we'll need to build into the Policy layer, since GDPR, CCPA et al all specify different constraints around accountability and data retention.
Changes
ProvidedIdentity
table with:encrypted_value
— an encrypted column in the database which houses the value of the identityhashed_value
— a oneway hashed version of the field to be used for exact match searches later onChecklist
CHANGELOG.md
fileCHANGELOG.md
file is being appended toUnreleased
section in an appropriate category. Add a new category from the list at the top of the file if the needed one isn't already there.Run Unsafe PR Checks
label has been applied, and checks have passed, if this PR touches any external servicesTicket
Fixes #743