Return persisted identities in `get_request_status` view #860

seanpreston · 2022-07-12T11:19:20Z

Purpose

This PR updates the way we return identities to use the data stored in the database, rather than the cache.

Changes

Return identity date from ProvidedIdentity table
Persist identity data for privacy requests created by test fixtures

Checklist

Ticket

Fixes NA

…ersisted identity in get_request_status view

pattisdr

Just some more cleanup needed @seanpreston

pattisdr · 2022-07-12T16:58:34Z

src/fidesops/api/v1/endpoints/privacy_request_endpoints.py

@@ -491,7 +491,7 @@ def get_request_status(
        # Conditionally include the cached identity data in the response if
        # it is explicitly requested
        for item in paginated.items:  # type: ignore
-            item.identity = item.get_cached_identity_data()
+            item.identity = item.get_persisted_identity().dict()


Also downloading privacy requests as a CSV above is still using cached identity there, these should both pull from the same source, since they are supposed to be the same data in different formats. Otherwise, I can see the UI showing the identities, and then they go to download a CSV and the identity rows are blank.

Creating the request body for a webhook, creating the requests for saas configs retrieve/update statements, and feeding the initial seed data into the traversal all still use the cache, not the database.

Do we do this because it's easier to access the cache sometimes, we don't always have a readily available session? especially in the traversal? I'm a little worried about having different locations storing what the identity is, some pull from one, others pull from another.

Do we do this because it's easier to access the cache sometimes, we don't always have a readily available session?

For now, yes. Ideally I'd like everything to use the same source of truth for identity data, but that's a larger refactor for exactly this reason, the DB connection isn't piped into everywhere that would need it yet. I've made this ticket to be actioned as a follow-up.

pattisdr · 2022-07-12T17:09:31Z

tests/fixtures/application_fixtures.py

+    pr.cache_identity(identity_kwargs)
+    pr.persist_identity(


Bug: Policy webhooks can have derived_identities returned. Neither PrivacyRequest.trigger_policy_webhook nor privacy_request_endpoints > resume_privacy_request which both update the identity graph, persist the data to the database, they only add it to the redis cache.

Would it be useful to have a method that both persists the identity in the cache and in the database at the same time? I'd like to avoid these mismatches we have now where they're both being updated in some places and not others.

Would it be useful to have a method that both persists the identity in the cache and in the database at the same time?

I'm torn here. On the one hand it's nice to have consistency, on the other, as you rightly suggest above, it means we'll need to be plumbing the DB connection in more places. I'm not sure if it's better to have the execution update the cache with identity data at the very start before the traversal, such that we can guarantee the traversal will always use what was provided by the user on privacy request creation. That way the internals can still use the cache and benefit from the speed, and less refactoring is required. What do you think?

Bug: Policy webhooks can have derived_identities returned. Neither PrivacyRequest.trigger_policy_webhook nor privacy_request_endpoints > resume_privacy_request which both update the identity graph, persist the data to the database, they only add it to the redis cache.

We should separate these concerns for now. The ProvidedIdentity is useful for facilitating request search based on the exact identity provided, we don't currently want to search based on derived identities, or return them into the UI, so should be careful when we update the ProvidedIdentity table as that's what will get displayed in the UI (and doesn't currently support anything beyond email and phone_number).

That way the internals can still use the cache and benefit from the speed, and less refactoring is required. What do you think?

Thinking about this more, I agree, it's in line with our original design, in that we query everything up front, build the graph, and execute it. We're not regularly querying the database as we execute the traversal which I think is good for performance.

We should separate these concerns for now. The ProvidedIdentity is useful for facilitating request search based on the exact identity provided, we don't currently want to search based on derived identities

OK, that makes sense

pattisdr · 2022-07-13T12:35:37Z

@seanpreston thanks for your response to my comments. This all makes sense, the one thing I would add then, are code comments in the places where we're writing to just the cache and not the database. I'd want to note why we're doing this and make it clear it's intentional.

pattisdr · 2022-07-13T14:45:17Z

Just waiting on the changelog!

pattisdr · 2022-07-13T15:14:37Z

CHANGELOG.md

+## Changed
+* Changed wording on Admin UI login page [#774](https://github.com/ethyca/fidesops/pull/774)
+* Fixed typos in Admin UI [#774](https://github.com/ethyca/fidesops/pull/774)
+* Update clipboard icon in Admin UI [#838](https://github.com/ethyca/fidesops/pull/838)
+* Return identity data from application DB, instead of cache [#860](https://github.com/ethyca/fidesops/pull/860)
+* Update admin ui to be served from the root route `/` [#720](https://github.com/ethyca/fidesops/pull/720)


Bad merge @seanpreston

Duplicates most of "changed" section above

Sean Preston added 4 commits July 12, 2022 13:17

persist identities for privacy requests created by fixtures, return p…

47700a5

…ersisted identity in get_request_status view

updates CHANGELOG

de53e3b

update test to not change the foreign key

fcaf76c

Merge branch 'main' into return-provided-identity

bfe2a45

seanpreston changed the title ~~Return persist identities in get_request_status view~~ Return persisted identities in get_request_status view Jul 12, 2022

seanpreston assigned pattisdr Jul 12, 2022

pattisdr reviewed Jul 12, 2022

View reviewed changes

seanpreston mentioned this pull request Jul 13, 2022

Read identity data from the app DB everywhere #867

Open

write CSVs based on persisted identities

dfa25f2

add comments to codebase explaining not to persist data

86cd310

Merge branch 'main' into return-provided-identity

c205bf1

pattisdr reviewed Jul 13, 2022

View reviewed changes

Sean Preston added 2 commits July 13, 2022 17:31

update expected return values

65ce89d

correct bad merge

f98a3ef

pattisdr approved these changes Jul 13, 2022

View reviewed changes

pattisdr merged commit 6e6011b into main Jul 13, 2022

pattisdr deleted the return-provided-identity branch July 13, 2022 15:48

sanders41 pushed a commit that referenced this pull request Sep 22, 2022

Return persisted identities in get_request_status view (#860)

7929220

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return persisted identities in `get_request_status` view #860

Return persisted identities in `get_request_status` view #860

seanpreston commented Jul 12, 2022 •

edited

Loading

pattisdr left a comment

pattisdr Jul 12, 2022

pattisdr Jul 12, 2022

seanpreston Jul 13, 2022

pattisdr Jul 12, 2022

pattisdr Jul 12, 2022

seanpreston Jul 13, 2022 •

edited

Loading

seanpreston Jul 13, 2022 •

edited

Loading

pattisdr Jul 13, 2022

pattisdr Jul 13, 2022

pattisdr commented Jul 13, 2022

pattisdr commented Jul 13, 2022

pattisdr Jul 13, 2022

pattisdr Jul 13, 2022

Return persisted identities in get_request_status view #860

Return persisted identities in get_request_status view #860

Conversation

seanpreston commented Jul 12, 2022 • edited Loading

Purpose

Changes

Checklist

Ticket

pattisdr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

seanpreston Jul 13, 2022 • edited Loading

Choose a reason for hiding this comment

seanpreston Jul 13, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pattisdr commented Jul 13, 2022

pattisdr commented Jul 13, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Return persisted identities in `get_request_status` view #860

Return persisted identities in `get_request_status` view #860

seanpreston commented Jul 12, 2022 •

edited

Loading

seanpreston Jul 13, 2022 •

edited

Loading

seanpreston Jul 13, 2022 •

edited

Loading