Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve deleted journalist UUIDs in database to attribute replies #5467

Closed
emkll opened this issue Aug 26, 2020 · 16 comments
Closed

Preserve deleted journalist UUIDs in database to attribute replies #5467

emkll opened this issue Aug 26, 2020 · 16 comments
Labels
goals: journalist experience needs/discussion queued up for discussion at future team meeting. Use judiciously.

Comments

@emkll
Copy link
Contributor

emkll commented Aug 26, 2020

Description

In https://github.com/freedomofpress/securedrop/pull/5178/files, we modified the model to default to a username and uuid set to deleted when a journalist is deleted. This is because the API will return the journalist's information via the /replies endpoint.

It is therefore impossible (both at the client and the server-side) to distinguish replies sent by distinct deleted journalists.

When deleting a journalist, we should consider:

  • preserving their UUID
  • setting their username to the deleted keyword
  • clearing out all other information such as password, firstname and lastname

In the Admin interface user management page, we can either:

  • display (but grey out) deleted journalists
  • hide journalists that have their username set to deleted
    Deleted users wouldn't need to appear in the Journalist Interface, as grayed out/deactivated

User Stories

As a journalist, I would like to differenciate replies that were written by different former colleagues

@prateekj117
Copy link
Contributor

Ok. This would also require changes in #5284. I will take a look at this.

@rmol
Copy link
Contributor

rmol commented Aug 26, 2020

@prateekj117 We're going to handle this one, Prateek. It involves changes on the client as well that will have to be coordinated.

@prateekj117
Copy link
Contributor

Cool.

@rmol
Copy link
Contributor

rmol commented Aug 26, 2020

@emkll 👍 I think keeping and scrubbing the Journalist record makes sense. The only problem I can see with it is that username has a unique constraint, so we can't set it to deleted. And to distinguish replies in the client, it would be useful to not just clear out first and last names. I'd suggest populating username, first_name, and last_name randomly using our word lists.

The advantages of keeping the Journalist record are:

  • It requires less work on the client to keep journalist records in sync. Our current approach of spoofing the info of deleted journalists on replies returned in API responses has caused problems, and I think it'll be easier to resolve those if we're just modifying the Journalist records.
  • It doesn't require logic scattered around the code base to update the related entities that we keep, like Reply.
  • It will make it easier to distinguish replies from multiple deleted journalists, instead of attributing them all to the deleted account that's currently being created in the client.

The last thing I'd suggest is that we consider not scrubbing the journalist records at all, except for auth info -- replacing the ability to delete these accounts with a way to disable them. I'd be curious to know whether news organizations actually prefer being able to completely remove a journalist account, when it means losing history. Even if we're scrubbing or fully removing Journalist records, there's the possibility that journalists might identify themselves in the text of replies, which makes all of this erasure look a little silly. 🙂

@eloquence
Copy link
Member

The last thing I'd suggest is that we consider not scrubbing the journalist records at all, except for auth info -- replacing the ability to delete these accounts with a way to disable them.

Definitely worth doing at least some informal user research on. I suspect we may want to offer both - account deletion and account locking (the latter is arguably possible via creds rotation, but a hard account lock seems preferable).

To the extent that we offer deletion, it should IMO operate as such (remove/scramble account-level data). I don't have strong opinions on how to do the scrambling, but it seems to me that nulling fields that can be blanked and using a schema like <deleted-$uuid> for non-nullable fields might be easiest to reason about/query.

@eloquence
Copy link
Member

eloquence commented Sep 14, 2020

We discussed this a bit more today (between Kev, Allie, Mickael, John, Jen, and myself).

There was some concern that implementing this ticket would preserve data that the admin might reasonably expect to be removed (i.e. the relationship between an account record and all data associated with it).

We agreed broadly that it would be useful to get some more news org input into user deletion/locking, and will ping a few orgs to better understand their needs as a high level.

One possible path that emerged is to offer admins two options:

  1. Lock a user, which would preserve their information but prevent logins;

  2. Delete a user, which would fully delete the user and all data they created (i.e. replies). This would of course need to be made clearer in the UI(s) for deleting a user.

If we did this, then this ticket would become moot, as there would be no record to preserve.

@eloquence
Copy link
Member

eloquence commented Sep 24, 2020

There's an emerging consensus that we want to proceed per @emkll's original description, i.e. preserve the UUID and ID of deleted users (and preserve associations with other records such as replies and seen/unseen data). A user locking feature that would also preserve user info is something we can consider separately.

This therefore seems like a good candidate issue for SecureDrop 1.7.0 to help clean up our user deletion story across the board.

Open implementation questions I'm aware of:

  1. Do we want to hide deleted users from the Admin UI?

My view is that yes, such users should be hidden from the list of users in the admin UI as they only serve an internal tracking purpose and cannot be resurrected.

  1. Do we want to replace the username of a deleted user with a keyword, add a column, or do something else?

My sense is that an is_deleted Boolean in the journalists table may be cleaner than proliferating the use of reserved keywords; we can set other values to an empty string.

  1. Do we want to add a migration that creates a placeholder deleted user? The purpose would be to ensure that a foreign key to a journalist ID is always present.

My sense is that yes, we do want to create such a placeholder user, to ensure we can tighten referential integrity checks in future (#5503).

@eloquence eloquence added needs/discussion queued up for discussion at future team meeting. Use judiciously. and removed needs/discussion queued up for discussion at future team meeting. Use judiciously. labels Sep 30, 2020
@sssoleileraaa
Copy link
Contributor

sssoleileraaa commented Oct 8, 2020

  1. Do we want to hide deleted users from the Admin UI?

My view is also yes since there is nothing useful for an admin to do with these accounts and we should not allow deleted user info to be updated. It should all remain null or empty strings except for the uuid and id.

  1. Do we want to replace the username of a deleted user with a keyword, add a column, or do something else?

Yup, a deleted bool column that we can check versus a "deleted" username string is cleanest.

  1. Do we want to add a migration that creates a placeholder deleted user? The purpose would be to ensure that a foreign key to a journalist ID is always present.

While technically a NULL journalist_id does not violate the foreign key constraint rule, it is a way of losing referential integrity (no where is it explicit that a NULL id maps to a "deleted" UUID. This is just hard to follow.

--

I can't think of anything else to add. Since the client is already preserving deleted accounts, we can start implementation client side first. Also, we probably want to get the security review started ASAP. Okay, talk to you all more about this stuff tomorrow.

@eloquence eloquence added this to the 1.7.0 milestone Oct 8, 2020
@eloquence
Copy link
Member

We discussed this a bit more at today's tech meeting, and reached preliminary agreement that the approach described above (preserve UUIDs/IDs, blank everything else, add a deleted Boolean, create a placeholder account for legacy deleted users) makes sense. I'm therefore adding this issue to the 1.7.0 milestone.

This is contingent on a final discussion about any threat model implications. It may make sense to do a small implementation spike on this first, just to get a full shared understanding of what the proposed database/behavior changes would be.

@eloquence
Copy link
Member

We'll have to decide if this is really going to make it to 1.7.0 given the complexity; if so, we need to start implementation on it soon. Per comment in freedomofpress/securedrop-client#1143 (comment) this could be one of the changes that'll move us towards more consistent user management on SecureDrop Workstations.

@sssoleileraaa
Copy link
Contributor

It doesn't look like we've mentioned this in the issue yet, but we'll need to provide a way for admins to fully deleted shell journalist accounts that no longer have any references. We could also add this as a check when source records are deleted.

@eloquence
Copy link
Member

This is unlikely to be doable for 1.7.0 but remains high priority, so adding to 1.8.0 milestone for now.

@eloquence eloquence modified the milestones: 1.7.0, 1.8.0 Jan 6, 2021
@eloquence eloquence modified the milestones: 1.8.0, 1.9.0 Jan 25, 2021
@eloquence
Copy link
Member

Bumping to 1.9.0 for now, we'll prioritize working through the list in the order outlined by @creviera in freedomofpress/securedrop-client#1143 (comment)

@eloquence eloquence modified the milestones: 2.0.0, 2.1.0 Apr 26, 2021
@eloquence
Copy link
Member

(Bumped to 2.1.0 milestone given the number of significant changes already on 2.0.0 and the remaining time until release.)

@eloquence
Copy link
Member

We discussed this further today (2021-12-17) with @creviera @zenmonkeykstop @gonzalo-bulnes @conorsch @cfm @rocodes.

We agreed that for now we'll not preserve entire rows for deleted users, but instead associate replies with a single global deleted user. (We'll need to do that anyway for historical accounts.)

We are still interested in giving admins more fine-grained control over:

  • locking accounts
  • scrubbing user data
  • deleting accounts and associated data

But we can prioritize and threat model that separately. A single global deleted user will give us more parity between the behavior of the client and the server (making it easier to reason about expected behavior), and will allow us to enforce referential integrity on foreign keys (#5503).

We'll open a new issue for this global deleted user, and work on that in the immediate future. Closing this issue for now; we can re-open, or create a more fully scoped issue in future.

@zenmonkeykstop
Copy link
Contributor

(Moving issue out of 2.2.0 milestone as there's a PR in flight for it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
goals: journalist experience needs/discussion queued up for discussion at future team meeting. Use judiciously.
Projects
None yet
Development

No branches or pull requests

6 participants