Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API changes for listing/marking source conversation items that have been seen by journalists #5513

Merged
merged 4 commits into from
Sep 24, 2020

Conversation

rmol
Copy link
Contributor

@rmol rmol commented Sep 17, 2020

Status

Ready for review

Description of Changes

Add new API endpoint for listing or marking source conversation items that have been seen by a journalist.

Add utility method to mark a heterogeneous collection of Submission and Reply objects seen.

Add Submission.is_file and Submission.is_message to encapsulate the characterization based on filename.

Fixes #5475.

Testing

  • Run the dev server with make dev.

  • Get an API token as the journalist user:

    export TOKEN=$(curl -o- -X POST -H "Content-Type: application/json" --data '{"username":"journalist","passphrase":"correct horse battery staple profanity oil chewy","one_time_code":"047889"}' http://localhost:8081/api/v1/token | jq -r .token)
    
  • List current submissions:

    curl -X GET -H "Content-Type: application/json" -H "Authorization: Token $TOKEN" http://localhost:8081/api/v1/submissions
    

    All should have a seen_by property, currently an empty list.

  • Mark a message seen:

    curl -X POST -H "Content-Type: application/json" -H "Authorization: Token $TOKEN" --data '{"messages": ["e711d29d-d0a1-440f-80e4-6642b77ec3ea"]}' http://localhost:8081/api/v1/seen
    

    Replace the UUID with one from one of the messages in the response from the /submissions endpoint, of course.

  • Mark the same message seen again. There should be no error.

  • Retrieve /submissions again. The message's seen_by list should now contain your journalist account's UUID.

  • Visit the source of that message in the journalist interface. The message you marked should not be listed in bold text. Clicking "Select unread" should not select that message.

  • Visit the source list in the journalist interface. Select the source of the message you marked and click "Download Unread". The zip file you receive should not contain the seen message.

  • Via the source interface, upload a file, then retrieve /submissions again, and mark the file seen:

    curl -X POST -H "Content-Type: application/json" -H "Authorization: Token $TOKEN" --data '{"files": ["4fe4b7f5-3031-4651-9092-888951681cf7"]}' http://localhost:8081/api/v1/seen
    

    Replace the UUID with that of the file you just uploaded, of course.

  • Repeat the checks in the journalist interface that you performed for a seen message, confirming that the file you marked seen is always considered read.

  • Get another API token, this time as the dellsberg journalist account.

  • Retrieve the list of all replies:

    curl -X GET -H "Content-Type: application/json" -H "Authorization: Token $TOKEN" http://localhost:8081/api/v1/replies
    
  • Mark one of them read:

    curl -X POST -H "Content-Type: application/json" -H "Authorization: Token $TOKEN" --data '{"replies": ["8924a691-3b0f-45c1-bfed-b0b91e264855"]}' http://localhost:8081/api/v1/seen
    
  • Hit /replies again and confirm that the dellsberg account UUID appears in the reply's seen_by list.

Deployment

Depends on the database changes in #5505.

Checklist

If you made changes to the server application code:

  • Linting (make lint) and tests (make test) pass in the development container

If you made non-trivial code changes:

  • I have written a test plan and validated it for this PR

@rmol rmol marked this pull request as draft September 18, 2020 13:57
@sssoleileraaa sssoleileraaa force-pushed the 5474-seen-tables branch 16 times, most recently from 1c3e733 to ee10a74 Compare September 22, 2020 23:17
Copy link
Contributor

@sssoleileraaa sssoleileraaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did a first pass

securedrop/create-dev-data.py Show resolved Hide resolved
securedrop/journalist_app/api.py Outdated Show resolved Hide resolved
@sssoleileraaa
Copy link
Contributor

sssoleileraaa commented Sep 23, 2020

Not sure how the API is going to look once you're done, but just as a reminder, we were also considering updating the current replies and submissions endpoints as an option:

[Option 2] Update current endpoints

"replies": [
        {
            "filename": <filename>,
            ...
            "seen_by": [<journalist_uuid>, ]
        },
        {
            "filename": <filename>,
            ...
            "seen_by": [<journalist_uuid>, ]
        },
]

"submissions": [
        {
            "is_read": <is_read>,
            "filename": <filename>,
            ...
            "seen_by": [<journalist_uuid>, ]
            "opened_by": [<journalist_uuid>, ]
        },
        {
            "is_read": <is_read>,
            "filename": <filename>,
            ...
            "seen_by": [<journalist_uuid>, ]
            "opened_by": [<journalist_uuid>, ]
        },
]

@sssoleileraaa
Copy link
Contributor

sssoleileraaa commented Sep 23, 2020

just a thought, we could rename "is_read" to "downloaded_from_web" or "seen_from_web"

@rmol
Copy link
Contributor Author

rmol commented Sep 23, 2020

Not sure how the API is going to look once you're done, but just as a reminder, we were also considering updating the current replies and submissions endpoints as an option

That is already present, as suggested in the test plan. Look at the to_json methods of Submission and Reply in models.py.

@rmol
Copy link
Contributor Author

rmol commented Sep 23, 2020

just a thought, we could rename "is_read" to "downloaded_from_web"

I know we've been playing pretty fast and loose with the API and versioning (at least it's mostly been additions 😐 ), but is_read is in production. It's possible that someone is using the API for something other than the workstation, so we should at least pretend we care about API stability.

@rmol rmol force-pushed the 5475-unseen-academicals branch from 078a022 to ecfe91f Compare September 23, 2020 17:47
@lgtm-com
Copy link

lgtm-com bot commented Sep 23, 2020

This pull request introduces 2 alerts when merging ecfe91f into 3d13faf - view on LGTM.com

new alerts:

  • 1 for Unused import
  • 1 for Variable defined multiple times

@rmol rmol force-pushed the 5475-unseen-academicals branch from ecfe91f to b16332e Compare September 23, 2020 18:00
@lgtm-com
Copy link

lgtm-com bot commented Sep 23, 2020

This pull request introduces 2 alerts when merging b16332e into 3d13faf - view on LGTM.com

new alerts:

  • 1 for Unused import
  • 1 for Variable defined multiple times

@rmol rmol force-pushed the 5475-unseen-academicals branch from b16332e to 5bdc7e0 Compare September 23, 2020 18:08
@lgtm-com
Copy link

lgtm-com bot commented Sep 23, 2020

This pull request introduces 1 alert when merging 5bdc7e0 into 3d13faf - view on LGTM.com

new alerts:

  • 1 for Variable defined multiple times

@rmol rmol force-pushed the 5475-unseen-academicals branch 3 times, most recently from d24f4dd to c98374b Compare September 23, 2020 22:17
@rmol rmol marked this pull request as ready for review September 23, 2020 22:19
@lgtm-com
Copy link

lgtm-com bot commented Sep 23, 2020

This pull request introduces 3 alerts when merging c98374b into 3d13faf - view on LGTM.com

new alerts:

  • 3 for Unused import

@rmol rmol force-pushed the 5475-unseen-academicals branch from c98374b to 38e866c Compare September 23, 2020 22:31
@conorsch conorsch mentioned this pull request Sep 23, 2020
5 tasks
Allie Crevier and others added 4 commits September 23, 2020 20:11
Add new API endpoint for listing or marking source conversation items
that have been seen by a journalist.

Add utility method to mark a heterogeneous collection of Submission
and Reply objects seen.

Add Submission.is_file and Submission.is_message to encapsulate the
characterization based on filename.
With the addition of journalist_app.utils.mark_seen, it can replace
the helper methods in tests/utils/db_helper.py, and the ad hoc logic
to mark things seen in journalist_app/col.py.
@rmol rmol force-pushed the 5475-unseen-academicals branch from 38e866c to 3542b67 Compare September 24, 2020 00:11
@rmol rmol requested review from conorsch and emkll as code owners September 24, 2020 00:11
@rmol rmol changed the base branch from 5474-seen-tables to develop September 24, 2020 00:12
@rmol rmol changed the title [WIP] API changes for listing/marking source conversation items that have been seen by journalists API changes for listing/marking source conversation items that have been seen by journalists Sep 24, 2020
Copy link
Contributor

@sssoleileraaa sssoleileraaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first pass through the code - going to start manually testing

@@ -212,7 +211,23 @@ def __init__(self, source: Source, filename: str) -> None:
def __repr__(self) -> str:
return '<Submission %r>' % (self.filename)

def to_json(self) -> 'Dict[str, Union[str, int, bool]]':
@property
def is_file(self) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thought about doing this myself in #5505 but the pr was already so large. nice change.

def to_json(self) -> "Dict[str, Union[str, int, bool]]":
seen_by = {
f.journalist.uuid for f in SeenFile.query.filter(SeenFile.file_id == self.id)
if f.journalist
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see, so this won't include the fact that there is a deleted journalist who saw a file. gonna keep reviewing and will cycle back to this. could be an issue because we still send "deleted" uuids for replies when journalists are deleted. this takes a different course.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not. I didn't see any way for sensible inclusion of those records in seen_by here. We do have Submission.seen, which will reflect those, and until we start keeping deleted journalists around, we can't distinguish between the records for the submission in the seen tables anyway, so since the only purpose of those rows at this point is establishing the global seen state, I thought Submission.seen the right answer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are also replies we have to worry about which don't have a global seen state, but I definitely see why you made this choice. Agreed that implementing #5467 + global user account for data migration will make this work fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replies do have a global seen state: if one exists, its creator has seen it. SeenReply can tell us who else has seen it, but only if journalist_id is not null, so again, I didn't see any point in including records where it is.

Copy link
Contributor

@conorsch conorsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Detailed test plan checks out, functionality matches expectations. Visual review of the diff was smooth, as well. As discussion during review indicates, we should queue up some "v2" refinements for the API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add API endpoints for seen_by
3 participants