Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry fixing blocked UI on client start. #944

Merged
merged 8 commits into from
Mar 25, 2020

Conversation

ntoll
Copy link
Contributor

@ntoll ntoll commented Mar 17, 2020

Description

Fixes #846 and replaces #870.

While rebasing #870 with current master I got into all sorts of difficulties. As a result I abandoned the work and re-created the work from #870 but based from current master. I also fixed a couple of issues to do with deleting sources.

Please see the conversation on #870 for the discussion of the approach of this fix. This PR simply exists to avoid rebase hell.

Test Plan

The unit tests have been copied over from the original PR. I've also removed some source deletion based unit tests which no longer worked because the approach to deletion is completely different. Please check the tests are comprehensive and exercise the code in the correct way.

I also tested this branch with 1000 sources and it works as expected.

Checklist

If these changes modify code paths involving cryptography, the opening of files in VMs or network (via the RPC service) traffic, Qubes testing in the staging environment is required. For fine tuning of the graphical user interface, testing in any environment in Qubes is required. Please check as applicable:

  • I have tested these changes in the appropriate Qubes environment
  • I do not have an appropriate Qubes OS workstation set up (the reviewer will need to test these changes)
  • These changes should not need testing in Qubes

If these changes add or remove files other than client code, packaging logic (e.g., the AppArmor profile) may need to be updated. Please check as applicable:

  • I have submitted a separate PR to the packaging repo
  • No update to the packaging logic (e.g., AppArmor profile) is required for these changes
  • I don't know and would appreciate guidance

securedrop_client/gui/widgets.py Outdated Show resolved Hide resolved
Copy link
Contributor

@sssoleileraaa sssoleileraaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also if you have trouble with the self.source_widgets[source.uuid].update() being part of the new-sources-to-add loop, I think it should be moved to the existing-widgets loop and changed to list_widget.update() -- you'll see what I mean when you get there or you'll come up with a better solution.

@ntoll
Copy link
Contributor Author

ntoll commented Mar 18, 2020

@creviera 👍 on re-using this code but with modification given the more recent changes that have happened on master.

@sssoleileraaa
Copy link
Contributor

@ntoll i think this change looks good after you respond to #944 (comment)

@ntoll
Copy link
Contributor Author

ntoll commented Mar 25, 2020

@creviera / @redshiftzero please never make me rebase anything like this again. 😛 Please review... I have:

  • simplified the recursive function signature
  • ensured the chunked method of populating the source list only happens if there are NO sources in the source list. For all other states, the current behaviour is retained.
  • updated various tests to reflect the above changes.
  • screamed, "GIT! Why GIT, why...?" on several occasions as I tried to unpick / rebase / check / test and generally get frustrated with computers.

That's it..!

securedrop_client/gui/widgets.py Show resolved Hide resolved
securedrop_client/gui/widgets.py Outdated Show resolved Hide resolved
securedrop_client/gui/widgets.py Outdated Show resolved Hide resolved
@sssoleileraaa
Copy link
Contributor

I did a first pass at the code and am moving on to the test plan to test with 1000 sources, and will track benchmark results as I go, and then can follow up with a more thorough code review.

@redshiftzero
Copy link
Contributor

added a commit with requested changes, let's now test this thoroughly ensuring that:

  • ordering of the source list matches the JI
  • there are no other regressions

btw I think there are other scalability issues with set_snippet which we'll need to investigate (existing on master).

Copy link
Contributor

@sssoleileraaa sssoleileraaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are my test results so far. I'm still testing.


First test

First I wanted to see how long updating the source list would take if there were already source widgets in the source list that had yet to be removed:

STR

  1. NUM_SOURCES=5 make dev
  2. run the client see the 5 sources in the list and keep it open
  3. NUM_SOURCES=1000 make dev (with the client still open)

Expected

Once the server is rebooted with 1000 sources, the next sync will start pulling down new sources.

Actual

The sync times out on a thousand sources, even after a client restart, the sync times out on a 1000 sources. Local database still only has 5 sources.

Next test

Eventually I just deleted the local database and restarted the client. Here's what I saw:

  • The black screen no longer appeared for the first ~5 seconds
  • The "Nothing to see just yet!" screen showed for the first ~53 seconds and then the 1000 sources showed up in the source list with the skeleton messages, which start updating on-the-fly, from bottom to top (this is a separate issue we need to fix so that we start decrypting top to bottom).
  • The next sync timed out so that the source list no longer updated but messages continue to be decrypted and to replace skeleton messages.
  • After minutes of waiting for the sync to recover, I decided it wouldn't recover and restarted the client.The sync timed out again. Messages were no longer decrypted until the next sync, so they were stuck as skeletons.

Here's how long updating the source list took in between syncs, for some reason it started increasing in duration:

(in seconds)

3.5008578300476074
3.46498441696167
3.5008578300476074
3.7121920585632324
5.3824193477630615
5.535599946975708
(the next sync times out)

@sssoleileraaa
Copy link
Contributor

Okay so I'm going to create at least two followup issues: (1) for the sync timeout issue I mentioned in my test results, and (2) for the issue when old widgets haven't been removed yet from the source list and we end up using a slower add method.

I'm still looking into why it takes ~53 seconds before we see the source list populated.

@sssoleileraaa
Copy link
Contributor

I'm still looking into why it takes ~53 seconds before we see the source list populated.

The plan is to address the slowness issue (issue still needs to be created I think) outside of this PR. Looks like the failing test opens up two client instances when you run it by itself, which is not happening on master, so looking into why this is happening.

@sssoleileraaa
Copy link
Contributor

Update: Turns out he multiple client windows is a result of when a test marked as flaky fails and runs again.

@sssoleileraaa sssoleileraaa self-requested a review March 25, 2020 22:58
Copy link
Contributor

@sssoleileraaa sssoleileraaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm with the functional test change

@redshiftzero redshiftzero merged commit 0b304e9 into freedomofpress:master Mar 25, 2020
@ntoll
Copy link
Contributor Author

ntoll commented Mar 26, 2020

Folks... many many thanks for the comprehensive testing and checking. I really appreciate it. 👍 Here's hoping nobody ever has 1000s of sources..! 😉

FWIW, perhaps in future, when folks are working on the same piece of code, it'd be good to coordinate in some way (stacked branches or small incremental changes done in a clearly defined order with clearly defined roles for who covers what aspect of the work..?) so we don't end up with the rebase hell I found myself in yesterday (it took several attempts to get it right, and as @creviera noticed, there were still some parts of the code I'd missed to remove). Having said that, our difference in time-zones and remote working didn't help the situation, and I think we're in a relatively good place considering where we could have been.

Stonking stuff..! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Client starts with blacked out / blocked UI with many sources
4 participants