Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate behaviour when files fail to download #1633

Open
rocodes opened this issue Mar 2, 2023 · 9 comments
Open

Investigate behaviour when files fail to download #1633

rocodes opened this issue Mar 2, 2023 · 9 comments
Assignees
Labels
bug Something isn't working needs/reproducing
Milestone

Comments

@rocodes
Copy link
Contributor

rocodes commented Mar 2, 2023

Description

With stormy Tor weather, users may be experiencing more file download failures than usual, and there are some reports that the file download error is not appearing (file shows "Downloading" forever).

Filing to remind myself to investigate since this may be a regression.

Possibly Related: #1095

Other relevant issues in the "improve download experience" list:

@rocodes rocodes self-assigned this Mar 2, 2023
@rocodes rocodes added bug Something isn't working needs/reproducing labels Mar 2, 2023
@rocodes
Copy link
Contributor Author

rocodes commented Mar 3, 2023

I was initally excited because @creviera pointed me to the timout logic in the client and reminded me that we may need to adjust that, and that the estimated timeout times could be why users report success downloading files in the Tails Journalist Interface, but not in the client.

However, I've just done some very quick testing with files up to as large as 50MB, and aside from sync failures, where the server becomes fully unreachable (and it may be time to restart tor and hope for a better circuit), I have been well inside the download timeout window, even for 50MB files.

I will look into this further next week, since files in the 10-50MB range should be very manageable for the client. We may also want to implement more sophisticated download/retry logic.

@rocodes
Copy link
Contributor Author

rocodes commented Mar 7, 2023

Update: after looking more closely at relevant recent tor metrics, I do think we can alleviate some pain by adjusting the timeout length. I will put in a PR to do this. Ongoing, there are other improvements we could make (backoff strategy? better error reporting?) - I'm still going to investigate the issue of error notifications not being reported to see if I can repro.

@hoyla
Copy link

hoyla commented Oct 8, 2023

Hi, might I ask if the issue of error notifications is being investigated? I can't remember ever seeing "download failed" message and am still seeing a high proportion of large downloads showing as "downloading" without ever completing (even after hours - way beyond any likely timeout threshold). c.f. #1451

With no progress indicator or evidence that a download has stopped, and no option to cancel the current download attempt, the only action available to me is to assume that it has failed, and quit and relaunch the app to either reattempt to download or discover that the file has in fact downloaded after all.

I should note that download via the old web-based Journalist interface is consistently more reliable - and, of course, using that I can see both progress and failure. So this is unlikely to be simply down to bad Tor connections on my side.

@rocodes
Copy link
Contributor Author

rocodes commented Oct 13, 2023

Hi @hoyla - implementing progress and cancel options for downloads is still on our radar, and some sort of UX improvement is tentatively scoped for the next SDW release. But unfortunately, I'm not sure how quickly it will happen. We're a small crew these days and have recently been focusing on the upcoming SecureDrop server release, but hoping to put more time towards this and other workstation issues towards the end of the year. I'm sorry it's giving you grief.

@rocodes
Copy link
Contributor Author

rocodes commented Jul 18, 2024

Automatic retries for stalled/failed downloads have been implemented and released: #2006

The next steps are to add progress/speed indications (#1104) and the ability to cancel downloads (#1568).

@rocodes rocodes added this to the 0.16.0 milestone Sep 3, 2024
@rocodes
Copy link
Contributor Author

rocodes commented Sep 3, 2024

Since this is a ticket about investigating file download failure behaviour, I am including the following user reports.

  • Larger files fail to download more frequently, and the problem increases with file size. The problem is significant enough to warrant a more significant concentration on testing >5Mb (++) file downloads
  • Sometimes, very large files fail to download but the UI does not report "Download failed." (This creates a nasty ux xp when combined with the lack of download progress indicator.)
  • There is an additional issue with restoring download functionality after a failed download, and I will file it as a separate ticket.

@deeplow
Copy link
Contributor

deeplow commented Sep 3, 2024

Reproduction attempt. Client version 0.13.1. To reproduce, I uploaded an 11MB file though the source interface and then following each of the scenarios. Is there an easy server modification / test that triggers failure for one particular file?

Scenario 1: shut down server

  • client application:
    • Clicked "Download file"
    • It showed "Downloading" with the "loading ring" on the left.
  • [shut down SecureDrop server]
  • client application
    • after some seconds a message popped up at the top informing that the file had failed to download. However, the "DOWNLOADING" text remained with the little animation.
    • [restarted server -- and it reset because make dev-tor. So I didn't manage to try to download a second file]

Scenario 2: disconnect from network on sys-net

  • [started again SecureDrop server]
    • client application:
    • Clicked "Download file"
    • It showed "Downloading" with the "loading ring" on the left.
  • [disconnect from network on sys-net]
  • client application
    • after some seconds a message popped up at the top informing that the server could not be reached. However, the "DOWNLOADING" text remained with the little animation.
  • [reconnect to network]
    • after some seconds a message popped up at the top informing that the server could not be reached (again). "DOWNLOADING" text remains with the little animation.

Scenario 3: pause VM running server

The result was the same as scenario 2.

@legoktm
Copy link
Member

legoktm commented Sep 3, 2024

From the original description:

and there are some reports that the file download error is not appearing (file shows "Downloading" forever).

I'll have a PR for this shortly.

legoktm added a commit that referenced this issue Sep 3, 2024
DownloadDecryptionException is specially handled in
Controller.on_file_download_failure(), so if we raise a different
exception class, the download animation will never be stopped because
file_missing is never emitted.

It is worth a proper re-examination of our exception heirarchy and
reducing the amount of indirection in the download flow, but that can be
done outside of a hotfix.

More debug logging is added to trace when slots are triggered.

Refs #1633.
legoktm added a commit that referenced this issue Sep 3, 2024
DownloadDecryptionException is specially handled in
Controller.on_file_download_failure(), so if we raise a different
exception class, the download animation will never be stopped because
file_missing is never emitted.

It is worth a proper re-examination of our exception heirarchy and
reducing the amount of indirection in the download flow, but that can be
done outside of a hotfix.

More debug logging is added to trace when slots are triggered.

Refs #1633.
@legoktm
Copy link
Member

legoktm commented Sep 3, 2024

Since this is a ticket about investigating file download failure behaviour, I am including the following user reports.

  • Larger files fail to download more frequently, and the problem increases with file size. The problem is significant enough to warrant a more significant concentration on testing >5Mb (++) file downloads

There are multiple reasons this could fail and unfortunately the code has only gotten more complicated since we added partial retries to help mitigate this. I think we ultimately need better logging that admins can provide to use to give us indications of where the failures are coming from.

  • Sometimes, very large files fail to download but the UI does not report "Download failed." (This creates a nasty ux xp when combined with the lack of download progress indicator.)

#2196 is kind of related, but doesn't actually address this case. I think our exception hierarchy is too complex and as a result, some exceptions slip through or are handled differently causing inconsistent UX. I think this is a good task for 0.16.0

zenmonkeykstop pushed a commit that referenced this issue Sep 3, 2024
DownloadDecryptionException is specially handled in
Controller.on_file_download_failure(), so if we raise a different
exception class, the download animation will never be stopped because
file_missing is never emitted.

It is worth a proper re-examination of our exception heirarchy and
reducing the amount of indirection in the download flow, but that can be
done outside of a hotfix.

More debug logging is added to trace when slots are triggered.

Refs #1633.

(cherry picked from commit 3b183fb)
legoktm added a commit that referenced this issue Sep 3, 2024
DownloadDecryptionException is specially handled in
Controller.on_file_download_failure(), so if we raise a different
exception class, the download animation will never be stopped because
file_missing is never emitted.

It is worth a proper re-examination of our exception heirarchy and
reducing the amount of indirection in the download flow, but that can be
done outside of a hotfix.

More debug logging is added to trace when slots are triggered.

Refs #1633.

(cherry picked from commit 3b183fb)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs/reproducing
Projects
Status: No status
Development

No branches or pull requests

4 participants