Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding retry feature to mobile apiFetch handler #1686

Merged
merged 8 commits into from
Jan 31, 2020
Merged

Conversation

etoledom
Copy link
Contributor

@etoledom etoledom commented Dec 16, 2019

This PR is a simple native-only solution to retry networking calls when they fail with unknown errors. Fixing point 2 of #1593 .

The idea is to catch the error before it returns to the caller, and retrying it if the error is unknown.

There are many kind of errors (both 4xx and 5xx) and we probably don't want to retry, so I decided to go simplistic and focus on caching the most common errors on mobile, that is no internet connection or time out for bad connectivity.

This approach has the following problems (with our only networking call so far [image object] as an example). Consider this scenario:

  • An image block requested the image object and a network call was triggered.
  • In the case the network call failed because of network issues, we will retry it.
  • In the mean time, the Image block was deleted from the post.
  • ⚠️ apiFetch has no way to know that this request is not needed anymore, and will continue retrying forever (until the request succeeds or Gutenberg is closed).
  • To solve this, I decided to add a maximum amount of retries (20 for now).
  • ⚠️ After all the retries attempt fails, there's no way to get the image object for this particular image ever again in the current Gutenberg session.

I think this is still better than never retrying anything, but it's also not the ideal.

The values I have chosen out of trial and error:

  • Retry each 2 seconds.
  • 20 retries max.

Not big feelings about those, but it seemed good enough to not send requests too often, and keep retrying for a decent amount of time.


Android side work left to do

UPDATE: Android side has been added

For this to work, the error returned from the parent apps needs to include the HTTP Status code. WPiOS needed a small change allow that.

I noticed that WPAndroid doesn't return status code either, so we will need a fix for this on there too. @mchowning could you take a look at it?

To test:

PR submission checklist:

  • I have considered adding unit tests where possible.
  • I have considered if this change warrants user-facing release notes and have added them to RELEASE-NOTES.txt if necessary.

@etoledom etoledom added the [Type] Enhancement Improves a current area of the editor label Dec 16, 2019
@etoledom etoledom added this to the 1.20 milestone Dec 16, 2019
@etoledom etoledom self-assigned this Dec 16, 2019
@etoledom etoledom marked this pull request as ready for review December 16, 2019 18:11
@etoledom etoledom requested a review from mchowning December 16, 2019 18:11
@mchowning
Copy link
Contributor

I noticed that WPAndroid doesn't return status code either, so we will need a fix for this on there too. @mchowning could you take a look at it?

On it! 😄

@SergioEstevao SergioEstevao self-requested a review December 18, 2019 21:56
Copy link
Contributor

@SergioEstevao SergioEstevao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test this on WP-IOS main app and it's working correctly! :shipit:

@koke
Copy link
Member

koke commented Dec 19, 2019

I'm not sure how effective this would be. I think the underlying problem is that Gutenberg would never make a second attempt, rather than needing a retry system on the API layer. With this, if a request fails it will retry 20 times every two seconds, so if you're offline when this starts, and don't get back online within 40 seconds, we're back to where we started. I think the changes we need should happen at the store level, so it doesn't give up on resolving the first time it fails.

Also, this is only OK now because we're discarding everything except path and assuming a GET request, but this would be dangerous if we started supporting any POST request, which shouldn't be blindly retried.

If we want to implement a network-level retry, I'd probably integrate it with reachability and/or use an incremental delay (like we do on PingHubManager, although that's for establishing a connection).

As a side note, we probably should update our fetch handler to behave like window.fetch():

A fetch() promise only rejects when a network error is encountered (which is usually when there’s a permissions issue or similar). A fetch() promise does not reject on HTTP errors (404, etc.). Instead, a then handler must check the Response.ok and/or Response.status properties.

I see some other middleware that relies on having access to the response status as well, so it might bite us in the future.

@etoledom
Copy link
Contributor Author

etoledom commented Dec 19, 2019

Thank you @koke for your comment!

I think the changes we need should happen at the store level, so it doesn't give up on resolving the first time it fails.

I agree with that! That's where I first started experimenting. But the current architecture made it quite challenging (from what I understood).

@youknowriad proposed a middleware based solution. This is not a middleware but close. And probably the same problems we have with this approach, we will have with a middleware on apiFetch.

I still think that a store level solution is the best option, but that probably will take more people involved and (maybe?) quite bigger changes.

I'd probably integrate it with reachability and/or use an incremental delay

That sounds like a great improvement to what is proposed here! Indeed 40 seconds of retrying might not be enough on many cases. Ideally we will still cover unstable or intermittent connectivity and timeouts.

@mchowning
Copy link
Contributor

mchowning commented Dec 20, 2019

👋 @etoledom

I have pushed the Android side work and opened related PRs for WPAndroid and FluxC. PR description has also been updated to reflect that.

In addition, I also merged in the latest from develop in order to include the Android crash fix.

@etoledom etoledom modified the milestones: 1.20, 1.22 Jan 20, 2020
@etoledom
Copy link
Contributor Author

etoledom commented Jan 30, 2020

Added an extra delay on every retry cycle. Now it will simply start with 1 second delay, and adding one second on every retry. I think this is enough for this first stage but always open to suggestions.

Also returning the error object directly in case we encounter an error code between 400 and 600, instead of rejecting.

@etoledom
Copy link
Contributor Author

Updated WPiOS and WPAndroid with the latest commit.

@etoledom
Copy link
Contributor Author

@mchowning would you mind reviewing the latest changes? Thanks!

Copy link
Contributor

@mchowning mchowning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍

@etoledom etoledom merged commit 772b610 into develop Jan 31, 2020
@youknowriad youknowriad deleted the issue/network-retry branch February 3, 2020 08:26
@hypest
Copy link
Contributor

hypest commented Feb 6, 2020

Check that image doesn't load

On Android (am not able to test it on iOS as it seems that, at least on the v1.22.0 release branches, the WPiOS Media Library won't allow selecting an image while offline), I see an empty/white box in the image block. I wonder if we can improve the UX here and display a better icon and/or message about the problem (no connectivity, though we should be able to put a version of the image, the one the media library has localy?). Can you open a ticket if there's none already @mchowning ?

screenshot-1580993920933

@designsimply
Copy link
Contributor

I tested with WPiOS 14.2.0.2 TestFlight beta and everything works well for me except I'm not sure how to find the image size settings. Should image size settings be available in 14.2. beta?

IMG_3333

Tested with WPiOS 14.2.0.2 TestFlight beta on iPhone 6S iOS 13.3.1.

@mchowning
Copy link
Contributor

mchowning commented Feb 20, 2020

I tested with WPiOS 14.2.0.2 TestFlight beta and everything works well for me except I'm not sure how to find the image size settings. Should image size settings be available in 14.2. beta?

👋 @designsimply ! The image size options were still behind a dev flag in 14.2 (which uses Gutenberg-Mobile 1.22.1 ), so they would have been hidden in release/beta builds. We have changed that for the Gutenberg-Mobile 1.23.0 release, so you should start seeing those image size options in the 14.3 release builds.

@mchowning
Copy link
Contributor

I see an empty/white box in the image block. I wonder if we can improve the UX here and display a better icon and/or message about the problem (no connectivity, though we should be able to put a version of the image, the one the media library has localy?). Can you open a ticket if there's none already @mchowning ?

👋 @hypest ! First of all, let's just pretend that responding to your (now 14-day old) comment didn't completely slip my mind. 😞

In testing the failure to download scenario I'm not seeing the behavior you are describing. In both the current Google Store beta build and 1.23 builds (both release and debug), I'm seeing the following when I select an image in my media library while in airplane mode. Do you have any idea what we might be doing differently?

Although the UX I'm seeing certainly isn't as good as showing the thumbnail we had downloaded for the media library, it's a lot better than what you were seeing.

Screen Shot 2020-02-20 at 5 45 57 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Type] Enhancement Improves a current area of the editor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants