Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate media on upload after retry #13427

Open
ashiagr opened this issue Nov 19, 2020 · 8 comments
Open

Duplicate media on upload after retry #13427

ashiagr opened this issue Nov 19, 2020 · 8 comments

Comments

@ashiagr
Copy link
Contributor

ashiagr commented Nov 19, 2020

Expected behavior

Retrying upload for a previously upload failed image should not result in duplicate images in the media library.

Actual behavior

If you upload more than one images at once and upload fails for an image, then on retrying to upload makes duplicate copies in the media library.

(Tested on a self-hosted site. Might also be reproducible on a simple WP site.)

Steps to reproduce the behavior

  1. Launch app
  2. Open a draft post in the Gutenberg Editor
  3. Select image block
  4. Select more than one images for upload
  5. Enable airplane mode, wait for ~1 min and then disable it while images are uploading so that images upload fails for at least one of the image (I also re-surfaced the app from background once)
  6. Click retry on the failed image
  7. Wait for the upload to complete
  8. Go to media library and notice duplicate copy of the failed image

Attempt 1: https://cloudup.com/izjkk0AITDc (Duplicate media on timestamp 1:14)

Attempt 2: https://cloudup.com/iydwecO-FxC (Duplicate media on timestamp 1:53)

Tested on [Pixel3 API 27], WPAndroid [16.2-rc-1]
@malinajirka
Copy link
Contributor

This is probably not related, but I thought I'd better drop a comment just to be sure. When we were dealing with posts being duplicated after retry, we came to a conclusion that it cannot be prevented without making changes to the API.

If the app sends a POST request and goes offline, the server will still process the request. However, the app will never receive a response so it assumes the request failed. Retry will send the same request again - however, since the post doesn't exist on the server yet, it doesn't have remote ID so it's treated by the server as a different post => duplicated post gets created.

I think it'll probably not be related to this issue, since images/files upload takes a while and the device would need to go offline in the exact right moment (the last byte was sent but the server didn't send a response yet).

@ashiagr
Copy link
Contributor Author

ashiagr commented Nov 20, 2020

Thanks @malinajirka !

If the app sends a POST request and goes offline, the server will still process the request. However, the app will never receive a response so it assumes the request failed. Retry will send the same request again

This is exactly what is happening in this case. This is how I verified:

  1. Create draft post (POST_ID = 0).
  2. Select several images for upload.
  3. Wait for few secs while images are uploading, then enable airplane mode.
  4. Notice in the desktop browser that one or more of the images get uploaded in the selected site -> media section (response not received by the app as it is in the offline mode).
  5. Disable airplane mode.
  6. Wait till we know upload status for all the images (images uploaded in step 4 included in the failed upload count).
  7. Retry upload (request included images uploaded in the step 4).
  8. Wait till all images are uploaded.
  9. Go to site's media section and notice that images that were uploaded in the step 4 are duplicated.

I could see duplicate images issue even for a post with id.

duplicate images

duplicate images for post with id

we came to a conclusion that it cannot be prevented without making changes to the API.

Looks like same conclusion applies in this case.

@ashiagr ashiagr removed their assignment Jun 21, 2021
@ravishanker
Copy link
Contributor

Investigated this issue to see if anything can be done on the client side to fix this issue. Tried a couple of strategies to see if it would help

  1. Fetch the media and compare with failed media ids before uploading again
  2. Delete the media item before retry upload again

Both the approaches while may seem to help solve the issue are found to be undue overhead for the task. Need to find out if there's an easier and better option on the backend!

cc: @ashiagr, @malinajirka

@ravishanker
Copy link
Contributor

@hannahtinkler for your reference about this issue

@hannahtinkler
Copy link

Hi @ravishanker!

Fetch the media and compare with failed media ids before uploading again

With regards to this, would the app know the IDs for the failed media items? I may have misunderstood, but if we're talking about the 'ID' field in the example response then I'd expect this would be generated in MySQL so the app wouldn't be aware of it until a successful upload response is returned. You might be referring to something on the app side that is pre-determined and a fixed value though?

If you're looking for minimal-change solution, you could create the media item with an alternative status (e.g. pending or future). Once you receive a successful response back from the upload endpoint, you can make a request to update the status to publish. If the connection fails during the upload, you could make a request to the list media endpoint for items with the alternative status to see which succeeded upload. For those that succeeded, update the status to publish and don't retry the upload - for those that failed, retry the upload. It adds additional requests to the process but I'm not sure there's any solution that wouldn't add requests that wouldn't also be highly specific to this problem case.

Otherwise, I don't see a nice way to do this unless the API and app can agree on an identifier by which to reference the upload. Since you can't guarantee receiving the API response, the app would have to be responsible for setting this. If you can assign and persist a unique identifier to a media item prior to upload, you could provide a way for either the app or the API to identify if a specific item has already been uploaded, either by storing the unique identifier as postmeta on the media item and either:

  • Adding/modifying an endpoint to allow lookup by postmeta value (if there are no results, the app knows it should retry an upload)
  • Modifying the API upload behaviour to reject uploads with the same unique identifier as postmeta (if given in the request)
  • Adding an endpoint that will only upload the media item if an item with the unique identifier as meta doesn't already exist, otherwise returning the previously created record (although this feels very use-case specific and not especially RESTful 😅 )

Obviously media items have a GUID (which feels analogous to a unique identifier), but at the moment it changes per retry and there's also no behaviour in the API or database to prevent duplicate GUIDs from existing. Adding this behaviour would be likely to cause issues, which makes me think postmeta would be a better place to store a unique identifier for this purpose - I suspect the performance wouldn't be amazing though due to the volume of data a postmeta query would have to filter through.

Other options that came to mind that I like less:

  • Compare all uploads on the post with the file size and a hash of the new upload to identify if it already exists (similar to your suggestion and I agree, it seems like overhead, especially where posts might have lots of uploads)
  • Compare the file name with files already uploaded to the post to identify if it already exists (but I think the risk of people uploading different files with the same name is too high to make this viable, e.g. IMG_1.PNG, untitled.doc)
  • Breaking up the upload behaviour so that the app would create an attachment first, then upload a piece of media to it. Since you'll have the attachment ID from the first request you could check on the upload using the API before retrying, but these are more extensive changes and still risks random records being created that aren't needed (e.g. in a scenario where connection is lost during the attachment creation so the app never gets the attachment ID)

@ravishanker
Copy link
Contributor

ravishanker commented Feb 15, 2022

Looked at implementing a couple of options. Particularly, deleting image/s before uploading on retry. Also, marking status with connection error on media. Haven't been able to do it satisfactorily and other priorities have taken over meanwhile.

@erricgunawan
Copy link

Possibly related/similar issue:

When I add new image, the image will duplicate other attached images

1-star Android app review

@antonis
Copy link
Contributor

antonis commented Apr 29, 2022

I confirm that the issue is still reproducible on 19.7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants