Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

handling timeout or lost connection on the last chunk upload #2304

Closed
moscicki opened this issue Oct 12, 2014 · 1 comment
Closed

handling timeout or lost connection on the last chunk upload #2304

moscicki opened this issue Oct 12, 2014 · 1 comment

Comments

@moscicki
Copy link
Contributor

In relation to #2290 here is an issue that requires clarification: the upload of the last chunk takes much longer and time-to-complete this operation grows proportionally to the total file size (this is how server is implemented i.e it concatenates chunks). We talk about >8GB files so you see the point, right?

The possibility of a timeout on a proxy or connection being cut for whatever reason is hence also proportionally greater for the last chink. This means that as a client you send the request, it may succeed internally, the final (big) file being ultimately created but you may see an error back at you (e.g. Gateway Timeout) or Connection Lost. As a client you have no option put try the whole chunked upload again because you have way of telling what happened on the server (and the chunk upload context on the server may have been gone by now).

So first what you do is to resend the last chunk.

If this succeeds you are good because the previous upload of last chunk genuinely failed indeed and you just retried it.

However in the contrary case (final file already created and the previous chunk-upload-context on the server lost), at this point you should be expecting an error from the server, right? Is this what you foresee in the client-server protocol specification? Because you may only expect the server to throw an error back to you if you have some guarantees about the order of chunked upload.

My take on the full picture of the chunked upload ordering semantics:

  1. client must guarantee that the first chunk is sent as first and no other chunks will be sent before this one is acknowledged by the server ("BEGIN-CHUNK-UPLOAD")
  2. client may/should send chunks in order (in the future possibly out-of-order or in parallel)
  3. client must guarantee that the last chunk is sent after all the others have been completed - there will not be any more chunks sent after the last chunk (important in case of retrying chunk upload). This marks "END-CHUNK-UPLOAD".

If guarantee 1) holds then the server is able and should throw an error if it sees chunked upload PUT request before "BEGIN-CHUNKED-UPLOAD" and client may implement a suitable fallback.
If guarantee 3) holds then the server may report a protocol error - which indicates that client does something illegal and it should retry the whole upload from scratch. However in this case there should be some error code that indicates this (and distinguishes from just a "standard" problem of uploading the last chunk). Do you have such error code?

Could you confirm please that 1),2),3) is an expected behaviour from the client? And what is the special error code to indicate the out-of-order protocol error (or other permanent server error state that cannot be recovered from in the current chunked upload context)?

@ogoffart
Copy link
Contributor

Duplicate as #2074

This is really a server problem. The way we want to solve it is that the server returns an id in a header and we can query this id at regular interval. (this is actually already implemented in one of the branch of the client.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants