You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I was working on #4839, I checked how the server would handle an upload that was interrupted (e.g. by a network disconnection), and the result was not good. The server tries to read the entire request body into memory at once:
If the upload is interrupted, that raises an exception, and all data that has already been transmitted is discarded. This is obviously suboptimal - the whole point of TUS is to support upload resumption, so if some data has been transmitted successfully, it should be written to the file, and the upload should then be able to resume from that point.
The current behavior is also problematic from another perspective: by reading the entire upload into memory, we might potentially consume an unbounded amount of memory.
I think the right thing to do here is to read the uploaded file in small chunks, and write each chunk to the destination file before handling the next one. That will solve both problems.
The text was updated successfully, but these errors were encountered:
The current behavior is also problematic from another perspective: by reading the entire upload into memory, we might potentially consume an unbounded amount of memory.
Okay, that wasn't quite accurate: the amount of memory is bounded by the DATA_UPLOAD_MAX_MEMORY_SIZE Django setting, which is currently set to 100 MiB. Still, we could avoid consuming even that much by reading the request body in chunks.
Come to think of it, this also means that we don't accept TUS upload requests with a chunk size greater than 100 MB. Which is not great, but also not terrible, since neither the UI nor the SDK should send chunks bigger than that.
When I was working on #4839, I checked how the server would handle an upload that was interrupted (e.g. by a network disconnection), and the result was not good. The server tries to read the entire request body into memory at once:
https://github.com/opencv/cvat/blob/e5d01359aa09521c9ca87f0ec77a6dede097211b/cvat/apps/engine/mixins.py#L94
If the upload is interrupted, that raises an exception, and all data that has already been transmitted is discarded. This is obviously suboptimal - the whole point of TUS is to support upload resumption, so if some data has been transmitted successfully, it should be written to the file, and the upload should then be able to resume from that point.
The current behavior is also problematic from another perspective: by reading the entire upload into memory, we might potentially consume an unbounded amount of memory.
I think the right thing to do here is to read the uploaded file in small chunks, and write each chunk to the destination file before handling the next one. That will solve both problems.
The text was updated successfully, but these errors were encountered: