Skip to content

Commit

Permalink
Fix TUS offset queries in production environments
Browse files Browse the repository at this point in the history
Previously, `mod_wsgi` would convert `HEAD` requests into `GET`, which
would be rejected, so clients were unable to resume an upload that failed
midway through.

To make use of this, update the SDK code to enable upload resumption.
  • Loading branch information
SpecLad committed Oct 31, 2022
1 parent 16c4c46 commit d85658d
Show file tree
Hide file tree
Showing 3 changed files with 40 additions and 12 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,8 @@ non-ascii paths while adding files from "Connected file share" (issue #4428)
(<https://github.com/opencv/cvat/issues/5048>)
- Changing an object causes current z layer to be set to the maximum (<https://github.com/opencv/cvat/pull/5145>)
- Create manifest with cvat/server docker container command (<https://github.com/opencv/cvat/pull/5172>)
- Fixed upload resumption in production environments
(<https://github.com/opencv/cvat/issues/4839>)

### Security
- TDB
Expand Down
38 changes: 26 additions & 12 deletions cvat-sdk/cvat_sdk/core/uploading.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
import urllib3

from cvat_sdk.api_client.api_client import ApiClient, Endpoint
from cvat_sdk.api_client.exceptions import ApiException
from cvat_sdk.api_client.rest import RESTClientObject
from cvat_sdk.core.helpers import StreamWithProgress, expect_status
from cvat_sdk.core.progress import ProgressReporter
Expand Down Expand Up @@ -211,18 +212,31 @@ def get_offset(self):
This is different from the instance attribute 'offset' because this makes an
http request to the tus server to retrieve the offset.
"""
# FIXME: traefik changes HEAD to GET for some reason, and it breaks the protocol

# Assume we are starting from scratch. This effectively disallows us to resume
# old file uploading
return 0

# resp = self._api_client.rest_client.HEAD(self.url, headers=self.headers)
# offset = resp.headers.get("upload-offset")
# if offset is None:
# msg = "Attempt to retrieve offset fails with status {}".format(resp.status_code)
# raise tus_uploader.TusCommunicationError(msg, resp.status_code, resp.content)
# return int(offset)
try:
resp = self._api_client.rest_client.HEAD(self.url, headers=self.headers)
except ApiException as ex:
if ex.status == 405: # Method Not Allowed
# In CVAT up to version 2.2.0, HEAD requests were internally
# converted to GET by mod_wsgi, and subsequently rejected by the server.
# For compatibility with old servers, we'll handle such rejections by
# restarting the upload from the beginning.
return 0

raise tus_uploader.TusCommunicationError(
f"Attempt to retrieve offset failed with status {ex.status}",
ex.status,
ex.body,
) from ex

offset = resp.headers.get("upload-offset")
if offset is None:
raise tus_uploader.TusCommunicationError(
f"Attempt to retrieve offset failed with status {resp.status}",
resp.status,
resp.data,
)

return int(offset)

# Add headers required by CVAT server
headers = {}
Expand Down
12 changes: 12 additions & 0 deletions mod_wsgi.conf
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,15 @@ LoadModule xsendfile_module /usr/lib/apache2/modules/mod_xsendfile.so
XSendFile On
XSendFilePath ${HOME}/data/
XSendFilePath ${HOME}/static/

# The presence of an Apache output filter (mod_xsendfile) causes mod_wsgi
# to internally convert HEAD requests to GET before passing them to the
# application, for reasons explained here:
# <http://blog.dscpl.com.au/2009/10/wsgi-issues-with-http-head-requests.html>.
# However, we need HEAD requests passed through as-is, because the TUS
# protocol requires them. It should be safe to disable this functionality in
# our case, because mod_xsendfile does not examine the response body (it
# either passes it through or discards it entirely based on the headers),
# so it shouldn't matter whether the application omits the body in response
# to a HEAD request.
WSGIMapHEADToGET Off

0 comments on commit d85658d

Please sign in to comment.