Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concurrent downloads of remote media can lead to orphaned files in storage providers. #8692

Open
matrixbot opened this issue Dec 18, 2023 · 0 comments

Comments

@matrixbot
Copy link
Collaborator

matrixbot commented Dec 18, 2023

This issue has been migrated from #8692.


When running multiple media workers it is possible to end up with both fetching and persisting the same media. matrix-org/synapse#8682 fixed it such that doing so a) didn't throw an error to the client and b) deleted the duplicated media from disk, however by this point the media is already queued up to be uploaded to s3.

We could change it to only upload to storage providers after persisting to the DB, however that runs the risk of having media in the DB that hasn't been uploaded to a storage provider if e.g. the worker dies half way through upload.

I think the "fix" here is to add a delete_file function to the storage provider interface that can be called to delete a file (or cancel its upload if it hasn't finished uploading).

@matrixbot matrixbot changed the title Dummy issue Concurrent downloads of remote media can lead to orphaned files in storage providers. Dec 21, 2023
@matrixbot matrixbot reopened this Dec 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant