Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRIVERS-2807 Update GridFS spec to add delete_by_name and rename_by_name #1702

Merged
merged 4 commits into from
Oct 31, 2024

Conversation

GromNaN
Copy link
Member

@GromNaN GromNaN commented Oct 30, 2024

Fix DRIVERS-2808 and DRIVERS-2807

Please complete the following before merging:

@GromNaN GromNaN changed the title DRIVERS-2808 DRIVERS-2807 Add specification for GridFS renameByName a… Add specification for GridFS renameByName and deleteByName Oct 30, 2024
@GromNaN GromNaN changed the title Add specification for GridFS renameByName and deleteByName Update GridFS spec to add delete_by_name and rename_by_name Oct 30, 2024
@GromNaN GromNaN changed the title Update GridFS spec to add delete_by_name and rename_by_name Update GridFS spec to add delete_by_name and rename_by_name Oct 30, 2024
@GromNaN GromNaN requested a review from alcaeus October 30, 2024 16:29
@GromNaN GromNaN marked this pull request as ready for review October 30, 2024 16:29
@GromNaN GromNaN requested a review from a team as a code owner October 30, 2024 16:29
**Implementation details:**

Drivers MUST first find the `_id` field of all files collection documents with the given filename. Drivers MUST then
delete all chunks with `files_id` in the found ids that was just deleted. Drivers MUST then delete all files collection
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't we want to delete the files documents first and then the chunks? Doing the opposite increases the chance of reading a phantom corrupt file. This is the same reason we always upload the chunks before saving the files document.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. Files should be deleted before chunks to ensure there is no partial files. But if the operation that deletes chunks fails, then there is orphan chunks.

Spec updated.

@alcaeus alcaeus requested a review from ShaneHarvey October 31, 2024 09:07
@GromNaN GromNaN force-pushed the DRIVERS-2808 branch 2 times, most recently from c1ce40a to 841a729 Compare October 31, 2024 13:05

Drivers MUST first find the `_id` field of all files collection documents with the given filename. Drivers MUST then
delete all files collection documents with the found ids. Drivers MUST then delete all chunks with `files_id` in the
found ids that was just deleted.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ids that was just deleted. -> ids that were just deleted.

Or use client.bulk_write() to delete both in one command :). We can leave that as a future optimization.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or use client.bulk_write() to delete both in one command :). We can leave that as a future optimization.

This would require server version 8.0 with the new client bulk write API to perform operations on both collections.

@ShaneHarvey ShaneHarvey changed the title Update GridFS spec to add delete_by_name and rename_by_name DRIVERS-2807 Update GridFS spec to add delete_by_name and rename_by_name Oct 31, 2024
Copy link
Member

@ShaneHarvey ShaneHarvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after the small grammar fix.

@GromNaN GromNaN merged commit 22f068a into master Oct 31, 2024
5 checks passed
@GromNaN GromNaN deleted the DRIVERS-2808 branch October 31, 2024 17:44
Comment on lines +857 to +858
To rename all revisions of a stored file with the specified filename, drivers SHOULD provide the method
`rename_by_name`:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I did not think about is if there is already a newer file with the new filename, then it will stay the last revision for this filename.

We can't change the date of the renamed files, as that would change the order of the revisions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct that seems like expected behavior to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants