Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Snapshots streamable by not requiring a known size up front. #601

Open
otoolep opened this issue Jun 21, 2024 · 6 comments
Open

Make Snapshots streamable by not requiring a known size up front. #601

otoolep opened this issue Jun 21, 2024 · 6 comments

Comments

@otoolep
Copy link
Contributor

otoolep commented Jun 21, 2024

https://pkg.go.dev/github.com/hashicorp/raft#SnapshotMeta

I am implementing (not for the first time) my own Snapshot Store in rqlite. I am reading a Snapshot data from an uncompressed source, but would like to compress it on the fly when a Snapshot it the data is read from the io.ReadCloser returned by Snapshot store.Open().

Since the source data is not itself compressed, but compressed bytes will be served by the io.ReadCloser, I do not know the size of the Snapshot data ahead of time. This forces me to compress the data first in a scratch area.

Any ideas for avoiding this?

@banks
Copy link
Member

banks commented Jun 21, 2024

Yeah we also jump through similar hoops in Vault currently where we actually have to read through the snapshot twice when we send it to a follower just because we don't know the size up front but have to fill in the size in the header.

It would be nicer if the InstallSnapshot RPC allowed for something like chunked encoding so that we could stream a large snapshot without knowing the size upfront but that would be a decent amount of work so the team that integrated raft into Vault choose to do the more expensive double read. It's not in a super performance critical part of the code so it's a little harder for us to prioritize the more complete fix, but if you'd be interested in proposing something I think we could make a strong case for making time to support and review it since it would impact Vault directly too!

I'm going to edit the issue title to reflect that this is a useful enhancement/feature request.

@banks banks changed the title Any way to avoid setting SnapshotMeta.Size? Make Snapshots streamable by not requiring a known size up front. Jun 21, 2024
@otoolep
Copy link
Contributor Author

otoolep commented Jun 21, 2024

Two paths occur to me:

  • some sort of chunked streaming as you say. This is probably the most polished implementation, and would be the most robust.
  • a simple flag in https://pkg.go.dev/github.com/hashicorp/raft#InstallSnapshotRequest which indicates that size should be ignored, and that transfer should be considered complete when the receiver receives EOF. In that case it would be up to the Snapshot store itself (the Sink implementation specifically) to validate the data received.

@otoolep
Copy link
Contributor Author

otoolep commented Jun 21, 2024

@banks --- in practise does Vault snapshot data ever get to 10+ GB? Do you see that in the field?

@banks
Copy link
Member

banks commented Jun 21, 2024

Yep for sure we do. And reading the snapshot from BoltDB means a bunch of disk IO if the file is larger than RAM available for page cache too so it would be a solid improvement for Vault... but typically because users with DBs that larger are using large servers too that probably do still have the entire file in page cache it's not something they perceive as a big problem performance wise. Snapshot install is also quite rare and they typically have disks. with plenty of spere IOPs because raft can only consume a relatively small number with a single serial writer...

@banks
Copy link
Member

banks commented Jun 21, 2024

should be ignored, and that transfer should be considered complete when the receiver receives EOF

This could be an option. I think we'd need to be careful to design that so that there is a trailer with a checksum etc. to ensure transient network EOF isn't mistaken for actual end of content etc. Also careful buffer size handling etc. so it's not a DOS vector, but that could work.

@otoolep
Copy link
Contributor Author

otoolep commented Jun 21, 2024

OK, thanks, let me think more about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants