Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store large objects hashes in the link object payload #263

Closed
roman-khimov opened this issue May 30, 2023 · 4 comments
Closed

Store large objects hashes in the link object payload #263

roman-khimov opened this issue May 30, 2023 · 4 comments
Assignees
Labels
enhancement Improving existing functionality I2 Regular impact S2 Regular significance U3 Regular
Milestone

Comments

@roman-khimov
Copy link
Member

  1. No matter what particular limit we're to choose in Object header size limitation #262 it'll be too low. We can't have arbitrarily large headers and even 4M headers only allow to have 8T objects. Yeah, they're big, but some HDD image can already easily be larger than that. And this 4M header is already impractical enough to operate with.
  2. Typical object payloads are 64M and that's enough for roughly 128T object. Seems to be good enough for today, even though it doesn't blow your mind.
  3. While I still think link objects should be simple, we can at least theoretically make them large as well, effectively removing this limit at all.

We can't break old objects, so split header can still remain (and arguably it can be more effective for smaller objects with 2-50 parts), but we need a new link object format with hashes stored in its payload. We can even do this without many changes to the wire protocol, just specify that missing "children" list means that hashes are encoded in the payload. Encode them without any wrappers, plain 32 bytes values one by one.

@cthulhu-rider
Copy link
Contributor

overall looks good, but

Encode them without any wrappers, plain 32 bytes values one by one

will tie our hands on a potentially necessary expansion of the format in the future. If for some reason there is no desire to define a clear structure of this data (which I consider more like saving on matches than justified), then let's leave at least the 1st byte as a marker: let 0 mean hash concatenation

@roman-khimov
Copy link
Member Author

This can be tied to the version of the object itself.

@cthulhu-rider
Copy link
Contributor

really can, I don’t think it’s more convenient, but it’s better than nothing

@roman-khimov
Copy link
Member Author

See also #264.

@roman-khimov roman-khimov added enhancement Improving existing functionality U4 Nothing urgent S2 Regular significance I2 Regular impact U3 Regular and removed U4 Nothing urgent labels Dec 20, 2023
@carpawell carpawell self-assigned this Jan 24, 2024
carpawell added a commit that referenced this issue Jan 24, 2024
It describes future protocol version's link object payload. Child objects list
will be moved from the header to the payload. This is done due to the header
size restrictions. Closes #263.

Signed-off-by: Pavel Karpy <[email protected]>
carpawell added a commit that referenced this issue Jan 25, 2024
It describes future protocol version's link object payload. Child objects list
will be moved from the header to the payload. This is done due to the header
size restrictions. Closes #263.

Signed-off-by: Pavel Karpy <[email protected]>
@roman-khimov roman-khimov added this to the v2.16.0 milestone Jan 30, 2024
carpawell added a commit that referenced this issue Jan 31, 2024
It describes future protocol version's link object payload. Child objects list
will be moved from the header to the payload. This is done due to the header
size restrictions. Closes #263.

Signed-off-by: Pavel Karpy <[email protected]>
carpawell added a commit that referenced this issue Jan 31, 2024
It describes future protocol version's link object payload. Child objects list
will be moved from the header to the payload. This is done due to the header
size restrictions. Closes #263.

Signed-off-by: Pavel Karpy <[email protected]>
carpawell added a commit that referenced this issue Feb 2, 2024
It describes future protocol version's link object payload. Child objects list
will be moved from the header to the payload. This is done due to the header
size restrictions. Closes #263.

Signed-off-by: Pavel Karpy <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improving existing functionality I2 Regular impact S2 Regular significance U3 Regular
Projects
None yet
Development

No branches or pull requests

3 participants