-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: Change tablet size calculation to not depend on the right key. #5656
Conversation
In badger, the right key might be a badger specific key that Dgraph cannot understand. To deal with these keys, a table is included in the size calculation if the next table starts with the same key. Related to DGRAPH-1358 and #5408.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 files reviewed, 3 unresolved discussions (waiting on @manishrjain, @martinmr, and @vvbalaji-dgraph)
a discussion (no related file):
See if we can simple UT.
worker/draft.go, line 1278 at r1 (raw file):
// same predicate. // We could later specifically iterate over these tables to get their estimated sizes. previousLeft = left.Attr
Lets simplify. Something like this:
if (need to update) {
updateSize()
}
prevLeft = left.Attr
prevSize = int64(tinfo.EstimatedSz)
worker/draft.go, line 1289 at r1 (raw file):
} // The last table has not been counted. Assign it to the predicate at the left of the table. updateSize(previousLeft, previousSize)
This will always be on over-estimate, which I think is probably ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comment on simplifying it.
Reviewable status: 0 of 1 files reviewed, 3 unresolved discussions (waiting on @manishrjain, @martinmr, and @vvbalaji-dgraph)
@parasssh @martinmr The For tables other than level 0, they wouldn't usually have an internal key as the right-most key. So what do we gain by this change? What is the benefit of checking for the An important thing to remember is that all the versions of a single key will be stored in a single file. So if you have 100,000 versions of |
The thought process was that So, I guess, in that case, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 1 files reviewed, 3 unresolved discussions (waiting on @manishrjain, @parasssh, and @vvbalaji-dgraph)
a discussion (no related file):
Previously, parasssh wrote…
See if we can simple UT.
Hard to do a manual test because we'd have to wait five minutes. I tested this by live loading the 1 million dataset and waiting for table sizes to be calculated.
worker/draft.go, line 1278 at r1 (raw file):
Previously, parasssh wrote…
Lets simplify. Something like this:
if (need to update) { updateSize() } prevLeft = left.Attr prevSize = int64(tinfo.EstimatedSz)
Done.
worker/draft.go, line 1289 at r1 (raw file):
Previously, parasssh wrote…
This will always be on over-estimate, which I think is probably ok.
Yes, I thought it was ok to do.
…graph-io#5656) In badger, the right key might be a badger specific key that Dgraph cannot understand. To deal with these keys, a table is included in the size calculation if the next table starts with the same key. Related to DGRAPH-1358 and dgraph-io#5408.
In badger, the right key might be a badger specific key that Dgraph
cannot understand. To deal with these keys, a table is included in
the size calculation if the next table starts with the same key.
Related to DGRAPH-1358 and #5408.
This change is