-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize mutation and delta application #2987
Conversation
… information about posting list version to detect if we are writing to a lower version. And create bigger fewer txns. This change leads to a 50% drop in latency (from 39ms to 19.5ms).
} | ||
// tags = append(tags, tag.Upsert(x.KeyError, perr.Error())) | ||
ms := x.SinceInMilliseconds(start) | ||
ostats.RecordWithTags(context.Background(), tags, x.LatencyMs.M(ms)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Error return value of ostats.RecordWithTags
is not checked (from errcheck
)
…. So, no assumptions must be made about the order in which they get executed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 9 files at r1, 3 of 5 files at r2.
Reviewable status: 7 of 8 files reviewed, 4 unresolved discussions (waiting on @manishrjain)
posting/list.go, line 80 at r2 (raw file):
} func (l *List) MaxVersion() uint64 {
So far MaxVersion is only used inside the posting package so it'd be better to call it maxVersion so that it's not exported.
We can always rename it back to MaxVersion if it turns out other packages need it.
posting/mvcc.go, line 106 at r2 (raw file):
txn.Unlock() // TODO: Simplify this. All we need to do is to get the PL for the key, and if it has the
Is this TODO still accurate?
posting/mvcc.go, line 112 at r2 (raw file):
var idx int for idx < len(keys) {
What's the purpose of having the writer.Update statement wrapped in a for loop?
At first, I thought it was so that if writer.Update fails, you can retry from the point the previous call
of writer.Update stopped but I see that CommitToDisk returns if writer.Update returns an error.
I think a short comment explaining the reason would be helpful.
worker/draft.go, line 33 at r2 (raw file):
ostats "go.opencensus.io/stats" tag "go.opencensus.io/tag"
package is already called tag (https://github.com/census-instrumentation/opencensus-go/blob/master/tag/context.go#L16) so you can just do
"go.opencensus.io/tag"
or perhaps you meant to call this package otag
like that rest of the packages imported from opencensus.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 4 of 8 files reviewed, 4 unresolved discussions (waiting on @martinmr)
posting/list.go, line 80 at r2 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
So far MaxVersion is only used inside the posting package so it'd be better to call it maxVersion so that it's not exported.
We can always rename it back to MaxVersion if it turns out other packages need it.
Done.
posting/mvcc.go, line 106 at r2 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
Is this TODO still accurate?
Nope. Removed.
posting/mvcc.go, line 112 at r2 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
What's the purpose of having the writer.Update statement wrapped in a for loop?
At first, I thought it was so that if writer.Update fails, you can retry from the point the previous call
of writer.Update stopped but I see that CommitToDisk returns if writer.Update returns an error.I think a short comment explaining the reason would be helpful.
Add a comment. writer.Update could return early if txn gets too big. In that case, we would still have to process the remaining keys.
worker/draft.go, line 33 at r2 (raw file):
Previously, martinmr (Martin Martinez Rivera) wrote…
package is already called tag (https://github.com/census-instrumentation/opencensus-go/blob/master/tag/context.go#L16) so you can just do
"go.opencensus.io/tag"
or perhaps you meant to call this package
otag
like that rest of the packages imported from opencensus.
Done. Looks like it was auto-imported like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
golangcibot is still complaining about the return value of ostats.RecordWithTags
not being used but I'll let you decide what to do with that warning.
Reviewable status: 4 of 8 files reviewed, all discussions resolved (waiting on @martinmr)
No need to check that error. |
Mutation and Txn Commit/Abort application happen in a serial order (among other less frequent operations). So, the latency of each operation has a cascading effect on the overall latency of transaction. This PR optimizes mutation application by dividing up the task of each mutation proposal among goroutines. For a batch of 2000 NQuads, this would be divided up among 4 goroutines. For commits, this PR makes 2 changes. It removes relying upon a Badger lookup to determine if the commit is being done at a lower version than the latest version of the key, instead, it picks this information up from the posting list. Secondly, it switches how txns are run to create fewer larger transactions instead of one txn per key-value. With this PR, mutate latency goes from original 13.5ms to 7.9ms (>40% drop). Commit latency goes from 39ms to 19.7ms (50% drop). Either of these changes shouldn't have any side-effects and should be NOOPs in terms of functionality. To make these measurements, this PR also adds metrics around each of these functions. BREAKING: With these changes, the mutations within a single call are rearranged. So, no assumptions must be made about the order in which they get executed. Changes: * Allow mutations to happen concurrently * Optimize how txns get committed. In particular, use the already known information about posting list version to detect if we are writing to a lower version. And create bigger fewer txns. This change leads to a 50% drop in latency (from 39ms to 19.5ms). * Bring >= back. * Unlock in defer dumbass. * Address Martin's comments
Mutation and Txn Commit/Abort application happen in a serial order (among other less frequent operations). So, the latency of each operation has a cascading effect on the overall latency of transaction. This PR optimizes mutation application by dividing up the task of each mutation proposal among goroutines. For a batch of 2000 NQuads, this would be divided up among 4 goroutines. For commits, this PR makes 2 changes. It removes relying upon a Badger lookup to determine if the commit is being done at a lower version than the latest version of the key, instead, it picks this information up from the posting list. Secondly, it switches how txns are run to create fewer larger transactions instead of one txn per key-value. With this PR, mutate latency goes from original 13.5ms to 7.9ms (>40% drop). Commit latency goes from 39ms to 19.7ms (50% drop). Either of these changes shouldn't have any side-effects and should be NOOPs in terms of functionality. To make these measurements, this PR also adds metrics around each of these functions. BREAKING: With these changes, the mutations within a single call are rearranged. So, no assumptions must be made about the order in which they get executed. Changes: * Allow mutations to happen concurrently * Optimize how txns get committed. In particular, use the already known information about posting list version to detect if we are writing to a lower version. And create bigger fewer txns. This change leads to a 50% drop in latency (from 39ms to 19.5ms). * Bring >= back. * Unlock in defer dumbass. * Address Martin's comments
Mutation and Txn Commit/Abort application happen in a serial order (among other less frequent operations). So, the latency of each operation has a cascading effect on the overall latency of transaction.
This PR optimizes mutation application by dividing up the task of each mutation proposal among goroutines. For a batch of 2000 NQuads, this would be divided up among 4 goroutines.
For commits, this PR makes 2 changes. It removes relying upon a Badger lookup to determine if the commit is being done at a lower version than the latest version of the key, instead picking this information up from the posting list. Secondly, it switches how txns are run to create fewer larger transactions instead of one txn per key-value.
With this PR, mutate latency goes from original 13.5ms to 7.9ms (>40% drop). Commit latency goes from 39ms to 19.7ms (50% drop). Either of these changes shouldn't have any side-effects and should be NOOPs in terms of functionality.
To make these measurements, this PR also adds metrics around each of these functions.
This change is