Optimize mutation and delta application #2987

manishrjain · 2019-02-07T23:36:27Z

Mutation and Txn Commit/Abort application happen in a serial order (among other less frequent operations). So, the latency of each operation has a cascading effect on the overall latency of transaction.

This PR optimizes mutation application by dividing up the task of each mutation proposal among goroutines. For a batch of 2000 NQuads, this would be divided up among 4 goroutines.

For commits, this PR makes 2 changes. It removes relying upon a Badger lookup to determine if the commit is being done at a lower version than the latest version of the key, instead picking this information up from the posting list. Secondly, it switches how txns are run to create fewer larger transactions instead of one txn per key-value.

With this PR, mutate latency goes from original 13.5ms to 7.9ms (>40% drop). Commit latency goes from 39ms to 19.7ms (50% drop). Either of these changes shouldn't have any side-effects and should be NOOPs in terms of functionality.

To make these measurements, this PR also adds metrics around each of these functions.

This change is

… information about posting list version to detect if we are writing to a lower version. And create bigger fewer txns. This change leads to a 50% drop in latency (from 39ms to 19.5ms).

golangcibot · 2019-02-07T23:39:07Z

worker/draft.go

+				}
+				// tags = append(tags, tag.Upsert(x.KeyError, perr.Error()))
+				ms := x.SinceInMilliseconds(start)
+				ostats.RecordWithTags(context.Background(), tags, x.LatencyMs.M(ms))


Error return value of ostats.RecordWithTags is not checked (from errcheck)

…. So, no assumptions must be made about the order in which they get executed.

martinmr

Reviewed 4 of 9 files at r1, 3 of 5 files at r2.
Reviewable status: 7 of 8 files reviewed, 4 unresolved discussions (waiting on @manishrjain)

posting/list.go, line 80 at r2 (raw file):

}

func (l *List) MaxVersion() uint64 {

So far MaxVersion is only used inside the posting package so it'd be better to call it maxVersion so that it's not exported.

We can always rename it back to MaxVersion if it turns out other packages need it.

posting/mvcc.go, line 106 at r2 (raw file):

	txn.Unlock()

	// TODO: Simplify this. All we need to do is to get the PL for the key, and if it has the

Is this TODO still accurate?

posting/mvcc.go, line 112 at r2 (raw file):

	var idx int
	for idx < len(keys) {

What's the purpose of having the writer.Update statement wrapped in a for loop?

At first, I thought it was so that if writer.Update fails, you can retry from the point the previous call
of writer.Update stopped but I see that CommitToDisk returns if writer.Update returns an error.

I think a short comment explaining the reason would be helpful.

worker/draft.go, line 33 at r2 (raw file):

	ostats "go.opencensus.io/stats"
	tag "go.opencensus.io/tag"

package is already called tag (https://github.com/census-instrumentation/opencensus-go/blob/master/tag/context.go#L16) so you can just do

"go.opencensus.io/tag"

or perhaps you meant to call this package otag like that rest of the packages imported from opencensus.

manishrjain

Reviewable status: 4 of 8 files reviewed, 4 unresolved discussions (waiting on @martinmr)

posting/list.go, line 80 at r2 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

So far MaxVersion is only used inside the posting package so it'd be better to call it maxVersion so that it's not exported.

We can always rename it back to MaxVersion if it turns out other packages need it.

Done.

posting/mvcc.go, line 106 at r2 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

Is this TODO still accurate?

Nope. Removed.

posting/mvcc.go, line 112 at r2 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

What's the purpose of having the writer.Update statement wrapped in a for loop?

At first, I thought it was so that if writer.Update fails, you can retry from the point the previous call
of writer.Update stopped but I see that CommitToDisk returns if writer.Update returns an error.

I think a short comment explaining the reason would be helpful.

Add a comment. writer.Update could return early if txn gets too big. In that case, we would still have to process the remaining keys.

worker/draft.go, line 33 at r2 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

package is already called tag (https://github.com/census-instrumentation/opencensus-go/blob/master/tag/context.go#L16) so you can just do

"go.opencensus.io/tag"

or perhaps you meant to call this package otag like that rest of the packages imported from opencensus.

Done. Looks like it was auto-imported like this.

martinmr

golangcibot is still complaining about the return value of ostats.RecordWithTags not being used but I'll let you decide what to do with that warning.

Reviewable status: 4 of 8 files reviewed, all discussions resolved (waiting on @martinmr)

manishrjain · 2019-02-08T00:37:44Z

No need to check that error.

Mutation and Txn Commit/Abort application happen in a serial order (among other less frequent operations). So, the latency of each operation has a cascading effect on the overall latency of transaction. This PR optimizes mutation application by dividing up the task of each mutation proposal among goroutines. For a batch of 2000 NQuads, this would be divided up among 4 goroutines. For commits, this PR makes 2 changes. It removes relying upon a Badger lookup to determine if the commit is being done at a lower version than the latest version of the key, instead, it picks this information up from the posting list. Secondly, it switches how txns are run to create fewer larger transactions instead of one txn per key-value. With this PR, mutate latency goes from original 13.5ms to 7.9ms (>40% drop). Commit latency goes from 39ms to 19.7ms (50% drop). Either of these changes shouldn't have any side-effects and should be NOOPs in terms of functionality. To make these measurements, this PR also adds metrics around each of these functions. BREAKING: With these changes, the mutations within a single call are rearranged. So, no assumptions must be made about the order in which they get executed. Changes: * Allow mutations to happen concurrently * Optimize how txns get committed. In particular, use the already known information about posting list version to detect if we are writing to a lower version. And create bigger fewer txns. This change leads to a 50% drop in latency (from 39ms to 19.5ms). * Bring >= back. * Unlock in defer dumbass. * Address Martin's comments

manishrjain added 4 commits February 6, 2019 20:34

Allow mutations to happen concurrently

0400357

Revert some unnecessary changes.

f6f88fd

Optimize how txns get committed. In particular, use the already known…

025cac8

… information about posting list version to detect if we are writing to a lower version. And create bigger fewer txns. This change leads to a 50% drop in latency (from 39ms to 19.5ms).

Merge master

06f3417

golangcibot reviewed Feb 7, 2019

View reviewed changes

manishrjain added 7 commits February 7, 2019 15:39

Revert unnecessary changes

561810a

Revert unnecessary changes

5955dc4

Revert unnecessary changes

4ec07fd

Bring >= back.

273daa0

decrease vert space

b35b0f1

Unlock in defer dumbass.

c4ccfae

With these changes, the mutations within a single call are rearranged…

3c1972e

…. So, no assumptions must be made about the order in which they get executed.

martinmr suggested changes Feb 8, 2019

View reviewed changes

Address Martin's comments

d461b8f

manishrjain commented Feb 8, 2019

View reviewed changes

martinmr reviewed Feb 8, 2019

View reviewed changes

manishrjain merged commit 2197155 into master Feb 8, 2019

manishrjain deleted the mrjn/mutate-concurrently branch February 8, 2019 00:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize mutation and delta application #2987

Optimize mutation and delta application #2987

manishrjain commented Feb 7, 2019 •

edited

Loading

golangcibot Feb 7, 2019

martinmr left a comment

manishrjain left a comment

martinmr left a comment

manishrjain commented Feb 8, 2019

Optimize mutation and delta application #2987

Optimize mutation and delta application #2987

Conversation

manishrjain commented Feb 7, 2019 • edited Loading

golangcibot Feb 7, 2019

Choose a reason for hiding this comment

martinmr left a comment

Choose a reason for hiding this comment

manishrjain left a comment

Choose a reason for hiding this comment

martinmr left a comment

Choose a reason for hiding this comment

manishrjain commented Feb 8, 2019

manishrjain commented Feb 7, 2019 •

edited

Loading