plan/statistics: concurrent build columns #2713

alivxxx · 2017-02-23T02:43:34Z

PTAL @coocood @hanfei19910905 @zimulala @tiancaiamao @winoros

hanfei1991 · 2017-02-23T05:53:18Z

Any bench result to prove the improvement?

alivxxx · 2017-02-24T03:19:47Z

Bench on a table of 9 coulmns,4 single column index,3 double column index,100000 random generated rows using my laptop.
Before the change,the analyze time is 12.47s, after the change, analyze time is 6.36s.
I find that if concurrent build column, the time for single index scan is 4 times slower than sequentially build. So while the concurrency is 8, there is only 2 times faster.
@hanfei1991

coocood · 2017-02-28T01:55:03Z

The execution time is not important for analyze table, we need it to have minimum impact on the system.
So I think we should not make analyze concurrent.

hanfei1991 · 2017-03-07T07:59:13Z

Execution time is very important for user @coocood

hanfei1991 · 2017-03-07T08:10:51Z

plan/statistics/statistics.go

@@ -395,12 +396,13 @@ func (t *Table) build4SortedColumn(sc *variable.StatementContext, offset int, re
 	}
 	var valuesPerBucket, lastNumber, bucketIdx int64 = 1, 0, 0
 	knowCount := true


We don't need know count in concurrent enviroument

…build-stats

coocood · 2017-03-07T08:48:22Z

@hanfei1991
At least we should keep the original none-concurrent execution and make it configurable.

ngaut · 2017-03-07T08:49:19Z

Can we use a session variable to control the concurrency ?

zimulala · 2017-03-07T09:39:19Z

PTAL @lamxTyler

hanfei1991 · 2017-03-07T15:53:51Z

plan/statistics/statistics_test.go

@@ -190,21 +190,21 @@ func (s *testStatisticsSuite) TestTable(c *C) {
 	c.Check(count, Equals, int64(1))
 	count, err = col.LessRowCount(sc, types.NewIntDatum(20000))
 	c.Check(err, IsNil)
-	c.Check(count, Equals, int64(19980))
+	c.Check(count, Equals, int64(19984))


Why the test result be changed ?

Because there was a bug when caculate the bucketIdx after merge buckets.

hanfei1991 · 2017-03-07T15:56:42Z

Well, we can add a concurrency argument for analyze, just like what we do for Join concurrency. If you don't think it will add the complexity. Now the question is , what should the default value be ?
@coocood @ngaut

tiancaiamao

I agree with @coocood , concurrent introduce too much complexity, which is our dangerous enemy. It bring than less benefit than gain.

tiancaiamao · 2017-03-08T03:42:03Z

plan/statistics/statistics.go

+		go b.buildMultiColumns(t, offsets, i*groupSize, isSorted, doneCh)
+	}
+	for range splittedOffsets {
+		err := <-doneCh


If error happened, then this function return, leave doneCh alone.
worker goroutine still waiting for write to doneCh, and would block forever, then goroutine leak .....

I was following the logic at https://github.com/pingcap/tidb/blob/master/domain/domain.go#L103.

@tiancaiamao The length of channel is len(splittedOffsets). It will never be blocked. PTAL

…build-stats

coocood · 2017-03-13T03:41:37Z

sessionctx/variable/sysvar.go

@@ -598,6 +599,7 @@ var defaultSysVars = []*SysVar{
 	{ScopeSession, TiDBSkipConstraintCheck, "0"},
 	{ScopeSession, TiDBSkipDDLWait, "0"},
 	{ScopeSession, TiDBOptAggPushDown, "ON"},
+	{ScopeSession, BuildStatsConcurrencyVar, "4"},


I think the default should be 1.

To minmum performance impact, reduce the risk.

hanfei1991 · 2017-03-13T07:04:48Z

LGTM

hanfei1991 · 2017-03-14T07:08:29Z

@coocood @tiancaiamao PTAL

shenli · 2017-03-22T07:24:27Z

@lamxTyler Please resolve the conflicts.

shenli · 2017-03-22T07:51:37Z

Rest LGTM

…build-stats

shenli

LGTM

plan/statistics: concurrent build columns

b5ef1a0

plan/statistics: fix data race

e5209dc

alivxxx closed this Mar 1, 2017

alivxxx deleted the xhb/concurrent-build-stats branch March 1, 2017 14:23

hanfei1991 restored the xhb/concurrent-build-stats branch March 7, 2017 07:58

hanfei1991 reopened this Mar 7, 2017

hanfei1991 reviewed Mar 7, 2017

View reviewed changes

alivxxx added 2 commits March 7, 2017 16:11

Merge branch 'master' of github.com:pingcap/tidb into xhb/concurrent-…

a0c0d2c

…build-stats

address comment and fix bug of caculation of bucketIdx after merge

ae85dca

hanfei1991 reviewed Mar 7, 2017

View reviewed changes

tiancaiamao reviewed Mar 8, 2017

View reviewed changes

alivxxx added 8 commits March 8, 2017 21:19

address comment

51e27eb

fix typo

30982c5

add session variable to control concurrency

3a3634a

fix typo

ff334b9

fix session variables

65f963e

Merge branch 'master' of github.com:pingcap/tidb into xhb/concurrent-…

1d719a8

…build-stats

plan: change Sv to Ctx

2c2a3ec

plan/statistics, sessionctx: refactor code

d8d4691

coocood reviewed Mar 13, 2017

View reviewed changes

zimulala added the status/LGT1 Indicates that a PR has LGTM 1. label Mar 20, 2017

Merge branch 'master' of github.com:pingcap/tidb into xhb/concurrent-…

39f98cc

…build-stats

shenli approved these changes Mar 22, 2017

View reviewed changes

Merge branch 'master' into xhb/concurrent-build-stats

9a53627

shenli merged commit 60bcd98 into master Mar 22, 2017

shenli deleted the xhb/concurrent-build-stats branch March 22, 2017 14:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

plan/statistics: concurrent build columns #2713

plan/statistics: concurrent build columns #2713

alivxxx commented Feb 23, 2017

hanfei1991 commented Feb 23, 2017

alivxxx commented Feb 24, 2017 •

edited

Loading

coocood commented Feb 28, 2017

hanfei1991 commented Mar 7, 2017

hanfei1991 Mar 7, 2017

coocood commented Mar 7, 2017

ngaut commented Mar 7, 2017

zimulala commented Mar 7, 2017

hanfei1991 Mar 7, 2017

alivxxx Mar 8, 2017 •

edited

Loading

hanfei1991 commented Mar 7, 2017

tiancaiamao left a comment

tiancaiamao Mar 8, 2017

alivxxx Mar 8, 2017

hanfei1991 Mar 13, 2017

coocood Mar 13, 2017

hanfei1991 Mar 13, 2017

coocood Mar 15, 2017

hanfei1991 commented Mar 13, 2017

hanfei1991 commented Mar 14, 2017

shenli commented Mar 22, 2017

shenli commented Mar 22, 2017

shenli left a comment

plan/statistics: concurrent build columns #2713

plan/statistics: concurrent build columns #2713

Conversation

alivxxx commented Feb 23, 2017

hanfei1991 commented Feb 23, 2017

alivxxx commented Feb 24, 2017 • edited Loading

coocood commented Feb 28, 2017

hanfei1991 commented Mar 7, 2017

Choose a reason for hiding this comment

coocood commented Mar 7, 2017

ngaut commented Mar 7, 2017

zimulala commented Mar 7, 2017

Choose a reason for hiding this comment

alivxxx Mar 8, 2017 • edited Loading

Choose a reason for hiding this comment

hanfei1991 commented Mar 7, 2017

tiancaiamao left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hanfei1991 commented Mar 13, 2017

hanfei1991 commented Mar 14, 2017

shenli commented Mar 22, 2017

shenli commented Mar 22, 2017

shenli left a comment

Choose a reason for hiding this comment

alivxxx commented Feb 24, 2017 •

edited

Loading

alivxxx Mar 8, 2017 •

edited

Loading