-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
statistics: refactor package. #2913
Conversation
6ac64fc
to
4d46d80
Compare
We should find a safe way to do DDL in bootstrap stage. |
3c8e5f1
to
3802853
Compare
@lamxTyler PTAL |
statistics/statscache.go
Outdated
tbl = statistics.PseudoTable(tableInfo) | ||
} | ||
SetStatisticsTableCache(tableID, tbl, version) | ||
log.Errorf("Error occured when convert pb table for table id %d", tableID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This log should be updated.
statistics/statistics.go
Outdated
return errors.Trace(err) | ||
} | ||
for _, col := range t.Columns { | ||
col.saveToStorage(ctx, t.Info.ID, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why ignore the error?
bootstrap.go
Outdated
index tbl(table_id) | ||
);` | ||
|
||
// CreateStatsBucketsTable stores the histogram info for every table columns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to point out that the count
is not the prefix sum?
@@ -68,6 +68,43 @@ type Column struct { | |||
Repeats []int64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment in line 63-65 can be removed? Because we do not serialize it now.
statistics/statistics.go
Outdated
if err != nil { | ||
return nil, errors.Trace(err) | ||
} | ||
func colStatsFromStorage(ctx context.Context, tableID int64, colID int64, tp *types.FieldType, distinct int64, isIndex bool) (*Column, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why sperate the function (c *Column) saveToStorage
and this function? It is better to put them in the same place as the Table
do.
statistics/statistics.go
Outdated
return nil, errors.Trace(err) | ||
bucketSize := len(rows) | ||
if bucketSize == 0 { | ||
return nil, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why return nil? Even if it is an empty table, currently it will have at least 1 bucket. bucketSize
is 0 may mean that this row is added after analyze.
statistics/statistics.go
Outdated
if err != nil { | ||
return nil, errors.Trace(err) | ||
} | ||
table.Indices = append(table.Indices, col) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should also check if this index is deleted.
statistics/statistics.go
Outdated
} | ||
} | ||
} | ||
return table, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if there are newly added columns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that load table from storage in this way needs too much check. May be it is better to iterate through the table info to load stats.
@shenli It's safe to create table when bootstrap. |
statistics/statistics.go
Outdated
var col *Column | ||
for _, idxInfo := range info.Indices { | ||
if idxID == idxInfo.ID { | ||
col, err = colStatsFromStorage(ctx, info.ID, idxID, nil, distinct, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will handle all problems like this in next PR.
} | ||
} | ||
if indexCount != len(info.Indices) { | ||
return nil, errors.New("The number of indices doesn't match with the schema") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it need .
at the end?
(update: golint said no.
statistics/statistics.go
Outdated
} else { | ||
count = c.Numbers[i] - c.Numbers[i-1] | ||
} | ||
// set value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add .
at the end.
Rest LGTM. |
LGTM |
statistics/statistics.go
Outdated
Info: info, | ||
Count: count, | ||
} | ||
selSQL := fmt.Sprintf("select * from mysql.stats_columns where table_id = %d", info.ID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be easier to read and have better compatibility if we list the columns in the SQL.
bootstrap.go
Outdated
count bigint(64) NOT NULL, | ||
repeats bigint(64) NOT NULL, | ||
value blob NOT NULL, | ||
index tbl(table_id, col_id, index_id, bucket_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be unique.
bootstrap.go
Outdated
distinct_ratio double(64) NOT NULL DEFAULT 0, | ||
use_count_to_estimate tinyint(2) NOT NULL DEFAULT 0, | ||
version bigint(64) unsigned NOT NULL DEFAULT 0, | ||
index tbl(table_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unique tbl(table_id, col_id, index_id)?
statistics/statistics.go
Outdated
Indices []*Column | ||
Count int64 // Total row count in a table. | ||
Pseudo bool | ||
// TableStatsFromStorage load table stats info from storage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/load/loads
statistics/statistics.go
Outdated
if err != nil { | ||
return nil, errors.Trace(err) | ||
} | ||
// indexCount and columnCount records the number of indices and columns in table stats. If the number don't match with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/records/record
statistics/statistics.go
Outdated
return nil | ||
} | ||
|
||
func colStatsFromStorage(ctx context.Context, tableID int64, colID int64, tp *types.FieldType, distinct int64, isIndex bool) (*Column, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use different structs to represent column stats and index stats.
index stats may contain multiple columns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be changed in next pr
statistics/statscache_test.go
Outdated
testKit := testkit.NewTestKit(c, store) | ||
testKit.MustExec("use test") | ||
testKit.MustExec("create table t (c1 int, c2 int)") | ||
for i := 0; i < 100000; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test may take too much time.
statistics/statscache_test.go
Outdated
c.Assert(err, IsNil) | ||
statsTbl2, err := statistics.TableStatsFromStorage(se, tableInfo, 100000) | ||
c.Assert(err, IsNil) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check count of stats.
bootstrap.go
Outdated
@@ -116,14 +116,38 @@ const ( | |||
UNIQUE KEY name (name) | |||
) ENGINE=InnoDB DEFAULT CHARSET=utf8 STATS_PERSISTENT=0 COMMENT='help topics';` | |||
|
|||
// CreateStatsMetaTable store's the meta of table statistics. | |||
// CreateStatsMetaTable stores the meta of table statistics. | |||
CreateStatsMetaTable = `CREATE TABLE if not exists mysql.stats_meta ( | |||
version bigint(64) unsigned NOT NULL, | |||
table_id bigint(64) NOT NULL, | |||
modify_count bigint(64) NOT NULL DEFAULT 0, | |||
count bigint(64) unsigned NOT NULL DEFAULT 0, | |||
index idx_ver(version) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need an index for table_id.
|
||
// SaveToStorage saves stats table to storage. | ||
func (t *Table) SaveToStorage(ctx context.Context) error { | ||
_, err := ctx.(sqlexec.SQLExecutor).Execute("begin") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check this error.
table := &Table{ | ||
Info: info, | ||
Count: count, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use table
as an argument directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this way is clearer.
} | ||
} | ||
} | ||
if indexCount != len(info.Indices) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition can move to line366 that can make code clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't do this. The check must be after loop.
7d7abba
to
915a8b1
Compare
return nil | ||
} | ||
go func(do *Domain) { | ||
ticker := time.NewTicker(lease) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add the function of Stop
for the ticker
.
1e973ef
to
9ad30a3
Compare
@zimulala PTAL |
5b9fc71
to
fc8c164
Compare
bootstrap.go
Outdated
);` | ||
|
||
// CreateStatsColsTable stores the statistics of table columns. | ||
CreateStatsColsTable = `CREATE TABLE if not exists mysql.stats_columns ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about name the table stats_histgram
?
Because it contains both index and column.
type Table struct { | ||
Info *model.TableInfo | ||
Columns []*Column | ||
Indices []*Column |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about use a different type for indices?
An index may contain multiple columns, it has a different algorithm to calculate partial column condition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change it in next pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a different opinion. What you said is true, but they have more in common than difference, especially when you consider how they are used, for example, get row count in a range, once you encode the indices, the rest is the same. If we use a different type, will the same logic be written twice? I do not know if this is worth the complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lamxTyler
We can extract the same logic into a type like histogram
, then wrap it with Column
and Index
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
LGTM |
change the way of store stats info.