-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: add trie tree based workload index recommendations #106525
sql: add trie tree based workload index recommendations #106525
Conversation
The first commit is from #105017. |
b199950
to
5c2a6aa
Compare
d16e6ac
to
10fcdab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! I think you need some unit tests for index_trie.go
. It would be good to also add some more complex logictests with multiple tables, several index recommendations that overlap in different ways, different sets of storing columns, etc.
Otherwise, most of my comments below are nits.
Reviewed 11 of 11 files at r1, 6 of 7 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @msirek and @qiyanghe1998)
-- commits
line 44 at r3:
nit: to use the remove the <-- needs a rewrite
pkg/sql/opt/workloadindexrec/index_trie.go
line 21 at r3 (raw file):
// TrieNode is an implementation of the node of a Trie-tree. // // TrieNode stores the indexed columns, storing columns, father node and the indexed
nit: father -> parent
pkg/sql/opt/workloadindexrec/index_trie.go
line 22 at r3 (raw file):
// // TrieNode stores the indexed columns, storing columns, father node and the indexed // column represented by the node (used for assign storings).
nit: for assign -> to assign
pkg/sql/opt/workloadindexrec/index_trie.go
line 23 at r3 (raw file):
// TrieNode stores the indexed columns, storing columns, father node and the indexed // column represented by the node (used for assign storings). type TrieNode struct {
Does this struct and all of its elements need to be exported?
pkg/sql/opt/workloadindexrec/index_trie.go
line 25 at r3 (raw file):
type TrieNode struct { Children map[tree.IndexElem]*TrieNode Storing map[string]bool
nit: since the only bool
value stored is true, it's slightly more efficient to use struct{}
pkg/sql/opt/workloadindexrec/index_trie.go
line 26 at r3 (raw file):
Children map[tree.IndexElem]*TrieNode Storing map[string]bool Fa *TrieNode
nit: Fa -> Parent
pkg/sql/opt/workloadindexrec/index_trie.go
line 27 at r3 (raw file):
Storing map[string]bool Fa *TrieNode Col tree.IndexElem
nit: seems like there is some inefficiency with storing two copies of this structure (one here and one in the map). Could we use a pointer and/or eliminate one copy?
pkg/sql/opt/workloadindexrec/index_trie.go
line 39 at r3 (raw file):
} // NewTrie returns a new trie tree
nit: end comment with a period.
pkg/sql/opt/workloadindexrec/index_trie.go
line 96 at r3 (raw file):
// removeStoringCoveredByLeaf checks whether the storing is covered by the leaf nodes // by depth-first search (DFS). func removeStoringCoveredByLeaf(node *TrieNode, restStoring map[string]bool) bool {
Looks like this function is all-or-nothing: Either all storing columns are covered by one leaf node path, or we do not remove any. Would it be worth removing a subset of the storing columns if only some of them are covered? (Feel free to just leave a TODO -- this doesn't need to block this PR)
pkg/sql/opt/workloadindexrec/index_trie.go
line 98 at r3 (raw file):
func removeStoringCoveredByLeaf(node *TrieNode, restStoring map[string]bool) bool { // nothing else we need to cover for the storing // even if we have not reached the leaf node.
nit: Start comment with a capital letter and end comment with a period (here and below).
pkg/sql/opt/workloadindexrec/index_trie.go
line 158 at r3 (raw file):
if len(node.Storing) > 0 { // Assign the storing of node to the shallowLeaf, some columns // may be covered along the path from node to the shallowLeaf
This is a little bit confusing. It might help to include an ascii art example to show what's happening.
pkg/sql/opt/workloadindexrec/index_trie.go
line 187 at r3 (raw file):
} // collectAllLeaves collects all the indexes represented by the leaf nodes recursively
nit: end comment with period.
pkg/sql/opt/workloadindexrec/workload_indexrecs.go
line 64 at r3 (raw file):
} // collectIndexRecs collects all the index recommendations stored in the system.statement_statistics
nit: comment should be <= 80 columns wide
pkg/sql/opt/workloadindexrec/workload_indexrecs.go
line 78 at r3 (raw file):
} // Since Alter index only makes invisible indexes visible, skip it for now.
nit: I'd move this comment down to where you are ignoring the ALTER
pkg/sql/opt/workloadindexrec/workload_indexrecs.go
line 88 at r3 (raw file):
for ok, err = indexRecs.Next(ctx); ok; ok, err = indexRecs.Next(ctx) { if !ok || err != nil {
I think we should just return err
if it's not nil. Also, I think ok
is typically false if there is an error, so you should probably check the value of err
outside of this loop too.
pkg/sql/opt/workloadindexrec/workload_indexrecs.go
line 96 at r3 (raw file):
indexStr, ok := index.(*tree.DString) if !ok { fmt.Println(index.String() + " is not a string!")
This should just be an error that gets returned
pkg/sql/opt/workloadindexrec/workload_indexrecs.go
line 102 at r3 (raw file):
indexStrArr := r.FindStringSubmatch(string(*indexStr)) if indexStrArr == nil { fmt.Println(string(*indexStr) + " is not a valid index recommendation!")
should this be an error? Or do you just want to log some info? (in which case use log
-- fmt.Println
is just for debugging)
pkg/sql/opt/workloadindexrec/workload_indexrecs.go
line 113 at r3 (raw file):
stmts, err := p.Parse(indexStrArr[2]) if err != nil { fmt.Println(indexStrArr[2] + " is not a valid index operation!")
just return err here
pkg/sql/opt/workloadindexrec/workload_indexrecs.go
line 147 at r3 (raw file):
} // extractIndexCovering pushs down the storing part of the internal nodes: find whether it
nit: pushs -> pushes
pkg/sql/sem/builtins/generator_builtins.go
line 287 at r3 (raw file):
tree.ParamTypes{}, types.String, makeWorkloadIndexRecsGeneratorFactory(-1, -1),
nit: add comments like makeWorkloadIndexRecsGeneratorFactory(-1 /* tsID */, -1 /* bgtId */),
pkg/sql/sem/builtins/generator_builtins.go
line 1123 at r3 (raw file):
// makeWorkloadIndexRecsGeneratorFactory uses the arrayValueGenerator to // return all the index recommendations as an array of strings. // tsId, bgtId mean the index for the arguments
nit: I'd call these args timestampIdx
and budgetIdx
. End comment with period.
pkg/sql/sem/builtins/generator_builtins.go
line 1133 at r3 (raw file):
ts = tree.MustBeDTimestampTZ(args[tsId]) } else { ts = tree.DTimestampTZ{}
I don't think this is the minimum value. You might want to use tree.MinSupportedTime
.
pkg/sql/sem/builtins/generator_builtins.go
line 1137 at r3 (raw file):
if bgtId != -1 { budget, err = humanizeutil.ParseBytes(string(tree.MustBeDString(args[bgtId])))
You're not really using the budget yet, right? I would just return an error for now if someone tries to specify a budget.
pkg/sql/sem/builtins/generator_builtins.go
line 1142 at r3 (raw file):
} } else { budget = (1 << 63) - 1
nit: can use math.MaxInt64
to make it clearer what this is
pkg/sql/logictest/testdata/logic_test/workload_indexrecs
line 46 at r3 (raw file):
# get workload index-recs with time filter
It would be good to add another test for a more recent timestamp that doesn't include the index rec.
pkg/sql/logictest/testdata/logic_test/workload_indexrecs
line 132 at r3 (raw file):
statement ok
nit: add a comment to explain the purpose of this test
Previously, rytaft (Rebecca Taft) wrote…
According to my RFC, I need to assign the storing part to the shallowest leaf node. Here is an example below: a store (b, c, d) The shallowest leaf node in this tree is the c in the left sub-tree of root a. If we assign the storing part (b, c, d) to it, then there will be duplicate c both in indexes columns and storing columns. To avoid that, we need to remove all the indexes columns along the path from the shallowest leaf node to the current node with storing part. In this example, we can remove b and c from storing part (b, c, d) since the path from left c to a covers b and c. To implement this, I always store the shallowest leaf node when during traversing the tree. If we keep the parent node and the column information, we can easily lift up the shallowest leaf node to the current node and get the whole information of the path. Otherwise, it is very hard and time consuming to get the path top-down. |
Previously, qiyanghe1998 wrote…
The previous example is like: Code snippet: a store (b, c, d)
/ \
b c
| |
c e
|
f |
Previously, rytaft (Rebecca Taft) wrote…
That is a good question, here is one example. We remove the b and c due to the lead node path of c and then assign the d to the leaf node e. The final indexes are (a, b, c) and (a, e) storing (d). Then if the original index (a) storing (b, c, d) is required for some query, we need an extra join between the two indexes we built or between (a, e) storing (d) and the primary index. To avoid the additional join, I implement the function in all-or-nothing idea. I admit that there may be some optimization further. I will leave a TODO here. Code snippet: a store (b, c, d)
/ \
b e
|
c |
419dd0f
to
d2b761a
Compare
Previously, rytaft (Rebecca Taft) wrote…
All the second workload_index_recs queries (time filter only, just like the line below) after the first replacement test are designed for missing some index recs. The timestamp of the first replacement is "2023-06-15 15:10:10+00:00" which is earlier than the time filter ('2023-07-05 15:10:10+00:00'::TIMESTAMPTZ - '2 weeks'::interval) = "2023-06-21 15:10:10+00:00" As you can verify that all the second workload_index_recs queries do not contain the first replacement "replacement : CREATE INDEX t1_i2 ON t1(i) storing (k); DROP INDEX t1_i;" |
Previously, qiyanghe1998 wrote…
I will add some comments to all the second workload_index_recs queries. |
Previously, rytaft (Rebecca Taft) wrote…
The general idea is to go through the path from the shallowest leaf node to the current node and remove those columns that are covered by the path from the storing part of the current node. Just like the example below, the shallowest leaf node for a is the leaf c. What I am doing is to remove b, c (along the path from left c to a) from the storing part (b, c, d) of a. Finally, assign the d (after removing b and c) to the storing part of c. Then the left c (with storing d) will represent the index (a, b, c) storing (d) instead of (a, b, c) storing (b, c, d). Code snippet: a store (b, c, d)
/ \
b c
| |
c e
|
f |
cf475ce
to
7ddce77
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r8, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner, @msirek, and @qiyanghe1998)
pkg/sql/opt/workloadindexrec/index_trie.go
line 54 at r8 (raw file):
parent: nil, col: indexedColumn{}, },
nit: I think this can just be root: &indexTrieNode{}
since everything has the default value (in general, you don't need to explicitly assign the data members that get the default value)
pkg/sql/opt/workloadindexrec/index_trie_test.go
line 288 at r7 (raw file):
Previously, qiyanghe1998 wrote…
Done
Thanks. This is better, but it would be even more helpful if you printed the actual test case and/or index columns so someone looking at the failure doesn't need to work too hard to figure out what happened.
pkg/sql/opt/workloadindexrec/index_trie_test.go
line 129 at r8 (raw file):
// i store (k) i // | -> | // j j store (k)
nit: fix spacing here
pkg/sql/opt/workloadindexrec/index_trie_test.go
line 159 at r8 (raw file):
// i i DESC store (j) i i DESC // | | | | // j k -> j k store (j)
nit: fix spacing
pkg/sql/opt/workloadindexrec/index_trie_test.go
line 203 at r8 (raw file):
// i store (j) i // / \ / \ // j k -> j k
nit: fix spacing
9aafce1
to
b7fddc0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner, @msirek, and @rytaft)
pkg/sql/opt/workloadindexrec/index_trie.go
line 54 at r8 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
nit: I think this can just be
root: &indexTrieNode{}
since everything has the default value (in general, you don't need to explicitly assign the data members that get the default value)
Done
pkg/sql/opt/workloadindexrec/index_trie_test.go
line 288 at r7 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
Thanks. This is better, but it would be even more helpful if you printed the actual test case and/or index columns so someone looking at the failure doesn't need to work too hard to figure out what happened.
Done
pkg/sql/opt/workloadindexrec/index_trie_test.go
line 129 at r8 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
nit: fix spacing here
Done
pkg/sql/opt/workloadindexrec/index_trie_test.go
line 159 at r8 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
nit: fix spacing
Done
pkg/sql/opt/workloadindexrec/index_trie_test.go
line 203 at r8 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
nit: fix spacing
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 4 files at r8, 2 of 2 files at r9, 1 of 1 files at r10, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @msirek and @qiyanghe1998)
pkg/sql/opt/workloadindexrec/index_trie.go
line 159 at r7 (raw file):
Previously, qiyanghe1998 wrote…
What I mean is the shallowest leaf node inside
node
's sub-tree. As you can see, I initialize thedep
as the maximum value ofint16
so that it will record the leaf node with the minimum depth.Since my algorithm will collect the indexes represented by all the leaf nodes, so only pushing down the storing to a non-leaf node may cause that some indexes are not covered by our final output indexes.
Thanks for the explanation!
pkg/sql/opt/workloadindexrec/index_trie.go
line 165 at r10 (raw file):
var shallowLeaf *indexTrieNode // largest depth dep := math.MaxInt32
math.MaxInt64
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r9, 1 of 1 files at r10, all commit messages.
Reviewable status: complete! 2 of 0 LGTMs obtained (waiting on @msirek and @qiyanghe1998)
84408cb
to
b34644f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r11, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @qiyanghe1998)
b34644f
to
ede9fe7
Compare
bors r+ |
Build failed (retrying...): |
bors r- |
Canceled. |
Your new tests are flaking. Looks like it due to non-determinism in the STORING part of the recommendation:
|
Also need to run |
This commit adds the trie and the logic for getting the workload index recommendations. In addition, it fills the gap between built-in functions and backend implementation for workload index recommendations. The whole process consists of collecting candidates and finding representative indexes. All the index recommendations in the table `system.statement_statistics` (satisfying some time requirement) will be collected as the candidates and then inserted to the trie. The trie is designed for all the indexes of one table. The indexed columns will be regarded as the key to insert into the tree in their original orders. The storing part will be attached to the node after the insertion of indexed columns. The general idea of finding representative indexes is to use all the indexes represented by the leaf nodes. One optimization is to remove the storings that are covered by some leaf nodes. Next, we will push down all the storings attached to the internal nodes to the shallowest leaf nodes (You can find the reasons in RFC). Finally, all the indexes represented by the leaf nodes will be returned. As for the `DROP INDEX`, since we collect all the indexes represented by the leaf nodes (a superset of dropped indexes), so we can directly drop all of them. Release note (sql change): new builtin functions `workload_index_recs()` and `workload_index_recs(timestamptz)`, return workload level index recommendations (columns of string, each string represent an index recommendation) from statement level index recommendations (as candidates) in `system.statement_statistics`. If the timestamptz is given, it will only consider those candidates who is generated after that timestampsz.
ede9fe7
to
0f2b7e6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, are of them are fixed.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner and @rytaft)
bors r+ |
Build succeeded: |
sql: add trie tree based workload index recommendations
This commit adds the trie and the logic for getting the workload index
recommendations. In addition, it fills the gap between built-in functions and
backend implementation for workload index recommendations. The whole process
consists of collecting candidates and finding representative indexes.
All the index recommendations in the table
system.statement_statistics
(satisfying some time requirement) will be collected as the candidates and then
inserted to the trie.
The trie is designed for all the indexes of one table. The indexed columns will
be regarded as the key to insert into the tree in their original orders. The
storing part will be attached to the node after the insertion of indexed
columns.
The general idea of finding representative indexes is to use all the indexes
represented by the leaf nodes. One optimization is to use the remove the
storings that are covered by some leaf nodes. Next, we will push down all the
storings attached to the internal nodes to the shallowest leaf nodes (You can
find the reasons in RFC). Finally, all the indexes represented by the leaf
nodes will be returned.
As for the
DROP INDEX
, since we collect all the indexes represented by theleaf nodes (a superset of dropped indexes), so we can directly drop all of
them.
Release note (sql change): new builtin functions
workload_index_recs()
andworkload_index_recs(timestamptz)
, return workload level index recommendations(columns of string, each string represent an index recommendation) from
statement level index recommendations (as candidates) in
system.statement_statistics
. If the timestamptz is given, it will onlyconsider those candidates who is generated after that timestampsz.
Epic: None