-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: fix inefficient memory usage for SHOW CREATE ALL TABLES #61127
sql: fix inefficient memory usage for SHOW CREATE ALL TABLES #61127
Conversation
5e69021
to
710671a
Compare
710671a
to
901b314
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 6 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rafiss and @RichardJCai)
pkg/sql/sem/builtins/generator_builtins.go, line 359 at r1 (raw file):
makeGeneratorOverload( tree.ArgTypes{ {"dbName", types.String},
database_name
pkg/sql/sem/builtins/generator_builtins.go, line 1581 at r1 (raw file):
// Start implements the tree.ValueGenerator interface. func (s *showCreateAllTablesGenerator) Start(ctx context.Context, txn *kv.Txn) error { ids, err := getTopologicallySortedTableIDs(ctx, s.ie, txn, s.dbName, s.timestamp)
That's still not cutting it though -- you're accumulating all the identifiers in RAM before starting anything.
What is needed is to open the internal executor iterator in Start
and then consume one row from that per invocation of Next
Then close the iterator in the SRF close function.
pkg/sql/sem/builtins/generator_builtins.go, line 1613 at r1 (raw file):
} createStmtStr := string(tree.MustBeDString(createStmt)) s.curr = tree.NewDString(fmt.Sprintf("%s;", createStmtStr))
createStmtStr + ";"
is more efficient and also clearer
pkg/sql/sem/builtins/generator_builtins.go, line 1621 at r1 (raw file):
alterStmt := tree.MustBeDString(s.alterArr[s.alterArrIdx]) s.curr = tree.NewDString( fmt.Sprintf("%s;", alterStmt),
ditto
901b314
to
74b4744
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz and @rafiss)
pkg/sql/sem/builtins/generator_builtins.go, line 359 at r1 (raw file):
Previously, knz (kena) wrote…
database_name
Done.
pkg/sql/sem/builtins/generator_builtins.go, line 1581 at r1 (raw file):
Previously, knz (kena) wrote…
That's still not cutting it though -- you're accumulating all the identifiers in RAM before starting anything.
What is needed is to open the internal executor iterator in
Start
and then consume one row from that per invocation ofNext
Then close the iterator in the SRF close function.
As discussed offline, I didn't change this and it still holds the ids in ram. We use the same amount of memory to hold the ids / descriptors in ram when creating the vtable for crdb_internal.create_statements.
Added comments about the reasoning here. I did try to add some memory accounting, although I'm not sure if I'm doing it correctly.
The second commit runs into a panic with the memory accounting because of Close()
not being called.
The panic happens when testing for local-vec-off due to what seems like ValuesGenerator.Close()
is not being called. Not familiar with the execution part of the code base but it seems like in project_set.go
ConsumerClosed
is not being called for the local-vec-off.
Will keep investigating.
pkg/sql/sem/builtins/generator_builtins.go, line 1613 at r1 (raw file):
Previously, knz (kena) wrote…
createStmtStr + ";"
is more efficient and also clearer
Done.
pkg/sql/sem/builtins/generator_builtins.go, line 1621 at r1 (raw file):
Previously, knz (kena) wrote…
ditto
Done.
Requesting review from @cockroachdb/sql-execution as well since y'all are probably more familiar with the memory accounting and may be able to read more into why local-vec-off is panicking due to the aforementioned:
|
Try the following diff:
It fixed the panic I observed on |
74b4744
to
51b4582
Compare
Thanks that seems to have fixed it. |
51b4582
to
e8221fc
Compare
@yuzefovich Can you double check that the changes I made to Also @rafiss / @knz can I get another look at this? Specifically the memory monitoring stuff is worth a good look since I'm not familiar with it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes in project_set.go
LGTM. Also, FYI Raphael is on vacation for week and a half.
Reviewed 1 of 3 files at r2, 1 of 8 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz, @rafiss, @RichardJCai, and @yuzefovich)
pkg/sql/sem/builtins/generator_builtins.go, line 1789 at r3 (raw file):
// Close implements the tree.ValueGenerator interface. func (s *showCreateAllTablesGenerator) Close(ctx context.Context) { s.acc.Clear(ctx)
This should be s.acc.Close(ctx)
. The reasoning is that we instantiate a new memory account in the constructor, so showCreateAllTablesGenerator
is its owner and should close the account once itself is being closed.
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 99 at r3 (raw file):
// Account for memory of map. sizeOfKeyValue := int64(unsafe.Sizeof(tid)) + int64(len(refs))*sizeOfInt64 sizeOfMap += sizeOfKeyValue
Just a note here that using map
has an overhead. At some point I did some experiments to measure that overhead and came down to approximately 64 bytes per entry in the map:
cockroach/pkg/sql/rowexec/aggregator.go
Lines 192 to 196 in 2e04b55
// hashAggregatorSizeOfBucketsItem is a guess on how much space (in bytes) | |
// each item added to 'buckets' map of hashAggregator takes up in the map | |
// (i.e. it is memory internal to the map, orthogonal to "key-value" pair | |
// that we're adding to the map). | |
hashAggregatorSizeOfBucketsItem = 64 |
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 119 at r3 (raw file):
// The sort relies on creating a new array for the ids. acc.Grow(ctx, int64(len(ids))*sizeOfInt64)
Grow
will return an error when the memory budget has been exceeded, so we should its return value here and in topologicalSort
.
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 128 at r3 (raw file):
// the two arrays should have the same length. if len(ids) != len(topologicallyOrderedIDs) { errors.AssertionFailedf("show_create_all_tables_builtin failed. "+
Seems like a missing return
.
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 152 at r3 (raw file):
AS OF SYSTEM TIME %s WHERE database_name = $1 AND is_virtual = FALSE
nit: uneven indentation (likely spaces vs tabs)
e8221fc
to
24c9d44
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz, @rafiss, and @yuzefovich)
pkg/sql/sem/builtins/generator_builtins.go, line 1789 at r3 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
This should be
s.acc.Close(ctx)
. The reasoning is that we instantiate a new memory account in the constructor, soshowCreateAllTablesGenerator
is its owner and should close the account once itself is being closed.
Done
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 99 at r3 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Just a note here that using
map
has an overhead. At some point I did some experiments to measure that overhead and came down to approximately 64 bytes per entry in the map:.cockroach/pkg/sql/rowexec/aggregator.go
Lines 192 to 196 in 2e04b55
// hashAggregatorSizeOfBucketsItem is a guess on how much space (in bytes) // each item added to 'buckets' map of hashAggregator takes up in the map // (i.e. it is memory internal to the map, orthogonal to "key-value" pair // that we're adding to the map). hashAggregatorSizeOfBucketsItem = 64
Added a var for extra overhead per item, just to double check, is 64 bytes per entry correct? Seems a bit high.
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 119 at r3 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Grow
will return an error when the memory budget has been exceeded, so we should its return value here and intopologicalSort
.
Done.
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 128 at r3 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Seems like a missing
return
.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz, @rafiss, @RichardJCai, and @yuzefovich)
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 99 at r3 (raw file):
Previously, RichardJCai (Richard Cai) wrote…
Added a var for extra overhead per item, just to double check, is 64 bytes per entry correct? Seems a bit high.
Yes, in my experiments 64 bytes overhead was per each k-v entry to the map, and 64 bytes doesn't include the memory of k
or v
. Well, map
s are expensive, and that's why we implemented a vectorized hash table :)
It is possible that the number is higher than it actually is (I think I expected it to be more like 32 bytes), and 64 bytes value was making the memory account (as reported by mon.BoundAccount
) to be what I was observing in the heap profile, but we tend to forget to account for some things, so possibly the "extra" overhead hides missing accounting somewhere else.
24c9d44
to
e974b23
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool! just have some small comments
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz, @RichardJCai, and @yuzefovich)
pkg/sql/sem/builtins/generator_builtins.go, line 378 at r4 (raw file):
showCreateAllTablesGeneratorType, makeShowCreateAllTablesGenerator, `Returns a rows CREATE table statements followed by
nit: the grammar is mistaken for "Returns a rows CREATE table statements..." missing word maybe?
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 137 at r4 (raw file):
} } acc.Shrink(ctx, memoryUsed)
we don't actually clear out the seen
map here right? why are we shrinking here? (the mem accounting stuff is also a bit unfamiliar to me, so just asking)
i wonder if it would be cleaner if topologicalSort
returned a cleanup()
function callback, and the code here could just call the callback rather than having to know about how the topologicalSort uses memory internally. then we wouldn't need the memoryUsed
counter i think
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 146 at r4 (raw file):
len(ids), len(topologicallyOrderedIDs)) } acc.Shrink(ctx, int64(len(ids))*sizeOfInt64)
maybe there should be a comment saying what is shrinking here?
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 210 at r4 (raw file):
tid int64, dependsOnIDs map[int64][]int64, seen map[int64]bool,
this would use less memory as a map[int64]struct{}
(and using _, isPresent := seen[tid]
) to check if a key is present.
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 249 at r4 (raw file):
func getCreateStatement( ctx context.Context, ie sqlutil.InternalExecutor, txn *kv.Txn, id int64, ts string, dbName string, ) (tree.Datum, error) {
is it safer to return a (tree.DString, error)
here and return an error if the Datum is the wrong type? maybe that's just extraneous though -- if the existing approach is conventional for using internalExecutor we can stick with it
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 282 at r4 (raw file):
dbName string, statementType string, ) (tree.Datum, error) {
same comment about DString
e974b23
to
5108e57
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz, @rafiss, @RichardJCai, and @yuzefovich)
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 137 at r4 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
we don't actually clear out the
seen
map here right? why are we shrinking here? (the mem accounting stuff is also a bit unfamiliar to me, so just asking)i wonder if it would be cleaner if
topologicalSort
returned acleanup()
function callback, and the code here could just call the callback rather than having to know about how the topologicalSort uses memory internally. then we wouldn't need thememoryUsed
counter i think
I'm working under the assumption here that the seen map gets GC'ed after the call the topologicalSort above, so I thought it would be alright to shrink it after it gets GC'ed. Let me know your thoughts on this.
IMO adding a cleanup() function callback wouldn't be clean because of how the function calls itself recursively. If we added another function that called the recursive function that returned a callback, I think the recursive function for topologicalSort would still have to return or use a int ptr to keep track of how much memory it uses.
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 146 at r4 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
maybe there should be a comment saying what is shrinking here?
Done.
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 210 at r4 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
this would use less memory as a
map[int64]struct{}
(and using_, isPresent := seen[tid]
) to check if a key is present.
Good call, done.
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 249 at r4 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
is it safer to return a
(tree.DString, error)
here and return an error if the Datum is the wrong type? maybe that's just extraneous though -- if the existing approach is conventional for using internalExecutor we can stick with it
I did some digging into this, it seems that the conventional approach for using IE is to use tree.MustBeDString and panic when it isn't the case. Not sure what the reasoning is here for panicking instead of returning an error though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one final comment then lgtm!
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz, @RichardJCai, and @yuzefovich)
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 137 at r4 (raw file):
Previously, RichardJCai (Richard Cai) wrote…
I'm working under the assumption here that the seen map gets GC'ed after the call the topologicalSort above, so I thought it would be alright to shrink it after it gets GC'ed. Let me know your thoughts on this.
IMO adding a cleanup() function callback wouldn't be clean because of how the function calls itself recursively. If we added another function that called the recursive function that returned a callback, I think the recursive function for topologicalSort would still have to return or use a int ptr to keep track of how much memory it uses.
well i think technically the seen
map is on the heap until the end of this getTopologicallySortedTableIDs function. if you want the memory to be available to GC any earlier than that, you'd need to do seen = make(map[int64]struct{})
again to remove the reference to the old map.
also, i just realized, why do we need memoryUsed
? can't we just look at the size of seen
after the topological sort is done and shrink by that amount?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz, @RichardJCai, and @yuzefovich)
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 137 at r4 (raw file):
technically
Oops yep you're right for point one, I'll update seen to nil to remove the old reference.
For the second point on memoryUsed, the amount we're growing for the map is not actually the exact amount the map uses.
The reason being, we want to allocate and track memory for each entry we are adding when we add it to the map, however we can't exactly estimate the overhead so we roughly estimate it. @yuzefovich came up with a 64 byte overhead per entry.
When we shrink we needa shrink by the exact amount we allocated which will not be the exact amount we allocated to seen.
Let me know if this logic makes sense to you.
5108e57
to
b15e9cb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz, @RichardJCai, and @yuzefovich)
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 137 at r4 (raw file):
Previously, RichardJCai (Richard Cai) wrote…
technically
Oops yep you're right for point one, I'll update seen to nil to remove the old reference.
For the second point on memoryUsed, the amount we're growing for the map is not actually the exact amount the map uses.
The reason being, we want to allocate and track memory for each entry we are adding when we add it to the map, however we can't exactly estimate the overhead so we roughly estimate it. @yuzefovich came up with a 64 byte overhead per entry.
When we shrink we needa shrink by the exact amount we allocated which will not be the exact amount we allocated to seen.
Let me know if this logic makes sense to you.
Nvm I misunderstood what you meant, updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment on comments but lgtm! nice stuff
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz, @RichardJCai, and @yuzefovich)
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 209 at r5 (raw file):
// be enough. // memoryUsed is used to determine how much memory is used by the seen map. // memoryUsed should be Shrunk after the sort is finished.
nit: remove the memoryUsed comment now
b15e9cb
to
87cc164
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 8 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz, @RichardJCai, and @yuzefovich)
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 138 at r6 (raw file):
// Clear memory used by the seen map. seen = nil
Not sure if you're done updating this stuff, but len(seen)
will return 0 after seen = nil
.
87cc164
to
f84f70a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz, @RichardJCai, and @yuzefovich)
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 138 at r6 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Not sure if you're done updating this stuff, but
len(seen)
will return 0 afterseen = nil
.
Oops my brain turned off for a moment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz, @RichardJCai, and @yuzefovich)
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 140 at r7 (raw file):
sizeOfSeen := len(seen) seen = nil acc.Shrink(ctx, int64(sizeOfSeen)*sizeOfInt64)
not quite sure, but should this be acc.Shrink(ctx, int64(sizeOfSeen)*(sizeOfInt64+mapEntryOverhead))
?
since on line 229 we are growing by sizeOfInt64+mapEntryOverhead
for each entry that is added.
f84f70a
to
7e5e400
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz, @rafiss, @RichardJCai, and @yuzefovich)
pkg/sql/sem/builtins/show_create_all_tables_builtin.go, line 140 at r7 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
not quite sure, but should this be
acc.Shrink(ctx, int64(sizeOfSeen)*(sizeOfInt64+mapEntryOverhead))
?since on line 229 we are growing by
sizeOfInt64+mapEntryOverhead
for each entry that is added.
yep it should🤦
crdb_internal.show_create_all_tables is now a generator that returns create / alter statements row by row. The id sort is now done on only the table ids. The topological sort is done using a map of int64 -> int64 meaning the tableMetadata is no longer kept in ram during the sort. Memory is now proportional to the number of tables in the database. The amount of memory used in this builtin for storing table ids is the same as the amount of memory needed when generating the underlying crdb_internal.create_statements. This is technically still not constrained so we should make it clear that the command should not be run on dbs with an excessive number of tables. Release justification: fix for inefficient memory usage, low risk, builtin currently not in release version of crdb Release note (sql change): SHOW CREATE ALL TABLES is updated to be more memory efficient, however this should still not be run on a database with an excessive number of tables. Users should not run this on a database with more than 10000 tables (arbitrary but tested number)..
Release justification: None Release note: None
7e5e400
to
6141e97
Compare
Thanks for the reviews! bors r=rafiss |
Build succeeded: |
crdb_internal.show_create_all_tables is now a generator that returns
create / alter statements row by row.
The id sort is now done on only the table ids.
The topological sort is done using a map of int64 -> int64
meaning the tableMetadata is no longer kept in ram during the sort.
Memory is now proportional to the number of tables in the database.
The amount of memory used in this builtin for storing table ids is
the same as the amount of memory needed when generating the underlying
crdb_internal.create_statements. This is technically still not constrained
so we should make it clear that the command should not be run on dbs with
an excessive number of tables.
Release justification: fix for inefficient memory usage, low risk, builtin
currently not in release version of crdb
Release note (sql change): SHOW CREATE ALL TABLES is updated to be more
memory efficient, however this should still not be run on a database with
an excessive number of tables. Users should not run this on a
database with more than 10000 tables (arbitrary but tested number)