-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql,server: change indexusagestats subsystem to issue cluster RPC fanout #66639
Conversation
4061552
to
e5445a2
Compare
d55bae2
to
1dafb7e
Compare
556e47b
to
2623363
Compare
Thank you for updating your pull request. Before a member of our team reviews your PR, I have some potential action items for you:
🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
2623363
to
47b69e0
Compare
14580f3
to
c94a165
Compare
c94a165
to
d1e6011
Compare
66451: sql: introduce index usage stats subsystem r=mgartner a=Azhng First commit: **roachpb: add protobuf definition for index usage statistics** Release note: None Second commit: **sql: introduce index usage stats subsystem** This commit introduce a new index usage stats subsystem living inside the sql package. This subsystem is responsible for asynchronously collecting index usage statistics from the planner and its is created within the sql.Server. Release note: None Related PR: * 1/3: Current PR * 2/3: #66639 * 3/3: #66640 67426: builtin: add {de,}compress builtins r=pbardea a=pbardea ``` [email protected]:26257/movr> select compress('hello there', 'gzip'); compress ------------------------------------------------------------------------------------------------------------------------ \037\213\010\000\000\000\000\000\000\377\312H\315\311\311W(\311H-J\005\004\000\000\377\377P\351\022m\013\000\000\000 (1 row) [email protected]:26257/movr> select decompress(compress('hello there', 'gzip'), 'gzip'); decompress --------------- hello there (1 row) [email protected]:26257/movr> select length(repeat('hi there', 100)::bytes); length ---------- 800 (1 row) [email protected]:26257/movr> select length(compress(repeat('hi there', 100)::bytes, 'gzip')); length ---------- 39 (1 row) [email protected]:26257/movr> select decompress('not compressed', 'gzip'); ERROR: decompress(): failed to decompress: gzip: invalid header [email protected]:26257/movr> select decompress('bad codec', 'zip'); ERROR: decompress(): only 'gzip' codec is supported for decompress() SQLSTATE: 22023 ``` This commits adds builtins to compress and decompress bytes. The only codec currently supported is `gzip`. Release note (sql change): New builtins, compress(data, codec) and decompress(data, codec), are added which can compress and decompress bytes with the specified codec. Gzip is the only currently supported codec. 67540: dev: infer and pass in an appropriate `dev` config to bazel invocations r=rail a=rickystewart Passing `--config=dev` for `dev` builds is a good choice, and `--config=devdarwinx86_64` is required for most dev builds on macOS machines. Automatically pass that flag unless the user requests not to. Release note: None 67584: parser: move scan.go to new scanner package r=knz,otan a=rafiss Fixes #64710 See individual commits. This extracts an interface for sqlSymType so that scan.go can keep using it. This allows scan.go to live in its own package so that other components can use its functionality without needing to pull in a large dependency on sql/sem/tree. Release note: None 67614: backfill: fix backfills for indexes on non-null virtual columns r=mgartner a=mgartner Refactoring in #65514 broke backfills for indexes that include non-null virtual columns. This specific case was not tested so it was not caught earlier. This commit fixes the issue and adds a test. Fixes #67528 There is no release note because this bug was not included in any prior releases. Release note: None Co-authored-by: Azhng <[email protected]> Co-authored-by: Paul Bardea <[email protected]> Co-authored-by: Ricky Stewart <[email protected]> Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: Marcus Gartner <[email protected]>
d1e6011
to
5692451
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
your "follow up"on the PR description is pointing to itself :)
Reviewed 2 of 21 files at r3, 31 of 31 files at r4.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @Azhng and @mgartner)
pkg/server/index_usage_stats_test.go, line 215 at r4 (raw file):
// When being stressed, there's possibility that it might takes a while for // the stats payload to be ingested. // time.Sleep(3 * time.Second)
nit: remove unnecessary comments
pkg/server/index_usage_stats_test.go, line 218 at r4 (raw file):
// Check local node stats. // Fetch stats reader from each individual
nit: period
pkg/sql/idxusage/cluster_idx_usage_stats.go, line 18 at r4 (raw file):
) // ClusterIndexUsageStats implements the Reader interface and is able to returns
nit: return
5692451
to
9c627f0
Compare
9c627f0
to
897bc93
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @maryliag and @mgartner)
Friendly ping @mgartner |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 21 files at r3, 9 of 31 files at r4, 4 of 9 files at r5, 4 of 5 files at r6.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @Azhng and @maryliag)
pkg/server/index_usage_stats.go, line 55 at r6 (raw file):
statsReader := s.sqlServer.pgServer.SQLServer.GetLocalIndexStatisticsReader() return serializeIndexUsageStats(req, statsReader)
Should req
be localReq
here?
pkg/server/index_usage_stats.go, line 58 at r6 (raw file):
} statusServer, err := s.dialNode(ctx, requestedNodeID)
nit: maybe name statusServer
to statusClient
or client
to make it clear its not of type statusServer
and that it's a client to a remote status server
pkg/server/index_usage_stats.go, line 63 at r6 (raw file):
} return statusServer.IndexUsageStatistics(ctx, localReq)
Should localReq
be req
here?
pkg/server/index_usage_stats.go, line 87 at r6 (raw file):
var combinedError error errFn := func(_ roachpb.NodeID, nodeFnError error) { if nodeFnError != nil {
nit: iterateNodes should not call this function with a nil
error, so you can remove this check.
pkg/server/index_usage_stats.go, line 115 at r6 (raw file):
} err := reader.ForEach(idxusage.IteratorOptions{
Creating a reader interface and implementing a local and cluster implementation of the interface only for the purpose of using either here seems over-engineered for what needs to be accomplished: aggregating index usage stats response from multiple nodes. Also, the cluster implementation just uses a local implementation which has all the complexity and overhead of dealing with concurrent reads and write. Would it be simpler to remove the interface, implement a single struct for local index usage stats, and perform aggregation of the results here?
pkg/server/index_usage_stats.go, line 118 at r6 (raw file):
SortedTableID: req.OrderedTableID, SortedIndexID: req.OrderedIndexID, Max: max,
It'd be nice to not fetch more than max
usage stats from remote nodes in the first place, if possible. Or at least to fetch only up to max
from each node because we'll never need more than that. If that's a lot more effort, maybe leave a TODO somewhere?
pkg/server/serverpb/status.proto, line 1126 at r6 (raw file):
// by the RPC. If this field is left unset, then all stats data will be // returned at once. oneof max {
Did you try using optional
here instead of oneof
?
pkg/sql/idxusage/cluster_idx_usage_stats.go, line 26 at r6 (raw file):
// usage stats, therefore, we do not call LocalIndexUsageStats.Start() method. storage *LocalIndexUsageStats statusServer serverpb.SQLStatusServer
nit: this field is never used
b6597e8
to
3437ee3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @maryliag, and @mgartner)
pkg/server/index_usage_stats.go, line 87 at r6 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: iterateNodes should not call this function with a
nil
error, so you can remove this check.
done.
pkg/server/index_usage_stats.go, line 115 at r6 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Creating a reader interface and implementing a local and cluster implementation of the interface only for the purpose of using either here seems over-engineered for what needs to be accomplished: aggregating index usage stats response from multiple nodes. Also, the cluster implementation just uses a local implementation which has all the complexity and overhead of dealing with concurrent reads and write. Would it be simpler to remove the interface, implement a single struct for local index usage stats, and perform aggregation of the results here?
I agree this is a bit awkward. I think the root cause here is that this RPC endpoint is consumed by two different clients:
- The code that populates
crdb_internal
virtual table. This client needs an aggregated reader so it can populate the virtual table. - The DB Console. This client needs the serialized data returning from the RPC endpoint. Preferably aggregated. If not, the DB Console will aggregated in the frontend.
The max
param was introduced here to distinguish between those two type of use case. But this does introduce quite a bit of awkwardness.
I think a better approach would be: this endpoint does not perform aggregation at all. It returns all the statistics directly to the caller without aggregation. Then the code responsible for crdb_internal
virtual table would be the one performing aggregation.
Then, the DB Console will be calling to a different endpoint, (possibly the v2 status API that's being built right now) and that API will be responsible to fetching data from the crdb_internal
virtual table.
I updated the code to reflect this.
pkg/server/index_usage_stats.go, line 118 at r6 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
It'd be nice to not fetch more than
max
usage stats from remote nodes in the first place, if possible. Or at least to fetch only up tomax
from each node because we'll never need more than that. If that's a lot more effort, maybe leave a TODO somewhere?
Removed max
from the request protobuf now since this API will be primary used for populating the crdb_internal
virtual table as mentioned in above comment. There's not quite necessary to limit number of rows returning from this endpoint.
pkg/server/serverpb/status.proto, line 1126 at r6 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Did you try using
optional
here instead ofoneof
?
Removed max
param here. See the comment above.
pkg/sql/idxusage/cluster_idx_usage_stats.go, line 26 at r6 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
nit: this field is never used
Done.
This commit introduce crdb_internal.index_usage_stats virtual table that is backed by new clusterindexusagestats package. This new package implements a variant of the indexusagestats interface and serves the data by issuing cluster RPC fanout. Addresses cockroachdb#64740 Followup to cockroachdb#66639 Release note (sql change): introduce crdb_internal.index_usage_statistics virtual table to surface index usage statistics. sql.metrics.index_usage_stats.enabled cluster setting can be used to turn on/off the subsystem. It is default to true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 15 of 16 files at r7.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @Azhng, @maryliag, and @mgartner)
pkg/server/index_usage_stats.go, line 63 at r6 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Should
localReq
bereq
here?
Looking at this again, and I think localReq
is correct. This is basically telling another node "give me your local stats", is that correct? Might be worth a comment for clarity.
pkg/server/index_usage_stats.go, line 115 at r6 (raw file):
Previously, Azhng (Archer Zhang) wrote…
I agree this is a bit awkward. I think the root cause here is that this RPC endpoint is consumed by two different clients:
- The code that populates
crdb_internal
virtual table. This client needs an aggregated reader so it can populate the virtual table.- The DB Console. This client needs the serialized data returning from the RPC endpoint. Preferably aggregated. If not, the DB Console will aggregated in the frontend.
The
max
param was introduced here to distinguish between those two type of use case. But this does introduce quite a bit of awkwardness.I think a better approach would be: this endpoint does not perform aggregation at all. It returns all the statistics directly to the caller without aggregation. Then the code responsible for
crdb_internal
virtual table would be the one performing aggregation.Then, the DB Console will be calling to a different endpoint, (possibly the v2 status API that's being built right now) and that API will be responsible to fetching data from the
crdb_internal
virtual table.I updated the code to reflect this.
Sounds simpler to me. 👍
pkg/server/index_usage_stats_test.go, line 195 at r7 (raw file):
}() // Records a non-full scan.
nit add: "over t_a_idx".
pkg/server/index_usage_stats_test.go, line 199 at r7 (raw file):
require.NoError(t, err) // Records an zigzag join.
nit add: "that scans both t_a_idx and t_b_idx"
pkg/server/index_usage_stats_test.go, line 203 at r7 (raw file):
require.NoError(t, err) // Record a full scan.
nit add: "over t_b_idx"
pkg/server/index_usage_stats_test.go, line 213 at r7 (raw file):
// When being stressed, there's possibility that it might takes a while for // the stats payload to be ingested. // time.Sleep(3 * time.Second)
Is this leftover?
pkg/server/index_usage_stats_test.go, line 260 at r7 (raw file):
statsEntries++ switch stats.Key.IndexID { case 2: // t@t_a_idx
nit: could this instead be case indexKeyA.IndexID
?
pkg/sql/conn_executor.go, line 272 at r7 (raw file):
pool *mon.BytesMonitor // indexUsageStats tracks the index usage statistics.
nit: Make it clear in this comment that it only tracks usage for the gateway node of a query, not the entire cluster.
Related issue: cockroachdb#64740 Release note: None
3437ee3
to
0a95d40
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @maryliag and @mgartner)
pkg/server/index_usage_stats.go, line 63 at r6 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Looking at this again, and I think
localReq
is correct. This is basically telling another node "give me your local stats", is that correct? Might be worth a comment for clarity.
Yes. I'll add a comment.
pkg/server/index_usage_stats.go, line 115 at r6 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Sounds simpler to me. 👍
Done.
pkg/server/index_usage_stats_test.go, line 213 at r7 (raw file):
Previously, mgartner (Marcus Gartner) wrote…
Is this leftover?
Ah yes. I cleaned this up earlier but probably messed up the rebase and it sneaked back. 🤦
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 16 files at r7, 3 of 3 files at r8.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @maryliag and @mgartner)
pkg/server/index_usage_stats.go, line 63 at r6 (raw file):
Previously, Azhng (Archer Zhang) wrote…
Yes. I'll add a comment.
Thanks!
TFTR! bors r+ |
Build succeeded: |
This commit introduce crdb_internal.index_usage_stats virtual table that is backed by new clusterindexusagestats package. This new package implements a variant of the indexusagestats interface and serves the data by issuing cluster RPC fanout. Addresses cockroachdb#64740 Followup to cockroachdb#66639 Release note (sql change): introduce crdb_internal.index_usage_statistics virtual table to surface index usage statistics. sql.metrics.index_usage_stats.enabled cluster setting can be used to turn on/off the subsystem. It is default to true.
This commit introduce crdb_internal.index_usage_stats virtual table that is backed by new clusterindexusagestats package. This new package implements a variant of the indexusagestats interface and serves the data by issuing cluster RPC fanout. Addresses cockroachdb#64740 Followup to cockroachdb#66639 Release note (sql change): introduce crdb_internal.index_usage_statistics virtual table to surface index usage statistics. sql.metrics.index_usage_stats.enabled cluster setting can be used to turn on/off the subsystem. It is default to true.
This commit introduce crdb_internal.index_usage_stats virtual table that is backed by new clusterindexusagestats package. This new package implements a variant of the indexusagestats interface and serves the data by issuing cluster RPC fanout. Addresses cockroachdb#64740 Followup to cockroachdb#66639 Release note (sql change): introduce crdb_internal.index_usage_statistics virtual table to surface index usage statistics. sql.metrics.index_usage_stats.enabled cluster setting can be used to turn on/off the subsystem. It is default to true.
This commit introduce crdb_internal.index_usage_stats virtual table that is backed by new clusterindexusagestats package. This new package implements a variant of the indexusagestats interface and serves the data by issuing cluster RPC fanout. Addresses cockroachdb#64740 Followup to cockroachdb#66639 Release note (sql change): introduce crdb_internal.index_usage_statistics virtual table to surface index usage statistics. sql.metrics.index_usage_stats.enabled cluster setting can be used to turn on/off the subsystem. It is default to true.
66640: sql: introduce crdb_internal.index_usage_stats virtual table r=Azhng a=Azhng This commit introduce crdb_internal.index_usage_stats virtual table that is backed by new clusterindexusagestats package. This new package implements a variant of the indexusagestats interface and serves the data by issuing cluster RPC fanout. Release note (sql change): introduce crdb_internal.index_usage_statistics virtual table to surface index usage statistics. sql.metrics.index_usage_stats.enabled cluster setting can be used to turn on/off the subsystem. It is default to true. sql.metrics.index_usage_stats.reset_interval can change the reset interval of the collected statistics. It is default to 1 hour. Addresses #64740 Followup to #66639 Co-authored-by: Azhng <[email protected]>
This commit introduce crdb_internal.index_usage_stats virtual table that is backed by new clusterindexusagestats package. This new package implements a variant of the indexusagestats interface and serves the data by issuing cluster RPC fanout. Addresses cockroachdb#64740 Followup to cockroachdb#66639 Release note (sql change): introduce crdb_internal.index_usage_statistics virtual table to surface index usage statistics. sql.metrics.index_usage_stats.enabled cluster setting can be used to turn on/off the subsystem. It is default to true.
sql: introduce crdb_internal.index_usage_stats virtual table
This commit introduce crdb_internal.index_usage_stats virtual
table that is backed by new clusterindexusagestats package. This
new package implements a variant of the indexusagestats interface
and serves the data by issuing cluster RPC fanout.
Addresses #64740
Followup to #66451
Release note (sql change): introduce crdb_internal.index_usage_statistics
virtual table to surface index usage statistics.
sql.metrics.index_usage_stats.enabled cluster setting can be used to
turn on/off the subsystem. It is default to true.