-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spanconfig: introduce the spanconfig.SQLWatcher #69661
Conversation
92a3a47
to
f02635e
Compare
pkg/sql/opt/optbuilder/mutation_builder.go, line 654 at r1 (raw file):
nit: stray whitespace |
ddae485
to
8840e09
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the general structure here is coming together and this is ready for some eyes. I have everything in a single commit right now but I can break apart the rangefeed setup and full reconciliation into separate commits if it'll make things easier to review.
I still need to add a configuration that runs the data driven tests for secondary tenants as some of the stuff around named zones is tenant specific.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @irfansharif, @nvanbenschoten, and @RichardJCai)
8840e09
to
c22533f
Compare
Friendly ping on this one now that we're all back from breather week @irfansharif @ajwerner @nvanbenschoten |
Will take a look today, catching up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a clear vision on how checkpointing is going to fit in. I was sort of expecting the event API to involve timestamps and batches of updates in some form or another.
Reviewed 4 of 23 files at r1, 5 of 23 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @irfansharif, @nvanbenschoten, and @RichardJCai)
-- commits, line 10 at r3:
uber-nit: "if any at all"?
-- commits, line 13 at r3:
abiilty
.github/CODEOWNERS, line 96 at r3 (raw file):
/pkg/ccl/backupccl/ @cockroachdb/bulk-prs /pkg/ccl/importccl/ @cockroachdb/bulk-prs /pkg/ccl/spanconfigccl/ @cockroachdb/multiregion
🤔
pkg/ccl/spanconfigccl/testdata/databases, line 42 at r3 (raw file):
constraints: [] voterconstraints: [] leasepreferences: []
these might be easier to read if you omitted empty values.
pkg/ccl/spanconfigccl/testdata/full_reconciliation_named_zones_deleted, line 44 at r3 (raw file):
rangeminbytes: 134217728 rangemaxbytes: 536870912
you know what, what if you formatted it differently so it's like a list of configuration literals and then a mapping from spans to those literals. This here is too hard to visually differentiate.
pkg/spanconfig/spanconfig.go, line 37 at r3 (raw file):
// SQLWatcher emits SQL span configuration updates. type SQLWatcher interface {
I think this would be better phrased as:
SQLWatcher emits span configs that might come from updates made to zone configurations at the SQL layer.
pkg/spanconfig/spanconfig.go, line 38 at r3 (raw file):
// SQLWatcher emits SQL span configuration updates. type SQLWatcher interface { WatchForSQLUpdates(ctx context.Context) (<-chan Update, error)
This needs commentary. How do these two methods interact? (I imagine they don't). I'm surprised to not see timestamps anywhere on here. Please fill out the description of the semantics here.
pkg/spanconfig/spanconfig.go, line 61 at r3 (raw file):
Entry roachpb.SpanConfigEntry // Deleted is true if the span config entry has been deleted. Deleted bool
It feels like this might be a bit cleaner as
type Update struct {
Span roachpb.Span
Config roachpb.SpancConfig
Deleted bool
}
pkg/spanconfig/spanconfigsqlwatcher/sql_watcher.go, line 312 at r3 (raw file):
Quoted 10 lines of code…
if err := sql.DescsTxn( ctx, s.execCfg, func(ctx context.Context, txn *kv.Txn, descsCol *descs.Collection) error { var err error updates, err = s.handleIDUpdate(ctx, id, txn, descsCol) return err }); err != nil { return err }
extreme nit: I've taken to wrapping these like:
if err := sql.DescsTxn(ctx, s.execCfg, func(
ctx context.Context, txn *kv.Txn, descsCol *descs.Collection,
) (err error) {
updates, err = s.handleIDUpdate(ctx, id, txn, descsCol)
return err
}); err != nil {
return err
}
however, while I'm here, this feels like the perfect seam for an interface boundary. I think that I would have done something like:
type Reconciler interface {
Reconcile(context.Context, descpb.IDs) ([]spanconfig.Update, error)
}
That should make it much easier to test this logic, which is super stateful in terms of events and what not and that logic which is really a function of the database. Consider splitting these two areas apart.
pkg/sql/catalog/descs/collection.go, line 286 at r3 (raw file):
// the transaction inside the database referenced by the given database ID. It // first checks the collections cached descriptors before defaulting to a key-value scan. func (tc *Collection) GetAllTableDescriptorsInDatabase(
Note that it does not hydrate the descriptors. The alternative here is to fetch the IDs and then the descriptors. I was thinking just today that I wished namespace had type information.
This patch introduces the spanconfig.SQLTranslator which translates a descriptor and its corresponding zone configurations into its constituent span configurations. The zone config and span config protos look similar, but they differ from each other in a couple of ways: - zone configurations correspond to {descriptor, zone} IDs whereas span configurations correspond to keyspans. - zone configurations have a notion of inheritance which span configurations do not. Instead, they're fully hydrated and flattened out. When the SQLTranslator is given a {zone,descriptor} ID it first descends the zone configuration hierarchy starting from the ID to accumulate IDs of all leaf objects. Leaf objects are tables and named zones (other than RANGE DEFAULT) which have actual span configurations associated with them. For every one of these accumulated IDs, it then generates <span, span config> tuples by following up the inheritance chain to fully hydrate the span config. The SQLTranslator also accounts for and negotiates subzones in this process. Concretely, for the following zone configuration hierarchy: ``` CREATE DATABASE db; CREATE TABLE db.t1(); ALTER DATABASE db CONFIGURE ZONE USING num_replicas=7; ALTER TABLE db.t1 CONFIGURE ZONE USING num_voters=5; ``` The SQLTranslator produces the following translation (represented as a diff against RANGE DEFAULT for brevity): ``` /Table/5{3-4} num_replicas=7 num_voters=5 ``` Split out from cockroachdb#69661. Release note: None
This patch introduces the spanconfig.SQLTranslator which translates a descriptor and its corresponding zone configurations into its constituent span configurations. The zone config and span config protos look similar, but they differ from each other in a couple of ways: - zone configurations correspond to {descriptor, zone} IDs whereas span configurations correspond to keyspans. - zone configurations have a notion of inheritance which span configurations do not. Instead, they're fully hydrated and flattened out. When the SQLTranslator is given a {zone,descriptor} ID it first descends the zone configuration hierarchy starting from the ID to accumulate IDs of all leaf objects. Leaf objects are tables and named zones (other than RANGE DEFAULT) which have actual span configurations associated with them. For every one of these accumulated IDs, it then generates <span, span config> tuples by following up the inheritance chain to fully hydrate the span config. The SQLTranslator also accounts for and negotiates subzones in this process. Concretely, for the following zone configuration hierarchy: ``` CREATE DATABASE db; CREATE TABLE db.t1(); ALTER DATABASE db CONFIGURE ZONE USING num_replicas=7; ALTER TABLE db.t1 CONFIGURE ZONE USING num_voters=5; ``` The SQLTranslator produces the following translation (represented as a diff against RANGE DEFAULT for brevity): ``` /Table/5{3-4} num_replicas=7 num_voters=5 ``` Split out from cockroachdb#69661. Release note: None
This patch introduces the spanconfig.SQLTranslator which translates a descriptor and its corresponding zone configurations into its constituent span configurations. The zone config and span config protos look similar, but they differ from each other in a couple of ways: - zone configurations correspond to {descriptor, zone} IDs whereas span configurations correspond to keyspans. - zone configurations have a notion of inheritance which span configurations do not. Instead, they're fully hydrated and flattened out. When the SQLTranslator is given a {zone,descriptor} ID it first descends the zone configuration hierarchy starting from the ID to accumulate IDs of all leaf objects. Leaf objects are tables and named zones (other than RANGE DEFAULT) which have actual span configurations associated with them. For every one of these accumulated IDs, it then generates <span, span config> tuples by following up the inheritance chain to fully hydrate the span config. The SQLTranslator also accounts for and negotiates subzones in this process. Concretely, for the following zone configuration hierarchy: ``` CREATE DATABASE db; CREATE TABLE db.t1(); ALTER DATABASE db CONFIGURE ZONE USING num_replicas=7; ALTER TABLE db.t1 CONFIGURE ZONE USING num_voters=5; ``` The SQLTranslator produces the following translation (represented as a diff against RANGE DEFAULT for brevity): ``` /Table/5{3-4} num_replicas=7 num_voters=5 ``` Split out from cockroachdb#69661. Release note: None
This patch introduces the spanconfig.SQLTranslator which translates a descriptor and its corresponding zone configurations into its constituent span configurations. The zone config and span config protos look similar, but they differ from each other in a couple of ways: - zone configurations correspond to {descriptor, zone} IDs whereas span configurations correspond to keyspans. - zone configurations have a notion of inheritance which span configurations do not. Instead, they're fully hydrated and flattened out. When the SQLTranslator is given a {zone,descriptor} ID it first descends the zone configuration hierarchy starting from the ID to accumulate IDs of all leaf objects. Leaf objects are tables and named zones (other than RANGE DEFAULT) which have actual span configurations associated with them. For every one of these accumulated IDs, it then generates <span, span config> tuples by following up the inheritance chain to fully hydrate the span config. The SQLTranslator also accounts for and negotiates subzones in this process. Concretely, for the following zone configuration hierarchy: ``` CREATE DATABASE db; CREATE TABLE db.t1(); ALTER DATABASE db CONFIGURE ZONE USING num_replicas=7; ALTER TABLE db.t1 CONFIGURE ZONE USING num_voters=5; ``` The SQLTranslator produces the following translation (represented as a diff against RANGE DEFAULT for brevity): ``` /Table/5{3-4} num_replicas=7 num_voters=5 ``` Split out from cockroachdb#69661. Release note: None
This patch introduces the spanconfig.SQLTranslator which translates a descriptor and its corresponding zone configurations into its constituent span configurations. The zone config and span config protos look similar, but they differ from each other in a couple of ways: - zone configurations correspond to {descriptor, zone} IDs whereas span configurations correspond to keyspans. - zone configurations have a notion of inheritance which span configurations do not. Instead, they're fully hydrated and flattened out. When the SQLTranslator is given a {zone,descriptor} ID it first descends the zone configuration hierarchy starting from the ID to accumulate IDs of all leaf objects. Leaf objects are tables and named zones (other than RANGE DEFAULT) which have actual span configurations associated with them. For every one of these accumulated IDs, it then generates <span, span config> tuples by following up the inheritance chain to fully hydrate the span config. The SQLTranslator also accounts for and negotiates subzones in this process. Concretely, for the following zone configuration hierarchy: ``` CREATE DATABASE db; CREATE TABLE db.t1(); ALTER DATABASE db CONFIGURE ZONE USING num_replicas=7; ALTER TABLE db.t1 CONFIGURE ZONE USING num_voters=5; ``` The SQLTranslator produces the following translation (represented as a diff against RANGE DEFAULT for brevity): ``` /Table/5{3-4} num_replicas=7 num_voters=5 ``` Split out from cockroachdb#69661. Release note: None
65924: docs: RFC for db-console Bazel build r=koorosh a=koorosh Described proposal and changes required for `pkg/ui` UI projects to establish Bazel build workflow and integration with other bazel rules. Release note: None 71254: cli: identify log source more easily when running merge-logs r=ajwerner,cameronnunez,rauchenstein a=jtsiros resolves #55395 Previously, merge-logs output was prefixed by short machine name by default which made it difficult to identify the originating node when looking at the merged results. This change adds support for ${fpath} in the template when using the --prefix arg to include unique file path contents as part of the prefix. The file path prefix excludes common tokens across all file paths and tokens that contain specific delimiters. Release note (cli change): adds support for "${fpath}" in --prefix arg 71359: spanconfig: introduce the spanconfig.SQLTranslator r=irfansharif,nvanbenschoten,ajwerner a=arulajmani This patch introduces the spanconfig.SQLTranslator which translates zone configurations to span configurations. The zone config and span config protos look similar, but they differ from each other in a couple of ways: - zone configurations correspond to {descriptor, zone} IDs whereas span configurations correspond to keyspans. - zone configurations have a notion of inheritance which span configurations do not. Instead, they're fully hydrated and flattened out. When the SQLTranslator is given a {zone,descriptor} ID it generates span configurations for all objects under the given ID in the zone configuration hierarchy. The SQLTranslator fills in any missing fields by following up the inheritance chain to fully hydrate span configs. It also accounts for subzones (and subzone spans) when constructing (span, span config) tuples. Split out from #69661. Epic: CRDB-10489 Release note: None Co-authored-by: Andrii Vorobiov <[email protected]> Co-authored-by: Jon Tsiros <[email protected]> Co-authored-by: arulajmani <[email protected]>
The SQLWatcher is intended to incrementally watch for events on system.zones and system.descriptors. It establishes rangefeeds on these tables from a given checkpoint to do so. It buffers events until the next checkpoint. These buffered events, along with the new checkpoint timestamp, are passed to the handler callback. The SQLWatcher uses an eventBuffer helper struct to buffer rangefeed updates. This thing maintains the notion of a checkpoint timestamp, whcih is the minimum timestamp for {zone,descriptor} checkpoint events. These interfaces aren't hooked up to interact with eachother yet. That's forthcoming in a future patch. References cockroachdb#67679 Release note: None
5d718c0
to
91d9dc7
Compare
@irfansharif I rebased this thing from the SQLTranslator PR like we spoke about earlier. I think there's still a few comments left in here that I might need to address and I need to switch out the Hopefully nothing major will change in the interface definitions here, so if you end up starting on the Reconciler before I send out the cleaned up SQLWatcher PR hopefully this rebased PR can help you for your SQLWatcher needs. |
This patch introduces the SQLWatcher, which is intended to incrementally watch for updates to system.zones and system.descriptors. It does so by establishing rangefeeds at a given timestamp. The SQLWatcher periodically invokes a callback with a list of updates that have been observed in the window (previous checkpointTS, checkpointTS]. The checkpointTS is also provided to the callback. Internally, the SQLWatcher uses a buffer to keep track of events generated by the SQLWatcher's rangefeeds. It also tracks the individual frontier timestamps of both the rangefeeds. This helps to maintain the notion of the combined frontier timestamp, which is computed as the minimum of the two. This combined frontier timestamp serves as the checkpoint to the SQLWatcher's callback function. This interface isn't hooked up to anything yet. It'll be used by the sponconfig.Reconciler soon to perform partial reconciliation once full reconciliation is done. It is intended that the IDs from the updates produced by the SQLWatcher will be fed into the SQLTranslator. References cockroachdb#67679 Carved from cockroachdb#69661 Release note: None
This patch introduces the SQLWatcher, which is intended to incrementally watch for updates to system.zones and system.descriptors. It does so by establishing rangefeeds at a given timestamp. The SQLWatcher periodically invokes a callback with a list of updates that have been observed in the window (previous checkpointTS, checkpointTS]. The checkpointTS is also provided to the callback. Internally, the SQLWatcher uses a buffer to keep track of events generated by the SQLWatcher's rangefeeds. It also tracks the individual frontier timestamps of both the rangefeeds. This helps to maintain the notion of the combined frontier timestamp, which is computed as the minimum of the two. This combined frontier timestamp serves as the checkpoint to the SQLWatcher's callback function. This interface isn't hooked up to anything yet. It'll be used by the sponconfig.Reconciler soon to perform partial reconciliation once full reconciliation is done. It is intended that the IDs from the updates produced by the SQLWatcher will be fed into the SQLTranslator. References cockroachdb#67679 Carved from cockroachdb#69661 Release note: None
This patch introduces the SQLWatcher, which is intended to incrementally watch for updates to system.zones and system.descriptors. It does so by establishing rangefeeds at a given timestamp. The SQLWatcher periodically invokes a callback with a list of updates that have been observed in the window (previous checkpointTS, checkpointTS]. The checkpointTS is also provided to the callback. Internally, the SQLWatcher uses a buffer to keep track of events generated by the SQLWatcher's rangefeeds. It also tracks the individual frontier timestamps of both the rangefeeds. This helps to maintain the notion of the combined frontier timestamp, which is computed as the minimum of the two. This combined frontier timestamp serves as the checkpoint to the SQLWatcher's callback function. This interface isn't hooked up to anything yet. It'll be used by the sponconfig.Reconciler soon to perform partial reconciliation once full reconciliation is done. It is intended that the IDs from the updates produced by the SQLWatcher will be fed into the SQLTranslator. References cockroachdb#67679 Carved from cockroachdb#69661 Release note: None
This patch introduces the SQLWatcher, which is intended to incrementally watch for updates to system.zones and system.descriptors. It does so by establishing rangefeeds at a given timestamp. The SQLWatcher invokes a callback from time to time with a list of updates that have been observed in the window (previous checkpointTS, checkpointTS]. The checkpointTS is also provided to the callback. Internally, the SQLWatcher uses a buffer to keep track of events generated by the SQLWatcher's rangefeeds. It also tracks the individual frontier timestamps of both the rangefeeds. This helps to maintain the notion of the combined frontier timestamp, which is computed as the minimum of the two. This combined frontier timestamp serves as the checkpoint to the SQLWatcher's callback function. This interface isn't hooked up to anything yet. It'll be used by the sponconfig.Reconciler soon to perform partial reconciliation once full reconciliation is done. It is intended that the IDs from the updates produced by the SQLWatcher will be fed into the SQLTranslator. References cockroachdb#67679 Carved from cockroachdb#69661 Release note: None
This patch introduces the SQLWatcher, which is intended to incrementally watch for updates to system.zones and system.descriptors. It does so by establishing rangefeeds at a given timestamp. The SQLWatcher invokes a callback from time to time with a list of updates that have been observed in the window (previous checkpointTS, checkpointTS]. The checkpointTS is also provided to the callback. Internally, the SQLWatcher uses a buffer to keep track of events generated by the SQLWatcher's rangefeeds. It also tracks the individual frontier timestamps of both the rangefeeds. This helps to maintain the notion of the combined frontier timestamp, which is computed as the minimum of the two. This combined frontier timestamp serves as the checkpoint to the SQLWatcher's callback function. This interface isn't hooked up to anything yet. It'll be used by the sponconfig.Reconciler soon to perform partial reconciliation once full reconciliation is done. It is intended that the IDs from the updates produced by the SQLWatcher will be fed into the SQLTranslator. References cockroachdb#67679 Carved from cockroachdb#69661 Release note: None
This patch introduces the SQLWatcher, which is intended to incrementally watch for updates to system.zones and system.descriptors. It does so by establishing rangefeeds at a given timestamp. The SQLWatcher invokes a callback from time to time with a list of updates that have been observed in the window (previous checkpointTS, checkpointTS]. The checkpointTS is also provided to the callback. Internally, the SQLWatcher uses a buffer to keep track of events generated by the SQLWatcher's rangefeeds. It also tracks the individual frontier timestamps of both the rangefeeds. This helps to maintain the notion of the combined frontier timestamp, which is computed as the minimum of the two. This combined frontier timestamp serves as the checkpoint to the SQLWatcher's callback function. This interface isn't hooked up to anything yet. It'll be used by the sponconfig.Reconciler soon to perform partial reconciliation once full reconciliation is done. It is intended that the IDs from the updates produced by the SQLWatcher will be fed into the SQLTranslator. References cockroachdb#67679 Carved from cockroachdb#69661 Release note: None
This patch introduces the SQLWatcher, which is intended to incrementally watch for updates to system.zones and system.descriptors. It does so by establishing rangefeeds at a given timestamp. The SQLWatcher invokes a callback from time to time with a list of updates that have been observed in the window (previous checkpointTS, checkpointTS]. The checkpointTS is also provided to the callback. Internally, the SQLWatcher uses a buffer to keep track of events generated by the SQLWatcher's rangefeeds. It also tracks the individual frontier timestamps of both the rangefeeds. This helps to maintain the notion of the combined frontier timestamp, which is computed as the minimum of the two. This combined frontier timestamp serves as the checkpoint to the SQLWatcher's callback function. This interface isn't hooked up to anything yet. It'll be used by the sponconfig.Reconciler soon to perform partial reconciliation once full reconciliation is done. It is intended that the IDs from the updates produced by the SQLWatcher will be fed into the SQLTranslator. References cockroachdb#67679 Carved from cockroachdb#69661 Release note: None
71968: spanconfig: introduce the spanconfig.SQLWatcher r=irfansharif,ajwerner a=arulajmani This patch introduces the SQLWatcher, which is intended to incrementally watch for updates to system.zones and system.descriptors. It does so by establishing rangefeeds at a given timestamp. The SQLWatcher periodically invokes a callback with a list of updates that have been observed in the window (previous checkpointTS, checkpointTS]. The checkpointTS is also provided to the callback. Internally, the SQLWatcher uses a buffer to keep track of events generated by the SQLWatcher's rangefeeds. It also tracks the individual frontier timestamps of both the rangefeeds. This helps to maintain the notion of the combined frontier timestamp, which is computed as the minimum of the two. This combined frontier timestamp serves as the checkpoint to the SQLWatcher's callback function. This interface isn't hooked up to anything yet. It'll be used by the sponconfig.Reconciler soon to perform partial reconciliation once full reconciliation is done. It is intended that the IDs from the updates produced by the SQLWatcher will be fed into the SQLTranslator. References #67679 Carved from #69661 Release note: None Co-authored-by: arulajmani <[email protected]>
This patch introduces a few more span config interfaces/structs, namely
the SQLReconciler, SQLWatcher, and an eventBuffer helper struct for the
SQLWatcher.
The SQLReconciler is responsible for reconciling SQL descriptors and
their zone configurations to span configurations. It simply constructs
the implied span configuration state and is agnostic to the actual span
configuration state in KV.
The SQLWatcher is intended to incrementally process zone configuration
updates in the SQL layer. It does so by establishing a rangefeed on
system.zones and system.descriptors from a given checkpoint. It buffers
events until the next checkpoint. The list of buffered descriptor IDs,
along with the new checkpoint, are passed to a handler callback.
The SQLWatcher uses an eventBuffer helper struct to buffer rangefeed
updates. This thing maintains the notion of a checkpoint timestamp,
which is minimum timestamp for a {zones, descriptor} event.
This flow doesn't quite work end to end yet because rangefeeds don't
expose checkpoint events yet. These interfaces aren't hooked up to
anything yet either.
References #67679
Release justification: non-production code changes
Release note: None