-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: break gossip dep for table descriptors / table leases #47150
Comments
48435: sql: trim down gossip deprecation calls r=asubiotto a=tbg This series of commits revisits all of the calls to `DeprecatedGossip.Deprecated()` and removes all of them. Instead, a few specialized accessors remain, namely access to the system config and the corresponding trigger channel, as well as resolving between nodeID, storeID, and addresses. This gives us a precise picture: we need #47150 to avoid the system config gossip, and we need to fix #48432. After that, we *should* be able to start SQL with a nil Gossip (though we would only do so for tenants, due to the many "nonessential" but important things lost when running without Gossip, such as DistSQL). cc @ajwerner Co-authored-by: Tobias Schottdorf <[email protected]>
I suppose I should file a follow-up issue if I want to close this. It allows a nil gossip to be passed to the lease manager, or rather, it did until we introduced the |
Most of cockroachdb#47904. First commit from cockroachdb#48504. Second and third commit from cockroachdb#48773. This commit implements the initial structure for tenant creation and deletion. It does so by introducing a new system table to track tenants in a multi-tenant cluster and two new builtin functions to manipulate this table and the overall multi-tenant state. The change starts by introducing a new `system.tenants` table with the following schema: ``` CREATE TABLE system.tenants ( id INT8 NOT NULL PRIMARY KEY, active BOOL NOT NULL DEFAULT true, info BYTES ); ``` The `id` column is self-explanatory. The `active` column is used to coordinate tenant destruction in an asynchronous job. The `info` column is an opaque byte slice to allow users to associate arbitrary information with specific tenants. I don't know exactly how this third field will be used (mapping tenants back to CockroachCloud user IDs?), but it seems like a good idea to add some flexibility since we do intend to eventually expose this externally. The table is given an ID of 8, which means that it falls within the "system config span". I believe this is ok, because the entire concept of the "system config span" should be going away with cockroachdb#47150. The table is also only exposed to the system-tenant and is never created for secondary tenants. The change then introduces two new builtin functions: `crdb_internal.create_tenant` and `crdb_internal.destroy_tenant`. These do what you would expect - creating and destroying tenant keyspaces, along with updating metadata in system.tenants.
Most of cockroachdb#47904. First commit from cockroachdb#48504. Second and third commit from cockroachdb#48773. This commit implements the initial structure for tenant creation and deletion. It does so by introducing a new system table to track tenants in a multi-tenant cluster and two new builtin functions to manipulate this table and the overall multi-tenant state. The change starts by introducing a new `system.tenants` table with the following schema: ``` CREATE TABLE system.tenants ( id INT8 NOT NULL PRIMARY KEY, active BOOL NOT NULL DEFAULT true, info BYTES ); ``` The `id` column is self-explanatory. The `active` column is used to coordinate tenant destruction in an asynchronous job. The `info` column is an opaque byte slice to allow users to associate arbitrary information with specific tenants. I don't know exactly how this third field will be used (mapping tenants back to CockroachCloud user IDs?), but it seems like a good idea to add some flexibility since we do intend to eventually expose this externally. The table is given an ID of 8, which means that it falls within the "system config span". I believe this is ok, because the entire concept of the "system config span" should be going away with cockroachdb#47150. The table is also only exposed to the system-tenant and is never created for secondary tenants. The change then introduces two new builtin functions: `crdb_internal.create_tenant` and `crdb_internal.destroy_tenant`. These do what you would expect - creating and destroying tenant keyspaces, along with updating metadata in system.tenants.
48778: sql: support tenant creation and deletion, track in system.tenants table r=nvanbenschoten a=nvanbenschoten Most of #47904. This commit implements the initial structure for tenant creation and deletion. It does so by introducing a new system table to track tenants in a multi-tenant cluster and two new builtin functions to manipulate this table and the overall multi-tenant state. The change starts by introducing a new `system.tenants` table with the following schema: ``` CREATE TABLE system.tenants ( id INT8 NOT NULL PRIMARY KEY, active BOOL NOT NULL DEFAULT true, info BYTES ); ``` The `id` column is self-explanatory. The `active` column is used to coordinate tenant destruction in an asynchronous job. The `info` column is an opaque byte slice to allow users to associate arbitrary information with specific tenants. I don't know exactly how this third field will be used (mapping tenants back to CockroachCloud user IDs?), but it seems like a good idea to add some flexibility since we do intend to eventually expose this externally. The table is given an ID of 8, which means that it falls within the "system config span". I believe this is ok, because the entire concept of the "system config span" should be going away with #47150. The table is also only exposed to the system-tenant and is never created for secondary tenants. The change then introduces two new builtin functions: `crdb_internal.create_tenant` and `crdb_internal.destroy_tenant`. These do what you would expect - creating and destroying tenant keyspaces, along with updating metadata in system.tenants. 48830: opt: correctly ignore NULL in array indirection r=mjibson a=mjibson Fixes #48826 Release note (bug fix): fix an error that could occur when using NULL in some array indirections. 49081: sql: fix bug where columns couldn't be dropped after a pk change r=rohany a=rohany Fixes #49079. Release note (bug fix): Fix a bug where columns of a table could not be dropped after a primary key change. 49082: kv: remove special handling of MergeTrigger in batcheval r=nvanbenschoten a=nvanbenschoten Discovered with @ajwerner while investigating #48770. This commit removes the special treatment that we used to give the MergeTrigger - running it before local lock cleanup. This was originally added back when the MergeTrigger held a snapshot of the RHS's data, including intents that needed to be resolved synchronously with the trigger. This snapshot was removed shortly afterwards in bb7426a, but the special handling during batch evaluation persisted, leading to confusing code. While here, avoid copying around the transaction Result on the hot path where there are no commit triggers. This was not free, as is demonstrated by the following benchmark: ``` func BenchmarkMergeTxnCommitResult(b *testing.B) { locks := []roachpb.Span{{Key: roachpb.Key("a")}, {Key: roachpb.Key("b")}} txn := &roachpb.Transaction{LockSpans: locks} txnResult := result.FromEndTxn(txn, false /* alwaysReturn */, false) txnResult.Local.UpdatedTxns = []*roachpb.Transaction{txn} txnResult.Local.ResolvedLocks = roachpb.AsLockUpdates(txn, locks) for i := 0; i < b.N; i++ { var res result.Result if err := res.MergeAndDestroy(txnResult); err != nil { b.Fatal(err) } } } // BenchmarkMergeTxnCommitResult-16 12777922 86.2 ns/op 0 B/op 0 allocs/op ``` Co-authored-by: Nathan VanBenschoten <[email protected]> Co-authored-by: Matt Jibson <[email protected]> Co-authored-by: Rohan Yadav <[email protected]>
One thing claimed at the outset here was that we'd be able to stop gossipping the descriptor table. This unfortunately is not true given the way that the KV layer computes the zone config for a given key. We can stop gossipping namespace but it's so much smaller than descriptor it's not clear there's a win there. That's not to say that there isn't a win to be had here. We don't need to set the @nvanbenschoten does this need to gossip the whole descriptor table make you as sad as it makes me? I don't really have a great set of alternatives. We could maintain a data structure that maps IDs to their parents which would likely be a couple of orders of magnitude smaller but still If we weren't so constrained by the implementation we have today, what would we do? Would we attach zone configuration state to range state and then have some mechanism to reconcile mismatches between the expectation set at a global level and what exists at the range level? |
Fixes cockroachdb#47909. This commit starts by adding two RPCs to the Internal service: ``` service Internal { ... rpc RangeLookup (RangeLookupRequest) returns (RangeLookupResponse) {} rpc NodeInfo (NodeInfoRequest) returns (stream NodeInfoResponse) {} } // RangeLookupRequest is a request to proxy a RangeLookup through a Tenant // service. Its fields correspond to a subset of the args of kv.RangeLookup. message RangeLookupRequest { ... } // NodeInfoRequest is a request to establish an indefinite stream on a Tenant // service that provides an initial NodeInfoResponse and a NodeInfoResponse // whenever the collection of KV nodes in a cluster changes. It effectively // proxies any updates to NodeDescriptors in the KV gossip network back to the // client of the request. message NodeInfoRequest {} ``` The commit then introduces new `kvtenant.Proxy` object. Proxy mediates the communication of cluster-wide state to sandboxed SQL-only tenant processes through a restricted interface. A Proxy is seeded with a set of one or more network addresses that reference existing KV nodes in the cluster (or a load-balancer which fans out to some/all KV nodes). On startup, it establishes contact with one of these nodes to learn about the topology of the cluster and bootstrap the rest of SQL <-> KV network communication. Proxy has two main roles: First, Proxy is capable of providing information on each of the KV nodes in the cluster in the form of NodeDescriptors. This obviates the need for SQL-only tenant processes to join the cluster-wide gossip network. In doing so, it satisfies the `NodeDescStore` interface and can be used as an `AddressResolver` with a small adapter. Second, Proxy is capable of providing Range addressing information in the form of RangeDescriptors through delegated RangeLookup requests. This is necessary because SQL-only tenants are restricted from reading Range Metadata keys directly. Instead, the RangeLookup requests are proxied through existing KV nodes while being subject to additional validation (e.g. is the Range being requested owned by the requesting tenant?). In doing so, it satisfies the `RangeDescriptorDB` interface and can be used to delegate all DistSender/RangeCache descriptor lookups to KV nodes. With this commit, we can mostly run a SQL-only tenant process without joining the KV cluster's gossip network. This works if I comment out a few of the uses of gossip due to cockroachdb#49692 and cockroachdb#47150 in SQL. Notably, with the call to `DeprecatedRegisterSystemConfigChannel` in `sql.Server.Start` removed, I can remove `Gossip` from `makeSQLServerArgs` entirely and things "just work".
Fixes cockroachdb#47909. This commit starts by adding two RPCs to the Internal service: ``` service Internal { ... rpc RangeFeed (RangeFeedRequest) returns (stream RangeFeedEvent) {} rpc GossipSubscription (GossipSubscriptionRequest) returns (stream GossipSubscriptionEvent) {} } // RangeLookupRequest is a request to proxy a RangeLookup through a Tenant // service. Its fields correspond to a subset of the args of kv.RangeLookup. message RangeLookupRequest { ... } // GossipSubscriptionRequest initiates a game of telephone. It establishes an // indefinite stream that proxies gossip information overheard by the recipient // node back to the caller. Gossip information is filtered down to just those // identified by a key matching any of the specified patterns. // // Upon establishment of the stream, all existing information that matches one // or more of the patterns is returned. After this point, only new information // matching the patterns is returned. message GossipSubscriptionRequest { ... } ``` The commit then introduces new `kvtenant.Proxy` object. Proxy mediates the communication of cluster-wide state to sandboxed SQL-only tenant processes through a restricted interface. A Proxy is seeded with a set of one or more network addresses that reference existing KV nodes in the cluster (or a load-balancer which fans out to some/all KV nodes). On startup, it establishes contact with one of these nodes to learn about the topology of the cluster and bootstrap the rest of SQL <-> KV network communication. Proxy has two main roles: First, Proxy is capable of providing information on each of the KV nodes in the cluster in the form of NodeDescriptors. This obviates the need for SQL-only tenant processes to join the cluster-wide gossip network. In doing so, it satisfies the `NodeDescStore` interface and can be used as an `AddressResolver` with a small adapter. Second, Proxy is capable of providing Range addressing information in the form of RangeDescriptors through delegated RangeLookup requests. This is necessary because SQL-only tenants are restricted from reading Range Metadata keys directly. Instead, the RangeLookup requests are proxied through existing KV nodes while being subject to additional validation (e.g. is the Range being requested owned by the requesting tenant?). In doing so, it satisfies the `RangeDescriptorDB` interface and can be used to delegate all DistSender/RangeCache descriptor lookups to KV nodes. With this commit, we can mostly run a SQL-only tenant process without joining the KV cluster's gossip network. This works if I comment out a few of the uses of gossip due to cockroachdb#49692 and cockroachdb#47150 in SQL. Notably, with the call to `DeprecatedRegisterSystemConfigChannel` in `sql.Server.Start` removed, I can remove `Gossip` from `makeSQLServerArgs` entirely and things "just work".
Fixes cockroachdb#47909. This commit starts by adding two RPCs to the Internal service: ``` service Internal { ... rpc RangeFeed (RangeFeedRequest) returns (stream RangeFeedEvent) {} rpc GossipSubscription (GossipSubscriptionRequest) returns (stream GossipSubscriptionEvent) {} } // RangeLookupRequest is a request to proxy a RangeLookup through a Tenant // service. Its fields correspond to a subset of the args of kv.RangeLookup. message RangeLookupRequest { ... } // GossipSubscriptionRequest initiates a game of telephone. It establishes an // indefinite stream that proxies gossip information overheard by the recipient // node back to the caller. Gossip information is filtered down to just those // identified by a key matching any of the specified patterns. // // Upon establishment of the stream, all existing information that matches one // or more of the patterns is returned. After this point, only new information // matching the patterns is returned. message GossipSubscriptionRequest { ... } ``` The commit then introduces new `kvtenant.Proxy` object. Proxy mediates the communication of cluster-wide state to sandboxed SQL-only tenant processes through a restricted interface. A Proxy is seeded with a set of one or more network addresses that reference existing KV nodes in the cluster (or a load-balancer which fans out to some/all KV nodes). On startup, it establishes contact with one of these nodes to learn about the topology of the cluster and bootstrap the rest of SQL <-> KV network communication. Proxy has two main roles: First, Proxy is capable of providing information on each of the KV nodes in the cluster in the form of NodeDescriptors. This obviates the need for SQL-only tenant processes to join the cluster-wide gossip network. In doing so, it satisfies the `NodeDescStore` interface and can be used as an `AddressResolver` with a small adapter. Second, Proxy is capable of providing Range addressing information in the form of RangeDescriptors through delegated RangeLookup requests. This is necessary because SQL-only tenants are restricted from reading Range Metadata keys directly. Instead, the RangeLookup requests are proxied through existing KV nodes while being subject to additional validation (e.g. is the Range being requested owned by the requesting tenant?). In doing so, it satisfies the `RangeDescriptorDB` interface and can be used to delegate all DistSender/RangeCache descriptor lookups to KV nodes. With this commit, we can mostly run a SQL-only tenant process without joining the KV cluster's gossip network. This works if I comment out a few of the uses of gossip due to cockroachdb#49692 and cockroachdb#47150 in SQL. Notably, with the call to `DeprecatedRegisterSystemConfigChannel` in `sql.Server.Start` removed, I can remove `Gossip` from `makeSQLServerArgs` entirely and things "just work".
50520: kv/kvclient: introduce new tenant Proxy r=nvanbenschoten a=nvanbenschoten Closes #47909. This commit starts by adding two RPCs to the Internal service: ```proto service Internal { ... rpc RangeFeed (RangeFeedRequest) returns (stream RangeFeedEvent) {} rpc GossipSubscription (GossipSubscriptionRequest) returns (stream GossipSubscriptionEvent) {} } // RangeLookupRequest is a request to proxy a RangeLookup through a Tenant // service. Its fields correspond to a subset of the args of kv.RangeLookup. message RangeLookupRequest { ... } // GossipSubscriptionRequest initiates a game of telephone. It establishes an // indefinite stream that proxies gossip information overheard by the recipient // node back to the caller. Gossip information is filtered down to just those // identified by a key matching any of the specified patterns. // // Upon establishment of the stream, all existing information that matches one // or more of the patterns is returned. After this point, only new information // matching the patterns is returned. message GossipSubscriptionRequest { ... } ``` The commit then introduces new `kvtenant.Proxy` object. Proxy mediates the communication of cluster-wide state to sandboxed SQL-only tenant processes through a restricted interface. A Proxy is seeded with a set of one or more network addresses that reference existing KV nodes in the cluster (or a load-balancer which fans out to some/all KV nodes). On startup, it establishes contact with one of these nodes to learn about the topology of the cluster and bootstrap the rest of SQL <-> KV network communication. Proxy has two main roles: First, Proxy is capable of providing information on each of the KV nodes in the cluster in the form of NodeDescriptors. This obviates the need for SQL-only tenant processes to join the cluster-wide gossip network. In doing so, it satisfies the `NodeDescStore` interface and can be used as an `AddressResolver` with a small adapter. Second, Proxy is capable of providing Range addressing information in the form of RangeDescriptors through delegated RangeLookup requests. This is necessary because SQL-only tenants are restricted from reading Range Metadata keys directly. Instead, the RangeLookup requests are proxied through existing KV nodes while being subject to additional validation (e.g. is the Range being requested owned by the requesting tenant?). In doing so, it satisfies the `RangeDescriptorDB` interface and can be used to delegate all DistSender/RangeCache descriptor lookups to KV nodes. With this commit, we can mostly run a SQL-only tenant process without joining the KV cluster's gossip network. This works if I comment out a few of the uses of gossip due to #49692 and #47150 in SQL. Notably, with the call to `DeprecatedRegisterSystemConfigChannel` in `sql.Server.Start` removed, I can remove `Gossip` from `makeSQLServerArgs` entirely and things "just work". Co-authored-by: Nathan VanBenschoten <[email protected]>
51836: sql: only hook up database cache to gossip if system tenant r=ajwerner a=ajwerner We won't have access to gossip on secondary tenants. This database cache should be gone in the next week but this should unblock progress. Touches #47150 Release note: None Co-authored-by: Andrew Werner <[email protected]>
I'm closing this and will open a different issue for eliminating some of the calls to |
@ajwerner please only re-close this after you've updated the remaining reference to this issue in the source code (in lease.go) (we keep X-anchored-telemetry issues open until the product doesn't refer to them in "live" problems) |
54256: sql: make multi-tenancy errors report to telemetry r=asubiotto a=knz Fixes #48164. Issues referenced, I also added the X-anchored-telemetry label to them on github (we keep those issues open on github until their reference in the code is removed): #54254 #54255 #54250 #54251 #54252 #49854 #48274 #47150 #47971 #47970 #47900 #47925 Co-authored-by: Raphael 'kena' Poss <[email protected]>
The errors have been removed. I'm closing this in favor of the newer issue #70560 which covers the more general problem of not gossiping the system config span. |
Table descriptors versions are leased by nodes. We ensure that at most, the latest two versions of a table are leased. Schema changes which increment the version of a table descriptor wait until all leases to previous versions of the table are dropped. Gossip is the mechanism by which nodes are informed of their need to drop the table descriptor. In order to make this work, we ensure that all DDL statements in explicit transactions occur as the first writing statement so that we can synthetically anchor the transaction record to the system config span (
(*kv.Txn).SetSystemConfigTrigger()
).This is an awkward restriction and should be lifted.
Additionally, this dependency on gossip is an impediment to separating the SQL layer from the gossip network.
The text was updated successfully, but these errors were encountered: