Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: break gossip dep for table descriptors / table leases #47150

Closed
jordanlewis opened this issue Apr 7, 2020 · 8 comments · Fixed by #48159
Closed

sql: break gossip dep for table descriptors / table leases #47150

jordanlewis opened this issue Apr 7, 2020 · 8 comments · Fixed by #48159
Assignees
Labels
A-multitenancy Related to multi-tenancy

Comments

@jordanlewis
Copy link
Member

jordanlewis commented Apr 7, 2020

Table descriptors versions are leased by nodes. We ensure that at most, the latest two versions of a table are leased. Schema changes which increment the version of a table descriptor wait until all leases to previous versions of the table are dropped. Gossip is the mechanism by which nodes are informed of their need to drop the table descriptor. In order to make this work, we ensure that all DDL statements in explicit transactions occur as the first writing statement so that we can synthetically anchor the transaction record to the system config span ((*kv.Txn).SetSystemConfigTrigger()).

This is an awkward restriction and should be lifted.

ERROR: unimplemented: schema change statement cannot follow a statement that has written in the same transaction
SQLSTATE: 0A000
HINT: You have attempted to use a feature that is not yet implemented.
See: https://github.com/cockroachdb/cockroach/issues/26508

Additionally, this dependency on gossip is an impediment to separating the SQL layer from the gossip network.

@tbg tbg added A-multitenancy Related to multi-tenancy A-multitenancy-blocker labels Apr 22, 2020
craig bot pushed a commit that referenced this issue May 7, 2020
48435: sql: trim down gossip deprecation calls r=asubiotto a=tbg

This series of commits revisits all of the calls to `DeprecatedGossip.Deprecated()` and removes all of them. Instead, a few specialized accessors remain, namely access to the system config and the corresponding trigger channel, as well as resolving between nodeID, storeID, and addresses.

This gives us a precise picture: we need #47150 to avoid the system config gossip, and we need to fix #48432. After that, we *should* be able to start SQL with a nil Gossip (though we would only do so for tenants, due to the many "nonessential" but important things lost when running without Gossip, such as DistSQL).

cc @ajwerner 

Co-authored-by: Tobias Schottdorf <[email protected]>
@craig craig bot closed this as completed in 91b2880 May 12, 2020
@nvanbenschoten
Copy link
Member

@ajwerner should this be closed yet? You said in #48159 that the PR

Relates to but does not fully resolve #47150. Another PR will follow-up and disable the gossip of the descriptors table and lift the enforcement of the system config transaction anchor.

@ajwerner
Copy link
Contributor

I suppose I should file a follow-up issue if I want to close this. It allows a nil gossip to be passed to the lease manager, or rather, it did until we introduced the DeprecatedGossip thing. Now, if a cluster is bootstrapped at 20.2, the lease manager should not ever look at gossip so the title of the issue is done.

@ajwerner ajwerner reopened this May 13, 2020
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue May 13, 2020
Most of cockroachdb#47904.

First commit from cockroachdb#48504.
Second and third commit from cockroachdb#48773.

This commit implements the initial structure for tenant creation and
deletion. It does so by introducing a new system table to track tenants
in a multi-tenant cluster and two new builtin functions to manipulate
this table and the overall multi-tenant state.

The change starts by introducing a new `system.tenants` table with the
following schema:
```
CREATE TABLE system.tenants (
    id     INT8 NOT NULL PRIMARY KEY,
    active BOOL NOT NULL DEFAULT true,
    info   BYTES
);
```

The `id` column is self-explanatory. The `active` column is used to
coordinate tenant destruction in an asynchronous job. The `info` column
is an opaque byte slice to allow users to associate arbitrary
information with specific tenants. I don't know exactly how this third
field will be used (mapping tenants back to CockroachCloud user IDs?),
but it seems like a good idea to add some flexibility since we do intend
to eventually expose this externally.

The table is given an ID of 8, which means that it falls within the
"system config span". I believe this is ok, because the entire concept
of the "system config span" should be going away with cockroachdb#47150. The table
is also only exposed to the system-tenant and is never created for
secondary tenants.

The change then introduces two new builtin functions:
`crdb_internal.create_tenant` and `crdb_internal.destroy_tenant`. These
do what you would expect - creating and destroying tenant keyspaces,
along with updating metadata in system.tenants.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue May 14, 2020
Most of cockroachdb#47904.

First commit from cockroachdb#48504.
Second and third commit from cockroachdb#48773.

This commit implements the initial structure for tenant creation and
deletion. It does so by introducing a new system table to track tenants
in a multi-tenant cluster and two new builtin functions to manipulate
this table and the overall multi-tenant state.

The change starts by introducing a new `system.tenants` table with the
following schema:
```
CREATE TABLE system.tenants (
    id     INT8 NOT NULL PRIMARY KEY,
    active BOOL NOT NULL DEFAULT true,
    info   BYTES
);
```

The `id` column is self-explanatory. The `active` column is used to
coordinate tenant destruction in an asynchronous job. The `info` column
is an opaque byte slice to allow users to associate arbitrary
information with specific tenants. I don't know exactly how this third
field will be used (mapping tenants back to CockroachCloud user IDs?),
but it seems like a good idea to add some flexibility since we do intend
to eventually expose this externally.

The table is given an ID of 8, which means that it falls within the
"system config span". I believe this is ok, because the entire concept
of the "system config span" should be going away with cockroachdb#47150. The table
is also only exposed to the system-tenant and is never created for
secondary tenants.

The change then introduces two new builtin functions:
`crdb_internal.create_tenant` and `crdb_internal.destroy_tenant`. These
do what you would expect - creating and destroying tenant keyspaces,
along with updating metadata in system.tenants.
craig bot pushed a commit that referenced this issue May 14, 2020
48778: sql: support tenant creation and deletion, track in system.tenants table r=nvanbenschoten a=nvanbenschoten

Most of #47904.

This commit implements the initial structure for tenant creation and deletion. It does so by introducing a new system table to track tenants in a multi-tenant cluster and two new builtin functions to manipulate this table and the overall multi-tenant state.

The change starts by introducing a new `system.tenants` table with the following schema:
```
CREATE TABLE system.tenants (
    id     INT8 NOT NULL PRIMARY KEY,
    active BOOL NOT NULL DEFAULT true,
    info   BYTES
);
```

The `id` column is self-explanatory. The `active` column is used to coordinate tenant destruction in an asynchronous job. The `info` column is an opaque byte slice to allow users to associate arbitrary information with specific tenants. I don't know exactly how this third field will be used (mapping tenants back to CockroachCloud user IDs?), but it seems like a good idea to add some flexibility since we do intend to eventually expose this externally.

The table is given an ID of 8, which means that it falls within the "system config span". I believe this is ok, because the entire concept of the "system config span" should be going away with #47150. The table is also only exposed to the system-tenant and is never created for secondary tenants.

The change then introduces two new builtin functions: `crdb_internal.create_tenant` and `crdb_internal.destroy_tenant`. These do what you would expect - creating and destroying tenant keyspaces, along with updating metadata in system.tenants.

48830: opt: correctly ignore NULL in array indirection r=mjibson a=mjibson

Fixes #48826

Release note (bug fix): fix an error that could occur when using NULL
in some array indirections.

49081: sql: fix bug where columns couldn't be dropped after a pk change r=rohany a=rohany

Fixes #49079.

Release note (bug fix): Fix a bug where columns of a table could not
be dropped after a primary key change.

49082: kv: remove special handling of MergeTrigger in batcheval r=nvanbenschoten a=nvanbenschoten

Discovered with @ajwerner while investigating #48770.

This commit removes the special treatment that we used to give
the MergeTrigger - running it before local lock cleanup. This
was originally added back when the MergeTrigger held a snapshot
of the RHS's data, including intents that needed to be resolved
synchronously with the trigger. This snapshot was removed shortly
afterwards in bb7426a, but the special handling during batch
evaluation persisted, leading to confusing code.

While here, avoid copying around the transaction Result on the
hot path where there are no commit triggers. This was not free,
as is demonstrated by the following benchmark:
```
func BenchmarkMergeTxnCommitResult(b *testing.B) {
	locks := []roachpb.Span{{Key: roachpb.Key("a")}, {Key: roachpb.Key("b")}}
	txn := &roachpb.Transaction{LockSpans: locks}
	txnResult := result.FromEndTxn(txn, false /* alwaysReturn */, false)
	txnResult.Local.UpdatedTxns = []*roachpb.Transaction{txn}
	txnResult.Local.ResolvedLocks = roachpb.AsLockUpdates(txn, locks)

	for i := 0; i < b.N; i++ {
		var res result.Result
		if err := res.MergeAndDestroy(txnResult); err != nil {
			b.Fatal(err)
		}
	}
}

// BenchmarkMergeTxnCommitResult-16   12777922   86.2 ns/op   0 B/op   0 allocs/op
```

Co-authored-by: Nathan VanBenschoten <[email protected]>
Co-authored-by: Matt Jibson <[email protected]>
Co-authored-by: Rohan Yadav <[email protected]>
@ajwerner
Copy link
Contributor

One thing claimed at the outset here was that we'd be able to stop gossipping the descriptor table. This unfortunately is not true given the way that the KV layer computes the zone config for a given key. We can stop gossipping namespace but it's so much smaller than descriptor it's not clear there's a win there. That's not to say that there isn't a win to be had here. We don't need to set the SystemConfigTrigger nearly as often. We only need to set it when writing zone configs, cluster settings, creating tables, and re-parenting tables. Not for any other table modifications.

@nvanbenschoten does this need to gossip the whole descriptor table make you as sad as it makes me? I don't really have a great set of alternatives. We could maintain a data structure that maps IDs to their parents which would likely be a couple of orders of magnitude smaller but still O(descriptors).

If we weren't so constrained by the implementation we have today, what would we do? Would we attach zone configuration state to range state and then have some mechanism to reconcile mismatches between the expectation set at a global level and what exists at the range level?

nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jul 2, 2020
Fixes cockroachdb#47909.

This commit starts by adding two RPCs to the Internal service:
```
service Internal {
	...
	rpc RangeLookup (RangeLookupRequest) returns (RangeLookupResponse)     {}
	rpc NodeInfo    (NodeInfoRequest)    returns (stream NodeInfoResponse) {}
}

// RangeLookupRequest is a request to proxy a RangeLookup through a Tenant
// service. Its fields correspond to a subset of the args of kv.RangeLookup.
message RangeLookupRequest {
    ...
}

// NodeInfoRequest is a request to establish an indefinite stream on a Tenant
// service that provides an initial NodeInfoResponse and a NodeInfoResponse
// whenever the collection of KV nodes in a cluster changes. It effectively
// proxies any updates to NodeDescriptors in the KV gossip network back to the
// client of the request.
message NodeInfoRequest {}
```

The commit then introduces new `kvtenant.Proxy` object. Proxy mediates
the communication of cluster-wide state to sandboxed SQL-only tenant
processes through a restricted interface. A Proxy is seeded with a set
of one or more network addresses that reference existing KV nodes in the
cluster (or a load-balancer which fans out to some/all KV nodes). On
startup, it establishes contact with one of these nodes to learn about
the topology of the cluster and bootstrap the rest of SQL <-> KV network
communication.

Proxy has two main roles:

First, Proxy is capable of providing information on each of the KV nodes
in the cluster in the form of NodeDescriptors. This obviates the need
for SQL-only tenant processes to join the cluster-wide gossip network.
In doing so, it satisfies the `NodeDescStore` interface and can be used
as an `AddressResolver` with a small adapter.

Second, Proxy is capable of providing Range addressing information in
the form of RangeDescriptors through delegated RangeLookup requests.
This is necessary because SQL-only tenants are restricted from reading
Range Metadata keys directly. Instead, the RangeLookup requests are
proxied through existing KV nodes while being subject to additional
validation (e.g. is the Range being requested owned by the requesting
tenant?). In doing so, it satisfies the `RangeDescriptorDB` interface
and can be used to delegate all DistSender/RangeCache descriptor lookups
to KV nodes.

With this commit, we can mostly run a SQL-only tenant process without
joining the KV cluster's gossip network. This works if I comment out a
few of the uses of gossip due to cockroachdb#49692 and cockroachdb#47150 in SQL. Notably,
with the call to `DeprecatedRegisterSystemConfigChannel` in `sql.Server.Start`
removed, I can remove `Gossip` from `makeSQLServerArgs` entirely and
things "just work".
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jul 13, 2020
Fixes cockroachdb#47909.

This commit starts by adding two RPCs to the Internal service:
```
service Internal {
	...
	rpc RangeFeed          (RangeFeedRequest)          returns (stream RangeFeedEvent)          {}
	rpc GossipSubscription (GossipSubscriptionRequest) returns (stream GossipSubscriptionEvent) {}
}

// RangeLookupRequest is a request to proxy a RangeLookup through a Tenant
// service. Its fields correspond to a subset of the args of kv.RangeLookup.
message RangeLookupRequest {
    ...
}

// GossipSubscriptionRequest initiates a game of telephone. It establishes an
// indefinite stream that proxies gossip information overheard by the recipient
// node back to the caller. Gossip information is filtered down to just those
// identified by a key matching any of the specified patterns.
//
// Upon establishment of the stream, all existing information that matches one
// or more of the patterns is returned. After this point, only new information
// matching the patterns is returned.
message GossipSubscriptionRequest {
    ...
}
```

The commit then introduces new `kvtenant.Proxy` object. Proxy mediates
the communication of cluster-wide state to sandboxed SQL-only tenant
processes through a restricted interface. A Proxy is seeded with a set
of one or more network addresses that reference existing KV nodes in the
cluster (or a load-balancer which fans out to some/all KV nodes). On
startup, it establishes contact with one of these nodes to learn about
the topology of the cluster and bootstrap the rest of SQL <-> KV network
communication.

Proxy has two main roles:

First, Proxy is capable of providing information on each of the KV nodes
in the cluster in the form of NodeDescriptors. This obviates the need
for SQL-only tenant processes to join the cluster-wide gossip network.
In doing so, it satisfies the `NodeDescStore` interface and can be used
as an `AddressResolver` with a small adapter.

Second, Proxy is capable of providing Range addressing information in
the form of RangeDescriptors through delegated RangeLookup requests.
This is necessary because SQL-only tenants are restricted from reading
Range Metadata keys directly. Instead, the RangeLookup requests are
proxied through existing KV nodes while being subject to additional
validation (e.g. is the Range being requested owned by the requesting
tenant?). In doing so, it satisfies the `RangeDescriptorDB` interface
and can be used to delegate all DistSender/RangeCache descriptor lookups
to KV nodes.

With this commit, we can mostly run a SQL-only tenant process without
joining the KV cluster's gossip network. This works if I comment out a
few of the uses of gossip due to cockroachdb#49692 and cockroachdb#47150 in SQL. Notably,
with the call to `DeprecatedRegisterSystemConfigChannel` in `sql.Server.Start`
removed, I can remove `Gossip` from `makeSQLServerArgs` entirely and
things "just work".
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jul 15, 2020
Fixes cockroachdb#47909.

This commit starts by adding two RPCs to the Internal service:
```
service Internal {
	...
	rpc RangeFeed          (RangeFeedRequest)          returns (stream RangeFeedEvent)          {}
	rpc GossipSubscription (GossipSubscriptionRequest) returns (stream GossipSubscriptionEvent) {}
}

// RangeLookupRequest is a request to proxy a RangeLookup through a Tenant
// service. Its fields correspond to a subset of the args of kv.RangeLookup.
message RangeLookupRequest {
    ...
}

// GossipSubscriptionRequest initiates a game of telephone. It establishes an
// indefinite stream that proxies gossip information overheard by the recipient
// node back to the caller. Gossip information is filtered down to just those
// identified by a key matching any of the specified patterns.
//
// Upon establishment of the stream, all existing information that matches one
// or more of the patterns is returned. After this point, only new information
// matching the patterns is returned.
message GossipSubscriptionRequest {
    ...
}
```

The commit then introduces new `kvtenant.Proxy` object. Proxy mediates
the communication of cluster-wide state to sandboxed SQL-only tenant
processes through a restricted interface. A Proxy is seeded with a set
of one or more network addresses that reference existing KV nodes in the
cluster (or a load-balancer which fans out to some/all KV nodes). On
startup, it establishes contact with one of these nodes to learn about
the topology of the cluster and bootstrap the rest of SQL <-> KV network
communication.

Proxy has two main roles:

First, Proxy is capable of providing information on each of the KV nodes
in the cluster in the form of NodeDescriptors. This obviates the need
for SQL-only tenant processes to join the cluster-wide gossip network.
In doing so, it satisfies the `NodeDescStore` interface and can be used
as an `AddressResolver` with a small adapter.

Second, Proxy is capable of providing Range addressing information in
the form of RangeDescriptors through delegated RangeLookup requests.
This is necessary because SQL-only tenants are restricted from reading
Range Metadata keys directly. Instead, the RangeLookup requests are
proxied through existing KV nodes while being subject to additional
validation (e.g. is the Range being requested owned by the requesting
tenant?). In doing so, it satisfies the `RangeDescriptorDB` interface
and can be used to delegate all DistSender/RangeCache descriptor lookups
to KV nodes.

With this commit, we can mostly run a SQL-only tenant process without
joining the KV cluster's gossip network. This works if I comment out a
few of the uses of gossip due to cockroachdb#49692 and cockroachdb#47150 in SQL. Notably,
with the call to `DeprecatedRegisterSystemConfigChannel` in `sql.Server.Start`
removed, I can remove `Gossip` from `makeSQLServerArgs` entirely and
things "just work".
craig bot pushed a commit that referenced this issue Jul 15, 2020
50520: kv/kvclient: introduce new tenant Proxy r=nvanbenschoten a=nvanbenschoten

Closes #47909.

This commit starts by adding two RPCs to the Internal service:
```proto
service Internal {
    ...
    rpc RangeFeed          (RangeFeedRequest)          returns (stream RangeFeedEvent)          {}
    rpc GossipSubscription (GossipSubscriptionRequest) returns (stream GossipSubscriptionEvent) {}
}

// RangeLookupRequest is a request to proxy a RangeLookup through a Tenant
// service. Its fields correspond to a subset of the args of kv.RangeLookup.
message RangeLookupRequest {
    ...
}

// GossipSubscriptionRequest initiates a game of telephone. It establishes an
// indefinite stream that proxies gossip information overheard by the recipient
// node back to the caller. Gossip information is filtered down to just those
// identified by a key matching any of the specified patterns.
//
// Upon establishment of the stream, all existing information that matches one
// or more of the patterns is returned. After this point, only new information
// matching the patterns is returned.
message GossipSubscriptionRequest {
    ...
}
```

The commit then introduces new `kvtenant.Proxy` object. Proxy mediates the communication of cluster-wide state to sandboxed SQL-only tenant processes through a restricted interface. A Proxy is seeded with a set of one or more network addresses that reference existing KV nodes in the cluster (or a load-balancer which fans out to some/all KV nodes). On startup, it establishes contact with one of these nodes to learn about the topology of the cluster and bootstrap the rest of SQL <-> KV network communication.

Proxy has two main roles:

First, Proxy is capable of providing information on each of the KV nodes in the cluster in the form of NodeDescriptors. This obviates the need for SQL-only tenant processes to join the cluster-wide gossip network. In doing so, it satisfies the `NodeDescStore` interface and can be used as an `AddressResolver` with a small adapter.

Second, Proxy is capable of providing Range addressing information in the form of RangeDescriptors through delegated RangeLookup requests. This is necessary because SQL-only tenants are restricted from reading Range Metadata keys directly. Instead, the RangeLookup requests are proxied through existing KV nodes while being subject to additional validation (e.g. is the Range being requested owned by the requesting tenant?). In doing so, it satisfies the `RangeDescriptorDB` interface and can be used to delegate all DistSender/RangeCache descriptor lookups to KV nodes.

With this commit, we can mostly run a SQL-only tenant process without joining the KV cluster's gossip network. This works if I comment out a few of the uses of gossip due to #49692 and #47150 in SQL. Notably, with the call to `DeprecatedRegisterSystemConfigChannel` in `sql.Server.Start` removed, I can remove `Gossip` from `makeSQLServerArgs` entirely and things "just work".

Co-authored-by: Nathan VanBenschoten <[email protected]>
craig bot pushed a commit that referenced this issue Jul 24, 2020
51836: sql: only hook up database cache to gossip if system tenant r=ajwerner a=ajwerner

We won't have access to gossip on secondary tenants. This database cache should
be gone in the next week but this should unblock progress.

Touches #47150

Release note: None

Co-authored-by: Andrew Werner <[email protected]>
@ajwerner
Copy link
Contributor

ajwerner commented Aug 5, 2020

I'm closing this and will open a different issue for eliminating some of the calls to SetSystemConfigTrigger

@ajwerner ajwerner closed this as completed Aug 5, 2020
@knz knz added the X-anchored-telemetry The issue number is anchored by telemetry references. label Sep 11, 2020
@knz knz reopened this Sep 11, 2020
@knz
Copy link
Contributor

knz commented Sep 11, 2020

@ajwerner please only re-close this after you've updated the remaining reference to this issue in the source code (in lease.go)

(we keep X-anchored-telemetry issues open until the product doesn't refer to them in "live" problems)

craig bot pushed a commit that referenced this issue Sep 14, 2020
54256: sql: make multi-tenancy errors report to telemetry r=asubiotto a=knz

Fixes #48164. 

Issues referenced, I also added the X-anchored-telemetry label to them on github (we keep those issues open on github until their reference in the code is removed):

#54254 
#54255 
#54250 
#54251 
#54252 
#49854 
#48274
#47150
#47971
#47970
#47900 
#47925 


Co-authored-by: Raphael 'kena' Poss <[email protected]>
@ajwerner ajwerner removed the X-anchored-telemetry The issue number is anchored by telemetry references. label Sep 22, 2021
@ajwerner
Copy link
Contributor

The errors have been removed. I'm closing this in favor of the newer issue #70560 which covers the more general problem of not gossiping the system config span.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-multitenancy Related to multi-tenancy
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants