Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: upgrade multi-tenant clusters #48436

Closed
tbg opened this issue May 5, 2020 · 16 comments
Closed

sql: upgrade multi-tenant clusters #48436

tbg opened this issue May 5, 2020 · 16 comments
Assignees
Labels
A-cc-enablement Pertains to current CC production issues or short-term projects A-multitenancy Related to multi-tenancy A-server-start-drain Pertains to server startup and shutdown sequences C-investigation Further steps needed to qualify. C-label will change. T-server-and-security DB Server & Security X-server-triaged-202105

Comments

@tbg
Copy link
Member

tbg commented May 5, 2020

We need to think about how multi-tenant clusters are upgraded across cluster versions. The minimal plan is that tenants will reach out to the KV layer at bootstrap time to learn the active cluster version.

However, with our existing tools, naively the resulting process to upgrade a multi-tenant cluster is as follows:

  1. roll the KV layer into the new binary (do not finalize)
  2. roll the SQL processes into new binary
  3. finalize update
  4. roll SQL processes again

We need 4) because the tenants would not learn about the version bump (they're not hooked up to gossip). This is awkward. Instead, the SQL tenants can be instructed to re-pull the cluster version, replacing 4 by "run a command against all pods".

It's of note that since the KV layer never reaches out to tenants (as a design principle), the safety of finalizing an upgrade relies on the operator (i.e., operator ensures tenants are running a compatible binary). This is fine since we're the operator and it can be improved later.

There's a connection to #39182 in that the KV layer is also expected to gain a "Connect" RPC. However, the more I look at it, the weaker the connection gets since long-running migrations is essentially a KV concept. In fact, multi-tenancy provides evidence that we should think twice about coupling long-running migrations to SQL tenants too closely. This is even though we're always running a system tenant SQL instance within each KV node - this is due to existing constraints, and not a pit we necessarily want to dig deeper.
cc @irfansharif

Not putting the blocker label since we can add this into the new binary before the first actual upgrade.

Epic: CRDB-8505

@tbg tbg added the A-multitenancy Related to multi-tenancy label May 5, 2020
@blathers-crl

This comment has been minimized.

@tbg
Copy link
Member Author

tbg commented Nov 24, 2020

Here's what I think we should do to close this issue:

  1. start tenants at their binary min supported version (at time of writing, uses the binary version):
    // TODO(tbg): this has to be passed in. See the upgrade strategy in:
    // https://github.com/cockroachdb/cockroach/issues/47919
    if err := clusterversion.Initialize(ctx, st.Version.BinaryVersion(), &st.SV); err != nil {
    return err
    }
  2. give the tenant's gossip connector access to the KV layer's active cluster version:
    func (c *Connector) runGossipSubscription(ctx context.Context) {
    for ctx.Err() == nil {
    client, err := c.getClient(ctx)
    if err != nil {
    continue
    }
    stream, err := client.GossipSubscription(ctx, &roachpb.GossipSubscriptionRequest{
    Patterns: gossipSubsPatterns,
    })
    if err != nil {
    log.Warningf(ctx, "error issuing GossipSubscription RPC: %v", err)
    c.tryForgetClient(ctx, client)
    continue
    }
    for {
    e, err := stream.Recv()
    and hook that up to its version setting.

Then, the upgrade process works as follows:

  1. roll the SQL processes into the new binary (they'll keep operating at min supported version)
  2. roll the KV layer into the new binary

Now the KV layer should auto-bump the cluster version (a manual strategy can also be carried out, as usual) and when that has happened, the new version will be picked up by the SQL pods asynchronously.

From the point of view of long-running migrations, SQL pods don't exist. We'll simply have to be careful not to write any long-running migrations that try to perform work on data that is on SQL pods. This shouldn't be much of an issue until they become stateful.

@tbg
Copy link
Member Author

tbg commented Nov 25, 2020

To clarify (since this came up in a DM), the story for patch upgrades is much simpler as cluster versions can't change in patch upgrades (at least today they can't). We can do pretty much "whatever we like" but for consistency with a major version upgrade I would say that rolling the SQL pods first and then the KV layer is preferred.

Another aspect of multi-tenancy and versions that we have not discussed yet and which adds a fair bit of complexity is around our (eventual) desire to let tenants decide when to upgrade their SQL pod's binary.

This means that we need to be able to fully upgrade the KV cluster while retaining compatibility with the previous binary. This is at odds with the current semantics of cluster versions, as "fully" activating, say, v21.2 implies a promise that no participants in the cluster are still running an older binary.

With delayed pod binary upgrades, we need to acknowledge that the SQL/KV split affects the version compatibility story. We essentially need the KV layer at binary (and version) X to remain fully compatible with a SQL layer at the previous version (and, perhaps, at some point, multiple previous versions).

I think this is mostly true today; typically new functionality that lives in KV does not affect SQL code. One notable area where we're going to have to be extremely careful though is the transaction protocol. This is because the TxnCoordSender and DistSender live on the SQL side. For example, if we introduced new behavior of this kind in the evaluation of an EndTxnRequest:

func evalTxnRequest(stuff *Stuff) (resp EndTxnResponse) {
  if version.IsActive(VersionSomeNewThing) {
    resp.MustFizzleTheGlobberOrThingsWillBreak = true
  }
  // Do other things
  return resp
}

and the TxnCoordSender is supposed to do something when that bool is set, we will have a problem in a multi-tenant cluster in which a tenant is running the binary that does not understand the flag, but the KV layer fully upgrades and starts expecting everyone to know about the flag.

This could be sand in our gears for certain functionality, but I think we would be able to cope. Mostly, I am concerned that without adequate protection, we might end up violating the required guarantees accidentally. I don't see an obvious way to set up safeguards against that (a mixed-version test alone doesn't cut it, though it will certainly be necessary).

@andy-kimball will we already need 21.1 clusters to fully upgrade while coexisting with 20.2 tenant binaries? If so, we need to adhere to the new rules right now.

@alyshanjahani-crl
Copy link
Collaborator

alyshanjahani-crl commented Nov 26, 2020

Now the KV layer should auto-bump the cluster version (a manual strategy can also be carried out, as usual) and when that has happened, the new version will be picked up by the SQL pods asynchronously.

Given that this is asynchronous, what guarantee do we have that a SQL pod (that hasn't picked up the new cluster version yet) won't handle a connection that could have compatibility issues with the upgraded(& finalized) KV pods.

@alyshanjahani-crl
Copy link
Collaborator

alyshanjahani-crl commented Nov 26, 2020

Trying to confirm my understanding here.. How does a sql pod determine its active cluster version?

This snippet of the current implementation that you shared above suggests it uses the binary version.

if err := clusterversion.Initialize(ctx, st.Version.BinaryVersion(), &st.SV); err != nil {
	return err
}

However, in the initial approach outlined in the issue description it suggests the sql processes need to be restarted after the kv finalization so that they can learn about the cluster version bump. Implying they get the active cluster version from the kv pods.

roll the KV layer into the new binary (do not finalize)
roll the SQL processes into new binary
finalize update
roll SQL processes again

@andy-kimball
Copy link
Contributor

@andy-kimball will we already need 21.1 clusters to fully upgrade while coexisting with 20.2 tenant binaries? If so, we need to adhere to the new rules right now.

I think we should start adhering as soon as we can, if only to get in the habit and begin to discover the issues. But I don't think it needs to be a hard requirement for 21.1. At least begin socializing this requirement with the KV team and get people thinking about it.

@tbg
Copy link
Member Author

tbg commented Dec 2, 2020

We discussed this in the KV weekly yesterday, so we're all aware. We also discussed some strategies to avoid breakage here which I'm jotting down for posterity (and without judging them by their merit or sorting them in any way)

  • disallow IsActive in batcheval package
  • attach a SQL pod version at the KV boundary (i.e. at the Internal or Tenant service) and require that this is what's used for feature gating on all affected request paths
  • put a marker in the context for requests coming in through the tenant boundary. When the marker is observed by IsActive, it fatals (i.e. this is a sort of runtime lint)

We don't expect there to be anything in 21.1 that doesn't allow 20.2 SQL pods. Possibly non-blocking txns are the most likely candidate to get in the way here but we'll sort it out.

@tbg
Copy link
Member Author

tbg commented Dec 15, 2020

Just to tie the threads together for those who work at CRL, there was a discussion here about long-running migrations on the SQL side. Serverless forces us into this conversation early. I'll report on progress here (and probably spin out an issue at some point).

@tbg
Copy link
Member Author

tbg commented Dec 22, 2020

Suggestion: we make set cluster setting version = x work on tenants as well. This mostly means

a) we have to store the tenant cluster version somewhere in KV (easy enough to add)

b) hook up set cluster setting = version to function on tenants, and to call into the migration manager

db *kv.DB
}
// Helper captures all the primitives required to fully specify a migration.
type Helper struct {
*Manager
}
just as it does on the system tenant

c) only run a subset of migration that is tagged as "sql migrations"; these can now run as the tenant. The migration manager will skip anything that's not taggged as such (i.e. most versions in the bump).

This seems to enable everything we want: long-running sql migrations are written just the same as any other migration (with the exception of being tagged as sql migrations), and they will automatically be invoked on tenants when the tenant bumps their version.

We should be able to set up acceptance/mixed-versions (or a variant thereof) to also exercise the upgrade pattern discussed in this thread, where the kv layer fully finalizes first, and then tenants are upgraded at a later point.

@tbg
Copy link
Member Author

tbg commented Feb 19, 2021

@lunevalex as far as I can tell this is an issue that is urgent and that is on nobody's plate officially w.r.t planning. @ajwerner is doing something about it right now and I'm helping as much as I can but I don't think we can rely on sql-schema of all places to own this, and we can't expect the work to just stop as this raises bigger questions about our versioning story. Which area/PM owns this and is responsible for getting staffing?

@andy-kimball
Copy link
Contributor

andy-kimball commented Feb 19, 2021

This seems to fall under the “Server” area, which has traditionally fallen under KV ownership, but recently had a skeleton team spun up under Raphael’s tech leadership. However, as we’re all aware, we don’t have sufficient staffing in that team yet, so we’ll have to cast about for pinch hitters to step in and help out, like @ajwerner. I consider myself responsible for making sure we find resources needed to do any work needed for Serverless, but i first need to know it’s needed. I had mistakenly thought this issue was about supporting older versions of Sql pods with newer versions of KV, which is not something we urgently need.

Setting aside the mixed version work, which we don’t yet need, isn’t Andrew merging the DB work needed to support major version upgrade of multi-tenant clusters next week? Is there more DB work needed beyond that to support plain vanilla upgrades of free-tier host clusters from 20.2 to 21.1?

@tbg
Copy link
Member Author

tbg commented Feb 19, 2021

Thanks Andy. Andrew is working on #60730, which I'll help him with. I think we will be in an ok shape after that PR, given that as you indicate we are willing to force all SQL pods into the new version as part of the version upgrade. I will then assign this issue into the server backlog and the expectation is that no more work will be requested for the upcoming release cycle.

@andy-kimball
Copy link
Contributor

I actually would want more work in the upcoming 21.2 cycle, in order to support mixed KV / SQL version scenarios. But that’s only because of a difficult planning conundrum: I don’t need mixed version support until maybe Q1 of 2022. But I have to ask for that now, in order to feed into the 21.2 planning process, for delivery in November, so that I have it for Q1 2022. And then it may actually turn out that I don’t need it until Q2, or even at all if it turns out there’s light customer demand. Trying to anticipate my needs a year in advance for fast-changing product requirements is nigh impossible, so I’m forced into being defensive in my asks, and risk asking for things I won’t actually turn out to need.

@lunevalex
Copy link
Collaborator

@tbg thanks for flagging this the PM that covers upgrades is @johnrk, can you please roll this up into the 21.2 problems to solve for operator experience based on the comments above?

@johnrk-zz
Copy link

@lunevalex , yes, I'll definitely incorporate this into our Problems to Solve for Deployments & Operations. Thanks!

@dt dt removed the GA-blocker label May 3, 2021
@knz knz added the C-investigation Further steps needed to qualify. C-label will change. label May 28, 2021
@jlinder jlinder added T-kv KV Team T-server-and-security DB Server & Security labels Jun 16, 2021
@exalate-issue-sync exalate-issue-sync bot removed the T-kv KV Team label Jul 13, 2021
@knz knz added A-server-start-drain Pertains to server startup and shutdown sequences A-cc-enablement Pertains to current CC production issues or short-term projects labels Jul 29, 2021
@ajwerner
Copy link
Contributor

Closed by #60730.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cc-enablement Pertains to current CC production issues or short-term projects A-multitenancy Related to multi-tenancy A-server-start-drain Pertains to server startup and shutdown sequences C-investigation Further steps needed to qualify. C-label will change. T-server-and-security DB Server & Security X-server-triaged-202105
Projects
None yet
Development

No branches or pull requests

9 participants