Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

setting ‹"cluster.preserve_downgrade_option"› to ‹"20.2"› failed: cannot set cluster.preserve_downgrade_option to ‹20.2› (cluster version is 21.1) #68335

Closed
nick-jones opened this issue Aug 2, 2021 · 9 comments
Labels
A-configurability Pertains to cluster settings, CLI flags, env vars etc C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-community Originated from the community T-multitenant Issues owned by the multi-tenant virtual team X-blathers-triaged blathers was able to find an owner

Comments

@nick-jones
Copy link

nick-jones commented Aug 2, 2021

Describe the problem

After upgrading our clusters from 20.2 to 21.1, all nodes now appear to want to adjust cluster.preserve_downgrade_option when starting.

+ exec /cockroach/cockroach start --logtostderr=WARNING --certs-dir /cockroach/cockroach-certs --advertise-addr cockroachdb-0.cockroachdb.***.svc.cluster.local --http-addr 0.0.0.0 --joi
n cockroachdb-0.cockroachdb,cockroachdb-1.cockroachdb,cockroachdb-2.cockroachdb --cache 25% --max-sql-memory 25%
Flag --logtostderr has been deprecated, use --log instead to specify 'sinks: {stderr: {filter: ...}}'.
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1  setting ‹"cluster.preserve_downgrade_option"› to ‹"20.2"› failed: cannot set cluster.preserve_downgrade_option to ‹20.2› (cluster version
 is 21.1)
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +(1) attached stack trace
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  -- stack trace:
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  | github.com/cockroachdb/cockroach/pkg/clusterversion.registerPreserveDowngradeVersionSetting.func1
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  |    /go/src/github.com/cockroachdb/cockroach/pkg/clusterversion/setting.go:259
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  | github.com/cockroachdb/cockroach/pkg/settings.(*StringSetting).Validate
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  |    /go/src/github.com/cockroachdb/cockroach/pkg/settings/string.go:65
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  | github.com/cockroachdb/cockroach/pkg/settings.(*StringSetting).set
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  |    /go/src/github.com/cockroachdb/cockroach/pkg/settings/string.go:79
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  | github.com/cockroachdb/cockroach/pkg/settings.updater.Set
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  |    /go/src/github.com/cockroachdb/cockroach/pkg/settings/updater.go:92
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  | github.com/cockroachdb/cockroach/pkg/server.processSystemConfigKVs.func1
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  |    /go/src/github.com/cockroachdb/cockroach/pkg/server/settingsworker.go:47
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  | github.com/cockroachdb/cockroach/pkg/server.processSystemConfigKVs
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  |    /go/src/github.com/cockroachdb/cockroach/pkg/server/settingsworker.go:53
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  | github.com/cockroachdb/cockroach/pkg/server.(*Server).refreshSettings
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  |    /go/src/github.com/cockroachdb/cockroach/pkg/server/settingsworker.go:69
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  | github.com/cockroachdb/cockroach/pkg/server.(*Server).PreStart
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  |    /go/src/github.com/cockroachdb/cockroach/pkg/server/server.go:1494
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  | github.com/cockroachdb/cockroach/pkg/cli.runStart.func4.2
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  |    /go/src/github.com/cockroachdb/cockroach/pkg/cli/start.go:587
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  | github.com/cockroachdb/cockroach/pkg/cli.runStart.func4
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  |    /go/src/github.com/cockroachdb/cockroach/pkg/cli/start.go:710
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  | runtime.goexit
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +  |    /usr/local/go/src/runtime/asm_amd64.s:1374
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +Wraps: (2) cannot set cluster.preserve_downgrade_option to ‹20.2› (cluster version is 21.1)
W210801 11:35:17.090732 66 server/settingsworker.go:48 ⋮ [n?] 1 +Error types: (1) *withstack.withStack (2) *errutil.leafError
W210801 11:35:17.095099 66 2@gossip/gossip.go:1491 ⋮ [n?] 2  no incoming or outgoing connections
CockroachDB node starting at 2021-08-01 11:35:22.216472524 +0000 UTC (took 5.4s)
build:               CCL v21.1.6 @ 2021/07/20 15:30:39 (go1.15.11)
webui:               https://0.0.0.0:8080
sql:                 postgresql://[email protected].***.svc.cluster.local:26257?sslmode=verify-full&sslrootcert=%2Fcockroach%2Fcockroach-certs%2Fca.crt
RPC client flags:    /cockroach/cockroach <client cmd> --host=cockroachdb-0.cockroachdb.***.svc.cluster.local:26257 --certs-dir=/cockroach/cockroach-certs
logs:                /cockroach/cockroach-data/logs
temp dir:            /cockroach/cockroach-data/cockroach-temp515401059
external I/O path:   /cockroach/cockroach-data/extern
store[0]:            path=/cockroach/cockroach-data
storage engine:      pebble
status:              restarted pre-existing node
clusterID:           88707280-6f4a-4158-8dda-0ffcf42d40ba
nodeID:              1

To Reproduce

We upgraded our cockroachdb clusters. When upgrading each cluster, we used the following steps:

  • SET CLUSTER SETTING cluster.preserve_downgrade_option = '20.2'; executed prior to upgrade
  • Upgraded all nodes in cluster to v21.1
  • Waited 24h+
  • RESET CLUSTER SETTING cluster.preserve_downgrade_option; executed

Expected behavior

The node seems to want to adjust the preserve_downgrade_option with no good reason.

Additional data / screenshots

Environment:

  • CockroachDB version: v21.1.6
  • Server OS: Linux

Additional context

Jira issue: CRDB-8992
Epic: CRDB-6671

@nick-jones nick-jones added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Aug 2, 2021
@blathers-crl
Copy link

blathers-crl bot commented Aug 2, 2021

Hello, I am Blathers. I am here to help you get the issue triaged.

Hoot - a bug! Though bugs are the bane of my existence, rest assured the wretched thing will get the best of care here.

I have CC'd a few people who may be able to assist you:

  • @cockroachdb/storage (found keywords: pebble)

If we have not gotten back to your issue within a few business days, you can try the following:

  • Join our community slack channel and ask on #cockroachdb.
  • Try find someone from here if you know they worked closely on the area and CC them.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added O-community Originated from the community X-blathers-triaged blathers was able to find an owner labels Aug 2, 2021
@darinpp
Copy link
Contributor

darinpp commented Aug 2, 2021

@tbg Can you take a look at this? Seems like it tries to apply cached settings when they aren't applicable anymore.

@Nican
Copy link

Nican commented Aug 14, 2021

@tbg I am having a similar issue, and I am wondering what this is about.
I had updated from 20.2 in the past, to 21.1, and recently I updated from 21.1.6 to 21.1.7.

W210814 20:03:19.815560 28 server/settingsworker.go:48 ⋮ [n?] 28  setting ‹"cluster.preserve_downgrade_option"› to ‹"20.2"› failed: cannot set cluster.preserve_downgrade_option to ‹20.2› (cluster version is 21.1)

What does it mean?

@ajwerner
Copy link
Contributor

This warning is not a big deal. The stack trace makes it look scary. It means that these nodes think that the preserve downgrade option should be set because that's what they had on disk but they've learned that the version is already newer than that. I believe this should only happen once. We could special case away this warning, but I don't think it's a major thing.

This would happen if you upgrade manually but don't clear the preserve downgrade option setting. You can clear it with SET CLUSTER SETTING cluster.preserve_downgrade_option = DEFAULT.

@ajwerner
Copy link
Contributor

I think what I'd say is that we should refuse to upgrade if you've got the preserve_downgrade_option set.

@nick-jones
Copy link
Author

nick-jones commented Sep 22, 2021

@ajwerner

This warning is not a big deal. The stack trace makes it look scary. It means that these nodes think that the preserve downgrade option should be set because that's what they had on disk but they've learned that the version is already newer than that. I believe this should only happen once. We could special case away this warning, but I don't think it's a major thing.

This would happen if you upgrade manually but don't clear the preserve downgrade option setting. You can clear it with SET CLUSTER SETTING cluster.preserve_downgrade_option = DEFAULT.

In our case the error occurs every single time a node starts. We cleared preserve_downgrade_option weeks ago, albeit via RESET CLUSTER SETTING cluster.preserve_downgrade_option rather than the statement you've specified.

As an example:

sh-4.4# cockroach sql
#
# Welcome to the CockroachDB SQL shell.
# All statements must be terminated by a semicolon.
# To exit, type: \q.
#
# Server version: CockroachDB CCL v21.1.7 (x86_64-unknown-linux-gnu, built 2021/08/09 17:55:28, go1.15.14) (same version as client)
# Cluster ID: 7ce2364a-782e-4cec-992c-c0c4dc40709f
#
# Enter \? for a brief introduction.
#
root@cockroachdb-proxy:26257/defaultdb> SHOW CLUSTER SETTING cluster.preserve_downgrade_option;
  cluster.preserve_downgrade_option
-------------------------------------

(1 row)

Time: 4ms total (execution 3ms / network 1ms)

root@cockroachdb-proxy:26257/defaultdb> SELECT NOW();
               now
---------------------------------
  2021-09-22 12:57:23.193367+00
(1 row)

Time: 2ms total (execution 0ms / network 1ms)

And then restarting a node in this cluster:

$ kubectl --context=prod-aws --namespace=<snip> delete pod cockroachdb-0
$ kubectl --context=prod-aws --namespace=<snip> logs cockroachdb-0 -c cockroachdb | head -n5 
++ hostname -f
+ exec /cockroach/cockroach start --logtostderr=WARNING --certs-dir /cockroach/cockroach-certs --advertise-addr cockroachdb-0.cockroachdb.<snip>.svc.cluster.local --http-addr 0.0.0.0 --join cockroachdb-0.cockroachdb,cockroachdb-1.cockroachdb,cockroachdb-2.cockroachdb --cache 25% --max-sql-memory 25%
Flag --logtostderr has been deprecated, use --log instead to specify 'sinks: {stderr: {filter: ...}}'.
W210922 12:59:07.247931 12 server/settingsworker.go:48 ⋮ [n?] 1  setting ‹"cluster.preserve_downgrade_option"› to ‹"20.2"› failed: cannot set cluster.preserve_downgrade_option to ‹20.2› (cluster version is 21.1)
W210922 12:59:07.247931 12 server/settingsworker.go:48 ⋮ [n?] 1 +(1) attached stack trace

@ajwerner
Copy link
Contributor

Interesting. Thanks for the report. I found the bug and will file a separate issue. We don't clear unset settings from our on-disk cache 🙁.

@ajwerner
Copy link
Contributor

Filed #70567.

@knz knz added A-configurability Pertains to cluster settings, CLI flags, env vars etc T-multitenant Issues owned by the multi-tenant virtual team labels Aug 11, 2023
@knz
Copy link
Contributor

knz commented Oct 3, 2023

This was fixed by #111475.

@knz knz closed this as completed Oct 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-configurability Pertains to cluster settings, CLI flags, env vars etc C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-community Originated from the community T-multitenant Issues owned by the multi-tenant virtual team X-blathers-triaged blathers was able to find an owner
Projects
None yet
Development

No branches or pull requests

5 participants