Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: assertion leaks batch request to sanitized logs #62765

Closed
andreimatei opened this issue Mar 29, 2021 · 1 comment
Closed

kvserver: assertion leaks batch request to sanitized logs #62765

andreimatei opened this issue Mar 29, 2021 · 1 comment
Assignees
Labels
A-kv-server Relating to the KV-level RPC server C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. GA-blocker

Comments

@andreimatei
Copy link
Contributor

This new assertion contains the error twice, and one of the copies some inadvertently ends up being considered "safe".

return nil, makeNonDeterministicFailure("writing at %s below closed ts: %s (%s)",

F210326 12:02:19.574549 195 kv/kvserver/store_raft.go:524 ⋮ [n9,s9,r9/8:‹×›] 619  writing at 1616760138.160904109,1 below closed ts: 1616760138.508101756,0 (ConditionalPut [/Table/13/1/2021-03-26T12:02:14.844713Z/644512934977306627/0,/Min), ConditionalPut [/Table/13/1/2021-03-26T12:02:14.844713Z/644512934977306627/2/1,/Min), ConditionalPut [/Table/13/1/2021-03-26T12:02:14.844713Z/644512934977306627/3/1,/Min), ConditionalPut [/Table/13/1/2021-03-26T12:02:14.844713Z/644512934977306627/4/1,/Min), ConditionalPut [/Table/13/1/2021-03-26T12:02:14.844713Z/644512934977306627/6/1,/Min), [txn: 36f6bd89]): ‹×›: writing at 1616760138.160904109,1 below closed ts: ‹×› (‹ConditionalPut [/Table/13/1/2021-03-26T12:02:14.844713Z/644512934977306627/0,/Min), ConditionalPut [/Table/13/1/2021-03-26T12:02:14.844713Z/644512934977306627/2/1,/Min), ConditionalPut [/Table/13/1/2021-03-26T12:02:14.844713Z/644512934977306627/3/1,/Min), ConditionalPut [/Table/13/1/2021-03-26T12:02:14.844713Z/644512934977306627/4/1,/Min), ConditionalPut [/Table/13/1/2021-03-26T12:02
teamcity-2816384-1616738656-21-n10cpu4-0009> F210326 12:02:19.574549 195 kv/kvserver/store_raft.go:524 ⋮ [n9,s9,r9/8:‹×›] 619 |:14.844713Z/644512934977306627/6/1,/Min), [txn: 36f6bd89]›)
@andreimatei andreimatei added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-kv-server Relating to the KV-level RPC server GA-blocker labels Mar 29, 2021
@andreimatei andreimatei self-assigned this Mar 29, 2021
@blathers-crl
Copy link

blathers-crl bot commented Mar 29, 2021

Hi @andreimatei, please add branch-* labels to identify which branch(es) this release-blocker affects.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

andreimatei added a commit to andreimatei/cockroach that referenced this issue Apr 5, 2021
In cockroachdb#62655 we see that there appears to be something wrong with the Raft
closed timestamp. That issue shows an assertion failure about a command
trying to write below a timestamp that was promised to have been closed
by a previous command. This patch includes a little bit more info in
that assertion (the current lease) and adds another two assertions:
- an assertion that the closed timestamp itself does not regress. This
assertion already existed in stageTrivialReplicatedEvalResult(), but
that comes after the failure we've seen. The point of copying this
assertion up is to ascertain whether we're screwing up by regressing the
closed timestamp, or whether a particular request/command is at fault
for not obeying the closed timestamp.
- an assertion against closed ts regression when copying the replica
state from a staging batch to the replica.

All these assertions can be disabled with an env var if things go bad.

Fixes cockroachdb#62765

Release note: None
andreimatei added a commit to andreimatei/cockroach that referenced this issue Apr 12, 2021
In cockroachdb#62655 we see that there appears to be something wrong with the Raft
closed timestamp. That issue shows an assertion failure about a command
trying to write below a timestamp that was promised to have been closed
by a previous command. This patch includes a little bit more info in
that assertion (the current lease) and adds another two assertions:
- an assertion that the closed timestamp itself does not regress. This
assertion already existed in stageTrivialReplicatedEvalResult(), but
that comes after the failure we've seen. The point of copying this
assertion up is to ascertain whether we're screwing up by regressing the
closed timestamp, or whether a particular request/command is at fault
for not obeying the closed timestamp.
- an assertion against closed ts regression when copying the replica
state from a staging batch to the replica.

All these assertions can be disabled with an env var if things go bad.

Fixes cockroachdb#62765

Release note: None
andreimatei added a commit to andreimatei/cockroach that referenced this issue Apr 19, 2021
In cockroachdb#62655 we see that there appears to be something wrong with the Raft
closed timestamp. That issue shows an assertion failure about a command
trying to write below a timestamp that was promised to have been closed
by a previous command. This patch includes a little bit more info in
that assertion (the current lease) and adds another two assertions:
- an assertion that the closed timestamp itself does not regress. This
assertion already existed in stageTrivialReplicatedEvalResult(), but
that comes after the failure we've seen. The point of copying this
assertion up is to ascertain whether we're screwing up by regressing the
closed timestamp, or whether a particular request/command is at fault
for not obeying the closed timestamp.
- an assertion against closed ts regression when copying the replica
state from a staging batch to the replica.

All these assertions can be disabled with an env var if things go bad.

Fixes cockroachdb#62765

Release note: None
andreimatei added a commit to andreimatei/cockroach that referenced this issue Apr 20, 2021
In cockroachdb#62655 we see that there appears to be something wrong with the Raft
closed timestamp. That issue shows an assertion failure about a command
trying to write below a timestamp that was promised to have been closed
by a previous command. This patch includes a little bit more info in
that assertion (the current lease) and adds another two assertions:
- an assertion that the closed timestamp itself does not regress. This
assertion already existed in stageTrivialReplicatedEvalResult(), but
that comes after the failure we've seen. The point of copying this
assertion up is to ascertain whether we're screwing up by regressing the
closed timestamp, or whether a particular request/command is at fault
for not obeying the closed timestamp.
- an assertion against closed ts regression when copying the replica
state from a staging batch to the replica.

All these assertions can be disabled with an env var if things go bad.

Fixes cockroachdb#62765

Release note: None
andreimatei added a commit to andreimatei/cockroach that referenced this issue Apr 21, 2021
In cockroachdb#62655 we see that there appears to be something wrong with the Raft
closed timestamp. That issue shows an assertion failure about a command
trying to write below a timestamp that was promised to have been closed
by a previous command. This patch includes a little bit more info in
that assertion (the current lease) and adds another two assertions:
- an assertion that the closed timestamp itself does not regress. This
assertion already existed in stageTrivialReplicatedEvalResult(), but
that comes after the failure we've seen. The point of copying this
assertion up is to ascertain whether we're screwing up by regressing the
closed timestamp, or whether a particular request/command is at fault
for not obeying the closed timestamp.
- an assertion against closed ts regression when copying the replica
state from a staging batch to the replica.

All these assertions can be disabled with an env var if things go bad.

Fixes cockroachdb#62765

Release note: None
@craig craig bot closed this as completed in a175614 Apr 22, 2021
andreimatei added a commit to andreimatei/cockroach that referenced this issue Apr 22, 2021
In cockroachdb#62655 we see that there appears to be something wrong with the Raft
closed timestamp. That issue shows an assertion failure about a command
trying to write below a timestamp that was promised to have been closed
by a previous command. This patch includes a little bit more info in
that assertion (the current lease) and adds another two assertions:
- an assertion that the closed timestamp itself does not regress. This
assertion already existed in stageTrivialReplicatedEvalResult(), but
that comes after the failure we've seen. The point of copying this
assertion up is to ascertain whether we're screwing up by regressing the
closed timestamp, or whether a particular request/command is at fault
for not obeying the closed timestamp.
- an assertion against closed ts regression when copying the replica
state from a staging batch to the replica.

All these assertions can be disabled with an env var if things go bad.

Fixes cockroachdb#62765

Release note: None
andreimatei added a commit to andreimatei/cockroach that referenced this issue Apr 22, 2021
In cockroachdb#62655 we see that there appears to be something wrong with the Raft
closed timestamp. That issue shows an assertion failure about a command
trying to write below a timestamp that was promised to have been closed
by a previous command. This patch includes a little bit more info in
that assertion (the current lease) and adds another two assertions:
- an assertion that the closed timestamp itself does not regress. This
assertion already existed in stageTrivialReplicatedEvalResult(), but
that comes after the failure we've seen. The point of copying this
assertion up is to ascertain whether we're screwing up by regressing the
closed timestamp, or whether a particular request/command is at fault
for not obeying the closed timestamp.
- an assertion against closed ts regression when copying the replica
state from a staging batch to the replica.

All these assertions can be disabled with an env var if things go bad.

Fixes cockroachdb#62765

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-server Relating to the KV-level RPC server C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. GA-blocker
Projects
None yet
Development

No branches or pull requests

1 participant