-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: RPC dialback does not resolve failover/non-system/blackhole-recv
#99104
Labels
A-kv-server
Relating to the KV-level RPC server
branch-master
Failures and bugs on the master branch.
branch-release-23.1
Used to mark GA and release blockers, technical advisories, and bugs for 23.1
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
GA-blocker
T-kv
KV Team
Comments
erikgrinaker
added
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
A-kv-server
Relating to the KV-level RPC server
T-kv
KV Team
labels
Mar 21, 2023
Hi @nvanbenschoten, please add branch-* labels to identify which branch(es) this release-blocker affects. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
andrewbaptist
added a commit
to andrewbaptist/cockroach
that referenced
this issue
Mar 29, 2023
Fixes: cockroachdb#99104 Informs cockroachdb#84289 As part of the previous fix for partition handling, we tracked the state of a previous attempt and use that result on the next attempt. However if there were multiple connections, we may only block system traffic connections and not default class connections. This change addresses that by ensuring a failed dialback attempt is remembered until we are able to successfully connect back to the pinging node. Epic: none Release note: None
erikgrinaker
added
branch-master
Failures and bugs on the master branch.
branch-release-23.1
Used to mark GA and release blockers, technical advisories, and bugs for 23.1
labels
Mar 30, 2023
craig bot
pushed a commit
that referenced
this issue
Mar 30, 2023
99429: ccl/sqlproxyccl: add PROXY protocol support via CLI flag to sqlproxy r=pjtatlow a=jaylim-crl This commits adds a new `require-proxy-protocol` flag to `mt start-proxy`, and that changes the sqlproxy's behavior to support the PROXY protocol. When the flag is set, the protocol will be enforced on the SQL listener, and supported on a best-effort basis on the HTTP listener. If the PROXY protocol isn't used, but is enforced, the connection will be rejected. The rationale behind doing best-effort basis on the HTTP listener is that some healthcheck systems don't support the protocol. This work is needed for the AWS PrivateLink work in CockroachCloud, which requires the use of the PROXY protocol. Release note: None Epic: none Release justification: SQL Proxy change only. Changes are needed for the AWS PrivateLink work in CockroachCloud. 99840: rpc: Fix blackhole recv r=erikgrinaker a=andrewbaptist Fixes: #99104 Informs #84289 As part of the previous fix for partition handling, we tracked the state of a previous attempt and use that result on the next attempt. However if there were multiple connections, we may only block system traffic connections and not default class connections. This change addresses that by ensuring a failed dialback attempt is remembered until we are able to successfully connect back to the pinging node. Epic: none Release note: None Co-authored-by: Jay <[email protected]> Co-authored-by: Andrew Baptist <[email protected]>
andrewbaptist
added a commit
to andrewbaptist/cockroach
that referenced
this issue
Mar 31, 2023
As part of the fix of cockroachdb#99104, a cast without a nil check was introduced. This PR addresses that by only casting if it is known to be not nil. Epic: none Fixes: cockroachdb#100275 Release note: None
andrewbaptist
added a commit
to andrewbaptist/cockroach
that referenced
this issue
Mar 31, 2023
As part of the fix of cockroachdb#99104, a cast without a nil check was introduced. This PR addresses that by only casting if it is known to be not nil. Epic: none Fixes: cockroachdb#100275 Release note: None
andrewbaptist
added a commit
that referenced
this issue
Mar 31, 2023
Fixes: #99104 Informs #84289 As part of the previous fix for partition handling, we tracked the state of a previous attempt and use that result on the next attempt. However if there were multiple connections, we may only block system traffic connections and not default class connections. This change addresses that by ensuring a failed dialback attempt is remembered until we are able to successfully connect back to the pinging node. Epic: none Release note: None
craig bot
pushed a commit
that referenced
this issue
Mar 31, 2023
99312: sqlsmith: add DEFAULT expressions to newly added columns r=mgartner a=mgartner Sqlsmith now builds `ALTER TABLE .. ADD COLUMN .. DEFAULT` statements with default expressions that have different types than the column type. This is allowed if the default expression type can be assignment-casted to the column's type. Fixes #98133 Release note: None 99348: testutils: move default test tenant message r=rharding6373 a=herkolategan In order to reduce logging noise but still inform test authors of the default test tenant, the message has been moved to where there is a `testing.TB` interface. Epic: CRDB-18499 99835: opt/execbuilder: add panic catching to buildRoutinePlanGenerator r=mgartner a=mgartner This commit adds a panic catcher to callback functions created in execbuilder and invoked during evaluation of UDFs and correlated subqueries. It matches the panic catcher logic in `buildApplyJoin`. Fixes #98786 Release note: None 100267: roachtest: own autoupgrade to TestEng r=renatolabs a=tbg Discussed in #99479. Epic: none Release note: None 100286: roachtest: prevent aws roachtest panic r=rail a=msbutler After #99723 merged as a bandaid for #98783, the aws roachtest nightly began to panic because of a different roachtest papercut #96655. Specifically, because roachtest filters which tests run on which cloud within the evaluation of the test closure, tests meant to run on gce will still get registered in an AWS run. During the registration of the gce test `restore/tpce/400GB/gce/nodes=4/cpus=8/lowmem` _on aws_, the aws test harness panics because the aws roachprod implementation does not have a low memory cpu configuration. This patch prevents this panic and should be reverted once the pr #99402 merges. Epic: None Release note: None 100294: tenantcapabilitiestestutils: add a missing default case r=ajwerner a=ajwerner The test should fail if we ever add a new type of capability and use it in the data driven test but don't update the test to handle it. Epic: none Follow-up from #100217 (review) Release note: None 100296: rpc: correctly check for nil before cast r=ajwerner a=andrewbaptist As part of the fix of #99104, a cast without a nil check was introduced. This PR addresses that by only casting if it is known to be not nil. Epic: none Fixes: #100275 Release note: None Co-authored-by: Marcus Gartner <[email protected]> Co-authored-by: Herko Lategan <[email protected]> Co-authored-by: Tobias Grieger <[email protected]> Co-authored-by: Michael Butler <[email protected]> Co-authored-by: ajwerner <[email protected]> Co-authored-by: Andrew Baptist <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-kv-server
Relating to the KV-level RPC server
branch-master
Failures and bugs on the master branch.
branch-release-23.1
Used to mark GA and release blockers, technical advisories, and bugs for 23.1
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
GA-blocker
T-kv
KV Team
In #94778, we added RPC dialback to resolve asymmetric network partitions. This fixed the
failover/non-system/blackhole-send
roachtest, but thefailover/non-system/blackhole-recv
test still shows persistent unavailability. We need to find out why and fix it.Jira issue: CRDB-25705
The text was updated successfully, but these errors were encountered: