Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: retry ambiguous kv/sql operations on node startup #97710

Merged
merged 2 commits into from
Apr 3, 2023

Conversation

aliher1911
Copy link
Contributor

@aliher1911 aliher1911 commented Feb 27, 2023

Previously, nodes could fail on startup if node itself was needed to restore quorum on ranges. This problem was caused by circuit breaker failing fast on live node when talking back to restarting node instead of waiting for raft operation to run again (and succeed).
This PR adds explicit retries and also adds assertions enabled in race build that would fire if startup reaches dist sender or internal sql executor without wrapping it in startup retry helper.

Release note: None

Fixes #74714

@aliher1911 aliher1911 self-assigned this Feb 27, 2023
@blathers-crl
Copy link

blathers-crl bot commented Feb 27, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@aliher1911 aliher1911 force-pushed the fix_startup_kv_failures branch 14 times, most recently from 00e80fb to d343b4e Compare March 6, 2023 10:55
@aliher1911 aliher1911 force-pushed the fix_startup_kv_failures branch 7 times, most recently from 6113f94 to 93a605d Compare March 8, 2023 18:20
@aliher1911 aliher1911 force-pushed the fix_startup_kv_failures branch 2 times, most recently from 026cd69 to 59aaae0 Compare March 21, 2023 17:52
@aliher1911 aliher1911 changed the title <pkg>: <short description - lowercase, no final period> server: retry ambiguous kv/sql operations on node startup Mar 21, 2023
@aliher1911 aliher1911 force-pushed the fix_startup_kv_failures branch 3 times, most recently from 0f94ee3 to 20bfb84 Compare March 22, 2023 10:35
@craig
Copy link
Contributor

craig bot commented Mar 31, 2023

Timed out.

@aliher1911
Copy link
Contributor Author

bors r=tbg

@craig
Copy link
Contributor

craig bot commented Mar 31, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Mar 31, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Mar 31, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Mar 31, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Apr 1, 2023

Build failed:

@aliher1911
Copy link
Contributor Author

bors r=tbg

@craig
Copy link
Contributor

craig bot commented Apr 3, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Apr 3, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Apr 3, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Apr 3, 2023

Build succeeded:

@blathers-crl
Copy link

blathers-crl bot commented Apr 3, 2023

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 6e14949 to blathers/backport-release-22.1-97710: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.1.x failed. See errors above.


error creating merge commit from fb09177 to blathers/backport-release-22.2-97710: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.2.x failed. See errors above.


error setting reviewers, but backport branch blathers/backport-release-23.1-97710 is ready: POST https://api.github.com/repos/cockroachdb/cockroach/pulls/100458/requested_reviewers: 422 Reviews may only be requested from collaborators. One or more of the teams you specified is not a collaborator of the cockroachdb/cockroach repository. []

Backport to branch 23.1.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.1.x Flags PRs that need to be backported to 23.1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

server: KV writes in startup path sensitive to circuit breaker errors
3 participants