-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bug in SwitchTraffic
that wasn't respecting --dry_run
for readonly and replica tablets during a resharding event
#12992
Conversation
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
If a new flag is being introduced:
If a workflow is added or modified:
Bug fixes
Non-trivial changes
New/Existing features
Backward compatibility
|
…icSwitch dry run for reads during resharding tries to actually switch reads and fails Signed-off-by: Rohit Nayak <[email protected]>
Thanks @austenLacy for the bug report and fix! Can you please fix the DCO for your commit so it passes CI? Also I updated the e2e tests to reproduce this failure and confirm your fix, at 0567da2. Can you also cherry-pick that commit into your PR. Previously dry-run for switching read traffic was only tested for MoveTables, I added it for Reshards as well. |
Signed-off-by: austenLacy <[email protected]>
…icSwitch dry run for reads during resharding tries to actually switch reads and fails Signed-off-by: Rohit Nayak <[email protected]>
8b842bb
to
99540f4
Compare
Thanks for the e2e test @rohit-nayak-ps. Just signed off my original commit and cherry picked yours in. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @austenLacy ! ❤️
I was unable to backport this Pull Request to the following branches: |
…donly and replica tablets during a resharding event (vitessio#12992) * use switcher struct when switching shard reads during a reshard event Signed-off-by: austenLacy <[email protected]> * Create failing test for bug reported in vitessio#12992, where a TrafficSwitch dry run for reads during resharding tries to actually switch reads and fails Signed-off-by: Rohit Nayak <[email protected]> --------- Signed-off-by: austenLacy <[email protected]> Signed-off-by: Rohit Nayak <[email protected]> Co-authored-by: Rohit Nayak <[email protected]> Signed-off-by: 'Stanislav Maksimov' <[email protected]>
…donly and replica tablets during a resharding event (vitessio#12992) * use switcher struct when switching shard reads during a reshard event Signed-off-by: austenLacy <[email protected]> * Create failing test for bug reported in vitessio#12992, where a TrafficSwitch dry run for reads during resharding tries to actually switch reads and fails Signed-off-by: Rohit Nayak <[email protected]> --------- Signed-off-by: austenLacy <[email protected]> Signed-off-by: Rohit Nayak <[email protected]> Co-authored-by: Rohit Nayak <[email protected]>
Description
There's a bug when resharding and switching traffic that makes it so it does not respect the
--dry_run
flag.❌ does not respect
--dry_run
without the fix intraffic_switcher.go
vtctlclient Reshard -- --tablet_types=rdonly,replica SwitchTraffic --dry_run customer.cust2cust I0426 14:42:58.221613 9828 main.go:96] I0426 14:42:58.221425 traffic_switcher.go:399] About to switchShardReads: [], [RDONLY REPLICA], 0 E0426 14:42:58.223739 9828 main.go:96] E0426 14:42:58.223117 traffic_switcher.go:401] switchShardReads failed: Code: INVALID_ARGUMENT keyspace customer is not locked (no locksInfo) E0426 14:42:58.224562 9828 main.go:96] E0426 14:42:58.223567 vtctl.go:2264] keyspace customer is not locked (no locksInfo) I0426 14:42:58.236510 9828 main.go:96] I0426 14:42:58.236337 vtctl.go:2266] Workflow Status: Reads Not Switched. Writes Not Switched Following vreplication streams are running for workflow customer.cust2cust: id=1 on -80/zone1-0000000301: Status: Running. VStream Lag: 0s. id=1 on 80-/zone1-0000000400: Status: Running. VStream Lag: 0s. Reshard Error: rpc error: code = Unknown desc = keyspace customer is not locked (no locksInfo) E0426 14:42:58.251351 9828 main.go:105] remote error: rpc error: code = Unknown desc = keyspace customer is not locked (no locksInfo)
✅ does respect
--dry_run
with the fix intraffic_switcher.go
testing on the primary
On v15 I wasn't able to replicate the issue with the primary tablets going to
NOT SERVING
becauseSwitchTraffic
did respect the dry run flag when dealing with the primary.Related Issue(s)
Checklist
Deployment Notes
None