-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement HA E2e for downgrades #13696
Conversation
c2b754c
to
95a49f8
Compare
tests/e2e/cluster_downgrade_test.go
Outdated
prefixArgs := []string{e2e.CtlBinPath, "--endpoints", strings.Join(epc.EndpointsV3(), ",")} | ||
t.Log("Write keys to ensure wal snapshot is created so cluster version set is snapshotted") | ||
var err error | ||
e2e.ExecuteWithTimeout(t, 20*time.Second, func() { | ||
for i := 0; i < 10; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In https://github.com/etcd-io/etcd/pull/13686/files, you are removing snapshotCount from the newCluster method.
I assume this will cause this method to stop working...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those two PRs will conflict as #13686 removes the need to set snapshotCount and write keys to force snapshot. It should not break the tests, I just will need to rebase the second change and fix the conflict.
for i := 0; i < len(epc.Procs); i++ { | ||
t.Logf("Upgrading member %d", i) | ||
stopEtcd(t, epc.Procs[i]) | ||
startEtcd(t, epc.Procs[i], currentEtcdBinary) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shell we validate here that cluster is not upgraded if quorum is not yet upgraded (1-st iteration),
and that cluster gets updated if quorum (including leader ?) get's upgraded ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I considered it here before. One not that all (not quorum) members need to be upgraded before cluster version is updated. Cluster version indicates minimal version of server in cluster.
tests/framework/e2e/cluster.go
Outdated
@@ -491,3 +493,25 @@ func (epc *EtcdProcessCluster) WithStopSignal(sig os.Signal) (ret os.Signal) { | |||
} | |||
return ret | |||
} | |||
func (epc *EtcdProcessCluster) Leader(ctx context.Context) (EtcdProcess, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: empty line to separate from import.
What warries me is that this method is relatively 'heavy' (3 connections made)
and it's easy to write very slow test that recaluculates 'leader' before each iteration.
Mitigations:
- call it 'FindLeaderHeavy'
- cache the leader as part of EtcdProcessCluster and have 2 methods:
Leader()
thatRefreshLeader()
.
Leader
would only callRefreshLeader
when there is no populated cache...
Unit test owner would need to call RefreshLeader
in all the cases when leader is expected to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think heavy
here is very relative. Creating 3 connections could be indeed considered heavy, however it is still lighter then all other methods that create whole process.
As this function is just needed for this test (we are looking for log generated by leader), I have removed it from Cluster.
9a088c6
to
963a78d
Compare
Codecov Report
@@ Coverage Diff @@
## main #13696 +/- ##
==========================================
- Coverage 72.76% 72.59% -0.18%
==========================================
Files 467 467
Lines 38278 38278
==========================================
- Hits 27854 27788 -66
- Misses 8623 8681 +58
- Partials 1801 1809 +8
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.
part of #13168