Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcdctl: allow move-leader to connect to multiple endpoints #14307

Closed
wants to merge 2 commits into from

Conversation

tjungblu
Copy link
Contributor

@tjungblu tjungblu commented Aug 3, 2022

Re-opening closed PR #11775 which was originaly authored by benmoss.
Then again opened PR #12757 which was authored by zerodayz.

The mustClientForCmd function is responsible for parsing environment
variables and flags into configuration data. A change was made in #9382
to call Fatal if a flag is provided multiple times. This means that we
cannot call the mustClientForCmd function more than once,
since it will think that flags parsed the first time are now
being redefined and error out.

Some people have commented about this in #8380 but I don't think
there's an open issue for it.

Signed-off-by: Thomas Jungblut [email protected]

tracked under our BZ1918413

Re-opening closed PR etcd-io#11775 which was originaly authored by benmoss.
Then again opened PR etcd-io#12757 which was authored by zerodayz.

The mustClientForCmd function is responsible for parsing environment
variables and flags into configuration data. A change was made in etcd-io#9382
to call Fatal if a flag is provided multiple times. This means that we
cannot call the mustClientForCmd function more than once,
since it will think that flags parsed the first time are now
being redefined and error out.

Some people have commented about this in etcd-io#8380 but I don't think
there's an open issue for it.

Signed-off-by: Thomas Jungblut <[email protected]>
@@ -109,6 +109,10 @@ func testCtlV3MoveLeader(t *testing.T, cfg e2e.EtcdProcessClusterConfig) {
[]string{cx.epc.EndpointsV3()[leadIdx]},
fmt.Sprintf("Leadership transferred from %s to %s", types.ID(leaderID), types.ID(transferee)),
},
{ // request to all endpoints
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I've added this test case to unmodified main and both TestCtlV3MoveLeaderSecure and TestCtlV3MoveLeaderInsecure passed for me. I think it should fail without the actual code change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to set the environment variable ETCDCTL_ENDPOITNS to trigger the issue. Please update the test case accordingly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$ export ETCDCTL_ENDPOINTS=http://www.google.com:2379

$ echo $ETCDCTL_ENDPOINTS
http://www.google.com:2379

$ etcdctl  move-leader --endpoints=http://127.0.0.1:2379,http://127.0.0.1:22379,http://127.0.0.1:32379 8211f1d0f64f3269
{"level":"fatal","ts":"2022-08-09T02:59:50.324+0800","caller":"flags/flag.go:85","msg":"conflicting environment variable is shadowed by corresponding command-line flag (either unset environment variable or disable flag))","environment-variable":"ETCDCTL_ENDPOINTS","stacktrace":"go.etcd.io/etcd/pkg/v3/flags.verifyEnv\n\t/Users/wachao/go/src/github.com/ahrtr/etcd/pkg/flags/flag.go:85\ngo.etcd.io/etcd/pkg/v3/flags.SetPflagsFromEnv\n\t/Users/wachao/go/src/github.com/ahrtr/etcd/pkg/flags/flag.go:63\ngo.etcd.io/etcd/etcdctl/v3/ctlv3/command.clientConfigFromCmd\n\t/Users/wachao/go/src/github.com/ahrtr/etcd/etcdctl/ctlv3/command/global.go:103\ngo.etcd.io/etcd/etcdctl/v3/ctlv3/command.mustClientFromCmd\n\t/Users/wachao/go/src/github.com/ahrtr/etcd/etcdctl/ctlv3/command/global.go:150\ngo.etcd.io/etcd/etcdctl/v3/ctlv3/command.transferLeadershipCommandFunc\n\t/Users/wachao/go/src/github.com/ahrtr/etcd/etcdctl/ctlv3/command/move_leader_command.go:46\ngithub.com/spf13/cobra.(*Command).execute\n\t/Users/wachao/go/gopath/pkg/mod/github.com/spf13/[email protected]/command.go:860\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/Users/wachao/go/gopath/pkg/mod/github.com/spf13/[email protected]/command.go:974\ngithub.com/spf13/cobra.(*Command).Execute\n\t/Users/wachao/go/gopath/pkg/mod/github.com/spf13/[email protected]/command.go:902\ngo.etcd.io/etcd/etcdctl/v3/ctlv3.Start\n\t/Users/wachao/go/src/github.com/ahrtr/etcd/etcdctl/ctlv3/ctl.go:113\ngo.etcd.io/etcd/etcdctl/v3/ctlv3.MustStart\n\t/Users/wachao/go/src/github.com/ahrtr/etcd/etcdctl/ctlv3/ctl.go:117\nmain.main\n\t/Users/wachao/go/src/github.com/ahrtr/etcd/etcdctl/main.go:32\nruntime.main\n\t/Users/wachao/software/go/src/runtime/proc.go:255"}

Copy link
Contributor Author

@tjungblu tjungblu Sep 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ahrtr I have refactored the test so it fails on that condition again. The issue seems deeper in pflags however, will come up with another fix here.

There are many other conflicting variables here: #8380 (comment)

Signed-off-by: Thomas Jungblut <[email protected]>
@codecov-commenter
Copy link

Codecov Report

Merging #14307 (319968f) into main (ae36a57) will decrease coverage by 0.33%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main   #14307      +/-   ##
==========================================
- Coverage   75.67%   75.34%   -0.34%     
==========================================
  Files         456      456              
  Lines       37039    37071      +32     
==========================================
- Hits        28030    27931      -99     
- Misses       7272     7378     +106     
- Partials     1737     1762      +25     
Flag Coverage Δ
all 75.34% <100.00%> (-0.34%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
etcdctl/ctlv3/command/move_leader_command.go 73.68% <100.00%> (+2.63%) ⬆️
server/etcdserver/api/rafthttp/peer_status.go 87.87% <0.00%> (-12.13%) ⬇️
server/etcdserver/api/membership/store.go 50.00% <0.00%> (-10.00%) ⬇️
server/proxy/grpcproxy/register.go 81.39% <0.00%> (-9.31%) ⬇️
server/auth/simple_token.go 80.00% <0.00%> (-8.47%) ⬇️
server/etcdserver/api/rafthttp/peer.go 87.01% <0.00%> (-8.45%) ⬇️
client/pkg/v3/fileutil/lock_linux.go 72.22% <0.00%> (-8.34%) ⬇️
api/v3rpc/rpctypes/error.go 84.61% <0.00%> (-5.87%) ⬇️
server/etcdserver/api/v3election/election.go 66.66% <0.00%> (-5.13%) ⬇️
server/storage/wal/file_pipeline.go 90.69% <0.00%> (-4.66%) ⬇️
... and 27 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@tjungblu
Copy link
Contributor Author

tjungblu commented Sep 7, 2022

just to bring this back to a real case in OpenShift with etcd 3.5.3. Our operator does set those env variables properly for the customer to "just" use, however on etcdctl move-leader we find this behaviour:

sh-4.4# etcdctl member list 
12ec9b75f0525453, started, ci-ln-434szdt-72292-smgq7-master-0, https://10.0.0.3:2380, https://10.0.0.3:2379, false
4d49ceb72e485bf8, started, ci-ln-434szdt-72292-smgq7-master-1, https://10.0.0.5:2380, https://10.0.0.5:2379, false
5d8710fcccf32982, started, ci-ln-434szdt-72292-smgq7-master-2, https://10.0.0.4:2380, https://10.0.0.4:2379, false

sh-4.4# etcdctl move-leader 4d49ceb72e485bf8
{"level":"fatal","ts":"2022-09-07T12:55:05.921Z","caller":"flags/flag.go:85","msg":"conflicting environment variable is shadowed by corresponding command-line flag (either unset environment variable or disable flag))","environment-variable":"ETCDCTL_CERT","stacktrace":"go.etcd.io/etcd/pkg/v3/flags.verifyEnv\n\t/go/src/go.etcd.io/etcd/pkg/flags/flag.go:85\ngo.etcd.io/etcd/pkg/v3/flags.SetPflagsFromEnv\n\t/go/src/go.etcd.io/etcd/pkg/flags/flag.go:63\ngo.etcd.io/etcd/etcdctl/v3/ctlv3/command.clientConfigFromCmd\n\t/go/src/go.etcd.io/etcd/etcdctl/ctlv3/command/global.go:129\ngo.etcd.io/etcd/etcdctl/v3/ctlv3/command.transferLeadershipCommandFunc\n\t/go/src/go.etcd.io/etcd/etcdctl/ctlv3/command/move_leader_command.go:56\ngithub.com/spf13/cobra.(*Command).execute\n\t/go/pkg/mod/github.com/spf13/[email protected]/command.go:856\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/go/pkg/mod/github.com/spf13/[email protected]/command.go:960\ngithub.com/spf13/cobra.(*Command).Execute\n\t/go/pkg/mod/github.com/spf13/[email protected]/command.go:897\ngo.etcd.io/etcd/etcdctl/v3/ctlv3.Start\n\t/go/src/go.etcd.io/etcd/etcdctl/ctlv3/ctl.go:107\ngo.etcd.io/etcd/etcdctl/v3/ctlv3.MustStart\n\t/go/src/go.etcd.io/etcd/etcdctl/ctlv3/ctl.go:111\nmain.main\n\t/go/src/go.etcd.io/etcd/etcdctl/main.go:59\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:225"}

sh-4.4# env | grep cert
ETCDCTL_CERT=/etc/kubernetes/static-pod-certs/secrets/etcd-all-certs/etcd-peer-ci-ln-434szdt-72292-smgq7-master-0.crt
ETCDCTL_CACERT=/etc/kubernetes/static-pod-certs/configmaps/etcd-serving-ca/ca-bundle.crt
ETCDCTL_KEY=/etc/kubernetes/static-pod-certs/secrets/etcd-all-certs/etcd-peer-ci-ln-434szdt-72292-smgq7-master-0.key

which should not happen, ideally. There was no actual flag supplied here, thus there was no env variable to be shadowed in the first place.

The stacktrace pretty printed is:

go.etcd.io/etcd/pkg/v3/flags.verifyEnv
	/go/src/go.etcd.io/etcd/pkg/flags/flag.go:85
go.etcd.io/etcd/pkg/v3/flags.SetPflagsFromEnv
	/go/src/go.etcd.io/etcd/pkg/flags/flag.go:63
go.etcd.io/etcd/etcdctl/v3/ctlv3/command.clientConfigFromCmd
	/go/src/go.etcd.io/etcd/etcdctl/ctlv3/command/global.go:129
go.etcd.io/etcd/etcdctl/v3/ctlv3/command.transferLeadershipCommandFunc
	/go/src/go.etcd.io/etcd/etcdctl/ctlv3/command/move_leader_command.go:56
github.com/spf13/cobra.(*Command).execute
	/go/pkg/mod/github.com/spf13/[email protected]/command.go:856
github.com/spf13/cobra.(*Command).ExecuteC
	/go/pkg/mod/github.com/spf13/[email protected]/command.go:960
github.com/spf13/cobra.(*Command).Execute
	/go/pkg/mod/github.com/spf13/[email protected]/command.go:897
go.etcd.io/etcd/etcdctl/v3/ctlv3.Start
	/go/src/go.etcd.io/etcd/etcdctl/ctlv3/ctl.go:107
go.etcd.io/etcd/etcdctl/v3/ctlv3.MustStart
	/go/src/go.etcd.io/etcd/etcdctl/ctlv3/ctl.go:111
main.main
	/go/src/go.etcd.io/etcd/etcdctl/main.go:59
runtime.main
	/usr/lib/golang/src/runtime/proc.go:225

I still assume this is fixed in master, yet we would need this in 3.5...

@tjungblu tjungblu changed the base branch from main to release-3.5 September 7, 2022 13:23
@tjungblu tjungblu changed the base branch from release-3.5 to main September 7, 2022 13:23
@tjungblu
Copy link
Contributor Author

tjungblu commented Sep 7, 2022

given this is only reproducible for 3.5, here's the real patch #14434.

closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants