OpenStack etcd-manager authentication expires: no more backups after 1hr #9730

kciredor · 2020-08-11T14:22:27Z

Hi,

With reference to etcd-manager issue: kopeio/etcd-manager#337

1. What kops version are you running? The command kops version, will display
this information.

1.17.9

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

1.17.1

3. What cloud provider are you using?

OpenStack

4. What commands did you run? What is the simplest way to reproduce this issue?

Create a kubernetes cluster on OpenStack using kops using defaults / 'the usual'.

5. What happened after the commands executed?

Cluster created. Your kops Swift container will be receiving etcd-manager backups every 15 minutes. After 1 hour, backups are not coming in anymore. Tailing etcd-manager-* pods will show a large amount of "Authentication failed" responses on read calls to the kop Swift container.

6. What did you expect to happen?

Etcd-manager backups keep coming in after 1 hour as well.

7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.

Defaults, no changes.

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

N/A

9. Anything else do we need to know?

OpenStack authentication tokens are usually set to expire after a certain amount of time. This is when you should 'reauth'. For my OpenStack cloud provider this expiry value is set to 60 minutes. That's how I confirmed this as root cause.

I can see reauth logic in the code, but it seems not working properly.

etcd-manager commit 8faecdad725d05f9c7375461cbf4f3dbbec6e527 on Thu Oct 17 2019 "avoid reauthentication loops (#1746)" was pulled in by kops commit c67cdaa6e4a4cfc7eb0e7d1ae2c3920eea5daa97 on Sun May 31 2020 "Update vendored kops version to 1.17.0"

I've tried debugging and patching etcd-manager myself for a while now. Forcing NewSwiftClient upon every Swift call for example, or forcing Reauth to happen all the time.

Not exactly sure where to file the bug report, so I'll update this issue with a reference to etcd-manager's issue and back as well. At least this kops issue can be used to track etcd-manager fixes so they can be pulled in.

Any thoughts?

The text was updated successfully, but these errors were encountered:

kciredor · 2020-08-11T14:35:35Z

Perhaps @justinsb knows how to proceed (because of commits open kops: openstack and etcd-manager as well)? 👍

olemarkus · 2020-08-11T18:20:30Z

Unfortunately, I don't have swift available, but I suspect it is a matter of adding authOption.AllowReauth = true after this line.
https://github.com/kubernetes/kops/blob/master/util/pkg/vfs/swiftfs.go#L49

Then etcd-manager need to do an update on their part. But it is fairly trivial to push a custom etcd-manager image if you are able to make that change and build kops master + etcd-manager yourself.

olemarkus · 2020-08-11T18:21:18Z

/kind bug
/area provider/openstack

kciredor · 2020-08-13T08:58:51Z

Not exactly sure why my attempts at fixing this failed and yours works, but, yours works great @olemarkus ;-) 👍 Backups keep coming in now.

Btw.. It's not exactly easy to run a custom etcd-manager: the kops cluster spec does not allow swapping the image, because the template is hardcoded in kops. This means you have to run a custom kops and roll the masters.

justinsb · 2020-08-29T12:27:51Z

I merged the etcd-manager patch, but I realized we should have fixed the code here first, because it is vendored into etcd-manager. I think the problem arises when we configuration authentication from env variables. I sent #9836

olemarkus · 2020-09-01T08:26:37Z

This has been fixed for the master branch
/close

k8s-ci-robot · 2020-09-01T08:26:50Z

@olemarkus: Closing this issue.

In response to this:

This has been fixed for the master branch
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

kciredor mentioned this issue Aug 11, 2020

Backups not coming in after OpenStack token expires: Reauth bug? kopeio/etcd-manager#337

Open

k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. area/provider/openstack Issues or PRs related to openstack provider labels Aug 11, 2020

kciredor mentioned this issue Aug 17, 2020

Fixes usage of OpenStack Swift reauthentication kopeio/etcd-manager#338

Merged

justinsb closed this as completed in kopeio/etcd-manager#338 Aug 29, 2020

justinsb reopened this Aug 29, 2020

justinsb mentioned this issue Aug 29, 2020

Always use OpenStack Swift reauthentication #9836

Merged

k8s-ci-robot closed this as completed Sep 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenStack etcd-manager authentication expires: no more backups after 1hr #9730

OpenStack etcd-manager authentication expires: no more backups after 1hr #9730

kciredor commented Aug 11, 2020 •

edited

Loading

kciredor commented Aug 11, 2020

olemarkus commented Aug 11, 2020

olemarkus commented Aug 11, 2020

kciredor commented Aug 13, 2020

justinsb commented Aug 29, 2020

olemarkus commented Sep 1, 2020

k8s-ci-robot commented Sep 1, 2020

OpenStack etcd-manager authentication expires: no more backups after 1hr #9730

OpenStack etcd-manager authentication expires: no more backups after 1hr #9730

Comments

kciredor commented Aug 11, 2020 • edited Loading

kciredor commented Aug 11, 2020

olemarkus commented Aug 11, 2020

olemarkus commented Aug 11, 2020

kciredor commented Aug 13, 2020

justinsb commented Aug 29, 2020

olemarkus commented Sep 1, 2020

k8s-ci-robot commented Sep 1, 2020

kciredor commented Aug 11, 2020 •

edited

Loading