aws-iam-authenticator: nodes of role master fail to build 17% of the time #9580

johngmyers · 2020-07-16T05:59:12Z

1. What kops version are you running? The command kops version, will display
this information.

Private build off of master branch

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

1.19.0-rc.1

3. What cloud provider are you using?

AWS

4. What commands did you run? What is the simplest way to reproduce this issue?

Terminated master. Got unlucky.

5. What happened after the commands executed?

New master failed to come up. Saw in logs:

Jul 16 05:29:34 ip-172-20-32-91 nodeup[1886]: I0716 05:29:34.932799    1886 user.go:97] Creating user "kube-apiserver-healthcheck"
Jul 16 05:29:34 ip-172-20-32-91 nodeup[1886]: I0716 05:29:34.932891    1886 user.go:99] running command: useradd -u 10012 -s /sbin/nologin -d /etc/kubernetes/kube-apiserver-healthcheck/secrets kube-apiserver-healthcheck
Jul 16 05:29:34 ip-172-20-32-91 groupadd[1915]: group added to /etc/group: name=docker, GID=998
Jul 16 05:29:35 ip-172-20-32-91 groupadd[1915]: group added to /etc/gshadow: name=docker
Jul 16 05:29:35 ip-172-20-32-91 groupadd[1915]: new group: name=docker, GID=998
Jul 16 05:29:35 ip-172-20-32-91 useradd[1916]: new group: name=user, GID=10012
Jul 16 05:29:35 ip-172-20-32-91 useradd[1916]: new user: name=user, UID=10012, GID=10012, home=/var/etcd, shell=/sbin/nologin, from=none
Jul 16 05:29:35 ip-172-20-32-91 useradd[1918]: failed adding user 'kube-apiserver-healthcheck', data deleted

Subsequent attempts all failed:

Jul 16 05:29:46 ip-172-20-32-91 nodeup[1886]: W0716 05:29:46.302066    1886 executor.go:128] error running task "UserTask/kube-apiserver-healthcheck" (9m48s remaining to succeed): error creating user: exit status 4
Jul 16 05:29:46 ip-172-20-32-91 nodeup[1886]: Output: useradd: UID 10012 is not unique

6. What did you expect to happen?

nodeup to complete

7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.

n/a

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

n/a

9. Anything else do we need to know?

EtcdBuilder doesn't specify a uid for the user user. If it needs such a user, it should.

The text was updated successfully, but these errors were encountered:

johngmyers · 2020-07-16T06:11:59Z

This was probably triggered by trying to use uid 10011 for kops-controller in my private build. That might be what user is normally assigned.

johngmyers · 2020-07-16T06:32:50Z

Looks like when user runs after another UserTask it gets assigned the next UID. So this is a race which happens when there are at least three UserTasks. So it probably also happes to people who use aws-iam-authenticator.

johngmyers mentioned this issue Jul 16, 2020

Use fixed UID for etcd user and restrict to legacy provider #9581

Merged

johngmyers changed the title ~~Etcd user created with uid conflicting with kube-apiserver-healthcheck~~ aws-iam-authenticator: nodes of role master fail to build 17% of the time Jul 16, 2020

k8s-ci-robot closed this as completed in #9581 Jul 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aws-iam-authenticator: nodes of role master fail to build 17% of the time #9580

aws-iam-authenticator: nodes of role master fail to build 17% of the time #9580

johngmyers commented Jul 16, 2020 •

edited

Loading

johngmyers commented Jul 16, 2020

johngmyers commented Jul 16, 2020

aws-iam-authenticator: nodes of role master fail to build 17% of the time #9580

aws-iam-authenticator: nodes of role master fail to build 17% of the time #9580

Comments

johngmyers commented Jul 16, 2020 • edited Loading

johngmyers commented Jul 16, 2020

johngmyers commented Jul 16, 2020

johngmyers commented Jul 16, 2020 •

edited

Loading