Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deployments fail with User "system:anonymous" cannot get replicationcontrollers error #2918

Closed
xjvs opened this issue Jun 8, 2015 · 24 comments
Assignees
Labels
component/apps kind/bug Categorizes issue or PR as related to a bug. priority/P1

Comments

@xjvs
Copy link

xjvs commented Jun 8, 2015

running the latest code (commit 4518893), docker-registry failed to be deployed with error:

F0608 09:39:33.433826       1 deployer.go:65] User "system:anonymous" cannot get replicationcontrollers in project "default"

steps:

  1. $ cd go/src/github.com/openshift/origin/examples/sample-app

  2. $ openshift start &> logs/openshift.log &
    wait for server to finish start

  3. $ oadm registry --create --credentials=openshift.local.config/master/openshift-registry.kubeconfig --config=openshift.local.config/master/admin.kubeconfig

    admin.kubeconfig
    deploymentconfigs/docker-registry
    services/docker-registry
    
  4. $ oc status --config=openshift.local.config/master/admin.kubeconfig

    In project default
    
    service docker-registry (172.30.178.242:5000)
      docker-registry deploys docker.io/openshift/origin-docker-registry:v0.5.4 
        #1 deployment failed 9 minutes ago
    
  5. $ docker ps -a

    CONTAINER ID        IMAGE                              COMMAND                CREATED             STATUS                        PORTS               NAMES
    e127ff84a12c        openshift/origin-deployer:v0.5.4   "/usr/bin/openshift-   12 minutes ago      Exited (255) 12 minutes ago                       k8s_deployment.9bc06e45_docker-registry-1-deploy_default_3da7842a-0dc2-11e5-bb2f-5254000035f7_9db1f105
    
  6. $ docker logs e127ff84a12c

    F0608 09:39:33.433826       1 deployer.go:65] User "system:anonymous" cannot get replicationcontrollers in project "default"
    
@liggitt
Copy link
Contributor

liggitt commented Jun 8, 2015

make sure you have the latest origin-deployer docker image. It now reads a service account API token to talk to the API

@liggitt liggitt self-assigned this Jun 8, 2015
@xjvs
Copy link
Author

xjvs commented Jun 8, 2015

Unfortunately, I have the latest code and images already, and I double confirmed that with:

# docker pull docker.io/openshift/origin-deployer:v0.5.4
v0.5.4: Pulling from docker.io/openshift/origin-deployer
6941bfcbbfca: Already exists 
41459f052977: Already exists 
fd44297e2ddb: Already exists 
739e4ff80aed: Already exists 
4fa32e2bc2d3: Already exists 
533823e74a13: Already exists 
e3c842312d83: Already exists 
f76fd3fa0c94: Already exists 
5cc3bd75c547: Already exists 
fd0b95d8c32f: Already exists 
e56ba82ad9b0: Already exists 
94fd191bf8e2: Already exists 
Digest: sha256:ebe81b6a3894887710556cc75c306e561bc13f8c8c1b1965d9b6d53eed2bbdb0
Status: Image is up to date for docker.io/openshift/origin-deployer:v0.5.4

and:

$ docker images | grep origin
docker.io/openshift/origin-deployer          latest              0423b758c1fe        3 hours ago         357.9 MB
docker.io/openshift/origin-deployer          v0.5.4              94fd191bf8e2        3 days ago          356.6 MB
docker.io/openshift/origin-pod               v0.5.4              5b2a4faa885e        3 days ago          1.105 MB
docker.io/openshift/origin-sti-builder       latest              ad8f11fd1110        6 days ago          356.5 MB
docker.io/openshift/origin-docker-registry   latest              b95b3434fbec        6 days ago          303.9 MB
docker.io/openshift/origin-sti-builder       v0.5.3              538db5e112fb        10 days ago         356.1 MB
docker.io/openshift/origin-deployer          v0.5.3              331eec1be546        10 days ago         356.1 MB
docker.io/openshift/origin-docker-registry   v0.5.3              b435e983568a        10 days ago         303.9 MB
docker.io/openshift/origin-pod               v0.5.3              d8401fe18462        10 days ago         1.105 MB
docker.io/openshift/origin-haproxy-router    latest              339ed87d09ea        7 weeks ago         357.7 MB
docker.io/openshift/origin-pod               latest              ab45010051b5        7 weeks ago         957.9 kB

beside of that, I have tried with tagging the latest origin-deployer image with v0.5.4 before starting openshift server, which result in the same error:

$ docker rmi docker.io/openshift/origin-deployer:v0.5.4
$ docker tag 0423b758c1fe docker.io/openshift/origin-deployer:v0.5.4
$ openshift start &> logs/openshift.log &

@liggitt
Copy link
Contributor

liggitt commented Jun 8, 2015

Make sure the "deployer" service account in your namespace has API tokens generated before running deployments (osc describe serviceaccount deployer). That should happen automatically when a namespace is first created, but can take a few seconds.

@danmcp danmcp added kind/bug Categorizes issue or PR as related to a bug. priority/P2 component/apps labels Jun 8, 2015
@xjvs
Copy link
Author

xjvs commented Jun 9, 2015

Seems that (no api tokens) is not the reason, I confirmed the tokens were generated before deploying registry, see log as below (step 7):

  1. $ git pull

  2. $ git rev-parse HEAD

    a675c2998516527782877c647cadd6eedf979e19
    
  3. $ make clean build

  4. $ docker images | awk '$1 ~ /origin-/ && $2 ~ /0.5.4/'

    docker.io/openshift/origin-sti-builder       v0.5.4              92e7b4c518ca        4 days ago          356.6 MB
    docker.io/openshift/origin-deployer          v0.5.4              94fd191bf8e2        4 days ago          356.6 MB
    docker.io/openshift/origin-docker-registry   v0.5.4              aedf766dadaa        4 days ago          303.9 MB
    docker.io/openshift/origin-haproxy-router    v0.5.4              5f88336d2b99        4 days ago          369 MB
    docker.io/openshift/origin-pod               v0.5.4              5b2a4faa885e        4 days ago          1.105 MB
    
  5. $ for i in docker images | awk '$1 ~ /origin-/ && $2 ~ /v0.5.4/ { print $1":"$2 }';do docker pull $i > /dev/null;done

  6. $ cd examples/sample-app/ ; openshift start &> logs/openshift.log &

  7. $ oc describe serviceaccount deployer --config=openshift.local.config/master/admin.kubeconfig

    Name:       deployer
    Labels:     <none>
    Secrets:    {  deployer-token-3wyxh    }
                {  deployer-dockercfg-co37z    }
    
    Tokens:     deployer-token-1kzdu
                deployer-token-3wyxh
    
  8. $ oadm registry --create --credentials=openshift.local.config/master/openshift-registry.kubeconfig --config=openshift.local.config/master/admin.kubeconfig

    deploymentconfigs/docker-registry
    services/docker-registry
    
  9. $ oc status --config=openshift.local.config/master/admin.kubeconfigIn project default

    service docker-registry (172.30.70.181:5000)
      docker-registry deploys docker.io/openshift/origin-docker-registry:v0.5.4 
        #1 deployment failed 23 seconds ago
    
    service kubernetes (172.30.0.2:443)
    
    service kubernetes-ro (172.30.0.1:80)
    
    To see more information about a Service or DeploymentConfig, use 'oc describe service <name>' or 'oc describe dc <name>'.
    You can use 'oc get all' to see lists of each of the types described above.
    
  10. $ docker ps -a

    CONTAINER ID        IMAGE                              COMMAND                CREATED             STATUS                        PORTS               NAMES
    6fe289744f25        openshift/origin-deployer:v0.5.4   "/usr/bin/openshift-   30 seconds ago      Exited (255) 19 seconds ago                       k8s_deployment.81cd2f34_docker-registry-1-deploy_default_ae0e2aa1-0e4c-11e5-9565-5254000035f7_6eda7005   
    a869877989ee        openshift/origin-pod:v0.5.4        "/pod"                 32 seconds ago      Exited (0) 12 seconds ago                         k8s_POD.dc58c433_docker-registry-1-deploy_default_ae0e2aa1-0e4c-11e5-9565-5254000035f7_b3af86a4          
    
  11. $ docker logs 6fe289744f25

    F0609 02:10:33.098779       1 deployer.go:65] User "system:anonymous" cannot get replicationcontrollers in project "default"
    

@liggitt
Copy link
Contributor

liggitt commented Jun 9, 2015

Can you docker inspect the failed deployer container and pastebin the result?

@xjvs
Copy link
Author

xjvs commented Jun 9, 2015

Here is the container inspect result:
http://www.fpaste.org/230236/

@liggitt liggitt changed the title failed to deploy docker-registry Deployments fail with User "system:anonymous" cannot get replicationcontrollers error Jun 9, 2015
@liggitt
Copy link
Contributor

liggitt commented Jun 10, 2015

A few questions:

  1. What OS are you running this on?
  2. What version of docker do you have?

Can you try running from master with --latest-images=true (ensure you have the latest openshift/origin-deployer pulled)... additional logging was added to the deployer image to help diagnose this

@xjvs
Copy link
Author

xjvs commented Jun 10, 2015

Hi @liggitt , thanks for taking look on this, I've just confirmed it's caused by docker version (RHEL7.1 x86_64 + Docker 1.6.0), with '--latest-images=true', I saw log as:

  1. $ docker logs 32a7c7b5f593

    E0610 03:32:45.773822       1 clientcmd.go:128] Error reading BEARER_TOKEN_FILE "/var/run/secrets/kubernetes.io/serviceaccount/token": open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory
    E0610 03:32:46.027594       1 clientcmd.go:146] Error reading BEARER_TOKEN_FILE "/var/run/secrets/kubernetes.io/serviceaccount/token": open /var/run/secrets/kubernetes.io/serviceaccount/token: no such file or directory
    F0610 03:32:46.039666       1 deployer.go:65] User "system:anonymous" cannot get replicationcontrollers in project "default"
    
  2. $ docker inspect 32a7c7b5f593

    ...
    "HostConfig": {
        "Binds": [
            "/root/go/src/github.com/openshift/origin/examples/sample-app/openshift.local.volumes/pods/52b4d006-0f21-11e5-9452-525400e80351/volumes/kubernetes.io~secret/deployer-token-qwlib:/var/run/secrets/kubernetes.io/serviceaccount:ro",
            "/root/go/src/github.com/openshift/origin/examples/sample-app/openshift.local.volumes/pods/52b4d006-0f21-11e5-9452-525400e80351/containers/deployment/32a7c7b5f593235dd3eeb530e437516b48881029c6c058977891f2c32509944c:/dev/termination-log"
        ],
    ...
    
  3. $ cat /root/go/src/github.com/openshift/origin/examples/sample-app/openshift.local.volumes/pods/52b4d006-0f21-11e5-9452-525400e80351/volumes/kubernetes.io~secret/deployer-token-qwlib/token

    eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJkZWZhdWx0Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6ImRlcGxveWVyLXRva2VuLXF3bGliIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImRlcGxveWVyIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiM2VhNTc5NDItMGYyMS0xMWU1LTk0NTItNTI1NDAwZTgwMzUxIiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50OmRlZmF1bHQ6ZGVwbG95ZXIifQ.Kvi-zGA7AyVlKErRRyKy8bGlw8tXPYwzOJCk0SWyRUrXBiE9x01mllhSC9B1k3IIBv0NSjZAGXUOrC_T3QoP5w6H6VlVecNVz51ryPyhjxob9eGycLOHyW0ny7BZN2PqmJuRhjICGuZyMMWXj9r2Vw8r26_Zcx93xwC5Zvfb3SPA6Fa59MbsoJrem5yOyObxoWrR-k6n-nCIFCsnxx39Y9rlbQKUU-VA630Tn5NPr0XRulL9t75DjZ9BgtCgne_chJOg4c_1ECJ8Nz7f0HKSo47CooIhZ9V2RnIUOqCvPz4Dcq569Qm03I3VqqaiT2Ng-NmudQO0hAxtTjWV8F70ZA
    
  4. $ getenforce

    Permissive

After I upgrade docker to 1.6.2, the issue does not exist anymore.

@liggitt
Copy link
Contributor

liggitt commented Jun 10, 2015

For reference, this was the RHEL issue with secret mounts being masked: #2921 (comment)

@liggitt
Copy link
Contributor

liggitt commented Jun 10, 2015

Closing this issue, since it is resolved by the RHEL docker 1.6.2-8 fix.

It is still possible to create a build or deployment pod immediately after creating a project before the serviceaccount API tokens have been generated, and no token will be mounted. The QPS changes in #3003 greatly narrowed the window where that was possible, but I will open another issue to track resolving that.

@liggitt liggitt closed this as completed Jun 10, 2015
@xjvs
Copy link
Author

xjvs commented Jun 10, 2015

Yes, thanks very much, actually I just figured out it's https://bugzilla.redhat.com/show_bug.cgi?id=1229319 , and not my version upgrade solved the issue, it's my RHEL-docker (1.6.0) -> upstream (1.6.2) docker solved the issue, and confirmed upstream 1.6.0 works fine too.

@liggitt
Copy link
Contributor

liggitt commented Jun 10, 2015

opened #3035 to track issue with pods created before API tokens have been generated

@gravis
Copy link

gravis commented Jun 10, 2015

I'm having the same issue with the docker registry, but not only.
When I trigger a deploy manually, it's failing with the error:

F0610 18:32:48.935073       1 deployer.go:65] User "system:anonymous" cannot get replicationcontrollers in project "myproject"

I'm using the openshift/origin:v0.6 docker image in boot2docker (docker 1.6.2), and oc deploy redis-master --latest to deploy.

Thanks

@gravis
Copy link

gravis commented Jun 10, 2015

For the record, it started to fail with v0.5.4, not v0.6. v0.5.3 is working fine.

@liggitt
Copy link
Contributor

liggitt commented Jun 10, 2015

there are three separate issues that exhibit the same symptom ("User "system:anonymous" cannot get replicationcontrollers"):

  1. rhel mounted /var/run/secrets which masked the secret - this issue, fixed in rhel docker 1.6.2-8
  2. boot2docker issue with secret mounts (which use tmpfs) - tracked in Docker 1.7 cannot mount secrets #3072
  3. the service account token not being ready when the deployer pod is created - tracked in Prevent pod admission before service account tokens exist #3035

@liggitt
Copy link
Contributor

liggitt commented Jun 10, 2015

@gravis the issue with boot2docker is actually a secret mounting issue... we just started using secrets in 0.5.4 in a way that the failures were noticeable

@gravis
Copy link

gravis commented Jun 11, 2015

We're also having this with a fresh new atomic installation (centos), shipping with docker 1.6.0, which makes impossible to run openshift out of the box :(

@ncdc
Copy link
Contributor

ncdc commented Jun 11, 2015

I believe the CentOS Docker tracks upstream and doesn't include our patches. I'd be surprised if "rhel mounted /var/run/secrets which masked the secret" applies to the CentOS RPM.

@gravis
Copy link

gravis commented Jun 11, 2015

Could you translate that for redhat noobs please :)
The http://www.projectatomic.io/download/ doesn't mention any "rhel" version.
Thanks

@ncdc
Copy link
Contributor

ncdc commented Jun 11, 2015

We have some patches to Docker in our Fedora and RHEL RPMs. 1 of these patches caused an issue with some of the 1.6.x RHEL RPMs, and it was fixed in 1.6.2-8. Again, this was for RHEL only. We probably need to spin up a CentOS VM and see if we can reproduce your issue.

@gravis
Copy link

gravis commented Jun 11, 2015

Ok thanks for the explanation. We have a strong Debian background, and all these flavors are pretty new to us :)

@liggitt
Copy link
Contributor

liggitt commented Jun 11, 2015

@gravis can you open a separate issue to track the issue with secrets on CentOS, and include the results of docker inspect <container> and oc get pod <pod> -o json?

@rhatdan
Copy link
Contributor

rhatdan commented Jun 11, 2015

Centos and RHEL have the same patches, even though the Centos does not use the secrets.

@ncdc
Copy link
Contributor

ncdc commented Jun 11, 2015

@rhatdan I thought CentOS was pure upstream?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/apps kind/bug Categorizes issue or PR as related to a bug. priority/P1
Projects
None yet
Development

No branches or pull requests

6 participants