Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can not apply cluster, "sudo: a password is required" #1209

Closed
sunshine69 opened this issue Apr 25, 2020 · 6 comments
Closed

Can not apply cluster, "sudo: a password is required" #1209

sunshine69 opened this issue Apr 25, 2020 · 6 comments

Comments

@sunshine69
Copy link
Contributor

Hi team,

I am playing with docker image 0.6.0 and then the latest develop branch build as a docker image which leads to the same error message.

Let me describe the steps I did:

  • Create 1 master and 2 nodes run ubuntu 18.04. Master nodes has enough 2 g ram and 2 core to make kubeadm happy. nodes has 1.5G ram
  • Setup user admin ubuntu and ssh key so user can ssh in using key. Also user can sudo without a password.
  • Start the epiphany docker container with docker opt --net host so it stays the same network with the three nodes
  • Run
epicli init -p any newcluster1
cd newcluster1
vim newcluster1.yaml
## Set all nodes count to 0 except the kubenetes master and two nodes, I first just want to experiment with this. Save vim
epicli apply -f newcluster1.yaml

It runs for a while (quick) and got error

10:57:53 INFO cli.engine.ansible.AnsibleCommand - TASK [preflight_facts : Store preflight facts] *************************************************************
10:57:53 INFO cli.engine.ansible.AnsibleCommand - fatal: [master1]: FAILED! => {"msg": "Failed to get information on remote file (/shared/build/newcluster1/vault//../preflight_facts.yml): sudo: a password is required\n"}
10:57:53 INFO cli.engine.ansible.AnsibleCommand - 
10:57:53 INFO cli.engine.ansible.AnsibleCommand - NO MORE HOSTS LEFT *****************************************************************************************
10:57:53 INFO cli.engine.ansible.AnsibleCommand - 
10:57:53 INFO cli.engine.ansible.AnsibleCommand - PLAY RECAP *************************************************************************************************
10:57:53 INFO cli.engine.ansible.AnsibleCommand - master1                    : ok=14   changed=0    unreachable=0    failed=1    skipped=8    rescued=0    ignored=0   
10:57:53 INFO cli.engine.ansible.AnsibleCommand - node1                      : ok=12   changed=0    unreachable=0    failed=0    skipped=10   rescued=0    ignored=0   
10:57:53 INFO cli.engine.ansible.AnsibleCommand - node2                      : ok=12   changed=0    unreachable=0    failed=0    skipped=10   rescued=0    ignored=0   
10:57:53 INFO cli.engine.ansible.AnsibleCommand - 
10:57:53 ERROR cli.engine.ansible.AnsibleCommand - Error running: "ansible-playbook -i /shared/build/newcluster1/inventory /shared/build/newcluster1/ansible/preflight.yml"
10:57:53 INFO cli.engine.ansible.AnsibleCommand - Retry running playbook: 1/1
10:58:03 INFO cli.engine.ansible.AnsibleRunner - Run done in 28570ms
10:58:03 ERROR epicli - Failed running playbook after 1 retries
10:58:09 INFO dump_debug_info - Error dump has been written to: /shared/build/newcluster1/epicli_error_20200425-105803.dump
10:58:09 WARNING dump_debug_info - This dump might contain sensitive information. Check before sharing.

I wonder what I did wrong. Or a known bug?

Please note that version docker tag 0.4.2 does not have the issues, it build the cluster just fine.

I will share the dump file if requested.

Thanks team.

@sunshine69
Copy link
Contributor Author

The reason I want to run the latest is that it have the option apply --skip-config which in my understanding allow me to play more with ansible generated after the first apply without it to be overridden again by epicli.

Also at some stage we need to upgrade anyway,

@sk4zuzu
Copy link
Contributor

sk4zuzu commented Apr 25, 2020

Hi @sunshine69!

So far I failed to reproduce the issue :(, I tried steps below (followed the docs here):

$ docker pull epiphanyplatform/epicli:0.6.0
$ docker run -it -v `pwd`:/shared --rm epiphanyplatform/epicli:0.6.0
epiuser@(redacted):/shared$ epicli apply -f any1.yml

where any1.yml is just a standard config file which looks like:

kind: epiphany-cluster
title: "Epiphany cluster Config"
provider: any
name: "any1"
specification:
  name: any1
  admin_user:
    name: ubuntu
    key_path: /shared/id_rsa
  components:
    kubernetes_master:
      count: 1
      machines:
        - default-k8s-master1
    kubernetes_node:
      count: 2
      machines:
        - default-k8s-node1
        - default-k8s-node2
    logging:
      count: 0
    monitoring:
      count: 0
    kafka:
      count: 0
    postgresql:
      count: 0
    load_balancer:
      count: 0
    rabbitmq:
      count: 0
---
kind: configuration/shared-config
title: Shared configuration that will be visible to all roles
name: default
specification:
  use_ha_control_plane: false
  promote_to_ha: false
provider: any
---
kind: infrastructure/machine
provider: any
name: default-k8s-master1
specification:
  hostname: x1a1
  ip: 10.20.2.10
---
kind: infrastructure/machine
provider: any
name: default-k8s-node1
specification:
  hostname: x1b1
  ip: 10.20.2.20
---
kind: infrastructure/machine
provider: any
name: default-k8s-node2
specification:
  hostname: x1b2
  ip: 10.20.2.21

The --net=host argument makes no difference in my environment.

Could you provide more info about what operating system you use to execute that docker container on and maybe exact steps how you enter it? Do you modify the image in any way?

Thanks for reporting the issue!

@sunshine69
Copy link
Contributor Author

Right, it might be the epiuser used as default. I - to retain ownership of my current shared volume, create a user with same uid inside the image - and when run I dont use user epiuser, but that newly created user. I do not think I enable sudo for that user -.

Let me look at that and repeat the process again. I will post update here.

Thanks a lot for looking into this.

Kind regards

@sunshine69
Copy link
Contributor Author

Yes confirmed. If the user run the epicli inside the epiphany container has sudo without root, or just use the default user epiuser (I have to edit to change the uid and gid to match with my current user as in ubuntu the first user defauled not 1000, but 1001.) - then the issues is gone.

However I got another error which is missing gpg-agent in the ubuntu system. Might create a PR to add these missing package in the ansible later on when I get all full list.

The ubuntu system I got is very minimum.

@sk4zuzu
Copy link
Contributor

sk4zuzu commented Apr 28, 2020

Hi @sunshine69!

I carefully reviewed the code and found that this is really unnecessary for sudo to be required for delegate_to: localhost type of ansible tasks.

Here's the pull-req that is going to fix that #1217. :)

Thanks.

@sunshine69
Copy link
Contributor Author

Thanks I will give it a test tomorrow

@sunshine69 sunshine69 reopened this Apr 28, 2020
@sk4zuzu sk4zuzu changed the title Can not apply cluster - fatal: [master1]: FAILED! => {"msg": "Failed to get information on remote file (/shared/build/newcluster1/vault//../preflight_facts.yml): sudo: a password is required Can not apply cluster, "sudo: a password is required". Apr 28, 2020
@sk4zuzu sk4zuzu changed the title Can not apply cluster, "sudo: a password is required". Can not apply cluster, "sudo: a password is required" Apr 28, 2020
@sk4zuzu sk4zuzu closed this as completed Apr 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants