Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Epicli is failing in air-gapped infra mode #2653

Closed
11 of 18 tasks
romsok24 opened this issue Oct 1, 2021 · 4 comments
Closed
11 of 18 tasks

[BUG] Epicli is failing in air-gapped infra mode #2653

romsok24 opened this issue Oct 1, 2021 · 4 comments
Assignees
Labels

Comments

@romsok24
Copy link
Contributor

romsok24 commented Oct 1, 2021

Describe the bug

When you follow the instructions described in this EPI user guide the deployment is failing at the task:
TASK [preflight_facts : PREFLIGHT_FACTS | Decide what should be repository url]
with the folowing error:

fatal: [hostname_1]: FAILED! => {
    "msg": "The task includes an option with an undefined variable. The error was: {'repository_url': '{{ _repository_url }}', 'repository_hostname': '{{ _repository_hostname }}', 'resolved_repository_hostname': '{{ _resolved_repository_hostname }}'}: {{ ( 'http://' ~ hostvars[registered_masters[0]].repository_hostname ~ '/epirepo' ) if _reconstruct_repository_url else repository_url }}: {{ custom_repository_url | default(local_repository_url, true) }}: http://{{ hostvars[groups.repository[0]].ansible_default_ipv4.address }}/epirepo: 'dict object' has no attribute 'address'\n\nThe error appears to be in '/shared/build/dwuzer/ansible/roles/preflight_facts/tasks/kubernetes/get-repository-url.yml': line 12, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: PREFLIGHT_FACTS | Decide what should be repository url\n  ^ here\n"
}

How to reproduce
Steps to reproduce the behavior:

  1. Prepare the offline environment in the way it is described here
  2. Use the epicli prepare --os ubuntu-18.04 version
  3. Ensure the environment is isolated from internet (in my case I've used Hyper-V internal-only v-switch)
  4. execute epicli apply -f <your-name>.yml --no-infra --offline-requirements /requirementsoutput/

Expected behavior
An errorless environment deploy with epicli command.

Config files
If applicable, add config files to help explain your problem.

Environment

  • Cloud provider: None ( on prem installation using Hyper-V VMs )
  • OS: Ubuntu 18.04.4 LTS

epicli version: 1.0.1

Additional context
According to my investigation, the problem is related to the fact that there is no such attribute - nomen omen - like IPv4 gathered. And this results from how Ansible is producing the ansible_default_ipv4 variable value, which is based on instigation of the interface address assigned to the one linked with the default route ( result of the ip r command ). In my situation the preparation of the environment described in the user guide was done using the 2nd interface connected to the internet and then this iface was disconnected from VM. This has caused lack of default route on the deployment machine which has followed-up with epicli failing to deploy the offline infra.

TL;DR
There should be a precheck prepared to assure the default route is defined on the host.


DoD checklist

  • Changelog
    • updated
    • not needed
  • COMPONENTS.md
    • updated
    • not needed
  • Schema
    • updated
    • not needed
  • Backport tasks
    • created
    • not needed
  • Documentation
    • added
    • updated
    • not needed
  • Feature has automated tests
  • Automated tests passed (QA pipelines)
    • apply
    • upgrade
    • backup/restore
  • Idempotency tested
  • All conversations in PR resolved
@to-bar
Copy link
Contributor

to-bar commented Oct 22, 2021

Epiphany code assumes that the default route is configured on each target host so this is as a prerequisite.
IMO we need to do 2 things:

  1. Add above prerequisite to documentation
  2. Add preflight check to assure the default route is configured (as @romsok24 proposed)

Info regarding ansible_default_ipv4:
https://medium.com/opsops/ansible-default-ipv4-is-not-what-you-think-edb8ab154b10

@plirglo plirglo self-assigned this Jan 11, 2022
@przemyslavic przemyslavic self-assigned this Feb 9, 2022
@przemyslavic
Copy link
Collaborator

@plirglo @to-bar I ran into an issue when installing epicli for any provider on a machine with a public address.

11:46:05 INFO cli.engine.ansible.AnsibleCommand - TASK [preflight : Validate if ansible_default_ipv4.address matches address from inventory] ***
11:46:05 ERROR cli.engine.ansible.AnsibleCommand - fatal: [repository]: FAILED! => {"assertion": "ansible_default_ipv4.address == ansible_host", "changed": false, "evaluated_to": false, "msg": "ansible_default_ipv4.address is 10.1.11.4 but inventory uses ip: 20.126.xxx.yyy. Check default routing configuration, read more in troubleshooting document."}

Shouldn't we also support such a scenario, especially since the installation went fine after removing the routing checks?

@to-bar
Copy link
Contributor

to-bar commented Feb 9, 2022

Scenario with any provider but without specification.cloud is not supported for VMs in cloud.

@przemyslavic
Copy link
Collaborator

OK, then moving to DoD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants