Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Backup/recovery commands fail when executed directly after upgrade #2907

Closed
12 of 20 tasks
przemyslavic opened this issue Jan 19, 2022 · 1 comment
Closed
12 of 20 tasks
Assignees
Labels

Comments

@przemyslavic
Copy link
Collaborator

przemyslavic commented Jan 19, 2022

Describe the bug
Backup/recovery commands fail when executed directly after upgrade as there are no vars/specification generated.
It works fine when testing the following scenarios:

  • apply->backup->recovery
    and
  • apply->upgrade->re-apply->backup->recovery
    but fails when executing
  • apply->upgrade->backup

How to reproduce
Steps to reproduce the behavior:

  1. deploy any cluster < v1.3
  2. upgrade to v1.3
  3. execute epicli backup

Expected behavior
Backup/recovery should be successful or there should be an assertion to check if apply was executed first.

Config files

Environment

  • Cloud provider: [all]
  • OS: [all]

epicli version: all epicli versions

Additional context

Could not find or access 'roles/rabbitmq/vars/main.yml'
Searched in:
\t/shared/build/10to13awrhflannel/ansible/vars/roles/rabbitmq/vars/main.yml
\t/shared/build/10to13awrhflannel/ansible/roles/rabbitmq/vars/main.yml
\t/shared/build/10to13awrhflannel/ansible/vars/roles/rabbitmq/vars/main.yml
\t/shared/build/10to13awrhflannel/ansible/roles/rabbitmq/vars/main.yml on the Ansible Controller.
If you are using a module and expect the file to exist on the remote, see the remote_src option

One more information about backup/recovery. It should be run only after apply. So the correct paths are apply->backup->recovery and apply->upgrade->re-apply->backup->recovery. When executed directly after upgrade it will fail as there is no specification/vars. Do we accept this approach? otherwise we still have some work to do. Failed examples
The task includes an option with an undefined variable. The error was: {{ dirs_to_archive | default([])
                   | map('regex_replace', '//*$', '')
                   | select
                   | map('regex_replace', '$', '/')
                   | list }}: ['{{ component_vars.specification.storage.data_directory }}/snapshots/{{ prometheus_snapshot_name }}/']: 'dict object' has no attribute 'specification'
The error appears to be in '/shared/build/10to13awrhflannel/ansible/roles/backup/tasks/common/create_snapshot_archive.yml': line 28, column 3, but may
be elsewhere in the file depending on the exact syntax problem.
The offending line appears to be:

- name: Reconstruct the paths_to_archive list
  ^ here


DoD checklist

  • Changelog
    • updated
    • not needed
  • COMPONENTS.md
    • updated
    • not needed
  • Schema
    • updated
    • not needed
  • Backport tasks
    • created
    • not needed
  • Documentation
    • added
    • updated
    • not needed
  • Feature has automated tests
  • Automated tests passed (QA pipelines)
    • apply
    • upgrade
    • backup/restore
  • Idempotency tested
  • All conversations in PR resolved
  • Solution meets requirements and is done according to design doc
  • Usage compliant with license
@przemyslavic
Copy link
Collaborator Author

przemyslavic commented Mar 2, 2022

✔️ Fixed
Tested scenarios:

  • apply v1.0.2 -> upgrade to 2.0.0dev -> backup -> recovery
  • apply v1.3.0 -> upgrade to 2.0.0dev -> backup -> recovery

Reproduced issue #2997 as reported by @rafzei when testing upgrade from v1.3.0.

@seriva seriva closed this as completed Mar 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants