Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI staging runs are timing out while waiting for VM reboot #4440

Closed
conorsch opened this issue May 16, 2019 · 3 comments
Closed

CI staging runs are timing out while waiting for VM reboot #4440

conorsch opened this issue May 16, 2019 · 3 comments

Comments

@conorsch
Copy link
Contributor

Description

The staging-test-with-rebase CI job has been timing out frequently, blocking review and merge.

Steps to Reproduce

Observe example CI failures:

Expected Behavior

CI should pass if no bugs are found. CI should not halt early because VMs did not successfully reboot.

Actual Behavior

CI frequently fails on while waiting for the staging VMs to report. Example text:

    RUNNING HANDLER [common : Wait for server to come back.] ***********************
    fatal: [app-staging -> localhost]: FAILED! => {"changed": false, "elapsed": 300, "msg": "Timeout when waiting for search string OpenSSH in 192.168.121.45:22"}
    fatal: [mon-staging -> localhost]: FAILED! => {"changed": false, "elapsed": 300, "msg": "Timeout when waiting for search string OpenSSH in 192.168.121.81:22"}
    

Comments

The GCP CI image hasn't changed since 2018-10. On every start, however, cloud-init settings will upgrade all apt packages. Perhaps an upstream version mismatch in the Xenial repos (for the GCP CI host, not for the staging VMs themselves) is causing problems with the libvirt config.

@conorsch
Copy link
Contributor Author

Temporary fix in #4435 yielded at least one passing build: https://circleci.com/gh/freedomofpress/securedrop/27875 Will re-run it in an attempt to shake out problems.

@conorsch
Copy link
Contributor Author

Another successful build: https://circleci.com/gh/freedomofpress/securedrop/27883

@conorsch
Copy link
Contributor Author

More successful builds:

The fix has been merged into develop via #4432, and will shortly be backported to release/0.13, so I'm marking this resolved. Please open a new issue if CI problems rear their heads again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants