Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Converts config tests from Serverspec to TestInfra #1616

Merged
merged 79 commits into from
Mar 16, 2017

Conversation

conorsch
Copy link
Contributor

All SecureDrop tests are now Python-based, with the config tests using testinfra, which is itself a pytest module. The config test conversion allows us to remove the Ruby-based Serverspec tooling, which was not running as part of CI (see #1067), and was underutilized by developers.

Items of note:

To test, read the new docs included in this PR and if anything isn't clear, yell at me for writing bad docs. You should be able to get passing tests on development, build, app-staging, and mon-staging.

Closes #1580.

Conor Schaefer and others added 30 commits March 10, 2017 14:05
The ServerSpec tests were checking for hard-coded pip versions in the
development environment that were out of date. The coverage dependencies
were bumped in bd795e2; this change
updates the tests accordingly.
Only adds the testinfra version; doesn't remove the spectests version
yet. Some of the versions for pip dependencies have changed; tracking
those down now.
Mostly these are simple checks to ensure the SSH and iptables config are
in line with system defaults, since SSH hardening and SSH over Tor only
applies to the staging and prod environments.
The testinfra invocation is tedious to type by hand, so let's wrap it up
in a script and use that.

The testinfra wrapper script now targets the development VM only.
We can update the script to accept arguments again once we're finished
porting the development spec tests to testinfra.
Mostly these are checks that the development environment is configured
correctly, e.g. app files are served out of /vagrant rather than
/var/www, and apache and other apt packages are absent, since we're
running the dev servers out of the local app code.
Using the `pytest.mark.parametrize` decorator to ensure that all tests
are run, even if some of the tests fail. This is a recommended best
practice with pytest, and avoids the use of `for` loops, which would
halt testing on the first failure. When testing many different pip
versions in a loop, you want to know all the versions that are wrong,
otherwise you have to patch and retest iteratively.
Fairly straightforward. Lazily hard-coding test vars right now,
specifically for development.
These were the bare packages required locally for me to get testinfra tests
chugging along. We may want to reorganize this later someplace else, just
getting into git in the meantime.
Only implemented for the development machine at present.
These tests should also run on the app-staging machine,
but we can add that logic to the test wrapper later.
Ran into a bit of trouble with the Firefox version pinned as part of the
Selenium config, since that broke checks for "all upgrades have been
installed." Added a logic branch in the tests that permits Firefox
if it's the only package listed in the list of upgradeables.
Includes regression checks for old fingerprints on expired FPF signing
keys. Not explicitly testing for files created by `securedrop-keyring`
package, but the fingerprint checks do include that filepath, so if it's
missing, the corresponding output in `apt-key finger` will change and
cause the test to fail.
Simple checks for dotfiles that enforce the auto-tmux functionality for
login shells on SecureDrop machines. It'd really be simpler if we
switched to `byobu`, which has the `byobu-enable` command for exactly
this use case.
Major workaround in place here: checking the tor service running and
enabled states via low-level, sysv-specific shell-outs, rather than
leveraging TestInfra's `Service` module. The Service module incorrectly
assumes the tor service will be managed inside `/etc/init/tor`, as an
Upstart service, which isn't true. Read the Service source code and
reimplemented the checks described there.

The service checks will break if run against a systemd-style host, but
that's fine.
Reasonable checks for the most part, leaning heavily on the Command and
File modules since TestInfra doesn't boast as many built-in types as
Serverspec does.

The swap check is marked as "expected to fail", via the
`pytest.mark.xfail` decorator, pending updates to the Ansible config.
Havent fully baked this through a prod install yet. Figure Ill take a
second pass on that as I move through. I'm also not 100% sold if my design
lay-out is the way to go, really experimenting for now.
In partciular, had to change $@ -> ${@:2} so if you pass
something like `tests.sh mon-staging` testinfra wont bomb out because
it thinks `mon-staging` should be a filename.
Parametrization of the `paxtest` command slows down the test run
significantly, but it's currently disabled by default due to lack of the
`paxtest` command. We should update the app-test role to install paxtest
to make sure it's available on staging hosts.

Setting the PaX flag checks as permissible failures due to PaX flags
being out of sync between:

  * Ansible playbook in SD repo
  * post-install kernel hook in securedrop-grsec metapackage

The Ansible config should be updated so that the flags match.
Substantial changes here. For one, the Document Interface has been
renamed to the Journalist Interface, so many tests needed logic updating
to check the correct filepath and content.

Most other checks are straightforward. We're a bit heavy handed on exact
matches for the configs, but I'm comfortable with that: configs for
these services should be well thought out, and must include updates to
the test suite, as well.
The combined test file was approaching 400 lines, much of which was
variable declaration via the pytest parametrization marker. Became
difficult to navigate and update. Splitting up the tests into smaller
chunks makes it more obvious where testing logic should land.
These are new, and haven't been tested before. We recently had a report
that Journalist Interface logging is broken, and that's indeed true.
Added an xfail marker on a test for that to track the regression once
patched.
Very simple test logic, using a substring search for a large multiline
block. More problematic is the non-DRY reuse of hardcoded vars; planning
on cleaning up that logic once the Serverspec -> Testinfra sprint has
slowed down, and the new test layout is stable.
Simple check to confirm all IPv6 traffic is dropped.
Had to make some revisions to how we were doing iptables comparisons. Since
iptables is highly dependent on the order of the rules, it is not enough to
simply check iteratively whether each line exists - order matters. So now I am
comparing the entire iptables output against an expected output.
Superseceded by replacement testinfra tests
Modifies the wrapper script used to trigger the Testinfra test suite so
that it exports an environment variable
SECUREDROP_TESTINFRA_TARGET_HOST, which is used in the
`testinfra/conftest.py` config file to import YAML vars files specific
to that host.
Enables per-host variable customization via exporting to the `pytest`
namespace. Reads in YAML files based on hostname, and makes the vars
declared there available to all tests that want it.

Requires an environment variable to be exported in the testinfra wrapper
script, which is an acceptable compromise to dry out the tests.
Allows importing custom vars, e.g. filepaths for app code docroots,
on a per-host basis for use in config test suite.
Importing from YAML vars files to customize the test inputs, so the same
testing logic can be reused across multiple hosts.
Reading from the per-host YAML vars files for development and
app-staging machines. Allows reuse of the tests with different checks,
making the entire test suite more DRY.
@conorsch
Copy link
Contributor Author

@redshiftzero Thanks for the detailed review! You're right, the docs should provide better guidance on creating and provisioning the VMs beforehand—the old serverspec docs did so, and that language should not have been removed. Will re-add.

Regarding the dev error, that sounds like a provider-specific issue, e.g. libvirt vs. virtualbox. There's no need to track that explicit mode on the dev VM, so I'll simply remove that check, because the failure is not helpful to developers.

Regarding the staging failures, that sounds like the VMs were not rebooted prior to running the tests, since nearly all the networking checks fail. We removed the automatic reboot at the end of the playbooks from the staging VMs, simply to get them up and running faster, but I'm not convinced that was the right approach. (Prod VMs always reboot at the end of a playbook run.) For now, I'll document the need to reboot the hosts in the developer docs, but we should consider enforcing the reboot in staging, same as prod.

@redshiftzero
Copy link
Contributor

redshiftzero commented Mar 15, 2017

Ah, makes sense. Adding a note to the docs works 👍. To confirm, I rebooted both staging VMs and reran the testinfra suites, which reduced the failures to three. Remaining failures on app-staging are here and for mon-staging are here.

It looks like two of the failures are due to the hardcoded iptables rulesets in the tests - I seem to have different uids on debian-tor on both VMs and for postfix on mon-staging:

vagrant@mon-staging:~$ id -u postfix
107
vagrant@mon-staging:~$ id -u debian-tor
106
vagrant@app-staging:~$ id -u redis
107
vagrant@app-staging:~$ id -u debian-tor
106

If you look at the line by line difference in the iptables rules in the links above you'll see that the differing uids are the only issue, so if this is resolved these two iptables tests should pass.

The old Serverspec scaffolding logic automatically created the VMs prior
to testing, but the Testinfra setup does not. Removed statements about
the VMs automatically being created.

Also adds a note stating that the staging VMs must be rebooted prior to
running the config tests. This is because we've removed the automatic
reboot at the end of the staging playbook, simply to get those machines
up and running faster. (Perhaps we should readd it.) The prod playbook
retains the reboot-after-run logic.
@conorsch
Copy link
Contributor Author

It looks like two of the failures are due to the hardcoded iptables rulesets in the tests - I seem to have different uids on debian-tor on both VMs and for postfix on mon-staging:

Drat, tried to KISS on those. Fortunately @msheiny wrote logic to infer those values dynamically, same as we were doing in the Serverspec tests, so I'll remove the skip markers on those and iron them out.

Conor Schaefer added 3 commits March 15, 2017 11:47
The tests were reporting failures on the HidServAuth token not matching
the specified regex, which was overly strict, and didn't include the `+`
character in the set for allowed chars. Let's permit any characters and
validate the length of the token, as well as its situation within the
entire HidServAuth declaration within the `hostname` file for Tor
services.
The Jinja logic was being skipped pending a CI-compatible change, but we
need to look up the uids and gids dynamically, since they're not
consistent across builds. Fortunately @msheiny already wrote the logic,
so this change just removes the "skip" marker on the test and deletes
the YAML vars in favor of the Jinja template.
Using the Jinja-based templating logic, same as mon-staging, to look up
the necessary uids and gids dynamically, for interpolation in the
iptables rulesets. As described in the developer docs, these config
tests focus on testing implementation, rather than behavior, but they're
a good start toward the testing baseline we want to deliver.
@conorsch
Copy link
Contributor Author

@redshiftzero Updated the networking tests to include dynamic vars for UIDs and GIDs. All tests should now pass for you on development, app-staging, and mon-staging. Let me know if you can get any to fail!

Conor Schaefer added 2 commits March 16, 2017 10:45
Different virtualization providers, e.g. VirtualBox versus libvirt, may
mount the shared directory with slightly different umasks. Let's not
police that, since the important thing is the ownership in the
development environment and the fact that the app code is present and
its associated services are active.
We require the tmux package to be installed, since it's used in the user
environment to provide resilience in the event that an interactive SSH
session is disconnected prematurely (as sometimes happens with Tor).

PR #1623 updates the Ansible config to ensure tmux is present, and this
test will track regressions in case it ever disappears from the base
image.
@msheiny
Copy link
Contributor

msheiny commented Mar 16, 2017

Okay running through now...

  • Development
  • app-staging - all passing post-reboot
  • mon-staging - all passing post-reboot

@msheiny
Copy link
Contributor

msheiny commented Mar 16, 2017

assert '** No agent available.' == 'app-staging-10.0.1.2 is available.'

I think this was failing last week too when I wrote the test... im still waiting for my staging environment to come up to confirm its still happening.

@msheiny
Copy link
Contributor

msheiny commented Mar 16, 2017

okay i got everything passing on my end under libvirt. Had to reboot the staging boxes pre test as @conorsch recommended

@conorsch
Copy link
Contributor Author

Thanks @msheiny! Going in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Convert ServerSpec tests to TestInfra
4 participants