Backported changes for 0.5.2 bugfix release #2940

redshiftzero · 2018-01-26T21:27:13Z

Status

Ready for review

Description of Changes

These are the changes for the 0.5.2 release (no rc commit yet). I took a slightly different approach than used in the past for bugfix releases based on some useful suggestions from @conorsch and @dachary. The approach I took was to first cherry-pick -x from individual PRs into backport branches named backport-pr-<PR number>, then I merged these PRs into this branch. A nice feature of this is that we can easily see which cherry picked commits came from which PR in the git history. Note however that there's one PR I didn't create a backport branch for... PR #2922, which consists of two small commits on top of modifications in other backported changes, so cherry picking directly into this branch seemed like the cleanest approach.

Comments welcome especially if you think this is a weird thing to do, happy to change it up for next time

Testing

Does Circle CI pass?
Are there other changes you feel should be brought in for 0.5.2?

Deployment

QA will discover this

Checklist

If you made changes to the app code:

Unit and functional tests pass on the development VM

If you made changes to the system configuration:

Configuration tests pass

Add ar, it_IT, tr, zh_Hant to the YAML prompt used by securedrop-admin sdconfig (cherry picked from commit 1c7002e)

(cherry picked from commit 81f4c90)

@msheiny

WIP committed by @msheiny during collaboration. Branch adopted by @conorsch for tackling #2478. (cherry picked from commit 41d9ddf)

Previously we configured both postfix and procmail inside the "ossec-server" role, which was essentially shoehorning the config. It's appropriate to target the "securedrop_monitor_server" group with those config items, but technically there's not literally OSSEC server, but rather a separate service that deserves its own configuration logic. Tidies up some of the vars. We've documented use of the `ossec_from_address` publicly, so we can't simply drop reference to it. Since the associated logic now resides in the "postfix" role, it should be prefixed with the `postfix_` namespace. That's done, old values of `ossec_from_address` set at the site level, if set, will be honored. (cherry picked from commit 6aaf88a)

The old logic was written back in the Ansible v1 days, and was never pretty to look at. As of Ansible v2 we should prefer use of "become" and "become_user", since the "sudo" calls have been deprecated. Implements a portion of the conversion described in #2742. (cherry picked from commit d1b0c31)

@msheiny

In order to support cross-host variable delegation, we'll need the play to target *both* the Application and Monitor servers. We can then dynamically open ports on the firewall as necessary in order to support registration. Reuses @msheiny's `iptables` Ansible module implementation, which is remarkably clean. Love the state=absent functionality. Rather than use `delegate_to`, we target both hosts and use the boolean vars `ossec_is_server` and `ossec_is_client` (both defaulting to False) in order to determine which host the task should execute on. For some tasks, we want both hosts to execute, thus the combined play target. Uses a rather gnarly hostvars-based retrieval to map a registered var result across both hosts, ensuring the same value is accessible to both play hosts, to coordinate the firewall rule management. Left an in-line comment to guide future maintainers. Removes most tags because we can better handle tasks at the import task level now, which was difficult to do before because the logic was so spaghetti. (cherry picked from commit 58b29c8)

BEGINS combined ossec role FIX: begins ossec common role REMOVES duplicated default ossec vars files PARITAL: ports ossec-agent tasks to common role MOVES registration flow into common role MOVES testing pubkeys for ossec server role MOVES ossec-server config tasks into common role MOVES send0encrypted-alarms tempalte MOVES (and ports) handlers into ossec comon role FIX MOVES ossec-server-config-tasks There's a functional change sneakily hiding in here: we had a double conditional on the SSL key creation tasks, when really we only want to skip creation if the relevant files exist. Prunes down the OSSEC-related handlers. We don't really need to fuss over whether the OSSEC agent or server is restarted: if inter-machine authentication changed, then we want to bounce *both* services. Fortunately we can do so by restarting the "ossec" service on both hosts, regardless of whether it's OSSEC server or client. (cherry picked from commit c4a1178)

We've consolidated the "ossec-agent" and "ossec-server" roles into "ossec", and excised the postfix logic from the "ossec-server" role into a discrete "postfix" role. Playbooks must have these updated accordingly. We're nearly at the point where we can standardize on a single playbook for prod and staging. Onward! (cherry picked from commit 77af2f2)

The previous iptables rules were swapped and caused the port to be blocked during registration. (cherry picked from commit 4c012fa)

Without this explicitly set, ansible will append the iptables to the bottom of the chain. Since we have some explicit DROP lines at the bottom of the INPUT chain, this means that the exception gets added AFTER a DROP line. Meaning... that the line basically never got evaluated. This commit resolves that issue by always inserting at the top of the chain (both INPUT and OUTPUT). There is a following clean-up task that will remove this afterwards as usual. (cherry picked from commit 345aea4)

The Postfix logic assumes that the "ossec" user and group will exist, and sets logfiles accordingly. Therefore let's make sure that Postfix is configured *after* OSSEC. The cleaner approach would be to set a `postfix_user` var and set the default value to "ossec", but rather than add additional logic into our config to solve an ordering problem, I'd rather focus on making what we have work—and the current change set is focused on resolving the ossec registration logic—and then deferring to community-maintained roles for common services, e.g. Postfix, which would allow overriding via vars. (cherry picked from commit 5493923)

Current logic doesnt always work as intended since a machine's agent name won't match up exactly if it's been registered multiple times. Instead, lets just look for the IP address and the `is available` string. (cherry picked from commit 963cb6b)

If the agent connection isnt working, lets go ahead and purge all existing agents from the app and monitor servers prior to re-registration. This helps shake out a bunch of weird connection issues that popped up during QA. (cherry picked from commit 99a538b)

Since the monitor server should only have one agent connected, multiple agents showing up is a sign of problems. In that scenario, lets force the re-register and cleanup logic. (cherry picked from commit 97e1b1a)

Partial reversion of 3da5605, in which several config tests were intentionally disabled for CI optimization. Now that we've resolved the unreliable OSSEC registration flow, the strict "xfail" on the OSSEC connectivity test was reporting failure: because it was unexpectedly passing! Renabled the OSSEC test, but left the other changes from 3da5605 in place. We'll get to those in their own time. (cherry picked from commit 4d3e94c)

In typical TDD fashion, let's create a failing test and then get it to pass. The test is currently failing, which is good! We'll want to update *all* SecureDrop-related playbooks to contain max_fail_percentage=0. (cherry picked from commit 61672a9)

Target name is intentionally a bit verbose in order to discrimate from "ansible-lint", which is a dedicated tool for linting Ansible playbooks and the roles called therefrom, ensuring adherence to best practices. That's a separate tool, and one we use elsewhere, whereas the new test suite being introduced ensures SecureDrop-specific config choices that we've added in this repository. Hooked up the new test to the general "make lint" target, as well. The new tests are *fast*, so no concerns with CI delays. (cherry picked from commit 1ebd8a8)

The primary concern is for the "securedrop-prod.yml" playbook, which is what Admins use to configure a production SecureDrop instance. In that playbook, any error encountered during provisioning should cause an immediate failure, aborting execution and reporting an error message for review. If the max_fail_percentage is *not* set to 0, debugging errors becomes challenging, often involving lots of scrollback to piece together what specifically went wrong. In order to provide a sane strategy for testing adherence to this requirement, we'll test all playbooks (defined as "YAML file in ansible-base directory, and not the old prod-specific.yml vars file") and ensure the option is set on each and every play. That's a bit aggressive, but the tests are extremely fast, and will likely pay off as we bring new contributors into the project who may not be familiar with this rather esoteric Ansible option. (cherry picked from commit 1421b40)

Also removes associated "upgrade" role, which was designed for the migration to SecureDrop 0.3, released on 2015-02-12. These files have not been used since then, and so should be removed from the repository. As a side note, these files were first written, and last touched, by the late James Dolan. RIP, James. I cannot express what a profound pleasure it was to work with you. You are missed, dearly. (cherry picked from commit 077002b)

@trishnaguha

Based on feedback from @trishnaguha, we've determined we need `any_errors_fatal=yes` in addition to `max_fail_percentage` to ensure fast fail behavior from Ansible. Tests have been updated, will modify playbooks subsequently. (cherry picked from commit a04fe03)

We also want `any_errors_fatal=yes` on all playbooks in the repository; simply using `max_fail_percentage` is insufficient. Tests have already been updated to track this change, and manual testing confirms the "fail fast" behavior desired for minimizing Admin frustration. (cherry picked from commit 60358b5)

… TEMPORARY until Ansible in `securedrop/requirements/ansible.in` is upgraded to 2.4 (cherry picked from commit 8d7345b)

redshiftzero · 2018-01-26T22:03:29Z

Ah right. Builds failing due to #2884, backporting changes from #2886...

ghost · 2018-01-26T22:08:45Z

@redshiftzero This looks really good (except for the angry CI but that's a different story ;-). No conflicts at all ?

redshiftzero · 2018-01-26T22:10:39Z

Nope no conflicts - it all went very smoothly. Now to appease CI 😇

The gettext 0.19.* has a lot of useful features compared to gettext 0.18.*, including support for .desktop files. Previously we used Zesty to install gettext 0.19.*, but it is now EOL. Instead, we can install Xenial to install a modern gettext. (cherry picked from commit 985146e) Conflicts: securedrop/Dockerfile Resolved by favoring syntax used on master - simply replaced 'zesty' with 'xenial'

(cherry picked from commit bd48593)

(cherry picked from commit 27eeb61)

Currently the cleanup and merging (combining app and testinfra results) of junit files is attached to the testinfra runner script. This is problematic because if the testinfra process dies unexpectedly, we wont have any results from the app tests as well. (cherry picked from commit 323d932)

Ensure developers are able to access raw test runner results for debugging. (cherry picked from commit d50519f)

In english, this should make it much simpler for a developer to get status in CircleCI when either the testinfra or application tests fail at an execution level. Previously this data was hard to parse and/or required manually reading thru an artifact. Fixes #2800 (cherry picked from commit 39f2d82)

forgot about this scenario in my previous commit :| (cherry picked from commit 5e17d18)

I suspect that circleci wants an <error> tag instead of a <failure> one to properly register a test as failing in the UI. We shall see. (cherry picked from commit f7ee43f)

(cherry picked from commit 3e3e417)

This is failing on the parsing script in CI. Needs to be investigated and tweaked further before re-introduction. (cherry picked from commit 565b20b)

Use Xenial to install gettext 0.19.* (cherry picked from commit cd608b8) Conflicts: securedrop/Dockerfile Favored master in conflict resolution

Conflicts: .circleci/config.yml Conflict due to one of the CI jobs being removed - resolved by modifying pip safety check in the existing CI job.

The tests weren't flake8 compliant, which CI did not catch. There were two separate tests identically named, so one got clobbered and effectively we weren't testing both required attributes. The test file now *does* pass flake8, which I confirmed via manual invocation against it. Opened a separate issue, #2933, to track improving CI to avoid this problem in the future. (cherry picked from commit 69668db)

Two options are required to ensure fail-fast behavior from Ansible: * max_fail_percentage=0 * any_errors_fatal=yes See #2885 for details. During merge of #2922, due to closely related changes to playbook "play" blocks in #2748, the options were removed as they landed in the "develop" branch. Here they are re-added, and the `make ansible-config-lint` target is happy again. (cherry picked from commit 91dbc08)

codecov-io · 2018-01-27T01:32:58Z

Codecov Report

❗ No coverage uploaded for pull request base (release/0.5.2@0648361). Click here to learn what that means.
The diff coverage is n/a.

@@               Coverage Diff                @@
##             release/0.5.2    #2940   +/-   ##
================================================
  Coverage                 ?   85.52%           
================================================
  Files                    ?       31           
  Lines                    ?     1914           
  Branches                 ?      213           
================================================
  Hits                     ?     1637           
  Misses                   ?      228           
  Partials                 ?       49

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0648361...ac50052. Read the comment docs.

redshiftzero · 2018-01-27T01:39:03Z

Yay backporting #2796 did the trick. Doing the merges this time I did encounter a couple of minor conflicts - where this happened (.circleci/config.yml and securedrop/Dockerfile) I've noted in the commit messages how they were resolved.

ghost · 2018-01-27T10:20:16Z

The GitHub display is inconvenient for review. Fortunately your consistent branch naming allows for a onliner to help review.

git --no-pager log --format='%H %s' --graph origin/master..origin/backported-changes-for-0.5.2 | perl -p -e 's/"/ /g; if (/\w+\s+Merge branch .backport-pr-(\d+). into backported-changes-for-.*/) { s|\w+\s+Merge branch .backport-pr-(\d+). into backported-changes-for-.*|[Pull request $1](https://github.com/freedomofpress/securedrop/pull/$1)|; } else { s|(\w+)\s+(.*)|[$2](https://github.com/freedomofpress/securedrop/commit/$1)|;} s/^\|\\ $//; s/^\|/ /;'

ghost · 2018-01-27T10:44:17Z

How is Use Xenial to install gettext 0.19.* different from the original commit ?

commit=4227c99cb02ce31a16b9924c1f29d57036c86e5f ; picked_from=$(git show --no-patch --pretty=%b $commit  | perl -ne 'print if(s/.*cherry pick.* from commit (\w+).*/$1/)') ; diff -u --ignore-matching-lines '^[^+-]'   <(git show $picked_from) <(git show $commit)
--- /dev/fd/63	2018-01-27 11:42:08.348420006 +0100
+++ /dev/fd/62	2018-01-27 11:42:08.340419907 +0100
@@ -40,19 +47,19 @@
      - apt
 -
 diff --git a/securedrop/Dockerfile b/securedrop/Dockerfile
-index ac875036..340bd96e 100644
+index d579da5e..909b4034 100644
 --- a/securedrop/Dockerfile
 +++ b/securedrop/Dockerfile
-@@ -19,10 +19,10 @@ RUN curl -LO https://launchpad.net/~ubuntu-mozilla-security/+archive/ubuntu/ppa/
+@@ -23,10 +23,10 @@ RUN curl -LO https://launchpad.net/~ubuntu-mozilla-security/+archive/ubuntu/ppa/
  #
  # This can be removed when upgrading to something more recent than trusty
  #
--RUN echo deb http://archive.ubuntu.com/ubuntu/ zesty main > /etc/apt/sources.list.d/zesty.list && \
-+RUN echo deb http://archive.ubuntu.com/ubuntu/ xenial main > /etc/apt/sources.list.d/xenial.list && \
-     apt-get update && \
-     apt-get install -y gettext && \
--    rm /etc/apt/sources.list.d/zesty.list && \
-+    rm /etc/apt/sources.list.d/xenial.list && \
-     apt-get update
+-RUN echo deb http://archive.ubuntu.com/ubuntu/ zesty main > /etc/apt/sources.list.d/zesty.list
++RUN echo deb http://archive.ubuntu.com/ubuntu/ xenial main > /etc/apt/sources.list.d/xenial.list
+ RUN apt-get update
+ RUN apt-get install -y gettext
+-RUN rm /etc/apt/sources.list.d/zesty.list
++RUN rm /etc/apt/sources.list.d/xenial.list
+ RUN apt-get update
  
  COPY requirements requirements

ghost · 2018-01-27T10:52:03Z

I think backporting #2822 would avoid the two conflicts. However I also don't think it's worth the effort. This only impacts a developer use case that is not even mainline.

ghost · 2018-01-27T10:57:52Z

Verified this branch is based on master:

$ git --no-pager log --oneline origin/master..origin/backported-changes-for-0.5.2 | wc -l
43
$ git --no-pager log --oneline origin/master..origin/release/0.5.2
$

ghost

All is well, good for 0.5.2 Q/A !

ghost · 2018-01-27T11:00:25Z

I'm merging this even though I'm not 100% sure these are all that's needed for 0.5.2. If more PRs need backporting, they can be added later.

Loic Dachary and others added 22 commits January 26, 2018 11:06

i18n: add new languages securedrop-confiure.yml

9f5e80f

Add ar, it_IT, tr, zh_Hant to the YAML prompt used by securedrop-admin sdconfig (cherry picked from commit 1c7002e)

Replace aging PyCrypto library with PyCryptodome

84f55d1

(cherry picked from commit 81f4c90)

Partial implementation of OSSEC iptables fix

7e31b60

WIP committed by @msheiny during collaboration. Branch adopted by @conorsch for tackling #2478. (cherry picked from commit 41d9ddf)

Add fix for iptables OSSEC monitor temporary rules

dc6c963

The previous iptables rules were swapped and caused the port to be blocked during registration. (cherry picked from commit 4c012fa)

Clean-up systems with multiple agents

0ae8af1

Since the monitor server should only have one agent connected, multiple agents showing up is a sign of problems. In that scenario, lets force the re-register and cleanup logic. (cherry picked from commit 97e1b1a)

Closes #2927, do not fail CI on vulnerable package :(. This change is…

314a773

… TEMPORARY until Ansible in `securedrop/requirements/ansible.in` is upgraded to 2.4 (cherry picked from commit 8d7345b)

redshiftzero requested a review from conorsch as a code owner January 26, 2018 21:27

redshiftzero requested a review from a user January 26, 2018 21:27

redshiftzero requested review from heartsucker and msheiny as code owners January 26, 2018 21:27

redshiftzero changed the title ~~Backported changes for 0.5.2 bugfix release~~ [wip] Backported changes for 0.5.2 bugfix release Jan 26, 2018

redshiftzero and others added 18 commits January 26, 2018 15:53

Remove Travis CI and run pages-layout tests on Circle CI

55dfaa5

(cherry picked from commit bd48593)

Docs: Remove Travis CI from developer documentation.

7130593

(cherry picked from commit 27eeb61)

Dump raw application and testinfra output in CI

652b274

Ensure developers are able to access raw test runner results for debugging. (cherry picked from commit d50519f)

CI - also dump raw artifacts in test-with-rebase

bdf075d

forgot about this scenario in my previous commit :| (cherry picked from commit 5e17d18)

CI - Potential fix for Circle to parse errors

d2e1ed4

I suspect that circleci wants an <error> tag instead of a <failure> one to properly register a test as failing in the UI. We shall see. (cherry picked from commit f7ee43f)

Fix syntax on application tests pytest command

a5272d0

(cherry picked from commit 3e3e417)

CI - Temporarily skip junit for test script calls

afa4a29

This is failing on the parsing script in CI. Needs to be investigated and tweaked further before re-introduction. (cherry picked from commit 565b20b)

Merge pull request #2886 from freedomofpress/use-xenial-gettext

c4114d5

Use Xenial to install gettext 0.19.* (cherry picked from commit cd608b8) Conflicts: securedrop/Dockerfile Favored master in conflict resolution

Merge branch 'backport-pr-2796' into backported-changes-for-0.5.2

f51b764

Merge branch 'backport-pr-2903' into backported-changes-for-0.5.2

5bfbb85

Merge branch 'backport-pr-2748' into backported-changes-for-0.5.2

4783065

Merge branch 'backport-pr-2922' into backported-changes-for-0.5.2

302a72e

Merge branch 'backport-pr-2928' into backported-changes-for-0.5.2

3d34c37

Conflicts: .circleci/config.yml Conflict due to one of the CI jobs being removed - resolved by modifying pip safety check in the existing CI job.

redshiftzero force-pushed the backported-changes-for-0.5.2 branch from a4a65f5 to ac50052 Compare January 27, 2018 00:55

redshiftzero changed the title ~~[wip] Backported changes for 0.5.2 bugfix release~~ Backported changes for 0.5.2 bugfix release Jan 27, 2018

ghost added this to the 0.5.2 milestone Jan 27, 2018

ghost approved these changes Jan 27, 2018

View reviewed changes

ghost merged commit eeacc1d into release/0.5.2 Jan 27, 2018

This pull request was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backported changes for 0.5.2 bugfix release #2940

Backported changes for 0.5.2 bugfix release #2940

redshiftzero commented Jan 26, 2018 •

edited by ghost

Loading

redshiftzero commented Jan 26, 2018

ghost commented Jan 26, 2018

redshiftzero commented Jan 26, 2018

codecov-io commented Jan 27, 2018 •

edited

Loading

redshiftzero commented Jan 27, 2018

ghost commented Jan 27, 2018 •

edited by ghost

Loading

ghost commented Jan 27, 2018

ghost commented Jan 27, 2018 •

edited by ghost

Loading

ghost commented Jan 27, 2018

ghost left a comment

ghost commented Jan 27, 2018

Backported changes for 0.5.2 bugfix release #2940

Backported changes for 0.5.2 bugfix release #2940

Conversation

redshiftzero commented Jan 26, 2018 • edited by ghost Loading

Status

Description of Changes

Testing

Deployment

Checklist

If you made changes to the app code:

If you made changes to the system configuration:

redshiftzero commented Jan 26, 2018

ghost commented Jan 26, 2018

redshiftzero commented Jan 26, 2018

codecov-io commented Jan 27, 2018 • edited Loading

Codecov Report

redshiftzero commented Jan 27, 2018

ghost commented Jan 27, 2018 • edited by ghost Loading

ghost commented Jan 27, 2018

ghost commented Jan 27, 2018 • edited by ghost Loading

ghost commented Jan 27, 2018

ghost left a comment

Choose a reason for hiding this comment

ghost commented Jan 27, 2018

redshiftzero commented Jan 26, 2018 •

edited by ghost

Loading

codecov-io commented Jan 27, 2018 •

edited

Loading

ghost commented Jan 27, 2018 •

edited by ghost

Loading

ghost commented Jan 27, 2018 •

edited by ghost

Loading