Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backported changes for 0.5.2 bugfix release #2940

Merged
43 commits merged into from
Jan 27, 2018

Conversation

redshiftzero
Copy link
Contributor

@redshiftzero redshiftzero commented Jan 26, 2018

Status

Ready for review

Description of Changes

These are the changes for the 0.5.2 release (no rc commit yet). I took a slightly different approach than used in the past for bugfix releases based on some useful suggestions from @conorsch and @dachary. The approach I took was to first cherry-pick -x from individual PRs into backport branches named backport-pr-<PR number>, then I merged these PRs into this branch. A nice feature of this is that we can easily see which cherry picked commits came from which PR in the git history. Note however that there's one PR I didn't create a backport branch for... PR #2922, which consists of two small commits on top of modifications in other backported changes, so cherry picking directly into this branch seemed like the cleanest approach.

Comments welcome especially if you think this is a weird thing to do, happy to change it up for next time

Testing

Does Circle CI pass?
Are there other changes you feel should be brought in for 0.5.2?

Deployment

QA will discover this

Checklist

If you made changes to the app code:

  • Unit and functional tests pass on the development VM

If you made changes to the system configuration:

Loic Dachary and others added 22 commits January 26, 2018 11:06
Add ar, it_IT, tr, zh_Hant to the YAML prompt used by
securedrop-admin sdconfig

(cherry picked from commit 1c7002e)
WIP committed by @msheiny during collaboration. Branch adopted by
@conorsch for tackling #2478.

(cherry picked from commit 41d9ddf)
Previously we configured both postfix and procmail inside the
"ossec-server" role, which was essentially shoehorning the config. It's
appropriate to target the "securedrop_monitor_server" group with those
config items, but technically there's not literally OSSEC server, but
rather a separate service that deserves its own configuration logic.

Tidies up some of the vars. We've documented use of the
`ossec_from_address` publicly, so we can't simply drop reference to it.
Since the associated logic now resides in the "postfix" role, it should
be prefixed with the `postfix_` namespace. That's done, old values of
`ossec_from_address` set at the site level, if set, will be honored.

(cherry picked from commit 6aaf88a)
The old logic was written back in the Ansible v1 days, and was never
pretty to look at. As of Ansible v2 we should prefer use of "become" and
"become_user", since the "sudo" calls have been deprecated.

Implements a portion of the conversion described in #2742.

(cherry picked from commit d1b0c31)
In order to support cross-host variable delegation, we'll need the play
to target *both* the Application and Monitor servers. We can then
dynamically open ports on the firewall as necessary in order to support
registration.

Reuses @msheiny's `iptables` Ansible module implementation, which is
remarkably clean. Love the state=absent functionality.

Rather than use `delegate_to`, we target both hosts and use the boolean
vars `ossec_is_server` and `ossec_is_client` (both defaulting to False)
in order to determine which host the task should execute on. For some
tasks, we want both hosts to execute, thus the combined play target.

Uses a rather gnarly hostvars-based retrieval to map a registered var
result across both hosts, ensuring the same value is accessible to both
play hosts, to coordinate the firewall rule management. Left an in-line
comment to guide future maintainers.

Removes most tags because we can better handle tasks at the import task
level now, which was difficult to do before because the logic was so
spaghetti.

(cherry picked from commit 58b29c8)
BEGINS combined ossec role
FIX: begins ossec common role
REMOVES duplicated default ossec vars files
PARITAL: ports ossec-agent tasks to common role
MOVES registration flow into common role
MOVES testing pubkeys for ossec server role
MOVES ossec-server config tasks into common role
MOVES send0encrypted-alarms tempalte
MOVES (and ports) handlers into ossec comon role
FIX MOVES ossec-server-config-tasks

There's a functional change sneakily hiding in here: we had a double
conditional on the SSL key creation tasks, when really we only want to
skip creation if the relevant files exist.

Prunes down the OSSEC-related handlers. We don't really need to fuss
over whether the OSSEC agent or server is restarted: if inter-machine
authentication changed, then we want to bounce *both* services.
Fortunately we can do so by restarting the "ossec" service on both
hosts, regardless of whether it's OSSEC server or client.

(cherry picked from commit c4a1178)
We've consolidated the "ossec-agent" and "ossec-server" roles into
"ossec", and excised the postfix logic from the "ossec-server" role into
a discrete "postfix" role. Playbooks must have these updated
accordingly.

We're nearly at the point where we can standardize on a single playbook
for prod and staging. Onward!

(cherry picked from commit 77af2f2)
The previous iptables rules were swapped and caused the port to be
blocked during registration.

(cherry picked from commit 4c012fa)
Without this explicitly set, ansible will append the iptables to the
bottom of the chain. Since we have some explicit DROP lines at the
bottom of the INPUT chain, this means that the exception gets added
AFTER a DROP line. Meaning... that the line basically never got
evaluated.

This commit resolves that issue by always inserting at the top of the
chain (both INPUT and OUTPUT). There is a following clean-up task that
will remove this afterwards as usual.

(cherry picked from commit 345aea4)
The Postfix logic assumes that the "ossec" user and group will exist,
and sets logfiles accordingly. Therefore let's make sure that Postfix is
configured *after* OSSEC.

The cleaner approach would be to set a `postfix_user` var and set the
default value to "ossec", but rather than add additional logic into our
config to solve an ordering problem, I'd rather focus on making what we
have work—and the current change set is focused on resolving the ossec
registration logic—and then deferring to community-maintained roles for
common services, e.g. Postfix, which would allow overriding via vars.

(cherry picked from commit 5493923)
Current logic doesnt always work as intended since a machine's agent
name won't match up exactly if it's been registered multiple times.
Instead, lets just look for the IP address and the `is available`
string.

(cherry picked from commit 963cb6b)
If the agent connection isnt working, lets go ahead and purge all
existing agents from the app and monitor servers prior to
re-registration. This helps shake out a bunch of weird connection issues
that popped up during QA.

(cherry picked from commit 99a538b)
Since the monitor server should only have one agent connected, multiple
agents showing up is a sign of problems. In that scenario, lets force
the re-register and cleanup logic.

(cherry picked from commit 97e1b1a)
Partial reversion of 3da5605, in which several config tests were
intentionally disabled for CI optimization. Now that we've resolved the
unreliable OSSEC registration flow, the strict "xfail" on the OSSEC
connectivity test was reporting failure: because it was unexpectedly
passing!

Renabled the OSSEC test, but left the other changes from 3da5605
in place. We'll get to those in their own time.

(cherry picked from commit 4d3e94c)
In typical TDD fashion, let's create a failing test and then get it to
pass. The test is currently failing, which is good! We'll want to update
*all* SecureDrop-related playbooks to contain max_fail_percentage=0.

(cherry picked from commit 61672a9)
Target name is intentionally a bit verbose in order to discrimate from
"ansible-lint", which is a dedicated tool for linting Ansible playbooks
and the roles called therefrom, ensuring adherence to best practices.
That's a separate tool, and one we use elsewhere, whereas the new test
suite being introduced ensures SecureDrop-specific config choices
that we've added in this repository.

Hooked up the new test to the general "make lint" target, as well. The
new tests are *fast*, so no concerns with CI delays.

(cherry picked from commit 1ebd8a8)
The primary concern is for the "securedrop-prod.yml" playbook, which is
what Admins use to configure a production SecureDrop instance. In that
playbook, any error encountered during provisioning should cause an
immediate failure, aborting execution and reporting an error message for
review.

If the max_fail_percentage is *not* set to 0, debugging errors becomes
challenging, often involving lots of scrollback to piece together what
specifically went wrong.

In order to provide a sane strategy for testing adherence to this
requirement, we'll test all playbooks (defined as "YAML file in
ansible-base directory, and not the old prod-specific.yml vars file")
and ensure the option is set on each and every play. That's a bit
aggressive, but the tests are extremely fast, and will likely pay off
as we bring new contributors into the project who may not be familiar
with this rather esoteric Ansible option.

(cherry picked from commit 1421b40)
Also removes associated "upgrade" role, which was designed for the
migration to SecureDrop 0.3, released on 2015-02-12. These files
have not been used since then, and so should be removed from the
repository.

As a side note, these files were first written, and last touched, by the
late James Dolan. RIP, James. I cannot express what a profound pleasure
it was to work with you. You are missed, dearly.

(cherry picked from commit 077002b)
Based on feedback from @trishnaguha, we've determined we need
`any_errors_fatal=yes` in addition to `max_fail_percentage` to ensure
fast fail behavior from Ansible. Tests have been updated, will modify
playbooks subsequently.

(cherry picked from commit a04fe03)
We also want `any_errors_fatal=yes` on all playbooks in the repository; simply
using `max_fail_percentage` is insufficient. Tests have already been updated
to track this change, and manual testing confirms the "fail fast" behavior
desired for minimizing Admin frustration.

(cherry picked from commit 60358b5)
… TEMPORARY until Ansible in `securedrop/requirements/ansible.in` is upgraded to 2.4

(cherry picked from commit 8d7345b)
@redshiftzero redshiftzero requested a review from a user January 26, 2018 21:27
@redshiftzero
Copy link
Contributor Author

Ah right. Builds failing due to #2884, backporting changes from #2886...

@redshiftzero redshiftzero changed the title Backported changes for 0.5.2 bugfix release [wip] Backported changes for 0.5.2 bugfix release Jan 26, 2018
@ghost
Copy link

ghost commented Jan 26, 2018

@redshiftzero This looks really good (except for the angry CI but that's a different story ;-). No conflicts at all ?

@redshiftzero
Copy link
Contributor Author

Nope no conflicts - it all went very smoothly. Now to appease CI 😇

redshiftzero and others added 18 commits January 26, 2018 15:53
The gettext 0.19.* has a lot of useful features compared to gettext 0.18.*,
including support for .desktop files. Previously we used Zesty to install
gettext 0.19.*, but it is now EOL. Instead, we can install Xenial to install
a modern gettext.

(cherry picked from commit 985146e)

Conflicts:
	securedrop/Dockerfile

Resolved by favoring syntax used on master - simply replaced 'zesty' with 'xenial'
Currently the cleanup and merging (combining app and testinfra results)
of junit files is attached to the testinfra runner script. This is
problematic because if the testinfra process dies unexpectedly, we wont
have any results from the app tests as well.

(cherry picked from commit 323d932)
Ensure developers are able to access raw test runner results for
debugging.

(cherry picked from commit d50519f)
In english, this should make it much simpler for a developer to get
status in CircleCI when either the testinfra or application tests fail
at an execution level. Previously this data was hard to parse and/or
required manually reading thru an artifact.

Fixes #2800

(cherry picked from commit 39f2d82)
forgot about this scenario in my previous commit :|

(cherry picked from commit 5e17d18)
I suspect that circleci wants an <error> tag instead of a <failure> one
to properly register a test as failing in the UI. We shall see.

(cherry picked from commit f7ee43f)
This is failing on the parsing script in CI. Needs to be investigated
and tweaked further before re-introduction.

(cherry picked from commit 565b20b)
Use Xenial to install gettext 0.19.*

(cherry picked from commit cd608b8)

Conflicts:
	securedrop/Dockerfile

Favored master in conflict resolution
Conflicts:
	.circleci/config.yml

Conflict due to one of the CI jobs being removed - resolved by
modifying pip safety check in the existing CI job.
The tests weren't flake8 compliant, which CI did not catch. There were
two separate tests identically named, so one got clobbered and
effectively we weren't testing both required attributes.

The test file now *does* pass flake8, which I confirmed via manual
invocation against it. Opened a separate issue, #2933, to track
improving CI to avoid this problem in the future.

(cherry picked from commit 69668db)
Two options are required to ensure fail-fast behavior from Ansible:

  * max_fail_percentage=0
  * any_errors_fatal=yes

See #2885 for details. During merge of #2922, due to closely related
changes to playbook "play" blocks in #2748, the options were removed as
they landed in the "develop" branch. Here they are re-added, and the
`make ansible-config-lint` target is happy again.

(cherry picked from commit 91dbc08)
@redshiftzero redshiftzero force-pushed the backported-changes-for-0.5.2 branch from a4a65f5 to ac50052 Compare January 27, 2018 00:55
@codecov-io
Copy link

codecov-io commented Jan 27, 2018

Codecov Report

❗ No coverage uploaded for pull request base (release/0.5.2@0648361). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@               Coverage Diff                @@
##             release/0.5.2    #2940   +/-   ##
================================================
  Coverage                 ?   85.52%           
================================================
  Files                    ?       31           
  Lines                    ?     1914           
  Branches                 ?      213           
================================================
  Hits                     ?     1637           
  Misses                   ?      228           
  Partials                 ?       49

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0648361...ac50052. Read the comment docs.

@redshiftzero redshiftzero changed the title [wip] Backported changes for 0.5.2 bugfix release Backported changes for 0.5.2 bugfix release Jan 27, 2018
@redshiftzero
Copy link
Contributor Author

Yay backporting #2796 did the trick. Doing the merges this time I did encounter a couple of minor conflicts - where this happened (.circleci/config.yml and securedrop/Dockerfile) I've noted in the commit messages how they were resolved.

@ghost
Copy link

ghost commented Jan 27, 2018

The GitHub display is inconvenient for review. Fortunately your consistent branch naming allows for a onliner to help review.

git --no-pager log --format='%H %s' --graph origin/master..origin/backported-changes-for-0.5.2 | perl -p -e 's/"/ /g; if (/\w+\s+Merge branch .backport-pr-(\d+). into backported-changes-for-.*/) { s|\w+\s+Merge branch .backport-pr-(\d+). into backported-changes-for-.*|[Pull request $1](https://github.com/freedomofpress/securedrop/pull/$1)|; } else { s|(\w+)\s+(.*)|[$2](https://github.com/freedomofpress/securedrop/commit/$1)|;} s/^\|\\ $//; s/^\|/ /;'

@ghost
Copy link

ghost commented Jan 27, 2018

How is Use Xenial to install gettext 0.19.* different from the original commit ?

commit=4227c99cb02ce31a16b9924c1f29d57036c86e5f ; picked_from=$(git show --no-patch --pretty=%b $commit  | perl -ne 'print if(s/.*cherry pick.* from commit (\w+).*/$1/)') ; diff -u --ignore-matching-lines '^[^+-]'   <(git show $picked_from) <(git show $commit)
--- /dev/fd/63	2018-01-27 11:42:08.348420006 +0100
+++ /dev/fd/62	2018-01-27 11:42:08.340419907 +0100
@@ -40,19 +47,19 @@
      - apt
 -
 diff --git a/securedrop/Dockerfile b/securedrop/Dockerfile
-index ac875036..340bd96e 100644
+index d579da5e..909b4034 100644
 --- a/securedrop/Dockerfile
 +++ b/securedrop/Dockerfile
-@@ -19,10 +19,10 @@ RUN curl -LO https://launchpad.net/~ubuntu-mozilla-security/+archive/ubuntu/ppa/
+@@ -23,10 +23,10 @@ RUN curl -LO https://launchpad.net/~ubuntu-mozilla-security/+archive/ubuntu/ppa/
  #
  # This can be removed when upgrading to something more recent than trusty
  #
--RUN echo deb http://archive.ubuntu.com/ubuntu/ zesty main > /etc/apt/sources.list.d/zesty.list && \
-+RUN echo deb http://archive.ubuntu.com/ubuntu/ xenial main > /etc/apt/sources.list.d/xenial.list && \
-     apt-get update && \
-     apt-get install -y gettext && \
--    rm /etc/apt/sources.list.d/zesty.list && \
-+    rm /etc/apt/sources.list.d/xenial.list && \
-     apt-get update
+-RUN echo deb http://archive.ubuntu.com/ubuntu/ zesty main > /etc/apt/sources.list.d/zesty.list
++RUN echo deb http://archive.ubuntu.com/ubuntu/ xenial main > /etc/apt/sources.list.d/xenial.list
+ RUN apt-get update
+ RUN apt-get install -y gettext
+-RUN rm /etc/apt/sources.list.d/zesty.list
++RUN rm /etc/apt/sources.list.d/xenial.list
+ RUN apt-get update
  
  COPY requirements requirements

@ghost ghost added this to the 0.5.2 milestone Jan 27, 2018
@ghost
Copy link

ghost commented Jan 27, 2018

I think backporting #2822 would avoid the two conflicts. However I also don't think it's worth the effort. This only impacts a developer use case that is not even mainline.

@ghost
Copy link

ghost commented Jan 27, 2018

Verified this branch is based on master:

$ git --no-pager log --oneline origin/master..origin/backported-changes-for-0.5.2 | wc -l
43
$ git --no-pager log --oneline origin/master..origin/release/0.5.2
$ 

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All is well, good for 0.5.2 Q/A !

@ghost
Copy link

ghost commented Jan 27, 2018

I'm merging this even though I'm not 100% sure these are all that's needed for 0.5.2. If more PRs need backporting, they can be added later.

@ghost ghost merged commit eeacc1d into release/0.5.2 Jan 27, 2018
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants