-
Notifications
You must be signed in to change notification settings - Fork 688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backported changes for 0.5.2 bugfix release #2940
Conversation
Add ar, it_IT, tr, zh_Hant to the YAML prompt used by securedrop-admin sdconfig (cherry picked from commit 1c7002e)
(cherry picked from commit 81f4c90)
Previously we configured both postfix and procmail inside the "ossec-server" role, which was essentially shoehorning the config. It's appropriate to target the "securedrop_monitor_server" group with those config items, but technically there's not literally OSSEC server, but rather a separate service that deserves its own configuration logic. Tidies up some of the vars. We've documented use of the `ossec_from_address` publicly, so we can't simply drop reference to it. Since the associated logic now resides in the "postfix" role, it should be prefixed with the `postfix_` namespace. That's done, old values of `ossec_from_address` set at the site level, if set, will be honored. (cherry picked from commit 6aaf88a)
In order to support cross-host variable delegation, we'll need the play to target *both* the Application and Monitor servers. We can then dynamically open ports on the firewall as necessary in order to support registration. Reuses @msheiny's `iptables` Ansible module implementation, which is remarkably clean. Love the state=absent functionality. Rather than use `delegate_to`, we target both hosts and use the boolean vars `ossec_is_server` and `ossec_is_client` (both defaulting to False) in order to determine which host the task should execute on. For some tasks, we want both hosts to execute, thus the combined play target. Uses a rather gnarly hostvars-based retrieval to map a registered var result across both hosts, ensuring the same value is accessible to both play hosts, to coordinate the firewall rule management. Left an in-line comment to guide future maintainers. Removes most tags because we can better handle tasks at the import task level now, which was difficult to do before because the logic was so spaghetti. (cherry picked from commit 58b29c8)
BEGINS combined ossec role FIX: begins ossec common role REMOVES duplicated default ossec vars files PARITAL: ports ossec-agent tasks to common role MOVES registration flow into common role MOVES testing pubkeys for ossec server role MOVES ossec-server config tasks into common role MOVES send0encrypted-alarms tempalte MOVES (and ports) handlers into ossec comon role FIX MOVES ossec-server-config-tasks There's a functional change sneakily hiding in here: we had a double conditional on the SSL key creation tasks, when really we only want to skip creation if the relevant files exist. Prunes down the OSSEC-related handlers. We don't really need to fuss over whether the OSSEC agent or server is restarted: if inter-machine authentication changed, then we want to bounce *both* services. Fortunately we can do so by restarting the "ossec" service on both hosts, regardless of whether it's OSSEC server or client. (cherry picked from commit c4a1178)
We've consolidated the "ossec-agent" and "ossec-server" roles into "ossec", and excised the postfix logic from the "ossec-server" role into a discrete "postfix" role. Playbooks must have these updated accordingly. We're nearly at the point where we can standardize on a single playbook for prod and staging. Onward! (cherry picked from commit 77af2f2)
The previous iptables rules were swapped and caused the port to be blocked during registration. (cherry picked from commit 4c012fa)
Without this explicitly set, ansible will append the iptables to the bottom of the chain. Since we have some explicit DROP lines at the bottom of the INPUT chain, this means that the exception gets added AFTER a DROP line. Meaning... that the line basically never got evaluated. This commit resolves that issue by always inserting at the top of the chain (both INPUT and OUTPUT). There is a following clean-up task that will remove this afterwards as usual. (cherry picked from commit 345aea4)
The Postfix logic assumes that the "ossec" user and group will exist, and sets logfiles accordingly. Therefore let's make sure that Postfix is configured *after* OSSEC. The cleaner approach would be to set a `postfix_user` var and set the default value to "ossec", but rather than add additional logic into our config to solve an ordering problem, I'd rather focus on making what we have work—and the current change set is focused on resolving the ossec registration logic—and then deferring to community-maintained roles for common services, e.g. Postfix, which would allow overriding via vars. (cherry picked from commit 5493923)
Current logic doesnt always work as intended since a machine's agent name won't match up exactly if it's been registered multiple times. Instead, lets just look for the IP address and the `is available` string. (cherry picked from commit 963cb6b)
If the agent connection isnt working, lets go ahead and purge all existing agents from the app and monitor servers prior to re-registration. This helps shake out a bunch of weird connection issues that popped up during QA. (cherry picked from commit 99a538b)
Since the monitor server should only have one agent connected, multiple agents showing up is a sign of problems. In that scenario, lets force the re-register and cleanup logic. (cherry picked from commit 97e1b1a)
Partial reversion of 3da5605, in which several config tests were intentionally disabled for CI optimization. Now that we've resolved the unreliable OSSEC registration flow, the strict "xfail" on the OSSEC connectivity test was reporting failure: because it was unexpectedly passing! Renabled the OSSEC test, but left the other changes from 3da5605 in place. We'll get to those in their own time. (cherry picked from commit 4d3e94c)
In typical TDD fashion, let's create a failing test and then get it to pass. The test is currently failing, which is good! We'll want to update *all* SecureDrop-related playbooks to contain max_fail_percentage=0. (cherry picked from commit 61672a9)
Target name is intentionally a bit verbose in order to discrimate from "ansible-lint", which is a dedicated tool for linting Ansible playbooks and the roles called therefrom, ensuring adherence to best practices. That's a separate tool, and one we use elsewhere, whereas the new test suite being introduced ensures SecureDrop-specific config choices that we've added in this repository. Hooked up the new test to the general "make lint" target, as well. The new tests are *fast*, so no concerns with CI delays. (cherry picked from commit 1ebd8a8)
The primary concern is for the "securedrop-prod.yml" playbook, which is what Admins use to configure a production SecureDrop instance. In that playbook, any error encountered during provisioning should cause an immediate failure, aborting execution and reporting an error message for review. If the max_fail_percentage is *not* set to 0, debugging errors becomes challenging, often involving lots of scrollback to piece together what specifically went wrong. In order to provide a sane strategy for testing adherence to this requirement, we'll test all playbooks (defined as "YAML file in ansible-base directory, and not the old prod-specific.yml vars file") and ensure the option is set on each and every play. That's a bit aggressive, but the tests are extremely fast, and will likely pay off as we bring new contributors into the project who may not be familiar with this rather esoteric Ansible option. (cherry picked from commit 1421b40)
Also removes associated "upgrade" role, which was designed for the migration to SecureDrop 0.3, released on 2015-02-12. These files have not been used since then, and so should be removed from the repository. As a side note, these files were first written, and last touched, by the late James Dolan. RIP, James. I cannot express what a profound pleasure it was to work with you. You are missed, dearly. (cherry picked from commit 077002b)
Based on feedback from @trishnaguha, we've determined we need `any_errors_fatal=yes` in addition to `max_fail_percentage` to ensure fast fail behavior from Ansible. Tests have been updated, will modify playbooks subsequently. (cherry picked from commit a04fe03)
We also want `any_errors_fatal=yes` on all playbooks in the repository; simply using `max_fail_percentage` is insufficient. Tests have already been updated to track this change, and manual testing confirms the "fail fast" behavior desired for minimizing Admin frustration. (cherry picked from commit 60358b5)
… TEMPORARY until Ansible in `securedrop/requirements/ansible.in` is upgraded to 2.4 (cherry picked from commit 8d7345b)
@redshiftzero This looks really good (except for the angry CI but that's a different story ;-). No conflicts at all ? |
Nope no conflicts - it all went very smoothly. Now to appease CI 😇 |
The gettext 0.19.* has a lot of useful features compared to gettext 0.18.*, including support for .desktop files. Previously we used Zesty to install gettext 0.19.*, but it is now EOL. Instead, we can install Xenial to install a modern gettext. (cherry picked from commit 985146e) Conflicts: securedrop/Dockerfile Resolved by favoring syntax used on master - simply replaced 'zesty' with 'xenial'
(cherry picked from commit bd48593)
(cherry picked from commit 27eeb61)
Currently the cleanup and merging (combining app and testinfra results) of junit files is attached to the testinfra runner script. This is problematic because if the testinfra process dies unexpectedly, we wont have any results from the app tests as well. (cherry picked from commit 323d932)
Ensure developers are able to access raw test runner results for debugging. (cherry picked from commit d50519f)
forgot about this scenario in my previous commit :| (cherry picked from commit 5e17d18)
I suspect that circleci wants an <error> tag instead of a <failure> one to properly register a test as failing in the UI. We shall see. (cherry picked from commit f7ee43f)
(cherry picked from commit 3e3e417)
This is failing on the parsing script in CI. Needs to be investigated and tweaked further before re-introduction. (cherry picked from commit 565b20b)
Use Xenial to install gettext 0.19.* (cherry picked from commit cd608b8) Conflicts: securedrop/Dockerfile Favored master in conflict resolution
Conflicts: .circleci/config.yml Conflict due to one of the CI jobs being removed - resolved by modifying pip safety check in the existing CI job.
The tests weren't flake8 compliant, which CI did not catch. There were two separate tests identically named, so one got clobbered and effectively we weren't testing both required attributes. The test file now *does* pass flake8, which I confirmed via manual invocation against it. Opened a separate issue, #2933, to track improving CI to avoid this problem in the future. (cherry picked from commit 69668db)
Two options are required to ensure fail-fast behavior from Ansible: * max_fail_percentage=0 * any_errors_fatal=yes See #2885 for details. During merge of #2922, due to closely related changes to playbook "play" blocks in #2748, the options were removed as they landed in the "develop" branch. Here they are re-added, and the `make ansible-config-lint` target is happy again. (cherry picked from commit 91dbc08)
a4a65f5
to
ac50052
Compare
Codecov Report
@@ Coverage Diff @@
## release/0.5.2 #2940 +/- ##
================================================
Coverage ? 85.52%
================================================
Files ? 31
Lines ? 1914
Branches ? 213
================================================
Hits ? 1637
Misses ? 228
Partials ? 49 Continue to review full report at Codecov.
|
Yay backporting #2796 did the trick. Doing the merges this time I did encounter a couple of minor conflicts - where this happened ( |
How is Use Xenial to install gettext 0.19.* different from the original commit ? commit=4227c99cb02ce31a16b9924c1f29d57036c86e5f ; picked_from=$(git show --no-patch --pretty=%b $commit | perl -ne 'print if(s/.*cherry pick.* from commit (\w+).*/$1/)') ; diff -u --ignore-matching-lines '^[^+-]' <(git show $picked_from) <(git show $commit) --- /dev/fd/63 2018-01-27 11:42:08.348420006 +0100 +++ /dev/fd/62 2018-01-27 11:42:08.340419907 +0100 @@ -40,19 +47,19 @@ - apt - diff --git a/securedrop/Dockerfile b/securedrop/Dockerfile -index ac875036..340bd96e 100644 +index d579da5e..909b4034 100644 --- a/securedrop/Dockerfile +++ b/securedrop/Dockerfile -@@ -19,10 +19,10 @@ RUN curl -LO https://launchpad.net/~ubuntu-mozilla-security/+archive/ubuntu/ppa/ +@@ -23,10 +23,10 @@ RUN curl -LO https://launchpad.net/~ubuntu-mozilla-security/+archive/ubuntu/ppa/ # # This can be removed when upgrading to something more recent than trusty # --RUN echo deb http://archive.ubuntu.com/ubuntu/ zesty main > /etc/apt/sources.list.d/zesty.list && \ -+RUN echo deb http://archive.ubuntu.com/ubuntu/ xenial main > /etc/apt/sources.list.d/xenial.list && \ - apt-get update && \ - apt-get install -y gettext && \ -- rm /etc/apt/sources.list.d/zesty.list && \ -+ rm /etc/apt/sources.list.d/xenial.list && \ - apt-get update +-RUN echo deb http://archive.ubuntu.com/ubuntu/ zesty main > /etc/apt/sources.list.d/zesty.list ++RUN echo deb http://archive.ubuntu.com/ubuntu/ xenial main > /etc/apt/sources.list.d/xenial.list + RUN apt-get update + RUN apt-get install -y gettext +-RUN rm /etc/apt/sources.list.d/zesty.list ++RUN rm /etc/apt/sources.list.d/xenial.list + RUN apt-get update COPY requirements requirements |
I think backporting #2822 would avoid the two conflicts. However I also don't think it's worth the effort. This only impacts a developer use case that is not even mainline. |
Verified this branch is based on master: $ git --no-pager log --oneline origin/master..origin/backported-changes-for-0.5.2 | wc -l 43 $ git --no-pager log --oneline origin/master..origin/release/0.5.2 $ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All is well, good for 0.5.2 Q/A !
I'm merging this even though I'm not 100% sure these are all that's needed for 0.5.2. If more PRs need backporting, they can be added later. |
Status
Ready for review
Description of Changes
These are the changes for the 0.5.2 release (no
rc
commit yet). I took a slightly different approach than used in the past for bugfix releases based on some useful suggestions from @conorsch and @dachary. The approach I took was to firstcherry-pick -x
from individual PRs into backport branches namedbackport-pr-<PR number>
, then I merged these PRs into this branch. A nice feature of this is that we can easily see which cherry picked commits came from which PR in the git history. Note however that there's one PR I didn't create a backport branch for... PR #2922, which consists of two small commits on top of modifications in other backported changes, so cherry picking directly into this branch seemed like the cleanest approach.Comments welcome especially if you think this is a weird thing to do, happy to change it up for next time
Testing
Does Circle CI pass?
Are there other changes you feel should be brought in for 0.5.2?
Deployment
QA will discover this
Checklist
If you made changes to the app code:
If you made changes to the system configuration: