Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add network troubleshooting guide #16

Merged
merged 8 commits into from
Mar 19, 2020
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
215 changes: 215 additions & 0 deletions docs/admin/troubleshooting_network.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,218 @@ Troubleshooting network issues
==============================
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The document is titled "Troubleshooting network issues", but really it strikes me as a "Troubleshooting connection issues" guide, given that step 1 is debugging network, and step 2 is debugging client-login. Sounds like splitting hairs, but on a full read-through, when I reached the end of "Step 1: Verify you are connected to the internet," it felt like I was done "Troubleshooting network issues", when of course there are additional steps that will almost certainly be relevant to Admins.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, "connection issues" is friendlier and encompasses more, will use that.


.. include:: ../includes/top-warning.rst

Before troubleshooting network issues, we recommend reading about the
:ref:`networking architecture <Networking Architecture>`
of SecureDrop Workstation. If you are in a hurry, this guide offers quick
diagnostic and remedial steps.

Step 1: Verify you are connected to the Internet
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You can use both wireless and wired networks in Qubes. You can manage network
access through the network manager, which you can find in the area populated
with icons in the top right corner of your Qubes desktop, known as the *system
tray*.

The network manager looks like this for a wired connection:

**[SCREENSHOT: network manager wired icon]**

It looks like this for a wireless connection:

**[SCREENSHOT: network manager wired icon]**

It looks like this when you are not connected to the Internet at all:

**[SCREENSHOT: no connection icon]**

When a network connection is lost, Qubes will display an alert like the
following:

**[SCREENSHOT: lost connection notification]**

Common causes for lost connections include fully or partly unplugged network cables,
lost power to networking equipment, and ISP service outages. When you see a lost
connection notification, it is most likely due to one of these causes.

If the network manager shows that you are connected to the Internet, you can
verify whether your connection is working by opening a terminal in ``sys-net``:

**[SCREENSHOT: Q widget with VM list and "Run terminal" expanded]**

1. Click the "Q" icon in the in the system tray (top right area).
2. A list of running VMs should appear. Select ``sys-net`` from the list, and
click **Run Terminal**.
3. In the terminal window, type the command ``ping google.com``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using -c 5 in the command to save the trouble of explaining ctrl+c to stop the forever-ping command.

Copy link
Member Author

@eloquence eloquence Mar 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe ping 8.8.8.8 to avoid DNS gotchas? 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a great additional step we can suggest here, will add.


You should see a sequence of lines starting with ``64 bytes from`` and ending with
the number of milliseconds it took to complete the request. If you do not see
similar output, your network access may be misconfigured, or the Internet may be
wholly or partially unreachable.

If you have verified that you are able to connect to the Internet using
``sys-net``, but you are experiencing other connectivity issues, move on to the
next step.

.. important::

Not all VMs in Qubes OS have Internet access. For example, opening the Qubes
menu (top left) and clicking **Terminal Emulator** opens a ``dom0`` terminal
without Internet access. See our :ref:`networking architecture <Networking Architecture>`
overview for additional background.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "important" box is great info, feels a bit too late here, since I've already finished debugging. Consider moving between paragraphs beginning:

  • "Common causes for lost connections" and
  • "If the network manager shows"

Placed there, it's a helpful primer for why I'm using sys-net specifically to debug.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


Step 2: Troubleshooting login issues
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Issues logging in may not be network-related. If you are experiencing
connectivity issues before or after logging in, you can skip ahead to the next section.

Make sure that your username, passphrase, and two-factor code are correct.

.. important::

After a failed login, wait for a new two-factor code from your app before
trying again.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


You can reveal the passphrase by clicking the "eye" icon next to it in the login
dialog (ensure you are in a fully private setting before doing so). Check for
extra characters at the beginning at the end, or subtle differences like
rocodes marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"at the beginning at the and end"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

capitalization.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider "like capitalization or whitespace". Given that we've been automatically generated diceware passphrases for SD accounts for some time now, perhaps mentioning whitespace is too confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added "Note that the spaces between words in SecureDrop passphrases are part of the passphrase" to make it super explicit.


If you use the two-factor app on your phone for other websites and services,
make sure that you have selected the correct user account.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding "It should be labeled SecureDrop." At least, I haven't seen a 2FA app that doesn't display the label by default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


If you have access to a Tails-based *Journalist Workstation*, verify whether you
can access SecureDrop from Tails.

If you are certain that your credentials are correct but you are unable to log
in, proceed to the next step.

Step 3: Verify that all required VMs are running
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The following VMs must be running for all actions requiring network connectivity
to work (e.g., logging in, checking for messages, downloading documents, replying
to sources, starring sources, deleting sources):

- ``sd-app``
- ``sd-gpg``
- ``sd-log``
- ``sd-proxy``
- ``sd-whonix``
- ``sys-firewall``
- ``sys-net``
- ``sys-whonix`` (during updates)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great list. The autostart functionality is sufficient for most use cases, but we should probably be enforcing these VMs as running in the project code. Will file an issue to track post-pilot. Regardless, the list here should remain as useful debugging info.


You can verify whether a VM is running or not by clicking the "Q" icon in the
system tray (top right). Only VMs that are currently running will appear in the
list:

**[SCREENSHOT: Q widget with VM list]**

If a required VM is not running, you can launch it from the Qube Manager. Open
the Qube Manager by clicking **Open Qube Manager** in the menu above. A window
like the following should appear:

**[SCREENSHOT: Qube manager screenshot]**

To start a VM, select it from the list, right-click it, and click **Start/Resume
Qube**. Alternatively, you can click the "Play" button in the toolbar.

In ordinary use, VMs required by SecureDrop should be started on boot or when
they are needed. If you repeatedly experience problems with a necessary VM not
running, or if an error message is displayed when attempting to start the VM,
please contact us for assistance.

If all required VMs are running, proceed to the next step.

Step 4: Verify that required VMs have connectivity
--------------------------------------------------
In step 1, you have already verified that you can connect to the
Internet using ``sys-net``. Now, test whether ``sys-firewall``, ``sd-whonix``
and ``sd-proxy`` are working.

First, open a terminal in ``sys-firewall`` and run the ``ping google.com`` command.
You should see similar as in ``sys-net`` before.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a word missing here; try

You should see similar (output|messages)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


Now, open a terminal in ``sd-whonix`` and run the following command:

``curl -s https://check.torproject.org/ | cat | grep -m 1 "Congratulations"``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: remove | cat, curl writes to stdout by default.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, I was consistently getting Failed writing body errors in curl, which seem to be intermittent if grep closes the read pipe before the request fully finished. Piping through cat seemed to resolve. Will leave in for now but will test once more in Qubes later, as well.


This command contacts a service intended for web browsers to verify whether your
Tor connection is working.

You should see the text "Congratulations. This browser is configured to use Tor."
or a similar message on the terminal.

If this command fails, proceed to the next step.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this command fails

Several failure cases will be non-obvious, indicated by a lack of a message, or a non-zero exit code. See:

user@host:~$ # Expecting a fully working connection
user@host:~$ curl -s https://check.torproject.org | grep -m 1 "Congratulations" 
      Congratulations. This browser is configured to use Tor.
user@host:~$ # After disconnecting network cable
user@host:~$ curl -s https://check.torproject.org | grep -m 1 "Congratulations" 
user@host:~$ echo $?
1
user@host:~$ sudo systemctl restart tor
user@host:~$ # After reconnecting network cable
user@host:~$ curl -s https://check.torproject.org | grep -m 1 "Congratulations" 
      Congratulations. This browser is configured to use Tor.
user@host:~$ # Stopping tor to simulate broken tor config
user@host:~$ sudo systemctl stop tor
user@host:~$ curl -s https://check.torproject.org | grep -m 1 "Congratulations" 
user@host:~$ echo $?
1

So consider stating explicitly:

If this command does not display a "Congratulations" message, proceed to the next step.

or similar wording so that empty output doesn't appear to be a false negative.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


If the command succeeds in ``sd-whonix``, run exactly the same command in
``sd-proxy``. If it only fails in ``sd-proxy``, your workstation may be
misconfigured, or the proxy may have crashed. In that case, skip ahead to step 6.
We also recommend that you contact us, so we can help identify the root cause.

Step 5: Restart Tor
-------------------
If you have narrowed down the problem to ``sd-whonix``, try restarting Tor.

To do so, right-click the Tor icons in the top right corner of your Qubes
desktop. They look like this:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be a shame not to reference the GUI icons, but given that we were using the terminal in sd-whonix in Step 4, it might simpler for an Admin to run sudo systemctl restart tor. We can take a closer look once the screenshots are proposed, but on my system, the sd-whonix / sys-whonix combination makes for a very confusing tooltray experience. If I were following the instructions, I might end up bouncing tor in sys-whonix rather than sd-whonix without realizing it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, it's a massive simplification in this context to just do it on the terminal. If Tor issues are frequent it'd be great to be able to teach journalists how to use the widget, but given the current UX issues with it, we may just want to steer clear for now, also pending a final decision whether we'll continue to use Whonix.

Cross-ref:


**[SCREENSHOT: sdwdate-gui widget]**

One of the two icons should display the option **sd-whonix**. Select that option
from the menu with your mouse, and click **Tor control panel**:

**[SCREENSHOT: sdwdate-gui widget with Tor control panel option shown]**

You should now see the following dialog:

**[SCREENSHOT: Tor control panel screenshot]**

Click **Restart Tor** in the bottom right of this dialog. You should see a
progress bar which will indicate when Tor is available again. If this does not
resolve the issue, proceed to the next step.

Step 6: Restart ``sd-proxy`` and ``sd-whonix``
----------------------------------------------
Restart ``sd-proxy`` and ``sd-whonix`` to attempt to restore connectivity:

1. Exit the SecureDrop app if it is running.
2. Click the "Q" icon in the system tray (top right).
3. Click **Run Qube Manager**
4. Right-click ``sd-proxy`` in the list of VMs. Click **Shutdown qube**.
5. Right-click ``sd-whonix`` in the list of VMs. Click **Shutdown qube**.
6. Right-click ``sd-proxy`` in the list of VMs. Click **Start/Resume qube**.
The ``sd-whonix`` VM should start automatically.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Super clear.


If this does not resolve the issue, proceed to the next step.

Step 7: Restart ``sys-net`` and ``sys-firewall``
------------------------------------------------

.. note::

You will temporarily lose all Internet connectivity in Qubes OS during this
step.

Using the same procedure as in the previous step, shut down ``sd-proxy``,
``sd-whonix`` and ``sys-whonix`` (in this order). Attempt to shut down
``sys-firewall``. You may see an error message telling you that other VMs still
require access to ``sys-firewall``. Save your work in those VMs, shut them
down, and attempt to shut down ``sys-firewall`` again.

Finally, shut down ``sys-net``. The network manager icon should disappear.

Now, start ``sys-whonix``, which will bring up ``sys-net`` and ``sys-firewall``
at the same time. Start ``sd-proxy``, which will bring up ``sd-whonix``.

If this does not resolve the issue, please contact us for assistance.

Examining logs
--------------
You may wish to examine system logs on your own, or with our guidance. You can
examine consolidated syslogs from all SecureDrop-related VMs in the ``sd-log``
VM. They can be found in the default user's ``~/QubesIncomingLogs`` directory.

In addition, you may want to examine ``/var/log/syslog`` in ``sys-net`` and
``sys-firewall``.
55 changes: 55 additions & 0 deletions docs/admin/workstation_architecture.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
SecureDrop Workstation Architecture
===================================

.. include:: ../includes/top-warning.rst

.. _Networking Architecture:

SecureDrop Workstation networking architecture
----------------------------------------------
One key security feature of Qubes OS is that it enables users to configure the
appropriate level of network access for each VM. For example, you could have a
VM for password storage that has no network access, a work VM that is firewalled
to only connect to work servers, and a personal VM that always uses Tor.

In the context of SecureDrop Workstation, these capabilities are used to
minimize the risk that an adversary who is able to exploit a security
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "minimize the risk of an adversary who [...]" or "minimize the risk that an adversary who [...] can exfiltrate documents"

Copy link
Member Author

@eloquence eloquence Mar 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, that was also very inaccessible writing, let me know what you think of the new wording in 3e082f9

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

vulnerability in order to exfiltrate documents or private keys. Specifically,
the following VMs have no network access:

- ``sd-app``, which runs the SecureDrop Client, and holds decrypted messages,
replies, and documents.
- ``sd-viewer``, which is the template for disposable VMs used for opening
documents from the SecureDrop Client.
- ``sd-gpg``, which holds the *Submission Private Key* required to decrypt
messages, replies, and documents.
- ``sd-devices``, which passes exported documents through to USB devices like
printers and encrypted flash drives.

By design, the Qubes OS host domain, ``dom0``, also does not have Internet
access.

.. note:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/:/::/, doesn't display as written

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍, good catch :)


If you attempt to directly access the network in any of these VMs, it will
not work. That is the expected behavior.

Because the SecureDrop Client must connect to the SecureDrop
*Application Server* in order to send or retrieve messages, documents, and
replies, it can communicate through Qubes-internal system calls with another
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider:

"...it can communicate through Qubes' internal Remote Procedure Call (RPC) mechanism with another..."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

VM, ``sd-proxy``, which can only access the open Internet through the Tor
network, using the separate ``sd-whonix`` VM.

Like all networked VMs, ``sd-whonix`` uses the ``sys-firewall`` service to
connect to the network, which is provided via ``sys-net``. All four VMs must be
running for the client to successfully connect to the server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/client/Client/ for consistency with the rest of the doc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to SecureDrop Client (client is IMO correct but I don't think we should ever use it as a technical term, only as an application name we can change globally)


.. important:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, s/:/::/, doesn't display as written

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


The ``sd-whonix`` VM contains a sensitive authentication token required to
access the SecureDrop API via Tor, and should not be attached to VMs that are
unrelated to SecureDrop.

Qubes OS ships with a Whonix service called ``sys-whonix``. When troubleshooting
network issues specific to SecureDrop, ``sys-whonix`` is only relevant during
updates of the Whonix VMs (e.g., while the preflight updater is running).
1 change: 1 addition & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ submitted documents with a reasonable level of security.
admin/troubleshooting_network
admin/provisioning_usb
admin/known_issues
admin/workstation_architecture


* :ref:`genindex`
Expand Down