Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tool to help journalists analyze and sanitize metadata #543

Closed
runasand opened this issue Sep 8, 2014 · 9 comments
Closed

Tool to help journalists analyze and sanitize metadata #543

runasand opened this issue Sep 8, 2014 · 9 comments

Comments

@runasand
Copy link
Contributor

runasand commented Sep 8, 2014

As @garrettr said in #519 where we decided to remove MAT from SecureDrop: "I think we're going to start building a tool to help journalists analyze and sanitize metadata after 0.3 is released."

@runasand runasand mentioned this issue Sep 8, 2014
@diracdeltas
Copy link
Contributor

MAT is in Tails. Should we just add instructions for journalists to use it? (I haven't tried using it)

@psivesely
Copy link
Contributor

This is something that would still be good to add instructions for--maybe this (2016 Aaron Swartz Day) hackathon someone will do it.

I'm adding the Reading Room label to this one because I believe this is something that could be automatically done by the reading room client. A submission is downloaded, then in a DispVM it is authenticated, decrypted, decompressed, and wiped of metadata in that order.

@psivesely
Copy link
Contributor

See #497. Metadata may be useful for a journalist trying to verify the authenticity of documents. Therefore, automatic stripping of metadata may not be appropriate. Further, it should be noted MAT is not a perfect solution that wipes all metadata. That said, the journalist should use MAT on any appropriate documents if they intend to take them off the airgap to be published.

Closing, because this was implemented at some point (I can't tell when because the migration to .rst from .md).

@redshiftzero
Copy link
Contributor

My understanding was that issue was to create a tool (like MAT, but better!) for journalists to anonymize their documents? Something not to be done automatically but to have installed in Tails along with some written documentation on how to use effectively to keep sources safe.

@psivesely
Copy link
Contributor

Okay, I misunderstood. Going to reopen. Also, have some more thoughts on the matter.

The best tool I can think of to do this, would be to take the Qubes PDF converter idea, and extend it to all photo and document types. Though it's design intention is to take a possibly malicious document, and produce a trusted one with the same contents, I believe it would also do an excellent job removing metadata. ImageMagick may add certain metadata when it re-constitutes the RGB bitmaps into the respective formats, however, this should be more predictable, less important, and easier to scrub. (E.g., ImageMagick might include the time of re-constitution, which is not nearly as bad as leaking the actual document creation time, but should still be removed.)

I've just started today diving into design of the reading room (RR). Here's the workflow I'm imagining for how a journalists removes documents for publication:

  • A USB is plugged into the RR machine. There is already a USB DispVM that has been assigned the USB controller via VT-D.
  • The journalist selects documents they would like to remove from the RR. By default, metadata and malware removal (to the best of our ability), will be performed automatically. However, we will expose some option with an appropriate warning allowing them to move a copy of the raw document off the machine.
  • The documents will be moved via a qrexec3 protocol from the storage VM to the USB VM, and onto the USB.

I think it's best we stop putting additional burdens on journalists, and adding to our now ~200 pages of documentation. We need to automate as much as possible, and stop relying on journalists as much possible to practice good opsec.

@redshiftzero
Copy link
Contributor

Great workflow for exporting documents @fowlslegs. Also it looks like the developer is not currently maintaining MAT and is recommending not to use it:

screen shot 2016-11-08 at 10 44 07 am

@redshiftzero redshiftzero added this to the 1.0 milestone Dec 6, 2016
@redshiftzero
Copy link
Contributor

FYI it turns out:

  1. qvm-convert-pdf does convert images (to PDFs) as well using DispVMs, though the "convert to trusted PDF" option does not appear unless you add the .PDF suffix to the file
  2. there is actually already a variant of this for images qvm-convert-img (not installed by default, but I tried it out and it works great) that you can install in Qubes to go directly from e.g. PNG to trusted PNG using the same opening in a DispVM approach

@redshiftzero redshiftzero removed this from the 1.0 milestone May 11, 2017
@redshiftzero
Copy link
Contributor

Just tried to redact a PDF using MAT on Tails 3 and PDFs are no longer supported files due to this bug found last year (and it looks like it's been disabled for a while). However if someone fixed this bug, they would likely become supported again...

@redshiftzero
Copy link
Contributor

We're going to use a Qubes-based strategy to help journalists strip metadata from SecureDrop submissions. Followup: freedomofpress/securedrop-workstation#26

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants