Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Users to Access Auto-Redacted Email Addresses in Responses #10

Closed
RichardTaylor opened this issue Jan 5, 2011 · 31 comments
Closed
Labels
easier-admin Make issues easier to resolve f:redaction f:request-analysis improvement Improves existing functionality (UI tweaks, refactoring, performance, etc) professional stale Issues with no activity for 12 months user-experience x:uk
Milestone

Comments

@RichardTaylor
Copy link

Emails are redacted from responses to prevent harvesting by spammers.

These could be revealed to all users who prove they're human via a CAPTCHA.

Currently there are regular support requests to the WhatDoTheyKnow asking for such addresses to be revealed.

@sebbacon
Copy link
Contributor

sebbacon commented Jan 5, 2011

Could this be an earned right as per https://github.com/sebbacon/alaveteli/issues#issue/9, or would CAPTCHA be necessary?

@RichardTaylor
Copy link
Author

Current experience on WhatDoTheyKnow suggests there needs to be a way to turn off email redaction on whole responses (inc. attachments); to deal with attachments containing contact details of many people. This could be only for those who've proved they're human, or for trusted users and the requestor.

This might be in-addition to another workflow where clicking on a single redacted email address presents the user with a CAPTCHA which they can complete in order to access it.

The problem with an earned right is that some of those browsing the site and just reading the material might not be registered users; some might legitimately want email addresses on their first visit - so I think a CAPTCHA is needed.

@hsenag
Copy link
Collaborator

hsenag commented Jun 22, 2011

There's also some relevant discussion on the duplicate https://github.com/sebbacon/alaveteli/issues/47

@skenaja
Copy link
Collaborator

skenaja commented Jul 8, 2011

Suggested UI improvement for this, that could be implemented without CAPTCHA or special privs:

"When redacting email addresses, leaving the second part uncensored if a .gov.uk domain name could improve readability at no privacy cost"

eg [email protected] would be shown as "[hmrc.gov.uk email address]" rather than the current "[email address]"

@tmtmtmtm
Copy link
Contributor

tmtmtmtm commented Feb 4, 2012

Or, just turn off this 'feature' entirely. The utility value of it is tiny compared to the hassle it creates. In the arms race of dealing with spam, hiding email addresses is an antiquated, failed method. The email addresses in question are being given out publicly, and almost certainly exist in the spammers' databases already anyway.

@hsenag
Copy link
Collaborator

hsenag commented Feb 4, 2012

On 04/02/2012 19:46, Tony Bowden wrote:

Or, just turn off this 'feature' entirely. The utility value of it is tiny compared to the hassle it creates. In the arms race of dealing with spam, hiding email addresses is an antiquated, failed method. The email addresses in question are being given out publicly, and almost certainly exist in the spammers' databases already anyway.

It does mean that when people get replies saying "contact X at Y
address", it gives us an opportunity to keep the communication through
the site where we can - e.g. by telling the user that it's another
authority's normal request address, or changing the contact address for
the authority where appropriate.

Ganesh

@tmtmtmtm
Copy link
Contributor

tmtmtmtm commented Feb 4, 2012

Now that we have the 'who is this to?' switching thingy we could maybe auto-add other email addresses found in the message to it. It would need to have a very basic level of smartness to cope with someone providing a list of hundreds of emails, but it could certainly do the reverse lookup type thing. In general, though, I think these examples are more just small unexpected side-benefits from something that's fatally flawed in the first place, and not really of sufficient value to justify the larger problem.

@sebbacon sebbacon closed this as completed Feb 5, 2012
@sebbacon
Copy link
Contributor

sebbacon commented Feb 5, 2012

I agree with @tmtmtm about the utility of this as a spam feature. So, will the "auto-add" idea do the trick WRT not encouraging off-site communication?

@sebbacon sebbacon reopened this Feb 5, 2012
@RichardTaylor
Copy link
Author

I'm adding a comment as we've recently had a series of requests from users on WhatDoTheyKnow who've been asking for contact details for schools, and they've had to ask the team to provide them with uncensored versions of the responses.

There is perhaps a policy question of if we ought be hosting publicly accessible huge lists of email addresses for schools.

While not noted above [obviously] we still want redaction of the request specific @whatdotheyknow.com email address if it's present in any incoming or outgoing message.

We could partially address Ganesh's comment above by replacing any emails which are other authority's request addresses with a link to the relevant body. [The request address for the current body is already specially treated and replaced by text saying it's the request address]

If email addresses in the body of messages were initially redacted and the process for accessing the unredacted version urged users to keep correspondence on the site where possible / appropriate that might help

@TomSteinberg
Copy link

Someone needs to ask directly - what is the morality of posting huge lists
of email addresses for public institutions?

Maybe even needs a blog post.

Tom

On 11 December 2012 23:10, RichardTaylor [email protected] wrote:

I'm adding a comment as we've recently had a series of requests from users
on WhatDoTheyKnow who've been asking for contact details for schools, and
they've had to ask the team to provide them with uncensored versions of the
responses.

There is perhaps a policy question of if we ought be hosting publicly
accessible huge lists of email addresses for schools.

While not noted above [obviously] we still want redaction of the request
specific @whatdotheyknow.com email address if it's present in any
incoming or outgoing message.

We could partially address Ganesh's comment above by replacing any emails
which are other authority's request addresses with a link to the relevant
body. [The request address for the current body is already specially
treated and replaced by text saying it's the request address]

If email addresses in the body of messages were initially redacted and the
process for accessing the unredacted version urged users to keep
correspondence on the site where possible / appropriate that might help


Reply to this email directly or view it on GitHubhttps://github.com//issues/10#issuecomment-11269188.

@hsenag
Copy link
Collaborator

hsenag commented Oct 27, 2013

This is a recurring problem - we're taking to discouraging requests for email addresses through the site though not very consistently yet.

Perhaps an admin-level button to unredact either a specific address or all addresses from a specific message/attachment would do.

@RichardTaylor
Copy link
Author

Requests to provide email addresses which have been redacted are still causing emails asking for administrators to unredact them.

In some of the recent cases the request itself was asking for a contact email address for a particular purpose - then when the response came it was hidden.

In other cases one authority was suggesting a requestor contact another body (so detecting the address as another body's request address and pointing the user to that body's page might have helped).

Occasionally users are not understanding that WhatDoTheyKnow is doing the redaction and think the authorities have failed to provide the information they have requested. (In some cases those responding on behalf of public bodies have been confused by our redaction too).

@RichardTaylor
Copy link
Author

We've got a user on WhatDoTheyKnow who's made 50 request for email contact details

https://www.whatdotheyknow.com/admin/users/102969

They've started to ask us to unredact the responses.

It's going to take lots of volunteer admin time to unredact them. (It may even be the volunteer administrators are unable/unwilling to help due to the volume of requests and other higher priority tasks)

(I made this comment on #3033 but here looks like a better place for it)

@RichardTaylor
Copy link
Author

The user mentioned above has now made 150 such requests and it does appear the volunteer team are unable/unwilling to help with the requested unredactions of the responses due to the volume of requests and other higher priority tasks.

@RichardTaylor
Copy link
Author

Today WhatDoTheyKnow has been contacted by someone with a role in a national representative body who wants to use FOI to obtain contact details for a specific officer within each local council.

Without the ability to turn off email address redaction on a user or request basis we can't really help.

As this is a professional user who we're not currently able to serve I'll add the professional tag. Happy for anyone to remove it if it's out of scope for the "professional" projection.

@RichardTaylor
Copy link
Author

A WhatDoTheyKnow user from a major national charity has made 157 requests for the contact person for a particular type of contract in each council.

User:
https://www.whatdotheyknow.com/admin/users/127375

They've asked us to unredact an email address; we're unlikely to have the volunteer capacity to unredact 157 of them.

If we had the ability to turn off email address redaction (apart from the request address) for an individual user we would use it in this case.

@RichardTaylor
Copy link
Author

Just to note another WDTK user has made a request for contact email addresses for officers with a certain responsibility to lots of bodies and is asking the WDTK admin team to unredact the responses for them.

https://www.whatdotheyknow.com/admin/users/111360

Again if we had the ability to turn off email address redaction (apart from the request address) for an individual user I suspect we would use it in this case.

@garethrees
Copy link
Member

@garethrees garethrees added f:redaction f:request-analysis improvement Improves existing functionality (UI tweaks, refactoring, performance, etc) labels May 29, 2018
@mdeuk
Copy link
Collaborator

mdeuk commented Aug 5, 2018

We've had a public authority get in touch over at WhatDoTheyKnow to seek the rationale behind the auto-redacted email addresses. After explaining the reasoning, the authority has intimated they may reconsider whether or not to respond to requests made via WDTK.

Being able to allow the user who makes a request direct access to this data would be incredibly useful as there are many cases (e.g. non-foi) where the user needs contact details that have been provided in order to conclude business elsewhere. This has a benefit to administrators of reducing the support workload, so, would likely be of benefit to administrators of other Alaveteli sites.

@garethrees
Copy link
Member

Being able to allow the user who makes a request direct access to this data

Do we mean "automatically" here, as in, if we're showing to a request owner we just don't apply these masks?

OR, do we mean have a button which requests access, and then an admin decides "yeah, that's sensible" and "unlocks" the message?

Turning it off per-user has also been suggested.

@RichardTaylor
Copy link
Author

Another case relating to the 38 requests at https://www.whatdotheyknow.com/admin/users/165418 - we've been asked today to act to provide redacted email addresses from 11 of them.

@RichardTaylor
Copy link
Author

Having dealt with the specific case in the above comment I'm returning to note that even administrators don't always have easy access to unredacted messages though the web-system.

Currently there's a need to download the whole raw message and open it in mail software to access an unredacted version of an attachment. If the material in question is in the plain text of a raw email it can be obtained via a message admin page, but an extra step is needed to decode it if it is base64 encoded (for the base64 issue see #1106) .

@RichardTaylor
Copy link
Author

Someone has written to WhatDoTheyKnow to say:

It is wrong to deny the submitter of a request whatever email addreses are sent by the supplier of information.

If the requester is logged-on to your site, they should be able to see ALL email addresses and not have to request additional, time consuming, assistance.

@MattK1234
Copy link
Collaborator

Someone contacted the WhatDoTheyKnow administration team today asking for email addresses to be revealed in a request. Investigation showed the authority hadn't provided any email addresses in the first place, however the user comments:

There is something of an irony when a website dedicated to freedom of information subsequently edits (automaticaly or otherwise) information received in that pursuit. I accept such an action may be well intentioned but the effect may defeat the purpose of the original request.

I shall submit an appropriate Freedom of Information request to [Public Body]. However, as the response will be automatically edited there appears to be no value in undertaking the task via the What Do They Know website.

@RichardTaylor
Copy link
Author

The WDTK admin team are being asked to unredact email addresses provided in response to requests which can currently be found via:

https://www.whatdotheyknow.com/search/%22%20Temporary%20Accommodation%20-%20Nightly%20Rates%22/all

Manual extraction and provision of the email addresses is likely to stretch, or be beyond the capacity of, the volunteer team, it's also not a great way to spend resources if a technical solution could be found.

@RichardTaylor
Copy link
Author

+1 A WhatDoTheyKnow.com user wanted an email address from a correspondence thread because they considered the ICO's form for escalating cases to the ICO required it. (We do advise users that a simple email to the ICO with a link to the request on WhatDoTheyKnow will probably suffice and there's no need to complete their form).

See also: Automated appeals to ICO #2819

@RichardTaylor
Copy link
Author

There have been a couple of cases on WhatDoTheyKnow recently where email address redaction has failed on PDFs; and on review we're actually happy with that as it was fair to release and publish the email addresses.

@RichardTaylor
Copy link
Author

We have had further correspondence to WhatDoTheyKnow from a user unhappy with our redaction of email addresses:

...your site's interference with communication between requester and responder. Because that's what this redacting of email addresses is

This prompted me to consider why we remove non request-address emails. WhatDoTheyKnow's public statement on this is at :

https://www.whatdotheyknow.com/help/how#email_addresses

Spam prevention is the only reason given.

The benefit of keeping correspondence on-site isn't mentioned, that is a benefit though, and the design of any system for revealing addresses without admin intervention should encourage keeping correspondence on-site - just as human admins do when corresponding with those seeking addresses.

We are also reducing the privacy impact of running our service by redacting email addresses. Spam prevention is a subset of this.

It is possible email redaction reduces the number of GDPR rights based requests compared to the number we might otherwise be getting.

If we are only seeking to keep correspondence on-site we could stop trying to redact emails from Excel attachments.

If we were concerned about misuse of large releases of email addresses for spam we perhaps should have other policies eg. not hosting requests/responses for such material, and not manually providing access to unredacted versions of such files on request. It appears perhaps our concerns related to automated indiscriminate harvesting of email addresses by spammers?

@laurentS
Copy link
Contributor

We are in the process of adding this behaviour to madada.fr. I've started by disabling general email censor rules for site admins (the logic being that they have access to the information anyway) but we'd probably want to make this change for requesters as well. Like others have commented here, our main use cases for showing them:

  • public bodies ask us to contact them on a different email. Accessing the given address via the admin page is time consuming, especially when we have to do this hundreds of times (we currently have a backlog of over 300 such cases). So this is mainly an efficiency gain.
  • requesters want access to the email address for a variety of reasons. Again, we save user support time.

I'd be happy to work on moving this code from our model_patches upstream, though I'm conscious that my approach is probably far from optimal. I think the assumption that emails must be censored is so deep in the codebase that undoing it requires adjusting a bunch of methods. Also, given our current main use case, I've just looked at uncensoring the main email body, nothing more, but if I were to upstream this change, it should be easier to apply it throughout attachments.

@garethrees
Copy link
Member

garethrees commented Aug 4, 2023

Thanks @laurentS, I think an addition like this – for admins only as a first step – would be great progress!

I think it would be clearer to just not apply any masks in the case of an admin user. If you're an admin and want to get a user's view, you could then just open the request in a private browsing window to see what they see1. I'd also retain the default of applying masks and be explicit in disabling them, as that's "safer" behaviour for calling this method elsewhere.

<% if [email protected]? && @user.is_admin? %>
  <p class="unmasked_message_body_notice">UNMASKED VERSION</p>
  <%= incoming_message.get_body_for_html_display(@collapse_quotes, mask: false) %>
<% else %>
  <%# Use the implicit default option of `mask: true` %>
  <%= incoming_message.get_body_for_html_display(@collapse_quotes) %>
<% end %>

I think the above approach would reduce some of the customisation required. I'd probably do something like this:

def get_body_for_html_display(collapse_quoted_sections = true, mask: true)
  if mask
    text = get_main_body_text_unfolded_unmasked
    folded_quoted_text = get_main_body_text_folded_unmasked
  else
    text = get_main_body_text_unfolded
    folded_quoted_text = get_main_body_text_folded
  end

  # current implementation…
end

That should then minimise disruption to existing methods, and just add the extra *_unmasked versions for when we're displaying to admins.

Footnotes

  1. With the exception of reduced prominence requests, but that's an issue as it is so we're not losing anything here.

@HelenWDTK HelenWDTK added the stale Issues with no activity for 12 months label Nov 19, 2024
@HelenWDTK
Copy link
Contributor

This issue has been automatically closed due to a lack of discussion or resolution for over 12 months.
Should we decide to revisit this issue in the future, it can be reopened.

@HelenWDTK HelenWDTK closed this as not planned Won't fix, can't repro, duplicate, stale Nov 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
easier-admin Make issues easier to resolve f:redaction f:request-analysis improvement Improves existing functionality (UI tweaks, refactoring, performance, etc) professional stale Issues with no activity for 12 months user-experience x:uk
Projects
None yet
Development

No branches or pull requests