Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data collection improvements to enable better transparency reporting #938

Closed
3 tasks
RichardTaylor opened this issue Dec 1, 2021 · 14 comments
Closed
3 tasks
Labels
administrative-task non-developer Tasks suitable for non-developers

Comments

@RichardTaylor
Copy link

Work on: "Manual Transparency Report for 2021 Annual Report" #910 threw up some areas where better data collection would be desirable:

  • Improved record keeping for how cases are closed on the GDPR spreadsheet. This is mainly a policy/practice point on the use of the existing "Decision" and "Erased?" fields.
    • Maybe change "Decision" to "Final decision"?
    • Consider an additional option for the "Erased?" field for cases where some material has been removed.
    • Maybe we don't need both a "Decision" and "Erased" field, but rather just the "Final Decision" field, and the "Done" field records if the decision has been actioned?
  • Count edits to outgoing message bodies as censor/redaction/takedown events. Consider also potential double counting there may be edits to outgoing messages which have subsequently been removed from public view. Request for Stats for the WDTK Transparency Report 2021 - 2 #922
  • Work on why there are hide events with no recorded reason as noted at Request for Stats for the WDTK Transparency Report 2021 - 2 #922 (comment) (Would the status of the request not help us fill in the gaps here?)
    • Do we need a system for hiding material without notifying users (this is one situation in which admins might not use the canned responses in the system which I suspect are related to the recorded hide events / reasons).
@RichardTaylor
Copy link
Author

  • Consistent labelling of on site correspondence threads, and/or inbox threads, where the substance of a FOI request/response has been materially impacted by a takedown.
  • More specific labelling of ICO correspondence in the support mailbox to enable easier identification of complaints about us to the ICO, and their outcomes.
  • More specific labelling of support correspondence involving takedown requests from public bodies, and specifically and separately identifying those from the police / law enforcement. (Could consider expanding the GDPR tracking spreadsheet to track all substantive takedown requests, including those not relating to personal data?).
  • More specific labelling of support correspondence involving requests for user data, and specifically and separately identifying those from the police / law enforcement.

@mdeuk
Copy link
Collaborator

mdeuk commented Dec 1, 2021

Work on: "Manual Transparency Report for 2021 Annual Report" #910 threw up some areas where better data collection would be desirable:

* [ ]  Improved record keeping for how cases are closed on the GDPR spreadsheet. This is mainly a policy/practice point on the use of the existing "Decision" and "Erased?" fields.
  
  * Maybe change "Decision" to "Final decision"?
  * Consider an additional option for the "Erased?" field for cases where some material has been removed.
  * Maybe we don't need both a "Decision" and "Erased" field, but rather just the "Final Decision" field, and the "Done" field records if the decision has been actioned?

For this to work correctly, you'd need a categorisation (which we already do, in the form of selecting the case type), and then a closure reason (e.g. Resolved - comply), followed by a sub-category which confirms what we've done. If we wished to be precise, two sub-categories may be best.

That could look like:

Category: Erasure (Art 17)
Closure reason: Resolved
closure reason 1: Comply in full
closure reason 2: All data removed

That's a very early suggestion, and isn't a final answer! I'd like to give some thought to how we balance the need for improvement, against the need to reduce complexity - automating things would likely help.

The existing setup is heavily bodged from what was there in the beginning, so it doesn't really do what we always need it to do. We do have an issue of metadata overall, and a lack of consistency in terms of audit logs, which is a key thing to have when handling these cases.

Linking to mysociety/whatdotheyknow-private#239 and mysociety/whatdotheyknow-private#238

  • More specific labelling of support correspondence involving requests for user data, and specifically and separately identifying those from the police / law enforcement.

Agreed. The tracker should be setup to track these types of cases (which we've been categorising with code 'LG') - along with service complaints, as they fall under broadly the same handling mechanism. @sallytay do you have any thoughts on this?

Being able to log these consistently will help considerably with our records management and compliance mechanisms, as we'll have everything on a system that we can run reports against so that everything is kept on track.

@mdeuk mdeuk added the non-developer Tasks suitable for non-developers label Dec 1, 2021
@RichardTaylor
Copy link
Author

The existing setup is heavily bodged from what was there in the beginning

We could go for a fresh start, a new sheet, possibly with a wider scope to cover all takedown requests, requests for user data and complaints?

@mdeuk
Copy link
Collaborator

mdeuk commented Dec 2, 2021

The existing setup is heavily bodged from what was there in the beginning

We could go for a fresh start, a new sheet, possibly with a wider scope to cover all takedown requests, requests for user data and complaints?

Possibly, but we need to think carefully about that.

I have an idea of sorts, I do need to flesh it out a bit though…

@RichardTaylor
Copy link
Author

Suggestion from report for data moving forward from @mdeuk

Could we perhaps collect some metadata within Alaveteli when generating a ban - e.g. similar to how we set a prominence reason on a request (a dropdown of pre-defined options, then a freeform text box).

This might allow us to automate production of this statistic with a degree of certainty.

Originally posted by @sallytay in #925 (comment)

@RichardTaylor
Copy link
Author

On the subject of better data on why censor rules were put in place:

mysociety/alaveteli#6487
mysociety/alaveteli#4626

@sallytay
Copy link
Contributor

sallytay commented Dec 8, 2021

My thoughts on this:
More specific labelling of support correspondence involving requests for user data, and specifically and separately identifying those from the police / law enforcement.
Agreed. The tracker should be setup to track these types of cases (which we've been categorising with code 'LG') - along with service complaints, as they fall under broadly the same handling mechanism.
Being able to log these consistently will help considerably with our records management and compliance mechanisms, as we'll have everything on a system that we can run reports against so that everything is kept on track.

Yes I agree it would be good to track these. I like the Police Request label in the inbox - it might be also be good to have a specific Police Request for User Data label. to ensure we not capturing other police requests at the same time. We could then also add a Request for User Data label as well for other request not made the police?

As the number of cases is pretty low I'm happy to set up a basic tracker, and don't mind taking on the responsibility to log as them we can keep track of the outcomes as well. I'm. not sure it will be as sophisticated as the GDPR tracker but it would deficiently give us the data that was needed for the Transparency Report

Sally

@sallytay
Copy link
Contributor

sallytay commented Dec 8, 2021

ICO Correspondence Data:

As well as clearer labelling within the inbox as we can't use the thread count for accurate numbers as the ICO casework systems doesn't seem to using threading.

My plan is to set up a basic tracking spreadsheet that would record, case we report to the ICO along with any instances where we've been reported to the ICO. Again I'm happy to pick up the admin burden of this as ultimately it will save me time when doing next years Transparency Report and may prove useful throughout the year.

Spreadsheet Content would be along the lines of:
Date Sent to ICO
Date Response Received
ICO case reference number
Who we have reported
Outcome (to include link to decision notice if there is one)

I've added to my next sprint to do this which you can then feedback on, then this can be started in the new year to make sure we have a good set of data for 2022.

Sally

@sallytay
Copy link
Contributor

Update:

i've made two, very basic spreadsheets to help keep a log of ICO referrals by us and Police request for information.
https://drive.google.com/drive/folders/1_lrkmO_kRVCh2quNDy0DMrm7UOQpUSEw

I don't think they need to be any more than this at the moment but any suggestions welcomed.

I'm happy to take the responsibility for logging cases, to relieve the admin burden but obviously anyone can add to them.

Next step is to look at the inbox labels and then to work through the other suggestions on this ticket.

Sally

@sallytay
Copy link
Contributor

I'm in the process of breaking these down into separate tickets for different types of tasks. All suggestions from this ticket will be added to the new tickets.

Data Collection Improvements for Transparency Report 2022 - Support Inbox Labelling #972
#972

There will also be:
GDPR Spreadsheet Improvements
System data collection tickets

Sally

@sallytay
Copy link
Contributor

GDPR Improvements transferred to ticket #974

@sallytay
Copy link
Contributor

System data collection now transferred to a new ticket #975

@mdeuk
Copy link
Collaborator

mdeuk commented Jun 6, 2023

Related:

  • mysociety/whatdotheyknow-private#60
  • mysociety/whatdotheyknow-private#204

@HelenWDTK
Copy link
Contributor

Closing this, as a lot of this has been implemented and tracker issues are logged elsewhere. Specific issues relating to the 2023 report can be noted on #1536

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
administrative-task non-developer Tasks suitable for non-developers
Projects
None yet
Development

No branches or pull requests

4 participants