Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculated homepage generating data quality issues #690

Closed
mdeuk opened this issue May 30, 2020 · 1 comment
Closed

Calculated homepage generating data quality issues #690

mdeuk opened this issue May 30, 2020 · 1 comment

Comments

@mdeuk
Copy link
Collaborator

mdeuk commented May 30, 2020

A WhatDoTheyKnow user has tweeted:

Lots of orgs. in this data have websites listed as Hotmail, Gmail, BTInternet etc. (by which I mean the mail provider's own website, not a subdomain), e.g.

https://www.whatdotheyknow.com/body/poppleton_road_primary_school_york
another is:

https://www.whatdotheyknow.com/body/plover_primary_school_doncaster

This is a data quality issue that appears to be generated through public body website addresses that are being derived from the 'calculated homepage' (e.g. based on the domain of the recorded email address).

Looking at an export from a few weeks back, and doing some mental maths based on a list of popular free/ISP email domains, we have some some 2,211 bodies where this may be an issue.

  • 22 bodies with sky.com
  • 38 bodies with blueyonder.co.uk / virgin.net
  • 113 bodies with talktalk.net / talktalkbusiness.net / tiscali.co.uk
  • 115 bodies with aol.com
  • 196 bodies with yahoo.com / yahoo.co.uk / ymail.com
  • 426 bodies with hotmail.com / hotmail.co.uk / live.co.uk
  • 570 bodies with gmail.com / googlemail.com
  • 731 bodies with btclick.com / btconnect.com / btinternet.com / btopenworld.com / talk21.com

There are likely to be some instances where the body is defunct, but, I haven't yet explored this possibility.

@HelenWDTK
Copy link
Contributor

Closing in favour of mysociety/alaveteli#6434

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants