-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request for Stats for the WDTK Transparency Report 2021 - 5 #925
Comments
There's a user count at |
In addition to the above user information, would it also be possible to get the data below Total number of users banned for site misuse (not including for spamming) |
I've added in the total number of users using the link suggested: https://www.whatdotheyknow.com/admin/stats However this is in 'real time' as the rest of the data is being taken from 1 November 2020 to 31 October 2021 - should we have the figure as at 31 October - if that's possible. |
The data should ideally be from the correct point in time of course. I suspect if spam accounts are included in user count or not would introduce a more significant error than a couple of week's delay in the count. |
from_date = Time.zone.parse('2020-11-01').at_beginning_of_day
to_date = Time.zone.parse('2021-10-31').at_end_of_day
period = from_date..to_date
users_created_before_cutoff = User.where('created_at <= ?', to_date)
users_created_in_period = User.where(created_at: period)
users_updated_in_period = User.where(updated_at: period)
# Total number of users created before the end date
users_created_before_cutoff.count
# => 241329
# Total number of users created before the end date who've confirmed their email
users_created_before_cutoff.where(email_confirmed: true).count
# => 222694
# Total number of users created before the end date who are still active
# i.e. confirmed their email, have not been banned, and have not closed
# their account.
users_created_before_cutoff.active.count
# => 212553
# ---
# Total number of users created within the given period .
users_created_in_period.count
# => 26405
# Number of users created within the given period who are still active
# i.e. confirmed their email, have not been banned, and have not closed
# their account.
users_created_in_period.active.count
# => 22847
# Number of users created within the given period who have
# subsequently confirmed their email address and are marked
# as banned for spamming.
#
# Note that these are *identified* spammers; likely to be significantly
# more in reality
users_created_in_period.where(ban_text: 'Banned for spamming').count
# => 3392
# Number of users created within the given period who have been banned
# for some other reason than obvious spam.
users_created_in_period.
where.not(ban_text: '').
where.not(ban_text: 'Banned for spamming').
count
# => 126
# Number of users created within the given period who have been
# anonymised
users_created_in_period.where(name: '[Name Removed]').count
# => 43
# ---
# Number of users where the last update was in the period who are marked
# as banned for spamming.
#
# Note that these are *identified* spammers; likely to be significantly
# more in reality
users_updated_in_period.where(ban_text: 'Banned for spamming').count
# => 3936 (EDIT: initially recorded here as 3392, but I must have mistakenly copied the stat from the users_created_in_period version)
# Number of users where the last update was in the period who have been
# banned for some other reason than obvious spam.
users_updated_in_period.
where.not(ban_text: '').
where.not(ban_text: 'Banned for spamming').
count
# => 166
# Number of users where the last update was in the period who have been
# anonymised
users_updated_in_period.where(name: '[Name Removed]').count
# => 127 |
I've now added these figures to the draft report |
Suggestion from report for data moving forward from @mdeuk Could we perhaps collect some metadata within Alaveteli when generating a ban - e.g. similar to how we set a prominence reason on a request (a dropdown of pre-defined options, then a freeform text box). This might allow us to automate production of this statistic with a degree of certainty. |
The published Transparency report can be found https://www.mysociety.org/2021/12/16/whatdotheyknow-transparency-report/ |
Is it possible to get stats from the site for the below; for the time period: 1 November 2020 - 31 October 2021
User Information
Total number WDTK users
Number of new users
Number of banned users
Number of user accounts anonymised at user request in 2021
Deadlines are:
The annual report is scheduled to go out on 16 December
Design is scheduled to be completed by 9 December
Ideally this means that copy should be ready by 2 December
Linked to:
#910
Sally
The text was updated successfully, but these errors were encountered: