-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2022 Moderation Report - Data Collection 5 (Users) #1498
Milestone
Comments
garethrees
changed the title
2022 Moderation Report - Data Collection 5
2022 Moderation Report - Data Collection 5 (Users)
Nov 14, 2022
User accountsrequire 'csv'
from_date = Time.zone.parse('2021-11-01').at_beginning_of_day
to_date = Time.zone.parse('2022-10-31').at_end_of_day
period = from_date..to_date
users_created_before_cutoff = User.where('created_at <= ?', to_date)
users_created_in_period = User.where(created_at: period)
users_updated_in_period = User.where(updated_at: period)
# Total number of users created before the end date who've confirmed their email
active_total = users_created_before_cutoff.where(email_confirmed: true).count
# Number of users created within the given period who are still active
# i.e. confirmed their email, have not been banned, and have not closed
# their account.
new_in_period = users_created_in_period.active.count
csv = CSV.generate do |csv|
csv << ['User accounts', 'Total']
csv << [
'WhatDoTheyKnow users with activated accounts', active_total
]
csv << [
"New user accounts activated in #{to_date.year}", new_in_period
]
end
puts csv
# User accounts,Total
# WhatDoTheyKnow users with activated accounts,239540
# New user accounts activated in 2022,16217
|
Banned usersrequire 'csv'
from_date = Time.zone.parse('2021-11-01').at_beginning_of_day
to_date = Time.zone.parse('2022-10-31').at_end_of_day
period = from_date..to_date
users_updated_in_period = User.where(updated_at: period)
# Number of users where the last update was in the period who are marked
# as banned for spamming.
#
# Note that these are *identified* spammers; likely to be significantly
# more in reality
banned_spam =
users_updated_in_period.where(ban_text: 'Banned for spamming').count
# Number of users where the last update was in the period who have been
# banned for some other reason than obvious spam.
banned_other =
users_updated_in_period.
where.not(ban_text: '').
where.not(ban_text: 'Banned for spamming').
count
total_banned = banned_spam + banned_other
csv = CSV.generate do |csv|
csv << ["Reason for banning users in #{to_date.year}", 'Total']
csv << [
'Spam', banned_spam
]
csv << [
'Other site misuse', banned_other
]
csv << [
"Total number of users banned in #{to_date.year}", total_banned
]
end
puts csv
# Reason for banning users in 2022,Total
# Spam,2160
# Other site misuse,300
# Total number of users banned in 2022,2460
|
Anonymised usersrequire 'csv'
from_date = Time.zone.parse('2021-11-01').at_beginning_of_day
to_date = Time.zone.parse('2022-10-31').at_end_of_day
period = from_date..to_date
users_created_in_period = User.where(created_at: period)
users_updated_in_period = User.where(updated_at: period)
# Number of users created within the given period who have been
# anonymised
created_anon = users_created_in_period.where(name: '[Name Removed]').count
# Number of users where the last update was in the period who have been
# anonymised
updated_anon = users_updated_in_period.where(name: '[Name Removed]').count
total_anonymised = created_anon + updated_anon
csv = CSV.generate do |csv|
csv << ['Anonymisation*', 'Total']
csv << [
"Accounts anonymised in #{to_date.year}", total_anonymised
]
end
puts csv
# Anonymisation*,Total
# Accounts anonymised in 2022,139
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Can we get stats from the site for the below; for the time period: 1 November 2021 - 31 October 2022
User Information
Total number WDTK users
Number of new users
Number of banned users
Number of user accounts anonymised at user's request in 2022
Data collection as last year's ticket #925
Deadlines are:
Data collection by 24 November
The report is scheduled to be compiled week beginning 27 November
The annual report is scheduled to go out on 15 December
The text was updated successfully, but these errors were encountered: