Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2022 Moderation Report - Data Collection 5 (Users) #1498

Closed
4 tasks done
sallytay opened this issue Nov 9, 2022 · 3 comments
Closed
4 tasks done

2022 Moderation Report - Data Collection 5 (Users) #1498

sallytay opened this issue Nov 9, 2022 · 3 comments
Assignees

Comments

@sallytay
Copy link
Contributor

sallytay commented Nov 9, 2022

Can we get stats from the site for the below; for the time period: 1 November 2021 - 31 October 2022
User Information

  • Total number WDTK users

  • Number of new users

  • Number of banned users

  • Number of user accounts anonymised at user's request in 2022

Data collection as last year's ticket #925

Deadlines are:
Data collection by 24 November
The report is scheduled to be compiled week beginning 27 November
The annual report is scheduled to go out on 15 December

@garethrees garethrees changed the title 2022 Moderation Report - Data Collection 5 2022 Moderation Report - Data Collection 5 (Users) Nov 14, 2022
@garethrees
Copy link
Member

User accounts

require 'csv'

from_date = Time.zone.parse('2021-11-01').at_beginning_of_day
to_date = Time.zone.parse('2022-10-31').at_end_of_day
period = from_date..to_date

users_created_before_cutoff = User.where('created_at <= ?', to_date)
users_created_in_period = User.where(created_at: period)
users_updated_in_period = User.where(updated_at: period)

# Total number of users created before the end date who've confirmed their email
active_total = users_created_before_cutoff.where(email_confirmed: true).count

# Number of users created within the given period who are still active
# i.e. confirmed their email, have not been banned, and have not closed
# their account.
new_in_period = users_created_in_period.active.count

csv = CSV.generate do |csv|
  csv << ['User accounts', 'Total']

  csv << [
    'WhatDoTheyKnow users with activated accounts', active_total
  ]

  csv << [
    "New user accounts activated in #{to_date.year}", new_in_period
  ]
end

puts csv
# User accounts,Total
# WhatDoTheyKnow users with activated accounts,239540
# New user accounts activated in 2022,16217
User accounts Total
WhatDoTheyKnow users with activated accounts 239540
New user accounts activated in 2022 16217

@garethrees
Copy link
Member

Banned users

require 'csv'

from_date = Time.zone.parse('2021-11-01').at_beginning_of_day
to_date = Time.zone.parse('2022-10-31').at_end_of_day
period = from_date..to_date

users_updated_in_period = User.where(updated_at: period)

# Number of users where the last update was in the period who are marked
# as banned for spamming.
#
# Note that these are *identified* spammers; likely to be significantly
# more in reality
banned_spam =
  users_updated_in_period.where(ban_text: 'Banned for spamming').count

# Number of users where the last update was in the period who have been
# banned for some other reason than obvious spam. 
banned_other =
  users_updated_in_period.
    where.not(ban_text: '').
    where.not(ban_text: 'Banned for spamming').
    count

total_banned = banned_spam + banned_other

csv = CSV.generate do |csv|
  csv << ["Reason for banning users in #{to_date.year}", 'Total']

  csv << [
    'Spam', banned_spam
  ]

  csv << [
    'Other site misuse', banned_other
  ]

  csv << [
    "Total number of users banned in #{to_date.year}", total_banned
  ]
end

puts csv
# Reason for banning users in 2022,Total
# Spam,2160
# Other site misuse,300
# Total number of users banned in 2022,2460
Reason for banning users in 2022 Total
Spam 2160
Other site misuse 300
Total number of users banned in 2022 2460

@garethrees
Copy link
Member

Anonymised users

require 'csv'

from_date = Time.zone.parse('2021-11-01').at_beginning_of_day
to_date = Time.zone.parse('2022-10-31').at_end_of_day
period = from_date..to_date

users_created_in_period = User.where(created_at: period)
users_updated_in_period = User.where(updated_at: period)

# Number of users created within the given period who have been
# anonymised
created_anon = users_created_in_period.where(name: '[Name Removed]').count

# Number of users where the last update was in the period who have been
# anonymised
updated_anon = users_updated_in_period.where(name: '[Name Removed]').count

total_anonymised = created_anon + updated_anon

csv = CSV.generate do |csv|
  csv << ['Anonymisation*', 'Total']

  csv << [
    "Accounts anonymised in #{to_date.year}", total_anonymised
  ]
end

puts csv
# Anonymisation*,Total
# Accounts anonymised in 2022,139
Anonymisation* Total
Accounts anonymised in 2022 139

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants