Implementation Plan: Content moderation metrics #3760

dhruvkb · 2024-02-06T20:24:53Z

Fixes

Fixes #1970 by @sarayourfriend

Description

This PR adds an implementation plan for adding metrics for the moderation based on the reports and decisions. This PR adds a few more metrics than the project proposal called for but that felt appropriate to track.

Reviewers:

Testing Instructions

~~Please read the IP and submit your thoughts.~~This proposal is now in the decision round.

Checklist

My pull request has a descriptive title (not a vague title likeUpdate index.md).
My pull request targets the default branch of the repository (main) or a parent feature branch.
My commit messages follow best practices.
My code follows the established code style of the repository.
I added or updated tests for the changes I made (if applicable).
I added or updated documentation (if applicable).
I tried running the project locally and verified that there are no visible errors.
I ran the DAG documentation generator (if applicable).

Developer Certificate of Origin

Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
1 Letterman Drive
Suite D4700
San Francisco, CA, 94129

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.


Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

github-actions · 2024-02-06T20:32:20Z

Full-stack documentation: https://docs.openverse.org/_preview/3760

Please note that GitHub pages takes a little time to deploy newly pushed code, if the links above don't work or you see old versions, wait 5 minutes and try again.

You can check the GitHub pages deployment action list to see the current status of the deployments.

New files ➕:

https://docs.openverse.org/_preview/3760/projects/proposals/trust_and_safety/content_report_moderation/20240122-implementation_plan_moderation_metrics.html

dhruvkb

This looks good enough to ask for reviews now.

...rust_and_safety/content_report_moderation/20240122-implementation_plan_moderation_metrics.md

sarayourfriend · 2024-02-12T22:04:59Z

Just a heads up @dhruvkb, from the PR description:

🚧 WIP. No reviewers have been decided yet.

...rust_and_safety/content_report_moderation/20240122-implementation_plan_moderation_metrics.md

dhruvkb · 2024-02-13T05:47:29Z

Drafting for a short bit to address potential changes from #3760 (comment) and determine reviewers.

stacimc

Excited for this -- having just worked on the bulk moderation decisions IP I'm immediately thinking this will be so useful for maintainers in those bulk moderation use cases, in addition to being useful for moderators :)

My questions largely relate to the way the ModerationDecision and MediaReport models have been changed in previous IPs as part of this milestone, which I think complicates some of the events and metrics described here. The IP describing ModerationDecision and updates to the Report model has been approved, and the Bulk Moderation IP which extends the decision model is about to be approved, so I think the models are solid at this point.

...rust_and_safety/content_report_moderation/20240122-implementation_plan_moderation_metrics.md

stacimc · 2024-02-15T19:35:22Z

...rust_and_safety/content_report_moderation/20240122-implementation_plan_moderation_metrics.md

+  - number of decisions (time-series, real-time)
+    <!-- TODO: - per media type (time-series, real-time) -->
+  - number of resolutions (time-series, real-time)
+    <!-- TODO: - per media type (time-series, real-time) -->


Should these TODOs be added in, or do you mean to suggest that they wouldn't be added in the first pass but later?

I imagine that a lot (if not all) of these metrics would be useful to break down by media type, violation type, and even provider/creator/source. Particularly in the deferred metrics, maybe the Django views could support filters along those lines? In that case it might be helpful to include all metrics in the Django views, even those which have time-series on the cloudwatch dashboards.

The distinction between media types was kept as a TODO because I was still understanding their implementation. Since the report and decision models are already separated by media type all metrics we want to know can be broken down on that basis.

For the real-time metrics we can construct dashboard graphs specific to our needs but I'm not sure if Cloudwatch allows us to dig into the metrics and apply more specific filters.

...rust_and_safety/content_report_moderation/20240122-implementation_plan_moderation_metrics.md

openverse-bot · 2024-02-20T00:00:11Z

Based on the medium urgency of this PR, the following reviewers are being gently reminded to review this PR:

@AetherUnbound
This reminder is being automatically generated due to the urgency configuration.

Excluding weekend¹ days, this PR was ready for review 4 day(s) ago. PRs labelled with medium urgency are expected to be reviewed within 4 weekday(s)².

@dhruvkb, if this PR is not ready for a review, please draft it to prevent reviewers from getting further unnecessary pings.

Specifically, Saturday and Sunday. ↩
For the purpose of these reminders we treat Monday - Friday as weekdays. Please note that the operation that generates these reminders runs at midnight UTC on Monday - Friday. This means that depending on your timezone, you may be pinged outside of the expected range. ↩

AetherUnbound

This is fantastic, and very clear! I have a few points of clarification, but generally this seems straightforward 😄 I'm very excited for this and I'm glad we could fit it in with existing tools so easily!

...rust_and_safety/content_report_moderation/20240122-implementation_plan_moderation_metrics.md

Co-authored-by: Madison Swain-Bowden <[email protected]>

…d_metrics

stacimc

The updates look fantastic @dhruvkb! Very thorough and the code samples are excellent.

stacimc · 2024-03-04T19:39:39Z

...rust_and_safety/content_report_moderation/20240122-implementation_plan_moderation_metrics.md

+# accuracy of reports
+# ===================
+total_reports = reports_in_range.count()
+confirmed_reports = reports_in_range.filter(decision__action__in=['marked_sensitive', 'deindexed_sensitive', 'deindexed_copyright']).count()


This is a minor thing, but the more I think about it the more the term 'accuracy' doesn't feel quite right -- specifically because it excludes deduplicated reports. We define dedeuplicating a report as an acknowledgement that it was accurate, but requires no action. (E.g. an already sensitive-marked work gets reported for sensitive content).

I totally understand why the duplication counts would be broken out into a separate metric, though. Maybe we could just change "accuracy" to something like "actionable"/"actioned"?

Just adding that I don't think the language necessarily needs to be finalized at this stage. I would be totally fine with just mentioning that in the issues that are created.

Happy to approve whenever this is moved out of the Clarification Round!

I tried very hard to find a better name for it but couldn't, because of these reaosons

duplicates can exist for both accurate and inaccurate reports so "accuracy" isn't the best term for this

marking a report as a duplicate is an action so not counting them under "actionable"/"actioned" isn't fair either

I'll move this proposal ahead into the decision round if we can keep this as an open question to find a better, more-suitable name for this metric.

AetherUnbound

Fully on board, no blocking objections! Great work on this Dhruv! 😄

stacimc

Looks great to me!

...rust_and_safety/content_report_moderation/20240122-implementation_plan_moderation_metrics.md

Co-authored-by: Staci Mullins <[email protected]>

dhruvkb commented Feb 11, 2024

View reviewed changes

...rust_and_safety/content_report_moderation/20240122-implementation_plan_moderation_metrics.md Outdated Show resolved Hide resolved

...rust_and_safety/content_report_moderation/20240122-implementation_plan_moderation_metrics.md Outdated Show resolved Hide resolved

dhruvkb marked this pull request as ready for review February 11, 2024 10:13

dhruvkb requested a review from a team as a code owner February 11, 2024 10:14

dhruvkb requested review from AetherUnbound and obulat February 11, 2024 10:14

sarayourfriend reviewed Feb 12, 2024

View reviewed changes

...rust_and_safety/content_report_moderation/20240122-implementation_plan_moderation_metrics.md Outdated Show resolved Hide resolved

...rust_and_safety/content_report_moderation/20240122-implementation_plan_moderation_metrics.md Outdated Show resolved Hide resolved

dhruvkb marked this pull request as draft February 13, 2024 05:46

dhruvkb requested review from stacimc and removed request for obulat February 13, 2024 13:24

dhruvkb marked this pull request as ready for review February 13, 2024 13:24

stacimc reviewed Feb 15, 2024

View reviewed changes

dhruvkb marked this pull request as draft February 20, 2024 07:07

dhruvkb and others added 7 commits February 20, 2024 15:40

Create the implementation plan

0f213f2

Remove notes about reviewers

711aeec

Add more metrics related to the moderation decisions

33582a0

Update recommended reviewers

8511aaa

Update as per responses to TODO questions

65c840b

Replace "response" and "resolution" with "decision"

42f571b

Update description of real-time metrics in implementation plan

e7eb22d

dhruvkb force-pushed the mod_metrics branch from e8ca843 to e7eb22d Compare February 20, 2024 17:37

dhruvkb mentioned this pull request Feb 21, 2024

Implementation Plan: Bulk moderation actions #3719

Merged

8 tasks

dhruvkb added 3 commits February 26, 2024 16:00

Use correct event names

1246926

Add a note about the impact of reversed decisions on accuracy

b531835

Expand on the implementation and surfacing of deferred models

bb3cbfb

dhruvkb requested a review from stacimc February 27, 2024 13:28

dhruvkb marked this pull request as ready for review February 27, 2024 15:48

sarayourfriend mentioned this pull request Feb 28, 2024

Sensitive content report moderation workflow #383

Open

12 tasks

AetherUnbound reviewed Mar 2, 2024

View reviewed changes

dhruvkb and others added 4 commits March 2, 2024 12:42

Improve copy for clarity

1d6788a

Co-authored-by: Madison Swain-Bowden <[email protected]>

Update the definition and code for report leaderboards

36e46af

Merge branch 'main' of https://github.com/WordPress/openverse into mo…

e750f8b

…d_metrics

Link to previous IPs in this project

3a5fda2

stacimc reviewed Mar 4, 2024

View reviewed changes

AetherUnbound approved these changes Mar 6, 2024

View reviewed changes

obulat assigned dhruvkb Mar 7, 2024

krysal added the skip-changelog label Mar 13, 2024

Record approval from @AetherUnbound

8d2b039

stacimc approved these changes Mar 18, 2024

View reviewed changes

...rust_and_safety/content_report_moderation/20240122-implementation_plan_moderation_metrics.md Outdated Show resolved Hide resolved

Record approval from @stacimc

3f17c56

Co-authored-by: Staci Mullins <[email protected]>

dhruvkb merged commit 0a4c34a into main Mar 19, 2024
38 checks passed

dhruvkb deleted the mod_metrics branch March 19, 2024 06:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation Plan: Content moderation metrics #3760

Implementation Plan: Content moderation metrics #3760

dhruvkb commented Feb 6, 2024 •

edited

Loading

github-actions bot commented Feb 6, 2024

dhruvkb left a comment

sarayourfriend commented Feb 12, 2024

dhruvkb commented Feb 13, 2024 •

edited

Loading

stacimc left a comment

stacimc Feb 15, 2024

dhruvkb Feb 26, 2024

openverse-bot commented Feb 20, 2024

AetherUnbound left a comment

stacimc left a comment

stacimc Mar 4, 2024

stacimc Mar 11, 2024

dhruvkb Mar 18, 2024

AetherUnbound left a comment

stacimc left a comment

Implementation Plan: Content moderation metrics #3760

Implementation Plan: Content moderation metrics #3760

Conversation

dhruvkb commented Feb 6, 2024 • edited Loading

Fixes

Description

Testing Instructions

Checklist

Developer Certificate of Origin

github-actions bot commented Feb 6, 2024

dhruvkb left a comment

Choose a reason for hiding this comment

sarayourfriend commented Feb 12, 2024

dhruvkb commented Feb 13, 2024 • edited Loading

stacimc left a comment

Choose a reason for hiding this comment

stacimc Feb 15, 2024

Choose a reason for hiding this comment

dhruvkb Feb 26, 2024

Choose a reason for hiding this comment

openverse-bot commented Feb 20, 2024

Footnotes

AetherUnbound left a comment

Choose a reason for hiding this comment

stacimc left a comment

Choose a reason for hiding this comment

stacimc Mar 4, 2024

Choose a reason for hiding this comment

stacimc Mar 11, 2024

Choose a reason for hiding this comment

dhruvkb Mar 18, 2024

Choose a reason for hiding this comment

AetherUnbound left a comment

Choose a reason for hiding this comment

stacimc left a comment

Choose a reason for hiding this comment

dhruvkb commented Feb 6, 2024 •

edited

Loading

dhruvkb commented Feb 13, 2024 •

edited

Loading