-
Notifications
You must be signed in to change notification settings - Fork 213
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create sensitive and deleted media models for decisions #4544
Conversation
…n-media through model
This PR has migrations. Please rebase it before merging to ensure that conflicting migrations are not introduced. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, and looks good to me! I tested this locally and confirmed that the deindexed image gets deleted from the API database, and the sensitive image gets a record in the sensitive table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The blocking change is to fix the issue with non-performant bulk decision creation.
I think we should exclude fixing the backfill from this PR and instead just fix the media admin so that it's working again. And then address the backfill (and lay groundwork for #3840, which needs this anyway) in a separate PR.
through_model = { | ||
"image": ImageDecisionThrough, | ||
"audio": AudioDecisionThrough, | ||
}[self.media_type] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bit of a nit, but I've seen this pattern a few times in our code and I don't really understand it. Why not use match/case or if/else for this? The inline object approach is a little too clever (and I never understood defining a static object inline of a function body like this either).
Both match and if/else require explicitly raising if self.media_type
doesn't match... but isn't that better? It's certainly easier to read and understand to me (and you know the old phrase about how often code is read compared to written, I'm sure).
through_model = { | |
"image": ImageDecisionThrough, | |
"audio": AudioDecisionThrough, | |
}[self.media_type] | |
match self.media_type: | |
case IMAGE: | |
through_model = ImageDecisionThrough | |
case AUDIO: | |
through_model = AudioDecisionThrough | |
case _: | |
raise ValueError(f"Unknown media type {self.media_type}") |
through_model = { | |
"image": ImageDecisionThrough, | |
"audio": AudioDecisionThrough, | |
}[self.media_type] | |
if self.media_type == IMAGE: | |
through_model = ImageDecisionThrough | |
elif self.media_type == AUDIO: | |
through_model = AudioDecisionThrough | |
else: | |
raise ValueError(f"Unknown media type {self.media_type}") |
Even better would be to configure it on the admin itself, along with the media type.
Alternatively, if you want to remove all explicit configuration:
through_model = { | |
"image": ImageDecisionThrough, | |
"audio": AudioDecisionThrough, | |
}[self.media_type] | |
through_model = getattr(media_obj, f"{self.media_type}decisionthrough_set").model |
But it really would be better if it was just a @property
of media_obj._meta
or something...
through_model = { | |
"image": ImageDecisionThrough, | |
"audio": AudioDecisionThrough, | |
}[self.media_type] | |
through_model = media_obj._meta.decision_through_model |
Anyway, any of those would be expected and easier to understand when reading, I think. (Except the getattr
one, that's similarly too clever and it's basically not even worth including as is, but would be improved if it didn't need gettatr and could just be media_obj.decisionthrough_set.model
if the media type was removed from the name, which would simplify other code too, not just here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I write use the = {}[]
pattern is because it is the most compact among all the alternatives and also raises an exception when none of the keys match the input.
I've done this a few times in this file itself. Would you prefer I change this pattern across the entire file, or keep this as is is in the interest of consistency, or just change it here to one of the alternatives you've suggested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the dict-key-access approach is preferred for any reason at all (it is nice it implicitly throws if the key doesn't exist), I'd at the very least think defining the dictionary outside the runtime scope of the function is a reasonable requirement, if only so that it's in a shared location. It's far-fetched to me to establish a pattern of defining otherwise static dictionaries, especially one encoding relationships between static objects, entirely dynamically in the runtime of a function. From a performance perspective it's fine here, but in a tight loop it's just silly, right? From a shared data/relationship encoding perspective (and discoverability, clarity, etc) it's definitely the worst option I can think of 😕. Just seems like an antipattern to me 🤷 I also don't think compactness is necessarily a virtue, and certainly not in Python, which actively resists compactness in my experience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So my request here, to clarify, is to move the dictionary to a static location, out of the function, or define the relationship in some other concrete manner that isn't specific to this function. That can be either: changing the field names on the models so that they can be referenced generically (without the media type prefix), or adding a class property to the base class that resolves these models for the media type based on the fields, or something else like that. Also, a follow-up issue to remove that pattern anywhere it's been added and replace it with the generic approach (whether that's statically defined dicts or the dynamic-but-shared approach of class properties).
In general: these static relationships between media type and the data models should not be defined within a local function context, even ignoring all issues with legibility, ergonomics, and performance of this local dict approach. At the very least, this static relationship should be defined statically, and in a shared location, so that new code automatically references it, and reducing the risk of someone just copy/pasting this function-local definition of the relationship.
The dict-accessor pattern is fine on its own, it's the inline dict definition I think is an anti-pattern (though I think match/case and an explicit raise of ValueError
is clearer than KeyError
, but that's an aesthetic judgement, I know, as at the end of the day it's more or less intelligible as the same underlying problem).
Edit: I realise I'm blocking this PR that fixes a bug in the admin on a code-style/quality issue. I do think this needs to change and believe it's an anti-pattern, but won't block the PR. I'll write an issue to address this more widely later today instead.
api/test/unit/management/commands/test_backfillmoderationdecision.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After you delete a media object using a deindexed_[...]
action, admin is trying to open the same change_view
form with the same media object. However, because this media object has been deleted, you get an error on line if tags := media_obj.tags:
. To prevent it, I added the redirect to the change_view
:
media_obj = self.get_object(request, object_id)
if media_obj is None:
return redirect(f"admin:api_{self.media_type}_changelist")
If you do not select any report, and submit a decision (e.g., mark_sensitive
), you get no indication of the error except for the warning in the logs (which the moderator using the Admin UI will probably not see). This can be a follow-up issue since this PR is critical, but we should add the error display ("report_id" is required) to the form.
@stacimc pointed out that the backfill doesn't need to "perform" the action at all, because it's just creating the decisions for actions that have already been performed. Glad we removed it already! I'll re-review this today. |
Based on the critical urgency of this PR, the following reviewers are being gently reminded to review this PR: @sarayourfriend Excluding weekend1 days, this PR was ready for review 1 day(s) ago. PRs labelled with critical urgency are expected to be reviewed within 1 weekday(s)2. @dhruvkb, if this PR is not ready for a review, please draft it to prevent reviewers from getting further unnecessary pings. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, tests fine locally and I'm unblocking on my requested changes to get the fix into the admin.
Fixes
Fixes #4513 by @krysal
Description
This PR ensures that
SensitiveImage
/SensitiveAudio
andDeletedImage
/DeletedAudio
models are created for every decision.Testing Instructions
DeletedImage
object.SenstiveMedia
object.Repeat these steps for audio.
Checklist
Update index.md
).main
) or a parent feature branch../ov just catalog/generate-docs
for catalogPRs) or the media properties generator (
./ov just catalog/generate-docs media-props
for the catalog or
./ov just api/generate-docs
for the API) where applicable.Developer Certificate of Origin
Developer Certificate of Origin