-
Notifications
You must be signed in to change notification settings - Fork 467
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Track record/account delete and update data in subject status #2804
✨ Track record/account delete and update data in subject status #2804
Conversation
type: 'string', | ||
format: 'datetime', | ||
description: | ||
'Last update timestamp of the record the subject is associated with', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like the language here is quite convoluted. would love suggestions to make it clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps something along the lines of: "Timestamp at which the record or account was last updated."
@@ -11127,7 +11145,7 @@ export const schemaDict = { | |||
type: 'object', | |||
description: | |||
'Logs account status related events on a repo subject. Normally captured by automod from the firehose and emitted to ozone for historical tracking.', | |||
required: ['timestamp', 'status', 'active'], | |||
required: ['timestamp', 'active'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: if this PR doesn't make it into the main branch, we'd wanna port this change over to the parent PR.
// When deactivated accounts are re-activated, we receive the event with just the active flag set to true | ||
// so we want to make sure that the recordStatus is not set to an outdated value | ||
if (currentStatus?.recordStatus !== 'active' && event.meta?.active) { | ||
status.recordStatus = 'active' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
default for recordStatus
is null, may be we should fallback to that? although, doing this is maybe helpful if we want to see which accounts reactivated
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend reflecting the exact hosting status here (where active is either null
or 'active'
, slight preference for the latter), and then if we want to keep tract of reactivations we do so with a separate field such as reactivatedAt
. That allows the status to not be dependent on past statuses, and if we want to know if something is active we don't need to check for multiple different values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah the null
value is basically "we haven't tracked any hosting status for this" so I think safer option is to check for actual deactivated
value for status.
|
||
if (event.action === 'tools.ozone.moderation.defs#accountEvent') { | ||
const status: Partial<ModerationSubjectStatusRow> = { | ||
recordStatus: `${event.meta?.status}`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the types here, as written this could end-up containing the string 'undefined'
.
} | ||
|
||
if (event.meta?.tombstone) { | ||
status.recordStatus = 'tombstoned' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see a couple values for recordStatus
in here that aren't reflected in the lexicon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(P.S. is this separate from 'deleted'
? Historically we have used the term "tombstoned" in a couple different ways, and I'm not sure which one is meant here.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to my understanding, the tombstoned
event is only emitted on identity and this is "technically" equivalend as "deleted" but not exactly the same thing.
"recordUpdatedAt": { | ||
"type": "string", | ||
"format": "datetime", | ||
"description": "Last update timestamp of the record the subject is associated with" | ||
}, | ||
"recordDeletedAt": { | ||
"type": "string", | ||
"format": "datetime", | ||
"description": "Timestamp referencing when the record the subject is associated with was deleted" | ||
}, | ||
"recordStatus": { | ||
"type": "string", | ||
"description": "Status of the record the subject is associated with. Statuses are different when the subject references an account vs. a record", | ||
"knownValues": ["takendown", "suspended", "deleted", "deactivated"] | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These names work alright if the subject is a record, but if the subject is a repository or chat message, using the term "record" here doesn't seem like the best fit. Would there be any issue with using something like subjectUpdatedAt
/subjectDeletedAt
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I was using record
as a generic term here without the context of atproto records. I'm also not sure about subject
prefix either since we already have updatedAt
column on subject, we will end up with subject.updatedAt
and subject.subjectUpdatedAt
which feels odd. I think I'm onboard with the term hosting
just from technical standpoint.
"recordStatus": { | ||
"type": "string", | ||
"description": "Status of the record the subject is associated with. Statuses are different when the subject references an account vs. a record", | ||
"knownValues": ["takendown", "suspended", "deleted", "deactivated"] | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite a complex field, I think there are a few potential issues with it:
- It's called "record status" but it may contain the hosting status of an account.
- The hosting statuses of records and accounts are mixed together, which could make it seem like a record can be marked as takendown here, which isn't the case.
- Potential confusion between the ozone status and the hosting status: the terms "takendown" and "suspended" are also terms used by ozone overlapping with the
takendown
andsuspendedUntil
fields.
Here's one idea: we could combine the three new fields into something more like this (using typescript just to illustrate the concept):
type SubjectStatusView = {
// ...
hosting?: AccountHosting | RecordHosting
}
type AccountHosting = {
$type: 'tools.ozone.moderation.defs#accountHosting',
status: 'active' | 'takendown' | 'suspended' | 'deleted' | 'deactivated'
createdAt: Date
updatedAt: Date
deletedAt?: Date
suspendedAt?: Date
deactivatedAt?: Date
takendownAt?: Date
reactivatedAt?: Date
}
type RecordHosting = {
$type: 'tools.ozone.moderation.defs#recordHosting'
status: 'active' | 'deleted'
createdAt: Date
updatedAt: Date
deletedAt?: Date
}
This also gives us a way to make certain fields required together if we want, e.g. if hosting
is present then hosting.createdAt
can always be present too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is nice!
@@ -28,6 +28,31 @@ | |||
"format": "datetime", | |||
"description": "Search subjects reviewed after a given timestamp" | |||
}, | |||
"recordDeletedAfter": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My qualm with these fields is just that they are named "record" when they really may apply to any subject, e.g. an account rather than a record.
await db.schema | ||
.alterTable('moderation_subject_status') | ||
.addColumn('recordStatus', 'varchar') | ||
.execute() | ||
await db.schema | ||
.alterTable('moderation_subject_status') | ||
.addColumn('recordDeletedAt', 'varchar') | ||
.execute() | ||
await db.schema | ||
.alterTable('moderation_subject_status') | ||
.addColumn('recordUpdatedAt', 'varchar') | ||
.execute() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend distinguishing these as e.g. hostingStatus
, hostingDeletedAt
, etc.
type: 'string', | ||
format: 'datetime', | ||
description: | ||
'Last update timestamp of the record the subject is associated with', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps something along the lines of: "Timestamp at which the record or account was last updated."
const timestamp = event.meta?.timestamp | ||
? `${event.meta?.timestamp}` | ||
: event.createdAt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The types here are a little funky to wrangle. To avoid ending up with a string like 'true'
here, could we go with something more explicit like:
const timestamp = event.meta?.timestamp | |
? `${event.meta?.timestamp}` | |
: event.createdAt | |
const timestamp = typeof event.meta?.timestamp === 'string' | |
? event.meta.timestamp | |
: event.createdAt |
// When deactivated accounts are re-activated, we receive the event with just the active flag set to true | ||
// so we want to make sure that the recordStatus is not set to an outdated value | ||
if (currentStatus?.recordStatus !== 'active' && event.meta?.active) { | ||
status.recordStatus = 'active' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend reflecting the exact hosting status here (where active is either null
or 'active'
, slight preference for the latter), and then if we want to keep tract of reactivations we do so with a separate field such as reactivatedAt
. That allows the status to not be dependent on past statuses, and if we want to know if something is active we don't need to check for multiple different values.
…oto into ozone-track-record-delete-and-update
…oto into ozone-track-record-delete-and-update
"status": { | ||
"type": "string", | ||
"knownValues": ["takendown", "suspended", "deleted", "deactivated"] | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a hosting perspective individual records can't be deactivated, takendown, or suspended— only accounts can. They can be deleted, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see the values I'd expect reflected in the implementation, so this probably just needs to be updated.
"status": { | ||
"type": "string", | ||
"knownValues": ["takendown", "suspended", "deleted", "deactivated"] | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the value unknown
used in the implementation—may be worth adding here and to the same field in #recordHosting
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Let's just add a changeset here before merge.
this is going to be merged into #2661 and I'll add the changeset there since that's the one landing on main. |
* ✨ Add events for account and record update/delete/deactivation * ✨ Add handle change event * ✨ Reduce account events to 2 types and record events to 1 * ✨ Store metadata from account, identity and record events * ✨ Add created event for record * ✨ Add ndd the new events to allowed types in emitEvent * ✨ Use string value for record op and add tombstone flag to identity event * ✨ Add active flag on account events * ✨ Change accountStatus -> status to match with firehose event * ✨ Make active flag required * 🚨 fix prettier style issue * ✨ Track record/account delete and update data in subject status (#2804) * ✨ Store deleted/updated event data in subject_status * 🐛 Fix query for recordDeletedAt and recordUpdatedAt * ✨ Add tombstoned status * ✨ Move from record to hosting term * ✅ Add tests for hosting params * ✨ Update lexicons for hostingStatuses * ✅ Update snapshots * ✅ Update snapshots * ✅ Update snapshots * ✨ Adjust hosting statuses * 📝 Add changeset
This PR updates moderation subject status with 3 new properties
recordUpdatedAt
recordDeletedAt
andrecordStatus
. as account/record events are received, these help moderators query subject statuses with delete/update time range and record status.