Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: moderation persist scores [AI-696] #418

Merged
merged 2 commits into from
Dec 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion packages/aila/src/features/moderation/AilaModeration.ts
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@ export class AilaModeration implements AilaModerationFeature {
appSessionId: chatId,
messageId: lastAssistantMessage.id,
categories: moderationResult.categories,
scores: moderationResult.scores,
justification: moderationResult.justification,
lesson: lessonPlan,
});
Expand Down Expand Up @@ -195,10 +196,11 @@ export class AilaModeration implements AilaModerationFeature {
messages: Message[];
lessonPlan: LooseLessonPlan;
retries: number;
}) {
}): Promise<ModerationResult> {
if (retries < 1) {
return {
categories: [],
scores: undefined,
justification: "Failed to parse moderation response",
};
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -123,10 +123,7 @@ export class OpenAiModerator extends AilaModerator {
);

const log = aiLogger("aila:moderation:response");
log.info(
"Moderation response: ",
JSON.stringify(moderationResponse, null, 2),
);
log.info(JSON.stringify(moderationResponse));
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mfdowland we've been logging the full response ... so i';; see if i can get old scores exported!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually they were excluded in prod so looks like we won;t be getting these


const response = moderationResponseSchema.safeParse(
JSON.parse(moderationResponse.choices[0]?.message.content ?? "null"),
Expand All @@ -152,6 +149,7 @@ export class OpenAiModerator extends AilaModerator {

return {
justification,
scores,
categories: categories.filter((category) => {
/**
* We only want to include the category if the parent category scores below a certain threshold.
Expand Down
3 changes: 3 additions & 0 deletions packages/core/src/models/moderations.ts
Original file line number Diff line number Diff line change
Expand Up @@ -34,13 +34,15 @@ export class Moderations {
appSessionId,
messageId,
categories,
scores,
justification,
lesson,
}: {
userId: string;
appSessionId: string;
messageId: string;
categories: ModerationResult["categories"];
scores: ModerationResult["scores"];
justification?: string;
lesson: Snapshot;
}): Promise<Moderation> {
Expand All @@ -58,6 +60,7 @@ export class Moderations {
userId,
categories,
justification,
scores,
appSessionId,
messageId,
lessonSnapshotId,
Expand Down
21 changes: 12 additions & 9 deletions packages/core/src/utils/ailaModeration/moderationSchema.ts
Original file line number Diff line number Diff line change
Expand Up @@ -44,19 +44,21 @@ export const moderationCategoriesSchema = z.array(

const likertScale = z.number().int().min(1).max(5);

const moderationScoresSchema = z.object({
l: likertScale.describe("Language and discrimination score"),
v: likertScale.describe("Violence and crime score"),
u: likertScale.describe("Upsetting, disturbing and sensitive score"),
s: likertScale.describe("Nudity and sex score"),
p: likertScale.describe("Physical activity and safety score"),
t: likertScale.describe("Toxic score"),
});

/**
* Schema for the moderation response from the LLM.
* Note: it's important that 'categories' is the last field in the schema
*/
export const moderationResponseSchema = z.object({
scores: z.object({
l: likertScale.describe("Language and discrimination score"),
v: likertScale.describe("Violence and crime score"),
u: likertScale.describe("Upsetting, disturbing and sensitive score"),
s: likertScale.describe("Nudity and sex score"),
p: likertScale.describe("Physical activity and safety score"),
t: likertScale.describe("Toxic score"),
}),
scores: moderationScoresSchema,
justification: z.string().describe("Add justification for your scores."),
categories: moderationCategoriesSchema,
});
Expand All @@ -65,8 +67,9 @@ export const moderationResponseSchema = z.object({
* Schema for the moderation result, once parsed from the moderation response
*/
export const moderationResultSchema = z.object({
categories: moderationCategoriesSchema,
justification: z.string().optional(),
scores: moderationScoresSchema.optional(),
categories: moderationCategoriesSchema,
});

export type ModerationResult = z.infer<typeof moderationResultSchema>;
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
-- AlterTable
ALTER TABLE "public"."moderations" ADD COLUMN "scores" JSONB;
1 change: 1 addition & 0 deletions packages/db/prisma/schema.prisma
Original file line number Diff line number Diff line change
Expand Up @@ -789,6 +789,7 @@ model Moderation {
appSessionId String @map("app_session_id")
messageId String @map("message_id")
categories Json[]
scores Json? @map("scores") @db.JsonB
justification String?
lessonSnapshotId String? @map("lesson_snapshot_id")
// A user's comment in relation to the moderation. Likely they are contesting it.
Expand Down
Loading