-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(moderation prompt): lower false positives in toxic category #102
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
Playwright e2e testsTo view traces locally, unzip the report and run: npx playwright show-report ~/Downloads/playwright-report |
Quality Gate passedIssues Measures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to me 👍
🎉 This PR is included in version 1.7.0 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
Description
I tried a few variants locally (prompt/model combinations), and this was the best I found. (note: gpt-4o-mini is no good for this!).
Later will explore some more in-depth changes, but that would come with more risk associated, so want to do that when there's a dataset (and more time) to give the confidence required.
Issue(s)
https://www.notion.so/oaknationalacademy/Fix-moderation-bug-where-all-categories-get-flagged-3c93a1e819bb438f971187cb8be653bc
How to test