[META] Add new guardrail type based on LLM #2463

ylwu-amzn · 2024-05-21T00:14:06Z

Currently we have only one type of guardrail which check model input/output is toxic with stop words and regex. This is limited as it can only do exact match with stop words and regex. We plan to build a new type of guardrail which can use LLM to detect if input/output is toxic.

ylwu-amzn added enhancement New feature or request untriaged and removed untriaged labels May 21, 2024

github-actions bot added the untriaged label May 21, 2024

ylwu-amzn added v2.15.0 and removed untriaged labels May 21, 2024

ylwu-amzn assigned jngz-es May 21, 2024

dhrubo-os added this to ml-commons projects May 21, 2024

dhrubo-os moved this to In Progress in ml-commons projects May 21, 2024

jngz-es mentioned this issue Jun 2, 2024

guardrails model support #2491

Merged

5 tasks

ylwu-amzn mentioned this issue Jun 10, 2024

[DOC] Add document for new guardrail type opensearch-project/documentation-website#7353

Closed

4 tasks

ylwu-amzn closed this as completed Jun 14, 2024

github-project-automation bot moved this from In Progress to Done in ml-commons projects Jun 14, 2024

github-project-automation bot added this to OpenSearch Project Roadmap Aug 30, 2024

github-project-automation bot moved this to 2.15.0 (Release window opens on June 10th, 2024 and closes on June 25th, 2024) in OpenSearch Project Roadmap Aug 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[META] Add new guardrail type based on LLM #2463

[META] Add new guardrail type based on LLM #2463

ylwu-amzn commented May 21, 2024

[META] Add new guardrail type based on LLM #2463

[META] Add new guardrail type based on LLM #2463

Comments

ylwu-amzn commented May 21, 2024