Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patronus Lynx Integration #622

Merged
merged 1 commit into from
Jul 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -220,7 +220,7 @@ NeMo Guardrails comes with a set of [built-in guardrails](https://docs.nvidia.co

> **NOTE**: The built-in guardrails are only intended to enable you to get started quickly with NeMo Guardrails. For production use cases, further development and testing of the rails are needed.

Currently, the guardrails library includes guardrails for: [jailbreak detection](https://docs.nvidia.com/nemo/guardrails/user_guides/guardrails-library.html#jailbreak-detection-heuristics), [output moderation](https://docs.nvidia.com/nemo/guardrails/user_guides/guardrails-library.html#self-check-output), [fact-checking](https://docs.nvidia.com/nemo/guardrails/user_guides/guardrails-library.html#fact-checking), [sensitive data detection](https://docs.nvidia.com/nemo/guardrails/user_guides/guardrails-library.html#presidio-based-sensitive-data-detection), [hallucination detection](https://docs.nvidia.com/nemo/guardrails/user_guides/guardrails-library.html#hallucination-detection) and [input moderation using ActiveFence](https://docs.nvidia.com/nemo/guardrails/user_guides/guardrails-library.html#activefence) and [hallucination detection for RAG applications using Got It AI's TruthChecker API](docs/user_guides/guardrails-library.md#got-it-ai).
Currently, the guardrails library includes guardrails for: [jailbreak detection](https://docs.nvidia.com/nemo/guardrails/user_guides/guardrails-library.html#jailbreak-detection-heuristics), [output moderation](https://docs.nvidia.com/nemo/guardrails/user_guides/guardrails-library.html#self-check-output), [fact-checking](https://docs.nvidia.com/nemo/guardrails/user_guides/guardrails-library.html#fact-checking), [sensitive data detection](https://docs.nvidia.com/nemo/guardrails/user_guides/guardrails-library.html#presidio-based-sensitive-data-detection), [hallucination detection](https://docs.nvidia.com/nemo/guardrails/user_guides/guardrails-library.html#hallucination-detection), [input moderation using ActiveFence](<<https://docs.nvidia.com/nemo/guardrails/user_guides/guardrails-library.html#activefence>), [hallucination detection for RAG applications using Got It AI's TruthChecker API](docs/user_guides/guardrails-library.md#got-it-ai), and [RAG hallucination detection using Patronus Lynx](docs/user_guides/guardrails-library.md#patronus-lynx-based-rag-hallucination-detection).

## CLI

Expand Down Expand Up @@ -283,7 +283,7 @@ Evaluating the safety of a LLM-based conversational application is a complex tas

## How is this different?

There are many ways guardrails can be added to an LLM-based conversational application. For example: explicit moderation endpoints (e.g., OpenAI, ActiveFence), critique chains (e.g. constitutional chain), parsing the output (e.g. guardrails.ai), individual guardrails (e.g., LLM-Guard), hallucination detection for RAG applications (e.g., Got It AI).
There are many ways guardrails can be added to an LLM-based conversational application. For example: explicit moderation endpoints (e.g., OpenAI, ActiveFence), critique chains (e.g. constitutional chain), parsing the output (e.g. guardrails.ai), individual guardrails (e.g., LLM-Guard), hallucination detection for RAG applications (e.g., Got It AI, Patronus Lynx).

NeMo Guardrails aims to provide a flexible toolkit that can integrate all these complementary approaches into a cohesive LLM guardrails layer. For example, the toolkit provides out-of-the-box integration with ActiveFence, AlignScore and LangChain chains.

Expand Down
80 changes: 80 additions & 0 deletions docs/user_guides/advanced/patronus-lynx-deployment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Host Patronus Lynx

## vLLM

Lynx is fully open source, so you can host it however you like. One simple way is using vLLM.

1. Get access to Patronus Lynx on HuggingFace. See [here](https://huggingface.co/PatronusAI/Patronus-Lynx-70B-Instruct) for the 70B parameters variant, and [here](https://huggingface.co/PatronusAI/Patronus-Lynx-8B-Instruct) for the 8B parameters variant. The examples below use the `70B` parameters model, but there's no additional configuration to deploy the smaller model, so you can swap the model name references out with `8B`.

2. Log in to Hugging Face

```bash
huggingface-cli login
```

3. Install vLLM and spin up a server hosting Patronus Lynx

```bash
pip install vllm
python -m vllm.entrypoints.openai.api_server --port 5000 --model PatronusAI/Patronus-Lynx-70B-Instruct
```

This will launch the vLLM inference server on `http://localhost:5000/`. You can use the OpenAI API spec to send it a cURL request to make sure it works:

```bash
curl http://localhost:5000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "PatronusAI/Patronus-Lynx-70B-Instruct",
"messages": [
{"role": "user", "content": "What is a hallucination?"},
]
}'
```

4. Create a model called `patronus_lynx` in your `config.yml` file, setting the host and port to what you set it as above. If the vLLM is running on a different server from `nemoguardrails`, you'll have to replace `localhost` with the vLLM server's address. Check out the guide [here](../guardrails-library.md#patronus-lynx-based-rag-hallucination-detection) for more information.

## Ollama

You can also run Patronus Lynx 8B on your personal computer using Ollama!

1. Install Ollama: https://ollama.com/download.

2. Get access to a GGUF quantized version of Lynx 8B on Huggingface. Check it out [here](https://huggingface.co/PatronusAI/Lynx-8B-Instruct-Q4_K_M-GGUF).

3. Download the gguf model from the repository [here](https://huggingface.co/PatronusAI/Lynx-8B-Instruct-Q4_K_M-GGUF/blob/main/patronus-lynx-8b-instruct-q4_k_m.gguf). This may take a few minutes.

4. Create a file called `Modelfile` with the following contents:

```bash
FROM "./patronus-lynx-8b-instruct-q4_k_m.gguf"
PARAMETER stop "<|im_start|>"
PARAMETER stop "<|im_end|>"
TEMPLATE """
<|im_start|>system
{{ .System }}<|im_end|>
<|im_start|>user
{{ .Prompt }}<|im_end|>
<|im_start|>assistant
```

Ensure that the `FROM` field correctly points to the `patronus-lynx-8b-instruct-q4_k_m.gguf` file you downloaded in Step 3.

5. Run `ollama create patronus-lynx-8b -f Modelfile`.

6. Run `ollama run patronus-lynx-8b`. You should now be able to chat with `patronus-lynx-8b`!

7. Create a model called `patronus_lynx` in your `config.yml` file, like this:

```yaml
models:
...

- type: patronus_lynx
engine: ollama
model: patronus-lynx-8b
parameters:
base_url: "http://localhost:11434"
```

Check out the guide [here](../guardrails-library.md#patronus-lynx-based-rag-hallucination-detection) for more information.
71 changes: 71 additions & 0 deletions docs/user_guides/guardrails-library.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ NeMo Guardrails comes with a library of built-in guardrails that you can easily
- [AlignScore-based Fact Checking](#alignscore-based-fact-checking)
- [LlamaGuard-based Content Moderation](#llama-guard-based-content-moderation)
- [Presidio-based Sensitive data detection](#presidio-based-sensitive-data-detection)
- [Patronus Lynx-based RAG Hallucination Detection](#patronus-lynx-based-rag-hallucination-detection)
- BERT-score Hallucination Checking - *[COMING SOON]*

3. Third-Party APIs
Expand Down Expand Up @@ -638,6 +639,76 @@ rails:

If you want to implement a completely different sensitive data detection mechanism, you can override the default actions [`detect_sensitive_data`](https://github.com/NVIDIA/NeMo-Guardrails/tree/develop/nemoguardrails/library/sensitive_data_detection/actions.py) and [`mask_sensitive_data`](https://github.com/NVIDIA/NeMo-Guardrails/tree/develop/nemoguardrails/library/sensitive_data_detection/actions.py).

### Patronus Lynx-based RAG Hallucination Detection

NeMo Guardrails supports hallucination detection in RAG systems using [Patronus AI](www.patronus.ai)'s Lynx model. The model is hosted on Hugging Face and comes in both a 70B parameters (see [here](https://huggingface.co/PatronusAI/Patronus-Lynx-70B-Instruct)) and 8B parameters (see [here](https://huggingface.co/PatronusAI/Patronus-Lynx-8B-Instruct)) variant.

There are three components of hallucination that Lynx checks for:

- Information in the `bot_message` is contained in the `relevant_chunks`
- There is no extra information in the `bot_message` that is not in the `relevant_chunks`
- The `bot_message` does not contradict any information in the `relevant_chunks`

#### Setup

Since Patronus Lynx is fully open source, you can host it however you like. You can find a guide to host Lynx using vLLM [here](./advanced/patronus-lynx-deployment.md).

#### Usage

Here is how to configure your bot to use Patronus Lynx to check for RAG hallucinations in your bot output:

1. Add a model of type `patronus_lynx` in `config.yml` - the example below uses vLLM to run Lynx:

```yaml
models:
...

- type: patronus_lynx
engine: vllm_openai
parameters:
openai_api_base: "http://localhost:5000/v1"
model_name: "PatronusAI/Patronus-Lynx-70B-Instruct" # "PatronusAI/Patronus-Lynx-8B-Instruct"
```

2. Add the guardrail name is `patronus lynx check output hallucination` to your output rails in `config.yml`:

```yaml
rails:
output:
flows:
- patronus lynx check output hallucination
```

3. Add a prompt for `patronus_lynx_check_output_hallucination` in the `prompts.yml` file:

```yaml
prompts:
- task: patronus_lynx_check_output_hallucination
content: |
Given the following QUESTION, DOCUMENT and ANSWER you must analyze ...
...
```

We recommend you base your Lynx hallucination detection prompt off of the provided example [here](https://github.com/NVIDIA/NeMo-Guardrails/tree/develop/examples/configs/patronusai/prompts.yml).

Under the hood, the `patronus lynx check output hallucination` rail runs the `patronus_lynx_check_output_hallucination` action, which you can find [here](https://github.com/NVIDIA/NeMo-Guardrails/tree/develop/nemoguardrails/library/patronusai/actions.py). It returns whether a hallucination is detected (`True` or `False`) and potentially a reasoning trace explaining the decision. The bot's response will be blocked if hallucination is `True`. Note: If Lynx's outputs are misconfigured or a hallucination decision cannot be found, the action default is to return `True` for hallucination.

Here's the `patronus lynx check output hallucination` flow, showing how the action is executed:

```colang
define bot inform answer unknown
"I don't know the answer to that."

define flow patronus lynx check output hallucination
$patronus_lynx_response = execute patronus_lynx_check_output_hallucination
$hallucination = $patronus_lynx_response["hallucination"]
# The Reasoning trace is currently unused, but can be used to modify the bot output
$reasoning = $patronus_lynx_response["reasoning"]

if $hallucination
bot inform answer unknown
stop
```

## Third-Party APIs

Expand Down
1 change: 1 addition & 0 deletions docs/user_guides/llm-support.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ If you want to use an LLM and you cannot see a prompt in the [prompts folder](ht
| ActiveFence moderation _(LLM independent)_ | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
| Llama Guard moderation _(LLM independent)_ | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
| Got It AI RAG TruthChecker _(LLM independent)_ | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
| Patronus Lynx RAG Hallucination detection _(LLM independent)_ | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |

Table legend:
- :heavy_check_mark: - Supported (_The feature is fully supported by the LLM based on our experiments and tests_)
Expand Down
15 changes: 15 additions & 0 deletions examples/configs/patronusai/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
models:
- type: main
engine: openai
model: gpt-3.5-turbo-instruct

- type: patronus_lynx
engine: vllm_openai
parameters:
openai_api_base: "http://localhost:5000/v1"
model_name: "PatronusAI/Patronus-Lynx-70B-Instruct" # "PatronusAI/Patronus-Lynx-8B-Instruct"

rails:
output:
flows:
- patronus lynx check output hallucination
32 changes: 32 additions & 0 deletions examples/configs/patronusai/prompts.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
prompts:
- task: patronus_lynx_check_output_hallucination
content: |
Given the following QUESTION, DOCUMENT and ANSWER you must analyze the provided answer and determine whether it is faithful to the contents of the DOCUMENT.

The ANSWER must not offer new information beyond the context provided in the DOCUMENT.

The ANSWER also must not contradict information provided in the DOCUMENT.

Output your final score by strictly following this format: "PASS" if the answer is faithful to the DOCUMENT and "FAIL" if the answer is not faithful to the DOCUMENT.

Show your reasoning.

--
QUESTION (THIS DOES NOT COUNT AS BACKGROUND INFORMATION):
{{ user_input }}

--
DOCUMENT:
{{ provided_context }}

--
ANSWER:
{{ bot_response }}

--

Your output should be in JSON FORMAT with the keys "REASONING" and "SCORE".

Ensure that the JSON is valid and properly formatted.

{"REASONING": ["<your reasoning as bullet points>"], "SCORE": "<final score>"}
14 changes: 14 additions & 0 deletions nemoguardrails/library/patronusai/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# SPDX-FileCopyrightText: Copyright (c) 2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
110 changes: 110 additions & 0 deletions nemoguardrails/library/patronusai/actions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# SPDX-FileCopyrightText: Copyright (c) 2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import logging
import re
from typing import List, Optional, Tuple

from langchain.llms.base import BaseLLM

from nemoguardrails.actions import action
from nemoguardrails.actions.llm.utils import llm_call
from nemoguardrails.context import llm_call_info_var
from nemoguardrails.llm.params import llm_params
from nemoguardrails.llm.taskmanager import LLMTaskManager
from nemoguardrails.llm.types import Task
from nemoguardrails.logging.explain import LLMCallInfo

log = logging.getLogger(__name__)


def parse_patronus_lynx_response(
response: str,
) -> Tuple[bool, List[str] | None]:
"""
Parses the response from the Patronus Lynx LLM and returns a tuple of:
- Whether the response is hallucinated or not.
- A reasoning trace explaining the decision.
"""
log.info(f"Patronus Lynx response: {response}.")
# Default to hallucinated
hallucination, reasoning = True, None
reasoning_pattern = r'"REASONING":\s*\[(.*?)\]'
score_pattern = r'"SCORE":\s*"?\b(PASS|FAIL)\b"?'

reasoning_match = re.search(reasoning_pattern, response, re.DOTALL)
score_match = re.search(score_pattern, response)

if score_match:
score = score_match.group(1)
if score == "PASS":
hallucination = False
if reasoning_match:
reasoning_content = reasoning_match.group(1)
reasoning = re.split(r"['\"],\s*['\"]", reasoning_content)

return hallucination, reasoning


@action()
async def patronus_lynx_check_output_hallucination(
llm_task_manager: LLMTaskManager,
context: Optional[dict] = None,
patronus_lynx_llm: Optional[BaseLLM] = None,
) -> dict:
"""
Check the bot response for hallucinations based on the given chunks
using the configured Patronus Lynx model.
"""
user_input = context.get("user_message")
bot_response = context.get("bot_message")
provided_context = context.get("relevant_chunks")

if (
not provided_context
or not isinstance(provided_context, str)
or not provided_context.strip()
):
log.error(
"Could not run Patronus Lynx. `relevant_chunks` must be passed as a non-empty string."
)
return {"hallucination": False, "reasoning": None}

check_output_hallucination_prompt = llm_task_manager.render_task_prompt(
task=Task.PATRONUS_LYNX_CHECK_OUTPUT_HALLUCINATION,
context={
"user_input": user_input,
"bot_response": bot_response,
"provided_context": provided_context,
},
)

stop = llm_task_manager.get_stop_tokens(
task=Task.PATRONUS_LYNX_CHECK_OUTPUT_HALLUCINATION
)

# Initialize the LLMCallInfo object
llm_call_info_var.set(
LLMCallInfo(task=Task.PATRONUS_LYNX_CHECK_OUTPUT_HALLUCINATION.value)
)

with llm_params(patronus_lynx_llm, temperature=0.0):
result = await llm_call(
patronus_lynx_llm, check_output_hallucination_prompt, stop=stop
)

hallucination, reasoning = parse_patronus_lynx_response(result)
print(f"Hallucination: {hallucination}, Reasoning: {reasoning}")
return {"hallucination": hallucination, "reasoning": reasoning}
12 changes: 12 additions & 0 deletions nemoguardrails/library/patronusai/flows.co
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
define bot inform answer unknown
"I don't know the answer to that."

define flow patronus lynx check output hallucination
$patronus_lynx_response = execute patronus_lynx_check_output_hallucination
$hallucination = $patronus_lynx_response["hallucination"]
# The Reasoning trace is currently unused, but can be used to modify the bot output
$reasoning = $patronus_lynx_response["reasoning"]

if $hallucination
bot inform answer unknown
stop
2 changes: 2 additions & 0 deletions nemoguardrails/library/patronusai/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# The minimal set of requirements to run Patronus Lynx on vLLM.
vllm==0.2.7
3 changes: 3 additions & 0 deletions nemoguardrails/llm/types.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,9 @@ class Task(Enum):
SELF_CHECK_OUTPUT = "self_check_output"
LLAMA_GUARD_CHECK_INPUT = "llama_guard_check_input"
LLAMA_GUARD_CHECK_OUTPUT = "llama_guard_check_output"
PATRONUS_LYNX_CHECK_OUTPUT_HALLUCINATION = (
"patronus_lynx_check_output_hallucination"
)

SELF_CHECK_FACTS = "fact_checking"
CHECK_HALLUCINATION = "check_hallucination"
Loading