Skip to content

Commit

Permalink
Update evaluate ragas component as reusable component
Browse files Browse the repository at this point in the history
  • Loading branch information
RobbeSneyders committed Dec 11, 2023
1 parent 08fd581 commit 95c28ea
Show file tree
Hide file tree
Showing 9 changed files with 87 additions and 23 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -9,15 +9,20 @@ RUN apt-get update && \
COPY requirements.txt /
RUN pip3 install --no-cache-dir -r requirements.txt

# Set the working directory to the component folder
WORKDIR /component/src
# Install Fondant
# This is split from other requirements to leverage caching
ARG FONDANT_VERSION=main
RUN pip3 install fondant[component,aws,azure,gcp]@git+https://github.com/ml6team/fondant@${FONDANT_VERSION}

# Copy over src-files
COPY src/ .
# Set the working directory to the component folder
WORKDIR /component
COPY src/ src/

FROM base as test
COPY tests/ tests/
RUN pip3 install --no-cache-dir -r tests/requirements.txt
ARG OPENAI_KEY
ENV OPENAI_KEY=${OPENAI_KEY}
RUN python -m pytest tests

FROM base
Expand Down
55 changes: 55 additions & 0 deletions components/evaluate_ragas/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# retriever_eval_ragas

### Description
Component that evaluates the retriever using RAGAS

### Inputs / outputs

**This component consumes:**

- text: string
- retrieved_chunks: list<item: string>

**This component produces no data.**

### Arguments

The component takes the following arguments to alter its behavior:

| argument | type | description | default |
| -------- | ---- | ----------- | ------- |
| module | str | Module from which the LLM is imported. Defaults to langchain.llms | langchain.llms |
| llm_name | str | Name of the selected llm | / |
| llm_kwargs | dict | Arguments of the selected llm | / |
| metrics | list | RAGAS metrics to compute | / |

### Usage

You can add this component to your pipeline using the following code:

```python
from fondant.pipeline import Pipeline


pipeline = Pipeline(...)

dataset = pipeline.read(...)

dataset = dataset.apply(
"evaluate_ragas",
arguments={
# Add arguments
# "module": "langchain.llms",
# "llm_name": ,
# "llm_kwargs": {},
# "metrics": [],
}
)
```

### Testing

You can run the tests using docker with BuildKit. From this directory, run:
```
docker build . --target test
```
Original file line number Diff line number Diff line change
@@ -1,24 +1,22 @@
#metadata: to be matched w/ docker image
name: retriever_eval_ragas
description: Component that evaluates the retriever using RAGAS
image: ghcr.io/ml6team/retriever_eval:dev
image: fndnt/retriever_eval:dev
tags:
- Data writing
- Text processing

consumes:
text: #TODO: same as previous component produces
text:
type: string
retrieved_chunks:
type: array
items:
type: string

produces:
#TODO: add/retrieve chosen metrics to compute
context_precision:
type: float32
context_relevancy:
type: float32
additionalProperties: true
# Overwrite with metrics to be computed by ragas
# (https://docs.ragas.io/en/latest/concepts/metrics/index.html)


args:
module:
Expand Down
1 change: 1 addition & 0 deletions components/evaluate_ragas/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ragas==0.0.21
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
import typing as t

import pandas as pd
from datasets import Dataset
from fondant.component import PandasTransformComponent
Expand All @@ -12,15 +14,15 @@ def __init__(
module: str,
llm_name: str,
llm_kwargs: dict,
metrics: list,
produces: t.Dict[str, t.Any],
**kwargs,
) -> None:
"""
Args:
module: Module from which the LLM is imported. Defaults to langchain.llms
llm_name: Name of the selected llm
llm_kwargs: Arguments of the selected llm
metrics: RAGAS metrics to compute.
produces: RAGAS metrics to compute.
kwargs: Unhandled keyword arguments passed in by Fondant.
"""
self.llm = self.extract_llm(
Expand All @@ -29,7 +31,9 @@ def __init__(
model_kwargs=llm_kwargs,
)
self.gpt_wrapper = LangchainLLM(llm=self.llm)
self.metric_functions = self.extract_metric_functions(metrics=metrics)
self.metric_functions = self.extract_metric_functions(
metrics=list(produces.keys()),
)
self.set_llm(self.metric_functions)

# import the metric functions selected
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,7 @@
import os

import pandas as pd
import pyarrow as pa
from main import RetrieverEval


Expand Down Expand Up @@ -49,8 +52,11 @@ def test_transform():
component = RetrieverEval(
module="langchain.llms",
llm_name="OpenAI",
llm_kwargs={"openai_api_key": ""},
metrics=["context_precision", "context_relevancy"],
llm_kwargs={"openai_api_key": os.environ["OPENAI_KEY"]},
produces={
"context_precision": pa.float32(),
"context_relevancy": pa.float32(),
},
)

output_dataframe = component.transform(input_dataframe)
Expand Down
File renamed without changes.
5 changes: 0 additions & 5 deletions components/retriever_eval_ragas/requirements.txt

This file was deleted.

0 comments on commit 95c28ea

Please sign in to comment.