Add RunEvaluator #4

vowelparrot · 2023-06-01T01:55:32Z

See: langchain-ai/langchain#5618 for example LangChain implementations

I think we'll want some simple non-langchain completion function evaluators if we want this core interface to be of much use outside the OSS project. I also don't think I want to land the StringEvaluator class but put it up as an example of one approach (that contrasts with the ones above)

from typing import Tuple, Optional
from langchain.chat_models import ChatOpenAI
from langchain.evaluation.qa import QAEvalChain
from langchainplus_sdk.client import LangChainPlusClient
from langchainplus_sdk.evaluation.evaluator import StringEvaluator

client = LangChainPlusClient()

dataset_name = "calculator-example-dataset-2"
examples = client.list_examples(dataset_name=dataset_name)
runs = [run for example in examples for run in client.list_runs(reference_example=example.id, error=False)]


chain = QAEvalChain.from_llm(llm=ChatOpenAI())
def evaluate(input_: str, output_: str, answer: Optional[str]) -> Tuple[str, Optional[float]]:
    result = chain({"query": input_, "result": output_, "answer": answer})
    return result["text"], None

evaluator = StringEvaluator(
    evaluation_name="Correctness",
    input_key="input",
    prediction_key='output',
    answer_key='output', 
    grading_function=evaluate
)

for run in runs:
    client.evaluate_run(run, evaluator)

Co-authored-by: Nuno Campos <[email protected]>

agola11

looks good!

vowelparrot and others added 8 commits May 30, 2023 21:58

Create Clients

7473c02

Update .github/workflows/js/tests.yml

cf7238f

Co-authored-by: Nuno Campos <[email protected]>

Update js/.eslintrc.cjs

bff44ac

Co-authored-by: Nuno Campos <[email protected]>

Update js/package.json

1cfcb21

Co-authored-by: Nuno Campos <[email protected]>

rm run

0465a65

rm dotenv

b4c7443

re-add dotenv

6c743db

Add RunEvaluator

c5137da

vowelparrot marked this pull request as draft June 1, 2023 01:55

vowelparrot changed the base branch from vwp/first_commit to main June 1, 2023 13:58

vowelparrot added 4 commits June 1, 2023 20:47

merge

125442c

Merge branch 'main' into vwp/evaluators

ee3f914

Merge branch 'main' into vwp/evaluators

8c0aada

Merge branch 'main' into vwp/evaluators

64a8bf3

vowelparrot force-pushed the vwp/evaluators branch from 1ebddad to 64a8bf3 Compare June 2, 2023 13:42

vowelparrot marked this pull request as ready for review June 2, 2023 18:32

Merge branch 'main' into vwp/evaluators

1e45cb9

vowelparrot force-pushed the vwp/evaluators branch 2 times, most recently from 7d330d9 to 4bd5c36 Compare June 2, 2023 19:10

agola11 approved these changes Jun 2, 2023

View reviewed changes

Update README

5c812c9

vowelparrot force-pushed the vwp/evaluators branch from 4bd5c36 to 5c812c9 Compare June 2, 2023 19:42

vowelparrot added 8 commits June 2, 2023 12:44

lint

d9ab525

Move file

a63c50e

Add JS Evaluators

cbec448

Update

d6399d3

bump

10e8b01

format

48fa4e3

format

8d835f6

update workflows

39d56d1

fix test catch

ff6df8c

vowelparrot force-pushed the vwp/evaluators branch from 23dcf15 to ff6df8c Compare June 2, 2023 21:12

vowelparrot merged commit 0159529 into main Jun 2, 2023

vowelparrot deleted the vwp/evaluators branch June 2, 2023 21:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add RunEvaluator #4

Add RunEvaluator #4

vowelparrot commented Jun 1, 2023 •

edited

Loading

agola11 left a comment

Add RunEvaluator #4

Add RunEvaluator #4

Conversation

vowelparrot commented Jun 1, 2023 • edited Loading

agola11 left a comment

Choose a reason for hiding this comment

vowelparrot commented Jun 1, 2023 •

edited

Loading