Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RunEvaluator #4

Merged
merged 23 commits into from
Jun 2, 2023
Merged

Add RunEvaluator #4

merged 23 commits into from
Jun 2, 2023

Conversation

vowelparrot
Copy link
Contributor

@vowelparrot vowelparrot commented Jun 1, 2023

See: langchain-ai/langchain#5618 for example LangChain implementations

I think we'll want some simple non-langchain completion function evaluators if we want this core interface to be of much use outside the OSS project. I also don't think I want to land the StringEvaluator class but put it up as an example of one approach (that contrasts with the ones above)

from typing import Tuple, Optional
from langchain.chat_models import ChatOpenAI
from langchain.evaluation.qa import QAEvalChain
from langchainplus_sdk.client import LangChainPlusClient
from langchainplus_sdk.evaluation.evaluator import StringEvaluator

client = LangChainPlusClient()

dataset_name = "calculator-example-dataset-2"
examples = client.list_examples(dataset_name=dataset_name)
runs = [run for example in examples for run in client.list_runs(reference_example=example.id, error=False)]


chain = QAEvalChain.from_llm(llm=ChatOpenAI())
def evaluate(input_: str, output_: str, answer: Optional[str]) -> Tuple[str, Optional[float]]:
    result = chain({"query": input_, "result": output_, "answer": answer})
    return result["text"], None

evaluator = StringEvaluator(
    evaluation_name="Correctness",
    input_key="input",
    prediction_key='output',
    answer_key='output', 
    grading_function=evaluate
)

for run in runs:
    client.evaluate_run(run, evaluator)

@vowelparrot vowelparrot marked this pull request as draft June 1, 2023 01:55
@vowelparrot vowelparrot changed the base branch from vwp/first_commit to main June 1, 2023 13:58
@vowelparrot vowelparrot marked this pull request as ready for review June 2, 2023 18:32
@vowelparrot vowelparrot force-pushed the vwp/evaluators branch 2 times, most recently from 7d330d9 to 4bd5c36 Compare June 2, 2023 19:10
Copy link
Contributor

@agola11 agola11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good!

@vowelparrot vowelparrot merged commit 0159529 into main Jun 2, 2023
@vowelparrot vowelparrot deleted the vwp/evaluators branch June 2, 2023 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants