Skip to content

Commit

Permalink
Minor refinement to wording.
Browse files Browse the repository at this point in the history
Signed-off-by: Dean Wampler <[email protected]>
  • Loading branch information
deanwampler committed Oct 15, 2024
1 parent 3f1f7b1 commit 11df544
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion docs/index.markdown
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Welcome to the **The AI Alliance** project for **Trust and Safety Evaluations**.
Much like other software, generative AI (&ldquo;GenAI&rdquo;) [_Models_]({{site.baseurl}}/glossary/#model) and the [_AI Systems_]({{site.baseurl}}/glossary/#ai-system) that use them need to be trusted and useful to their users.

[_Evaluation_]({{site.baseurl}}/glossary/#evaluation) aims to provide the evidence for gaining users’ trust in models and systems. More specifically, evaluation refers to the capability of measuring and quantifying how a model or system responds to inputs. Are the responses within acceptable bounds, for example free of hate speech and [_Hallucinations_]({{site.baseurl}}/glossary/#hallucination), useful to user, cost-effective, etc.?
[_Evaluation_]({{site.baseurl}}/glossary/#evaluation) aims to provide the evidence for gaining users’ trust in models and systems. More specifically, evaluation refers to the capability of measuring and quantifying how a model or system responds to inputs. Are the responses within acceptable bounds, for example free of hate speech and [_Hallucinations_]({{site.baseurl}}/glossary/#hallucination), are they useful to users, cost-effective, etc.?

There are many organizations working on evaluations for safety, broadly defined, and other kinds of measurements, as well as [_Benchmarks_]({{site.baseurl}}/glossary/#benchmark) that aggregate some evaluations and [_Leaderboards_]({{site.baseurl}}/glossary/#leaderboard) that let you see how some models and systems do against benchmarks, without having to execute these benchmarks yourself.

Expand Down

0 comments on commit 11df544

Please sign in to comment.