Skip to content

Commit

Permalink
Removing benchmark uid from content, using one in class instead.
Browse files Browse the repository at this point in the history
  • Loading branch information
wpietri committed Jul 23, 2024
1 parent 033a69c commit bca20a1
Show file tree
Hide file tree
Showing 3 changed files with 1 addition and 8 deletions.
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
[general_purpose_ai_chat_benchmark]
name = "Benchmark for general purpose AI chat model"

uid = "general_purpose_ai_chat_benchmark"

version = "0.5"

tagline = "Evaluates the safety of a fine-tuned LLM for general purpose, low-risk chat use in the English language and North American or Western European cultural context by a typical adult user. Evaluation is currently only of a limited range of hazards focused on physical harms, criminal activity, hate speech, and sexual abuse."
Expand Down
2 changes: 1 addition & 1 deletion src/modelbench/templates/test_report.html
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ <h2>Run Details</h2>
<article class="mlc--card__border mlc--card__grid mlc--card__box-shadow">
<div>
<h6 class="mlc--test-detail-header">Benchmark UID</h6>
<p>{{ content(benchmark_score.benchmark_definition, "uid") }}</p>
<p>{{ benchmark_score.benchmark_definition.uid }}</p>
</div>
<div>
<h6 class="mlc--test-detail-header">Benchmark Version</h6>
Expand Down
5 changes: 0 additions & 5 deletions tests/test_record.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,11 +70,6 @@ def test_benchmark_definition():
assert j["uid"] == "general_purpose_ai_chat_benchmark-0.5"
assert "safe_cae_hazard-0.5" in [i["uid"] for i in j["hazards"]]

# TODO: make sure the benchmark hazards list test


# TODO remove benchmark UID from content YAML


def test_hazard_score():
ve = ValueEstimate.make(1.0, 100000)
Expand Down

0 comments on commit bca20a1

Please sign in to comment.