Skip to content

Commit

Permalink
add docs
Browse files Browse the repository at this point in the history
  • Loading branch information
bittersweet1999 committed Dec 18, 2023
1 parent 9dd44de commit 7cbfadf
Show file tree
Hide file tree
Showing 3 changed files with 14 additions and 2 deletions.
2 changes: 0 additions & 2 deletions configs/eval_subjective_judge_pandalm.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
from os import getenv as gv

from mmengine.config import read_base
with read_base():
from .models.qwen.hf_qwen_7b_chat import models as hf_qwen_7b_chat
Expand Down
7 changes: 7 additions & 0 deletions docs/en/advanced_guides/subjective_evaluation.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,13 @@ The `-r` parameter allows the reuse of model inference and GPT-4 evaluation resu
The response of JudgeLLM will be output to `output/.../results/timestamp/xxmodel/xxdataset/.json`.
The evaluation report will be output to `output/.../summary/timestamp/report.csv`.

Opencompass has supported lots of JudgeLLM, actually, you can take any model as JudgeLLM in opencompass configs.
And we list the popular open-source JudgeLLM here:

1. Auto-J, refer to `configs/models/judge_llm/auto_j`
2. JudgeLM, refer to `configs/models/judge_llm/judgelm`
3. PandaLM, refer to `configs/models/judge_llm/pandalm`

## Practice: AlignBench Evaluation

### Dataset
Expand Down
7 changes: 7 additions & 0 deletions docs/zh_cn/advanced_guides/subjective_evaluation.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,13 @@ python run.py configs/eval_subjective_score.py -r
JudgeLLM的评测回复会保存在 `output/.../results/timestamp/xxmodel/xxdataset/.json`
评测报告则会输出到 `output/.../summary/timestamp/report.csv`

Opencompass 已经支持了很多的JudgeLLM,实际上,你可以将Opencompass中所支持的所有模型都当作JudgeLLM使用。
我们列出目前比较流行的开源JudgeLLM:

1. Auto-J,请参考 `configs/models/judge_llm/auto_j`
2. JudgeLM,请参考 `configs/models/judge_llm/judgelm`
3. PandaLM,请参考 `configs/models/judge_llm/pandalm`

## 实战:AlignBench 主观评测

### 数据集准备
Expand Down

0 comments on commit 7cbfadf

Please sign in to comment.