diff --git a/README.md b/README.md index 53c93a0..bf7ded2 100644 --- a/README.md +++ b/README.md @@ -29,6 +29,7 @@

Installing from PyPI

+ Yes, we have published WalledEval on PyPI! To install WalledEval and all its dependencies, the easiest method would be to use `pip` to query PyPI. This should, by default, be present in your Python installation. To, install run the following command in a terminal or Command Prompt / Powershell: ```bash @@ -49,6 +50,7 @@ Here too, `python` or `pip` might be replaced with `py` or `python3` and `pip3`

Installing from Source

+ To install from source, you need to get the following: #### Git @@ -178,6 +180,7 @@ logs[0]["score"] # True if safe, False if unsafe

Flow 2: Judge Benchmarking

+ Beyond just LLMs, some datasets are designed to benchmark judges and identify if they are able to accurately classify questions as **safe** or **unsafe**. The general requirements for testing an LLM on Judge Benchmarks is as follows: - **Prompts** - a compilation of prompts and/or responses from LLMs to judge @@ -247,6 +250,7 @@ logs[0]["score"] # True if correct, False if wrong

Flow 3: MCQ Benchmarking

+ Some safety datasets (e..g [WMDP](https://www.wmdp.ai/) and [BBQ](https://aclanthology.org/2022.findings-acl.165/)) are designed to test LLMs on any harmful knowledge or inherent biases that they may possess. These datasets are largely formatted in multiple-choice question (**MCQ**) format, hence why we choose to call them MCQ Benchmarks. The general requirements for testing an LLM on MCQ Benchmarks is as follows: - **MCQ Questions**: a compilation of questions, choices and answer rows