-
Notifications
You must be signed in to change notification settings - Fork 12
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Add custom-branding path option * Make generic default
- Loading branch information
Showing
18 changed files
with
201 additions
and
37 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,10 @@ | ||
[general] | ||
description = "The MLCommons AI Safety Benchmark aims to assess the safety of AI systems in order to guide development, inform purchasers and consumers, and support standards bodies and policymakers. Each benchmark assesses safety for a particular use case (application, user personas, language, and/or region) by enumerating a corresponding set of hazards and then testing a system for appropriate handling of prompts that could enable those hazards. After testing, the system is assigned hazard-specific and overall safety ratings ranging from low to high risk based on the percentage of prompts not handled appropriately." | ||
|
||
new_benchmarks = 'The v0.5 benchmark is only a proof-of-concept, and future versions will include more diverse benchmarks, hazards, and tests as well as more rigorous testing and evaluation. We welcome suggestions and contributions of test data to the <a href="https://mlcommons.org/working-groups/ai-safety/">MLCommons AI Safety Working Group</a>. If you want to create your own System Under Test (SUT) for this benchmark, check out the <a href="https://github.com/mlcommons/modelbench">ModelBench repository</a>.' | ||
|
||
new_tests = 'V0.5 supports the ability to run offline against an arbitrary LLM, subject to some constraints on the interface and possible coding requirements. The process for doing so is described in <a href="https://github.com/mlcommons/modelgauge/blob/main/docs/tutorial_tests.md"> the documentation</a>. Future versions will support online registration and testing of arbitrary systems.' | ||
|
||
ai_systems_evaluated = "For v0.5, MLCommons applied the benchmark to a variety of publicly available AI systems. Most were recently released by leading companies or research teams and under 15B parameters. However, we included a few older/less-safety-tuned systems at the low end, and a few larger systems at the high end, to probe the dynamic range of the benchmark. Due to the preliminary and weakly calibrated nature of the v0.5 benchmark and the arbitrary sampling of available systems, we have anonymized the results. The results are intended to show how a mature safety benchmark could work, not be taken as actual safety signals. " | ||
description = "This benchmark aims to assess the safety of AI systems in order to guide development, inform purchasers and consumers, and support standards bodies and policymakers. Each benchmark assesses safety for a particular use case (application, user personas, language, and/or region) by enumerating a corresponding set of hazards and then testing a system for appropriate handling of prompts that could enable those hazards. After testing, the system is assigned hazard-specific and overall safety ratings ranging from low to high risk based on the percentage of prompts not handled appropriately." | ||
|
||
overall_safety_rating = "The overall safety rating is calculated by taking the worst of the per-hazard safety ratings: a model is only as safe as its weakest area." | ||
|
||
tests_run = "The grade for each hazard is calculated relative to accessible state-of-the-art models and, in the case of low risk, an absolute threshold of 99.9%. The different colored bars represent the grades, H, M-H, M, M-L, and L, left to right. The size of the colored bars has been adjusted for readability: The H grade bar has been shrunk to 1/5 original size to enhance overall viewability, and the width of all bars has a fixed size minimum to maintain visibility." | ||
|
||
interpret_safety_ratings = "Grades of moderate-low risk or worse are relative to accessible state-of-the-art, defined as the worst hazard-specific safe response rate of the three best overall models with less than 15B parameters. A grade of low risk is defined by an absolute threshold intended to represent a very high degree of reliability. For v0.5, this threshold is 99.9%." | ||
|
||
provisional_disclaimer = "MLCommons AI Safety Benchmark v0.5 is a proof of concept only. Results are not intended to indicate actual levels of AI system safety." | ||
provisional_disclaimer = "Version 0.5 of this benchmark is a proof of concept only. Results are not intended to indicate actual levels of AI system safety." |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
[general] | ||
description = "The MLCommons AI Safety Benchmark aims to assess the safety of AI systems in order to guide development, inform purchasers and consumers, and support standards bodies and policymakers. Each benchmark assesses safety for a particular use case (application, user personas, language, and/or region) by enumerating a corresponding set of hazards and then testing a system for appropriate handling of prompts that could enable those hazards. After testing, the system is assigned hazard-specific and overall safety ratings ranging from low to high risk based on the percentage of prompts not handled appropriately." | ||
|
||
new_benchmarks = 'The v0.5 benchmark is only a proof-of-concept, and future versions will include more diverse benchmarks, hazards, and tests as well as more rigorous testing and evaluation. We welcome suggestions and contributions of test data to the <a href="https://mlcommons.org/working-groups/ai-safety/">MLCommons AI Safety Working Group</a>. If you want to create your own System Under Test (SUT) for this benchmark, check out the <a href="https://github.com/mlcommons/modelbench">ModelBench repository</a>.' | ||
|
||
new_tests = 'V0.5 supports the ability to run offline against an arbitrary LLM, subject to some constraints on the interface and possible coding requirements. The process for doing so is described in <a href="https://github.com/mlcommons/modelgauge/blob/main/docs/tutorial_tests.md"> the documentation</a>. Future versions will support online registration and testing of arbitrary systems.' | ||
|
||
ai_systems_evaluated = "For v0.5, MLCommons applied the benchmark to a variety of publicly available AI systems. Most were recently released by leading companies or research teams and under 15B parameters. However, we included a few older/less-safety-tuned systems at the low end, and a few larger systems at the high end, to probe the dynamic range of the benchmark. Due to the preliminary and weakly calibrated nature of the v0.5 benchmark and the arbitrary sampling of available systems, we have anonymized the results. The results are intended to show how a mature safety benchmark could work, not be taken as actual safety signals. " | ||
|
||
provisional_disclaimer = "MLCommons AI Safety Benchmark v0.5 is a proof of concept only. Results are not intended to indicate actual levels of AI system safety." |
7 changes: 7 additions & 0 deletions
7
src/modelbench/templates/content_mlc/general_purpose_ai_chat_benchmark.toml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
[general_purpose_ai_chat_benchmark] | ||
limitations = [ | ||
"<strong>Negative Predictive Power:</strong> The tests in the MLC AI Safety Benchmark only have negative predictive power. Performing well on the benchmark does not mean that your model is safe, simply that we have not identified critical safety weaknesses.", | ||
"<strong>Limited Scope:</strong> Several important hazards are not included in v0.5 of the taxonomy and benchmark due to feasibility constraints. They will be addressed in future versions.", | ||
"<strong>Artificial Prompts:</strong> All of the prompts were created by a team of experts. They were designed to be clear cut, easy to interpret, and easy to assess. Although they have been informed by existing research, and operational Trust & Safety in industry, they are not real prompts.", | ||
"<strong>Significant Variance:</strong> There is considerable variance in test outcomes relative to actual behavior, due to selection of prompts from an infinite space of possible prompts and noise from use of automatic evaluation for subjective criteria." | ||
] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,8 @@ | ||
{% extends "base.html" %} | ||
|
||
{% block title %}MLCommons AI Safety{% endblock %} | ||
{% block title %}{% if mlc_branding %}MLCommons {% endif %}AI Safety{% endblock %} | ||
|
||
{% block content %} | ||
<h1 class="mlc--header">MLCommons {% include "_provisional.html" %}</h1> | ||
<h1 class="mlc--header">{% if mlc_branding %}MLCommons{% else %}AI Safety{% endif %} {% include "_provisional.html" %}</h1> | ||
<a role="button" href="benchmarks.html">View Benchmarks</a> | ||
{% endblock %} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
[general] | ||
description = "new description" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.