News | Competition | Dataset | Important Dates | Data Format | Evaluation Metrics | Baselines | Contact
When applying detectors to machine-generated text in the wild, the dominant emerging paradigm is to use an open-domain API-based detector. However, many commonly used detectors exhibit poor cross-domain and cross-model robustness. Thus, it is critical to train our detectors to be able to handle text from many domains with both high accuracy and low false positive rates.
In the COLING Workshop on MGT Detection Task 3, we focus on cross-domain robustness of detectors by testing submissions on the RAID benchmark. We adopt the same straightforward binary problem formulation as Task 1, that is: given a piece of text, determine whether it is generated by a machine or authored by a human.
However, in this task the texts will not be limited to any one domain and may come from any one of 8 different domains, 11 generative models, and 4 decoding strategies. Your goal will be to create a detector that exhibits high levels of robustness across all of these models, domains, and decoding strategies while maintaining a low false positive rate.
Our domains are:
Domain | Source | Dataset Link |
---|---|---|
Arxiv Abstracts | arxiv.org | (Link) |
Book Plot Summaries | wikipedia.org | (Link) |
BBC News Articles | bbc.com/news | (Link) |
Poems | poemhunter.com | (Link) |
Reddit Posts | reddit.com | (Link) |
Recipes | allrecipes.com | (Link) |
IMDb Movie Reviews | imdb.com | (Link) |
Wikipedia Articles | wikipedia.org | (Link) |
There are two subtasks:
- Subtask A: Non-Adversarial Cross-Domain MGT detection.
- Subtask B: Adversarial Cross-Domain MGT detection.
We have released our instructions and training set.
We have released our format checking script.
The deadline for the competition has been extended to Nov 2nd 2024.
The competition phase has ended and the evaluation phase has begun.
The evaluation phase has ended and the leaderboard is now live!
The competition will be held on the RAID Website. We will be releasing a separate leaderboard specifically for the shared task that will exist alongside the main RAID leaderboard and will be populated with results after the task finishes.
To submit to the shared task, you must first get predictions for your detector on the test set. Please consult the RAID Leaderboard Submission Instructions for more details on how to get the predictions.json
file for your detector.
After you have the predictions.json
file you must then write a metadata file for your submission. Your metadata file should use the template found in this repository at submissions/template-metadata.json
.
Finally, fork this repository. Add your files to submissions/YOUR-DETECTOR-NAME/predictions.json
and your metadata file to submissions/YOUR-DETECTOR-NAME/metadata.json
and make a pull request to this repository. We have provided an example submission of the OpenAI Roberta Large classifier under submissions/openai-roberta-large
.
Note
Please DO NOT SUBMIT to the main RAID leaderboard during the duration of the shared task. If you do so, you will be disqualified.
For this task we will be using the RAID dataset. Download RAID by consulting the RAID Github Repository.
- 18th September, 2024: Training & test set release
- 2nd November, 2024
25th October, 2024: Submission phase closes - 5th November, 2024
28th October, 2024: Leaderboard to be public - 15th November, 2024: System description paper submission
In order to run our automatic evaluation, your submission must include a file named predictions.json
.
This file should be valid JSON and should be of the following format:
[
{"id": "64005577-3d63-4583-8945-7541d3e53e7d", "score": 0.0021110873541056},
{"id": "c2b9df67-4e29-45ca-bdcc-7065fb907b77", "score": 0.9116235922302712},
...
]
The provided run_detection
function from the RAID Pypi package will output predictions in this format.
If you would like to use your own code, you can run something like the below snippet to output in the correct format.
with open(output_path, "w") as f:
json.dump(df[["id", "score"]].to_dict(orient="records"), f)
To check your submission's correctness please run our provided format checker as follows:
$ python format_check.py --results_path <your_file>.json
The official evaluation metric is TPR @ FPR=5%. That is, the accuracy of the model on detecting machine-generated text at a fixed false positive rate of 5%. To calculate this, our scorer uses the model predictions on human data to search a classification threshold that results in 5% FPR for each domain.
To run the scorer, first run pip install raid-bench
then use the RAID pip package as follows:
from raid import run_detection, run_evaluation
from raid.utils import load_data
# Define your detector function
def my_detector(texts: list[str]) -> list[float]:
pass
# Download & Load the RAID dataset
train_df = load_data(split="train")
# Run your detector on the dataset
predictions = run_detection(my_detector, train_df)
# Evaluate your detector predictions
evaluation_result = run_evaluation(predictions, train_df)
We have run a number of publicly available open-source detectors on RAID. Binoculars gets 79.0%, RADAR gets 65.6%, and roberta-base-openai-detector gets 59.2% on the non-adversarial RAID test set.
We will also be releasing some simple baseline trained models on the RAID dataset. These will be released shortly.
Website: https://genai-content-detection.gitlab.io
Email: [email protected] or directly to [email protected]