List of deep learning models that are capable of distinguishing computer-generated text from the human-generated text. Each model is inside their respective folder.
- chatgpt-roberta: Recent model (2023) focused on distinguishing chatgpt generated text from a human. They started with the Roberta-base model (which is a masked language model) and then finetuned it in curated data. PAPER CODE.
- openai-roberta-base: Model trained in 2019 for distinguish gpt2 generated text from human. They started with the Roberta-base model (which is a masked language model) and then finetuned it in curated data. PAPER CODE.
- openai-roberta-large: Model trained in 2019 for distinguish gpt2 generated text from human. They started with the Roberta-large model (which is a masked language model) and then finetuned it in curated data. PAPER CODE.
In each folder there is a run.sh
script, just run that. The script will build a virtual environment and install all the dependencies.
Inside chatgpt-roberta-detector folder
$ ./run.sh
Inside openai-reberta-detector folder
This runs the base version
$ ./run-base.sh
This runs the large version
$ ./run-large.sh
The openai-roberta-detector
only works with python3.7, so python3.7 must be installed!!! (This is a hard requirement due to the transformer library)
The chatgpt-roberta-detector
works with the latest version of the required libraries, so it should be good to run.
For both models, I am creating virtual-environments, so it is required to have the package python-venv installed.