TOCSIN

This code is for paper "Zero-Shot Detection of LLM-Generated Text using Token Cohesiveness", where we borrow code and data from Fast-DetectGPT.

Data

Following folders are created for experiments:

./exp_Open_source_model -> experiments for open-source models generations (Five_models.sh).
./exp_API-based_model -> experiments for ChatGPT, GPT-4, and Gemini generations (API-based.sh).

Models loading

If you want to load models locally, place the files for the bart-base model in the 'facebook' directory.

For experiments with Open-Source LLMs, Please download models and create directories in the following format:

gpt2-xl: './gpt2-xl'

opt-2.7b: 'facebook/opt-2.7b'

gpt-neo-2.7B: 'EleutherAI/gpt-neo-2.7B'

gpt-j-6B: 'EleutherAI/gpt-j-6B'

gpt-neox-20b: 'EleutherAI/gpt-neox-20b'

Environment

Python3.8
PyTorch2.1.0

GPU: NVIDIA A40 GPU with 48GB memory

Demo

Please run following commands for a demo:

sh Five_models.sh

for experiments with Open-Source LLMs or

sh API-based.sh

for experiments with API-based LLMs

Citation

If you find this work useful, you can cite it with the following BibTex entry:

@inproceedings{
anonymous2024zeroshot,
title={Zero-Shot Detection of {LLM}-Generated Text using Token Cohesiveness},
author={Anonymous},
booktitle={The 2024 Conference on Empirical Methods in Natural Language Processing},
year={2024},
url={https://openreview.net/forum?id=sbBAqZnszt}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
exp_API-based_model/data		exp_API-based_model/data
exp_Open_source_model/data		exp_Open_source_model/data
API-based.sh		API-based.sh
Five_models.sh		Five_models.sh
LICENSE		LICENSE
README.md		README.md
TOCSIN.py		TOCSIN.py
bart_score.py		bart_score.py
custom_datasets.py		custom_datasets.py
data_builder.py		data_builder.py
metrics.py		metrics.py
model.py		model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TOCSIN

Data

Models loading

Environment

Demo

Citation

About

Releases

Packages

Languages

License

Shixuan-Ma/TOCSIN

Folders and files

Latest commit

History

Repository files navigation

TOCSIN

Data

Models loading

Environment

Demo

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages