Skip to content

The official implementation of the paper “LOKI:A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models”

Notifications You must be signed in to change notification settings

opendatalab/LOKI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

86 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

$\LARGE\textbf{\textsf{{\color[rgb]{1.0, 0.7, 0.0}L}{\color[rgb]{1.0, 0.6, 0.0}O}{\color[rgb]{1.0, 0.5, 0.0}K}{\color[rgb]{1.0, 0.4, 0.0}I}}}{\color[rgb]{0,0,0}}$
A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models

Paper PDF Website Website

LOKI

🎉 News

🔥 Takeaways

Diverse modalities: Our dataset includes high-quality multimodal data generated by recent popular synthetic models, covering $\color{#ffb60dde}{\textbf{video}}$, $\color{rgba(83, 164, 251, 1)}{\textbf{image}}$, $\color{rgba(41, 208, 108, 1)}{\textbf{3D}}$, $\color{rgb(166, 72, 255)}{\textbf{text}}$, $\color{rgb(255, 58, 58)}{\textbf{audio}}$.
Heterogeneous category: Our collected dataset includes 26 detailed categories across different modalities, such as specialized statellite and medical images; texts like philosophy and ancient chinese; and $\color{rgb(255, 58, 58)}{\textbf{audio}}$ data like singing voices, enviromental sound and music.
Multi-level tasks: LOKI includes basic ”Synthetic or Real” labels, suitable for fundamental question settings like true/false and multiple-choice questions. It also incorporates fine-grained anomalies for inferential explanations, enabling tasks like abnormal detail selection and abnormal explanation, to explore LMMs’ capabilities in explainable synthetic data detection.
Multimodal synthetic data evaluation framework: We propose a comprehensive evaluation framework that supports inputs of various data formats and over 25 mainstream multimodal models.

📚 Contents

Installation

Please clone our repository and change to that folder

git clone https://github.com/opendatalab/LOKI.git
cd LOKI

Change to the dev branch, create a new python environment and install relevant requirements

git checkout dev
conda create -n loki python=3.10
conda activate loki
pip install -e .

Data Preparation

LOKI contains media data across 5 modalities: video, image, 3D, text and audio.

To examine the performance of LMMs on each modality, you need to first download the data from huggingface.

Then, unzip the dataset and put it under the current folder.

Your media_data folder should look like:

├── 3D
│   
├── image
│   
├── video

Model Preparation

Our evaluation framework supports over 20+ mainstream foundation models. Please see here for full model list.

Most of our models can be run off-the-shelf with our framework, for models that require special environment setup, we refer readers to here for more information.

Evaluation

Now, start evaluating!

The configs folder contains configurations for the models and LOKI tasks, which are then read and used by run.py

For example, to evaluate Phi-3.5-Vision model on the LOKI's image judgement task, your command should be:

accelerate launch  --num_processes=4 --main_process_port=12005 run.py --model_config_path configs/models/phi_3.5_vision_config.yaml --task_config_path configs/tasks/image/image_tf_loki.yaml --batch_size 1 

Acknowledgement

Some of the design philosophy of our framework is adopted from lmms-eval.

Citations

@article{ye2024loki,
  title={LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models},
  author={Ye, Junyan and Zhou, Baichuan and Huang, Zilong and Zhang, Junan and Bai, Tianyi and Kang, Hengrui and He, Jun and Lin, Honglin and Wang, Zihao and Wu, Tong and others},
  journal={arXiv preprint arXiv:2410.09732},
  year={2024}
}

About

The official implementation of the paper “LOKI:A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models”

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages