DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models (ICCV 2023)
- Authors: Jaemin Cho, Abhay Zala, and Mohit Bansal (UNC Chapel Hill)
- Paper
Please see ./paintskills for our DETR-based visual reasoning skill evaluation.
(Optional) Please see https://github.com/aszala/PaintSkills-Simulator for our 3D Simulator implementation.
Please see ./biases for our social (gender and skin tone) bias evaluation.
Please see ./quality for our image quaity evaluation based on FID score.
Please see ./retrieval for our image-text alignment evaluation with CLIP-based R-precision.
Please see ./captioning for our image-text alignment evaluation with VL-T5 captioning.
We provide inference scripts for DALLE-small (DALLE-pytorch), minDALL-E, X-LXMERT, and Stable Diffusion.
We thank the developers of DETR, DALLE-pytorch, minDALL-E, X-LXMERT, and Stable Diffusion for their public code release.
Please cite our paper if you use our dataset in your works:
@inproceedings{Cho2023DallEval,
title = {DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generation Models},
author = {Jaemin Cho and Abhay Zala and Mohit Bansal},
year = {2023},
booktitle = {ICCV},
}