- Content
- 0. Why am I curating this repository?
- 1. Chart Captioning
- 🔥 2. Chart Question Answering
- 3. Chart Reverse Engineering
- 4. Natural Language to Visualization
- 5. Visualization Design
- 6. Visualization Agents & Automatic Judge
- 7. Empirical Study on LLM's Chart Understanding & Chart Generation
- 🔥 8. Visualization for Interpreting, Evaluating, and Improving LLM
- 9. Generic Multimodal Large Language Model
- 10. Related Survey Papers
- 11. Others
- I've found that the existing VisxLLM paper-list repositories are updated infrequently (they're more likely created for a survey paper and then abandoned). I will gradually enrich this repository and keep it updated.
- Feel free to open an issue or a pull request to add a new paper you appreciate.
- Star and watch this repo for future updates 😁.
- Strongly recommend the tutorial LLM4Vis: Large Language Models for Information Visualization delivered by Prof. Hoque.
- Papers under the same category are recorded in reverse chronological order.
Natural Language Dataset Generation Framework for Visualizations Powered by Large Language Models
Hyung-Kwon Ko, Hyeon Jeon, Gwanmo Park, Dae Hyun Kim, Nam Wook Kim, Juho Kim, Jinwook Seo
CHI 2024 • Code
VisText: A Benchmark for Semantically Rich Chart Captioning
Benny J. Tang, Angie Boggust, Arvind Satyanarayan
ACL 2023 Outstanding paper • Code
--- Pre-LLM ---
Chart-to-Text: A Large-Scale Benchmark for Chart Summarization
Shankar Kantharaj, Rixie Tiffany Leong, Xiang Lin, Ahmed Masry, Megh Thakkar, Enamul Hoque, Shafiq Joty
ACL 2022 • Code
Advancing Multimodal Large Language Models in Chart Question Answering with Visualization-Referenced Instruction Tuning
Xingchen Zeng, Haichuan Lin, Yilin Ye, Wei Zeng
VIS 2024 • Code
Distill Visual Chart Reasoning Ability from LLMs to MLLMs
Wei He, Zhiheng Xi, Wanxu Zhao, Xiaoran Fan, Yiwen Ding, Zifei Shan, Tao Gui, Qi Zhang, Xuanjing Huang
arXiv, 24 Oct 2024 • Code
SynChart: Synthesizing Charts from Language Models
Mengchen Liu, Qixiu Li, Dongdong Chen, Dong Chen, Jianmin Bao, Yunsheng Li
arXiv, 25 Sep 2024
ChartMoE: Mixture of Expert Connector for Advanced Chart Understanding
Zhengzhuo Xu, Bowen Qu, Yiyan Qi, Sinan Du, Chengjin Xu, Chun Yuan, Jian Guo
arXiv, 5 Sep 2024 • Code
On Pre-training of Multimodal Language Models Customized for Chart Understanding
Wan-Cyuan Fan, Yen-Chun Chen, Mengchen Liu, Lu Yuan, Leonid Sigal
arXiv, 19 Jul 2024
ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild
Ahmed Masry, Megh Thakkar, Aayush Bajaj, Aaryaman Kartha, Enamul Hoque, Shafiq Joty
arXiv, 4 Jul 2024 • Code
TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning
Liang Zhang*, Anwen Hu*, Haiyang Xu, Ming Yan, Yichen Xu, Qin Jin†, Ji Zhang, Fei Huang
EMNLP 2024 • Code
OneChart: Purify the Chart Structural Extraction via One Auxiliary Token
Jinyue Chen*, Lingyu Kong*, Haoran Wei, Chenglong Liu, Zheng Ge, Liang Zhao, Jianjian Sun, Chunrui Han, Xiangyu Zhang
ACM MM 2024 (Oral) • Homepage
Representing Charts as Text for Language Models: An In-Depth Study of Question Answering for Bar Charts
Victor Soares Bursztyn, Jane Hoffswell, Eunyee Koh, Shunan Guo
VIS 2024 Short Paper
ChartInstruct: Instruction Tuning for Chart Comprehension and Reasoning
Ahmed Masry, Mehrad Shahmohammadi, Md Rizwan Parvez, Enamul Hoque, Shafiq Joty
Findings of ACL 2024 • Code
ChartAssistant: A Universal Chart Multimodal Language Model via Chart-to-Table Pre-training and Multitask Instruction Tuning
Fanqing Meng, Wenqi Shao, Quanfeng Lu, Peng Gao, Kaipeng Zhang, Yu Qiao, Ping Luo
Findings of ACL 2024 • Code
MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning
Fuxiao Liu, Xiaoyang Wang, Wenlin Yao, Jianshu Chen, Kaiqiang Song, Sangwoo Cho, Yaser Yacoob, Dong Yu
NAACL 2024 • Code
Synthesize Step-by-Step: Tools Templates and LLMs as Data Generators for Reasoning-Based Chart VQA
Zhuowan Li, Bhavan Jasani, Peng Tang, Shabnam Ghadar
CVPR 2024
Chartllama: A multimodal llm for chart understanding and generation
Yucheng Han, Chi Zhang, Xin Chen, Xu Yang, Zhibin Wang, Gang Yu, Bin Fu, Hanwang Zhang
arXiv, 27 Nov 2023 • Homepage
--- Pre-LLM ---
UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning
Ahmed Masry∗, Parsa Kavehzadeh∗, Xuan Long Do, Enamul Hoque, Shafiq Joty
EMNLP 2023 • Code
Deplot: One-shot visual language reasoning by plot-to-table translation
Fangyu Liu, Julian Martin Eisenschlos, Francesco Piccinno, Syrine Krichene, Chenxi Pang, Kenton Lee, Mandar Joshi, Wenhu Chen, Nigel Collier, Yasemin Altun
Findings of ACL 2023 • Code
MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering
Fangyu Liu, Francesco Piccinno, Syrine Krichene, Chenxi Pang, Kenton Lee, Mandar Joshi, Yasemin Altun, Nigel Collier, Julian Martin Eisenschlos
ACL 2023 • Code
CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
Zirui Wang, Mengzhou Xia, Luxi He, Howard Chen, Yitao Liu, Richard Zhu, Kaiqu Liang, Xindi Wu, Haotian Liu, Sadhika Malladi, Alexis Chevalier, Sanjeev Arora, Danqi Chen
NeurIPS 2024 Benchmark • Homepage
ChartInsights: Evaluating Multimodal Large Language Models for Low-Level Chart Question Answering
Yifan Wu, Lutao Yan, Leixian Shen, Yunhai Wang, Nan Tang, Yuyu Luo
Findings of EMNLP 2024 • Code
Chartx & chartvlm: A versatile benchmark and foundation model for complicated chart reasoning
Renqiu Xia, Bo Zhang, Hancheng Ye, Xiangchao Yan, Qi Liu, Hongbin Zhou, Zijun Chen, Min Dou, Botian Shi, Junchi Yan, Yu Qiao
arXiv, 19 Feb 2024
ChartBench: A Benchmark for Complex Visual Reasoning in Charts
Zhengzhuo Xu*, Sinan Du*, Yiyan Qi, Chengjin Xu, Chun Yuan†, Jian Guo
arXiv 26 Dec 2023 • Homepage
--- Pre-LLM ---
ChartQA: A Benchmark for Question Answering about Charts with Visual and Logical Reasoning
Ahmed Masry, Xuan Long Do, Jia Qing Tan, Shafiq Joty, Enamul Hoque
Findings of ACL 2022 • Code
InfographicVQA
Minesh Mathew, Viraj Bagal, Rubèn Pérez Tito, Dimosthenis Karatzas, Ernest Valveny, C.V Jawahar
WACV 2022 • Homepage
PlotQA: Reasoning over Scientific Plots
Pritha Ganguly, Nitesh Methani, Mitesh Khapra, Pratyush Kumar
WACV 2020 • Code
Dvqa: Understanding data visualizations via question answering
Kushal Kafle, Brian Price, Scott Cohen, Christopher Kanan
CVPR 2018 • Code
FigureQA: An Annotated Figure Dataset for Visual Reasoning ICLR 2018 Workshop track • Code
VizAbility: Enhancing Chart Accessibility with LLM-based Conversational Interaction
Joshua Gorniak, Yoon Kim, Donglai Wei, Nam Wook Kim
UIST 2024
ChartMimic: Evaluating LMM’s Cross-Modal Reasoning Capability via Chart-to-Code Generation
Chufan Shi and Cheng Yang and Yaxin Liu and Bo Shui and Junjie Wang and Mohan Jing and Linran Xu and Xinyu Zhu and Siheng Li and Yuxiang Zhang and Gongye Liu and Xiaomei Nie and Deng Cai and Yujiu Yang
arXiv, 14 Jun 2024 • Homepage
Is GPT-4V (ision) All You Need for Automating Academic Data Visualization? Exploring Vision-Language Models’ Capability in Reproducing Academic Charts
Zhehao Zhang, Weicheng Ma, Soroush Vosoughi
EMNLP2024 Findings • Code
--- Pre-LLM ---
InvVis: Large-scale data embedding for invertible visualization
Huayuan Ye, Chenhui Li, Yang Li and Changbo Wang
VIS 2023
Deep colormap extraction from visualizations
Lin-Ping Yuan, Wei Zeng, Siwei Fu, Zhiliang Zeng, Haotian Li, Chi-Wing Fu, Huamin Qu
TVCG 2022
Chartem: Reviving Chart Images with Data Embedding
Jiayun Fu, Bin Zhu, Weiwei Cui, Song Ge, Yun Wang, Haidong Zhang, He Huang, Yuanyuan Tang, Dongmei Zhang, and Xiaojing Ma
TVCG 2021
Reverse‐engineering visualizations: Recovering visual encodings from chart images
Jorge Poco and Jeffrey Heer
EuroVis 2017
VisEval: A benchmark for data visualization in the era of large language models
Nan Chen, Yuge Zhang, Jiahang Xu, Kan Ren, and Yuqing Yang
VIS 2024 • Code
Chartgpt: Leveraging LLMs to generate charts from abstract natural language
Yuan Tian, Weiwei Cui, Dazhen Deng, Xinjing Yi, Yurun Yang, Haidong Zhang, Yingcai Wu
TVCG 2024
LLM4Vis: Explainable Visualization Recommendation using ChatGPT
Lei Wang, Songheng Zhang, Yun Wang, Ee-Peng Lim, Yong Wang
EMNLP 2023 Industry • Code
LIDA: A Tool for Automatic Generation of Grammar-Agnostic Visualizations and Infographics using Large Language Models
Victor Dibia
ACL 2023 Demo • Code
--- Pre-LLM ---
Natural Language to Visualization by Neural Machine Translation
Yuyu Luo, Nan Tang, Guoliang Li, Jiawei Tang, Chengliang Chai, Xuedi Qin
VIS 2021 • Code
nvBench: A Large-Scale Synthesized Dataset for Cross-Domain Natural Language to Visualization Task
Yuyu Luo, Jiawei Tang, Guoliang Li
Workshop on NL VIZ 2021 at IEEE VIS 2021 • Code
DataNarrative: Automated Data-Driven Storytelling with Visualizations and Texts
Mohammed Saidul Islam, Enamul Hoque, Shafiq Joty, Md Tahmid Rahman Laskar, Md Rizwan Parvez
EMNLP 2024 • Code
Beyond generating code: Evaluating gpt on a data visualization course
Zhutian Chen, Chenyang Zhang, Qianwen Wang, Jakob Troidl, Simon Warchol, Johanna Beyer
EduVis 2024 • Code
SNIL: Generating Sports News from Insights with Large Language Models
Liqi Cheng, Dazhen Deng, Xiao Xie, Rihong Qiu, Mingliang Xu, Yingcai Wu
TVCG 2024
Large Language Models estimate fine-grained human color-concept associations
Kushin Mukherjee, Timothy T. Rogers, Karen B. Schloss
arXiv, 4 May 2024
NL2Color: Refining Color Palettes for Charts with Natural Language
Chuhan Shi, Weiwei Cui, Chengzhong Liu, Chengbo Zheng, Haidong Zhang, Qiong Luo, and Xiaojuan Ma
VIS 2023
DracoGPT: Extracting Visualization Design Preferences from Large Language Models
Huichen Will Wang, Mitchell Gordon, Leilani Battle, and Jeffrey Heer
VIS 2024
Bavisitter: Integrating Design Guidelines into Large Language Models for Visualization Authoring
Jiwon Choi, Jaeung Lee, Jaemin Jo
VIS 2024 Short Paper
User-Adaptive Visualizations: An Exploration with GPT-4
F. Yanez and C. Nobre
MLVis 2024
The Visualization JUDGE: Can Multimodal Foundation Models Guide Visualization Design Through Visual Perception?
Matthew Berger, Shusen Liu
arXiv, 5 Oct 2024
MatPlotAgent: Method and Evaluation for LLM-Based Agentic Scientific Data Visualization
Zhiyu Yang and Zihan Zhou and Shuo Wang and Xin Cong and Xu Han and Yukun Yan and Zhenghao Liu and Zhixing Tan and Pengyuan Liu and Dong Yu and Zhiyuan Liu and Xiaodong Shi and Maosong Sun
Findings of ACL 2024 • Code
WaitGPT: Monitoring and Steering Conversational LLM Agent in Data Analysis with On-the-Fly Code Visualization
Liwenhan Xie, Chengbo Zheng, Haijun Xia, Huamin Qu, Chen Zhu-Tian
UIST 2024
AVA: Towards Autonomous Visualization Agents through Visual Perception‐Driven Decision‐Making
Shusen Liu, Haichao Miao, Zhimin Li, Matthew Olson, Valerio Pascucci, P‐T Bremer
EuroVis 2024
AVA: An automated and AI-driven intelligent visual analytics framework
Jiazhe Wang, Xi Li, Chenlu Li, Di Peng, Arran Zeyu Wang, Yuhui Gu, Xingui Lai, Haifeng Zhang, Xinyue Xu, Xiaoqing Dong, Zhifeng Lin, Jiehui Zhou, Xingyu Liu, Wei Chen
Visual Informatics 2024
LEVA: Using large language models to enhance visual analytics
Yuheng Zhao, Yixing Zhang, Yu Zhang, Xinyi Zhao, Junjie Wang, Zekai Shao, Cagatay Turkay, Siming Chen
TVCG 2024
How Aligned are Human Chart Takeaways and LLM Predictions? A Case Study on Bar Charts with Varying Layouts
Huichen Will Wang, Jane Hoffswell, Sao Myat Thazin Thane, Victor S Bursztyn, Cindy Xiong Bearfield
VIS 2024
An Empirical Evaluation of the GPT-4 Multimodal Language Model on Visualization Literacy Tasks
Alexander Bendeck, John Stasko
VIS 2024
Visualization Literacy of Multimodal Large Language Models: A Comparative Study
Zhimin Li, Haichao Miao, Valerio Pascucci, Shusen Liu
arXiv, 24 Jun 2024
Enhancing Data Literacy On-demand: LLMs as Guides for Novices in Chart Interpretation
Kiroong Choe, Chaerin Lee, S. Lee, Jiwon Song, Aeri Cho, Nam Wook Kim, Jinwook Seo
VIS 2024
Promises and Pitfalls: Using Large Language Models to Generate Visualization Items
Yuan Cui, Lily W. Ge, Yiren Ding, Lane Harrison, Fumeng Yang, and Matthew Kay
VIS 2024
How Good (Or Bad) Are LLMs at Detecting Misleading Visualizations?
Leo Yu-Ho Lo, Huamin Qu
VIS 2024
Exploring the Capability of LLMs in Performing Low-Level Visual Analytic Tasks on SVG Data Visualizations
Zhongzheng Xu, Emily Wall
VIS 2024 Short Paper
Evaluating the Semantic Profiling Abilities of LLMs for Natural Language Utterances in Data Visualization
Hannah K Bako, Arshnoor Buthani, Xinyi Liu, Kwesi A Cobbina, Zhicheng Liu
VIS 2024 Short Paper • Code
If you are interested in this topic, you can find some interesting interactive papers/demos in the VISxAI workshop.
LVLM-Interpret: An Interpretability Tool for Large Vision-Language Models
Gabriela Ben Melech Stan, Estelle Aflalo, Raanan Yehezkel Rohekar, Anahita Bhiwandiwalla, Shao-Yen Tseng, Matthew Lyle Olson, Yaniv Gurwicz, Chenfei Wu, Nan Duan, Vasudev Lal
arXiv, 3 Apr 2024 • Homepage
LLM Comparator: Interactive Analysis of Side-by-Side Evaluation of Large Language Models
Minsuk Kahng, Ian Tenney, Mahima Pushkarna, Michael Xieyang Liu, James Wexler, Emily Reif, Krystal Kallarackal, Minsuk Chang, Michael Terry, and Lucas Dixon
VIS 2024 • Code • Demo
Towards Dataset-scale and Feature-oriented Evaluation of Text Summarization in Large Language Model Prompts
Sam Yu-Te Lee, Aryaman Bahukhandi, Dongyu Liu, Kwan-Liu Ma
VIS 2024
LLM Attributor: Attribute LLM's Generated Text to Training Data
Seongmin Lee, Zijie J. Wang, Aishwarya Chakravarthy, Alec Helbling, ShengYun Peng, Mansi Phute, Duen Horng (Polo) Chau, Minsuk Kahng
VIS 2024 Poster • Code
Can Large Language Models Explain Their Internal Mechanisms?
Nada Hussein, Asma Ghandeharioun, Ryan Mullins, Emily Reif, Jimbo Wilson, Nithum Thain and Lucas Dixon
VISxAI Workshop 2024 & Demo
ExplainPrompt: Decoding the language of AI prompts
Shawn Simister
VISxAI Workshop 2024 & Demo
Attentionviz: A global view of transformer attention
Catherine Yeh, Yida Chen, Aoyu Wu, Cynthia Chen, Fernanda Viégas, Martin Wattenberg
VIS 2023 • Demo
Do Machine Learning Models Memorize or Generalize?
Adam Pearce, Asma Ghandeharioun, Nada Hussein, Nithum Thain, Martin Wattenberg and Lucas Dixon
VISxAI Workshop 2023 & Demo
Refer to Awesome-Multimodal-Large-Language-Models for a more comprehensive list. This repo only records papers that bring insights to LLMxVis.
ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models
Mingrui Wu, Xinyue Cai, Jiayi Ji, Jiale Li, Oucheng Huang, Gen Luo, Hao Fei, Guannan Jiang, Xiaoshuai Sun, Rongrong Ji
NIPS 2024 • Code
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning
Haotian Zhang, Mingfei Gao, Zhe Gan, Philipp Dufter, Nina Wenzel, Forrest Huang, Dhruti Shah, Xianzhi Du, Bowen Zhang, Yanghao Li, Sam Dodge, Keen You, Zhen Yang, Aleksei Timofeev, Mingze Xu, Hong-You Chen, Jean-Philippe Fauconnier, Zhengfeng Lai, Haoxuan You, Zirui Wang, Afshin Dehghan, Peter Grasch, Yinfei Yang
arXiv, 30 Sep 2024
Law of Vision Representation in MLLMs
Shijia Yang, Bohan Zhai, Quanzeng You, Jianbo Yuan, Hongxia Yang, Chenfeng Xu
arXiv, 29 Aug 2024
EAGLE: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Min Shi*, Fuxiao Liu*, Shihao Wang, Shijia Liao, Subhashree Radhakrishnan, De-An Huang, Hongxu Yin, Karan Sapra, Yaser Yacoob, Humphrey Shi, Bryan Catanzaro, Andrew Tao, Jan Kautz, Zhiding Yu, Guilin Liu
arXiv, 28 Aug 2024 • Code
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
Shengbang Tong*, Ellis Brown*, Penghao Wu*, Sanghyun Woo, Manoj Middepogu, Sai Charitha Akula, Jihan Yang, Shusheng Yang, Adithya Iyer, Xichen Pan, Austin Wang, Rob Fergus, Yann LeCun, Saining Xie
arXiv, 24 Jun 2024 • Code
DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models
Linli Yao, Lei Li, Shuhuai Ren, Lean Wang, Yuanxin Liu, Xu Sun, Lu Hou
arXiv, 31 May 2024 • Code
Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs
Shengbang Tong, Zhuang Liu, Yuexiang Zhai, Yi Ma, Yann LeCun, Saining Xie
CVPR 2024
InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
Zhe Chen, Jiannan Wu, Wenhai Wang, Weijie Su, Guo Chen, Sen Xing, Muyan Zhong, Qinglong Zhang, Xizhou Zhu, Lewei Lu, Bin Li, Ping Luo, Tong Lu, Yu Qiao, Jifeng Dai
CVPR 2024
Shikra: Unleashing Multimodal LLM’s Referential Dialogue Magic
arXiv, 27 Jun 2023 • Code
Natural Language Generation for Visualizations: State of the Art, Challenges and Future Directions
Enamul Hoque, Mohammed Saidul Islam
arXiv, 29 Sep 2024
Generative AI for visualization: State of the art and future directions
Yilin Ye, Jianing Hao, Yihan Hou, Zhan Wang, Shishi Xiao, Yuyu Luo, Wei Zeng
Visual Informatics 2024
Datasets of Visualization for Machine Learning
Can Liu, Ruike Jiang, Shaocong Tan, Jiacheng Yu, Chaofan Yang, Hanning Shao, Xiaoru Yuan
arXiv, 23 Jul 2024
Foundation models meet visualizations: Challenges and opportunities
Weikai Yang, Mengchen Liu, Zheng Wang, Shixia Liu
CVM 2024
From Detection to Application: Recent Advances in Understanding Scientific Tables and Figures
Jiani Huang, Haihua Chen, Fengchang Yu, Wei Lu
ACM Computing Surveys 2024
From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models
Kung-Hsiang Huang, Hou Pong Chan, Yi R. Fung, Haoyi Qiu, Mingyang Zhou, Shafiq Joty, Shih-Fu Chang, Heng Ji
arXiv, 18 Mar 2024
Leveraging large models for crafting narrative visualization: a survey
Yi He, Shixiong Cao, Yang Shi, Qing Chen, Ke Xu, Nan Cao
arXiv, 25 Jan 2024
Natural Language Interfaces for Tabular Data Querying and Visualization: A Survey
Weixu Zhang, Yifei Wang, Yuanfeng Song, Victor Junqiu Wei, Yuxing Tian, Yiyan Qi, Jonathan H. Chan, Raymond Chi-Wing Wong, Haiqin Yang
arXiv, 27 Oct 2023
AI4VIS: Survey on Artificial Intelligence Approaches for Data Visualization
Aoyu Wu, Yun Wang, Xinhuan Shu, Dominik Moritz, Weiwei Cui, Haidong Zhang, Dongmei Zhang, and Huamin Qu
TVCG 2022 • Homepage
Chart Question Answering: State of the Art and Future Directions
Enamul Hoque, Parsa Kavehzadeh, Ahmed Masry
EuroVis 2022
Towards Natural Language Interfaces for Data Visualization: A Survey
Leixian Shen, Enya Shen, Yuyu Luo, Xiaocong Yang, Xuming Hu, Xiongshuai Zhang, Zhiwei Tai, Jianmin Wang
TVCG 2022
VisAnatomy: An SVG Chart Corpus with Fine-Grained Semantic Labels
Chen Chen, Hannah K. Bako, Peihong Yu, John Hooker, Jeffrey Joyal, Simon C. Wang, Samuel Kim, Jessica Wu, Aoxue Ding, Lara Sandeep, Alex Chen, Chayanika Sinha, Zhicheng Liu
arXiv, 16 Oct 2024 • Homepage
Multimodal Chart Retrieval: A Comparison of Text, Table and Image Based Approaches
Averi Nowak, Francesco Piccinno, Yasemin Altun
NAACL 2024