You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Chart question answering (CQA) is a newly proposed visual question answering (VQA) task where an algorithm must answer questions about data visualizations, e.g. bar charts, pie charts, and line graphs. CQA requires capabilities that natural-image VQA algorithms lack: fine-grained measurements, optical character recognition, and handling out-of-vocabulary words in both questions and answers. Without modifications, state-of-the-art VQA algorithms perform poorly on this task. Here, we propose a novel CQA algorithm called parallel recurrent fusion of image and language (PReFIL). PReFIL first learns bimodal embeddings by fusing question and image features and then intelligently aggregates these learned embeddings to answer the given question. Despite its simplicity, PReFIL greatly surpasses state-of-the art systems and human baselines on both the FigureQA and DVQA datasets. Additionally, we demonstrate that PReFIL can be used to reconstruct tables by asking a series of questions about a chart.
The text was updated successfully, but these errors were encountered:
Hi @pratikkotian04 thanks for bringing up this topic. We are also quite interested in extending the question answering capabilities from text over tables to charts and other kinds of content. It's a larger topic though and right now I can't give you an estimation of when question answering on bar charts, pie charts and other visualizations will be supported by Haystack.
Did you have a look at the code accompanying the paper you just quoted? I found it here: https://github.com/kushalkafle/PREFIL As it is of 2020 and we're already in 2022 maybe there is some more recent research around on that topic.
Hi @julian-risch , I am using Haystack for Text and Table question answering at my organization and it would have been helpful if I could continue using haystack for chart question answering as well.
@pratikkotian04 We definitely have QA on charts on our list, yes. 👍 At the moment, we are working on an epic that will integrate image documents in addition to text documents into Haystack. In that context, we will implement an ImageRetriever but also an ImageToText node. Here is the link: #2418 Once this epic is done we will be able to tackle QA on charts in Q3, 2022. Stay tuned!
Chart question answering (CQA) is a newly proposed visual question answering (VQA) task where an algorithm must answer questions about data visualizations, e.g. bar charts, pie charts, and line graphs. CQA requires capabilities that natural-image VQA algorithms lack: fine-grained measurements, optical character recognition, and handling out-of-vocabulary words in both questions and answers. Without modifications, state-of-the-art VQA algorithms perform poorly on this task. Here, we propose a novel CQA algorithm called parallel recurrent fusion of image and language (PReFIL). PReFIL first learns bimodal embeddings by fusing question and image features and then intelligently aggregates these learned embeddings to answer the given question. Despite its simplicity, PReFIL greatly surpasses state-of-the art systems and human baselines on both the FigureQA and DVQA datasets. Additionally, we demonstrate that PReFIL can be used to reconstruct tables by asking a series of questions about a chart.
The text was updated successfully, but these errors were encountered: