a curated list of image text translation
-
Shaolin Zhu∗, Shangjie Li∗, Yikun Lei, Deyi xiong†. PEIT: Bridging the Modality Gap with Pre-trained Models for End-to-End Image Translation[C]. ACL 2023; code:[code]
-
Zhibin Lan, Jiawei Yu, Xiang Li, Wen Zhang, Jian Luan, Bin Wang, Degen Huang, Jinsong Su. Exploring Better Text Image Translation with Multimodal Codebook[C]. ACL 2023. code:[code].
-
Cong Ma; Xu Han; Linghui Wu; Yaping Zhang; Yang Zhao; Yu Zhou; Chengqing Zong. Modal Contrastive Learning based End-to-End Text Image Machine Translation[J]. TASLP 2023.
-
Cong Ma, Yaping Zhang, Mei Tu, Yang Zhao, Yu Zhou & Chengqing Zong. Multi-teacher Knowledge Distillation for End-to-End Text Image Machine Translation[C]. ICDAR 2023.
-
Cong Ma, Yaping Zhang, Mei Tu, Yang Zhao, Yu Zhou, Chengqing Zong. E2TIMT: Efficient and Effective Modal Adapter for Text Image Machine Translation[C]. ICDAR 2023.
-
Cong Ma, Yaping Zhang, Mei Tu, Yang Zhao, Yu Zhou, Chengqing Zong. CCIM: Cross-modal Cross-lingual Interactive Image Translation[C]. EMNLP 2023.
-
Yanzhi Tian, Xiang Li, Zeming Liu, Yuhang Guo, Bin Wang. In-Image Neural Machine Translation with Segmented Pixel Sequence-to-Sequence Model[C]. EMNLP 2023
- Cong Ma, Yaping Zhang, Mei Tu, Xu Han, Linghui Wu, Yang Zhao, Yu Zhou. Improving End-to-End Text Image Translation From the Auxiliary Text Translation Task[C]. ICPR 2022. code:[code].
- Zhuo Chen; Fei Yin; Qing Yang; Cheng-Lin Liu. Cross-lingual text image recognition via multi-hierarchy cross-modal mimic. IEEE Transactions on Multimedia 2022.
- Zhiyang Zhang, Yaping Zhang, Yupu Liang, Lu Xiang, Yang Zhao, Yu Zhou, Chengqing Zong. [LayoutDIT: Layout-Aware End-to-End Document Image Translation with Multi-Step Conductive Decoder] (https://aclanthology.org/2023.findings-emnlp.673.pdf) [C]. EMNLP 2023.
- Zhiyang Zhang, Yaping Zhang, Lu Xiang, Yang Zhao, Yu Zhou, Chengqing Zong. A Novel Dataset and Benchmark Analysis on Document Image Translation[C]. CCMT 2023.
- Ryota Hinami, Shonosuke Ishiwatari, Kazuhiko Yasuda, Yusuke Matsui. Towards Fully Automated Manga Translation. AAAI 2021.
- Jain P, Firat O, Ge Q, et al. Image translation network[J]. NAACL workshop 2021.
- Su, Tonghua, Shuchen Liu, and Shengjie Zhou. Rtnet: An end-to-end method for handwritten text image translation. ICDAR 2021.
- Elman Mansimov, Mitchell Stern, M. Chen, Orhan Firat, Jakob Uszkoreit, Puneet Jain. Towards End-to-End In-Image Neural Machine Translation[C]. NLPBT 2020.
- Chuang Yang, Kai Zhuang, Mulin Chen, Haozhao Ma, Xu Han, Tao Han, Changxing Guo, Han Han, Bingxuan Zhao, Qi Wang. Traffic Sign Interpretation in Real Road Scene.
###Benchmark Datasets
Dataset | Discription | Competition Paper |
---|---|---|
ICDAR 2015 | 1000 training images and 500 testing images | paper |
ICDAR 2013 | 229 training images and 233 testing images | paper |
ICDAR 2011 | 229 training images and 255 testing images | paper |
ICDAR 2005 | 1001 training images and 489 testing images | paper |
ICDAR 2003 | 181 training images and 251 testing images(word level and character level) | paper |