Skip to content

ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting

License

Notifications You must be signed in to change notification settings

FangShancheng/ABINet-PP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting

The official code of ABINet++.

framework

Runtime Environment

  • Use the pre-built docker image as follows:
$ [email protected]:FangShancheng/ABINet-PP.git
$ docker run --gpus all --rm -ti --shm-size=128g --ipc=host -v "$(pwd)"/ABINet-PP:/workspace/ABINet-PP fangshancheng/adet /bin/bash
$ cd ABINet-PP
$ python setup.py build develop
  • Or build your custom environment from docker/Dockerfile

Datasets

Training and evaluation datasets should be placed in datasets folder:

datasets
├── syntext1
├── syntext2
├── mlt2017
├── totaltext
├── CTW1500
├── icdar2015
├── ChnSyn
├── ReCTS
├── LSVT
├── ArT
├── evaluation
│   ├── gt_ctw1500.zip
│   ├── gt_icdar2015.zip
│   └── gt_totaltext.zip
└── WikiText-103-n96.csv

Here, a more detailed description of these datasets, and the download links can be found from AdelaiDet. Additional links include WikiText-103-n96.csv, and gt_icdar2015.zip (a compatible format for this repository.)

Pretrained Models

Training

English recognition

  • Pretrainining:

    CUDA_VISIBLE_DEVICES=0,1,2,3 python tools/train_net.py \
        --config-file configs/ABINet/Pretrain.yaml \
        --num-gpus 4
    
  • Finetuning on TotalText:

    CUDA_VISIBLE_DEVICES=0,1,2,3 python tools/train_net.py \
        --config-file configs/ABINet/TotalText.yaml \
        --num-gpus 4 \
        MODEL.WEIGHTS weights/abinet/model_pretrain.pth
    
  • Finetuning on CTW1500:

    CUDA_VISIBLE_DEVICES=0,1,2,3 python tools/train_net.py \
        --config-file configs/ABINet/CTW1500.yaml \
        --num-gpus 4 \
        MODEL.WEIGHTS weights/abinet/model_pretrain.pth
    
  • Finetuning on ICDAR2015:

    CUDA_VISIBLE_DEVICES=0,1,2,3 python tools/train_net.py \
        --config-file configs/ABINet/ICDAR2015.yaml \
        --num-gpus 4 \
        MODEL.WEIGHTS weights/abinet/model_pretrain.pth
    

Chinese recognition

  • Pretrainining:

    CUDA_VISIBLE_DEVICES=0,1,2,3 python tools/train_net.py \
        --config-file configs/ABINet/Pretrain-chn.yaml \
        --num-gpus 4
    
  • Finetuning on ReCTS:

    CUDA_VISIBLE_DEVICES=0,1,2,3 python tools/train_net.py \
        --config-file configs/ABINet/ReCTS.yaml \
        --num-gpus 4 \
        MODEL.WEIGHTS weights/abinet/model_pretrain_chn.pth
    

Evaluation

  • Evaluate on Totaltext:

    python tools/train_net.py \
        --config-file configs/ABINet/TotalText.yaml \
        --eval-only \
        MODEL.WEIGHTS weights/abinet/model_totaltext.pth
    
  • Evaluate on CTW1500:

    python tools/train_net.py \
        --config-file configs/ABINet/CTW1500.yaml \
        --eval-only \
        MODEL.WEIGHTS weights/abinet/model_ctw1500.pth
    
  • Evaluate on ICDAR2015:

    python tools/train_net.py \
        --config-file configs/ABINet/ICDAR2015.yaml \
        --eval-only \
        MODEL.WEIGHTS weights/abinet/model_icdar2015.pth
    
  • Evaluate on ReCTS:

    python tools/train_net.py \
        --config-file configs/ABINet/ReCTS.yaml \
        --eval-only \
        MODEL.WEIGHTS weights/abinet/model_rects.pth
    

    For the evaluation of ReCTS, you need to submit the results using the predicted json file in the official website.

Demo

  • For TotalText

    mkdir -p output/abinet/totaltext-vis
    python demo/demo.py \
        --config-file configs/ABINet/TotalText.yaml \
        --input datasets/totaltext/test_images/* \
        --output output/abinet/totaltext-vis \
        --opts MODEL.WEIGHTS weights/abinet/model_totaltext.pth
    
  • For CTW1500

    mkdir -p output/abinet/ctw1500-vis
    python demo/demo.py \
        --config-file configs/ABINet/CTW1500.yaml \
        --input datasets/CTW1500/ctwtest_text_image/* \
        --output output/abinet/ctw1500-vis \
        --opts MODEL.WEIGHTS weights/abinet/model_ctw1500.pth
    
  • For ICDAR2015

    mkdir -p output/abinet/icdar2015-vis
    python demo/demo.py \
        --config-file configs/ABINet/ICDAR2015.yaml \
        --input datasets/icdar2015/test_images/* \
        --output output/abinet/icdar2015-vis \
        --opts MODEL.WEIGHTS weights/abinet/model_icdar2015.pth
    
  • For ReCTS (Chinese)

    wget https://drive.google.com/file/d/1dcR__ZgV_JOfpp8Vde4FR3bSR-QnrHVo/view?usp=sharing -O simsun.ttc
    wget https://drive.google.com/file/d/1wqkX2VAy48yte19q1Yn5IVjdMVpLzYVo/view?usp=sharing -O chn_cls_list
    mkdir -p output/abinet/rects-vis
    python demo/demo.py \
        --config-file configs/ABINet/ReCTS.yaml \
        --input datasets/ReCTS/ReCTS_test_images/* \
        --output output/abinet/rects-vis \
        --opts MODEL.WEIGHTS weights/abinet/model_rects.pth
    

BibTeX

@ARTICLE{9960802,  
    author={Fang, Shancheng and Mao, Zhendong and Xie, Hongtao and Wang, Yuxin and Yan, Chenggang and Zhang, Yongdong},  
    journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},   
    title={ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting},   
    year={2022},  
    volume={},  
    number={},  
    pages={1-18},  
    doi={10.1109/TPAMI.2022.3223908}
}

@inproceedings{fang2021read,
    title={Read like humans: Autonomous, bidirectional and iterative language modeling for scene text recognition},
    author={Fang, Shancheng and Xie, Hongtao and Wang, Yuxin and Mao, Zhendong and Zhang, Yongdong},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    pages={7098--7107},
    year={2021}
}

License

This project is only free for academic research purposes.

Feel free to contact [email protected] if you have any questions.

About

ABINet++: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Spotting

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published