Skip to content

Latest commit

 

History

History
33 lines (22 loc) · 1.21 KB

README.md

File metadata and controls

33 lines (22 loc) · 1.21 KB

DocFigure

A dataset for scientific document figure classiication

How to get the dataset

We proved the scientific document images from the article published in CVPR, ECCV and ICCV. We don't have any copy write on this figure images. We provide you a python script for dowloading the pdf files from IEEE and CVF. Please make sure that you have acces to these websites.

Convert the all pdf file to image file. Download pdfbox

git clone https://github.com/jobinkv/DocFigure.git
cd DocFigure
wget http://mirrors.estointernet.in/apache/pdfbox/2.0.14/pdfbox-app-2.0.14.jar
python readAnotation.py

It will create a folder sub images in a folder images

Trained Models

Trained model link

To test the trained model run

python testTrainedModel.py --trainedFigClassModel '/downloded/path/to/epoch_9_loss_0.04706_testAcc_0.96867_X_resnext101_docSeg.pth' --inputImage '/path/of/inputimage/for/testing'