Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to improve the performance in detecting small words? #52

Closed
jycloud opened this issue Nov 29, 2017 · 23 comments
Closed

how to improve the performance in detecting small words? #52

jycloud opened this issue Nov 29, 2017 · 23 comments

Comments

@jycloud
Copy link

jycloud commented Nov 29, 2017

No description provided.

@eragonruan
Copy link
Owner

@jycloud some potential solutions to this problem.

  • multi-scale testing, this is the most direct way.
  • smaller anchor size when training
  • feature fusion, since small words may disappear in feature map after several pooling op

@jycloud
Copy link
Author

jycloud commented Nov 30, 2017

@eragonruan
1.multi-scale testing is a simple way,but the cnn model always resize input pictures to 1000x600,if I divide the pictures first,there is a probability to cut the character,it may cause a bad result.
2.The ANCHOR_SCALES == 16,In training phase, you mean I should reset it to 14 or smaller?
3.I dont think so,I use this code to detect scanning pictures of paper documents,my results shows:most(about 80%~90%) small words(about 20x450pixels)could be detected,but some(about more than 10%) could not.

@eragonruan
Copy link
Owner

@jycloud the model can take any size of input(short size should longer than 300), 600x1000 is just a default setting I set.
no. anchor size is 11-273

@jycloud
Copy link
Author

jycloud commented Nov 30, 2017

@eragonruan
where can I set the model input size?
In generate_anchors.py,I find the anchor size setting,the small words' height is more than 11pix, I guess the key is not anchor size,the error is like this:
1130
It seems every paper always miss detecting two or three rows.
how can I optimize ?

@eragonruan
Copy link
Owner

how many rows do you have, I only keep 1000 proposals during test, for your case, each row 20x450 needs about 30 proposals, hence if your document has more than 33 rows, the model may miss something

@jycloud
Copy link
Author

jycloud commented Dec 1, 2017

@eragonruan where can I modify proposals parameter?

@eragonruan
Copy link
Owner

@jycloud
Copy link
Author

jycloud commented Dec 4, 2017

@eragonruan I reset __C.TEST.RPN_POST_NMS_TOP_N to 2000,but results did not change.

@jycloud
Copy link
Author

jycloud commented Dec 4, 2017

@eragonruan I decide to retrain this model by my own dataset,hoped I can solve this problem,by the way,in training phase,the default setting is resize pictures to (1000,600) or (600,1000)?

@eragonruan
Copy link
Owner

@jycloud short side is resized to 600 pixel. afer resize, if long side is longer than 1200 pixel, resize the long side to 1200 pixel.

@jycloud
Copy link
Author

jycloud commented Dec 4, 2017

@eragonruan Thank you very much.

@guddulrk
Copy link

@eragonruan where can I modify iterations (150000)? I am training the model on CPU, so it is taking too much time. I want to reduce the iterations..

Thanks

@eragonruan
Copy link
Owner

@guddulrk
Copy link

Thanks, mate.

@guddulrk
Copy link

@eragonruan should I change stepsize and snapshot_iter? as there is SOLVER: Adam on #L10.

restore: 0
SOLVER: Adam
OHEM: False
RPN_BATCHSIZE: 300
BATCH_SIZE: 300
LOG_IMAGE_ITERS: 100
DISPLAY: 10
SNAPSHOT_ITERS: 1000
HAS_RPN: True
LEARNING_RATE: 0.00001
MOMENTUM: 0.9
GAMMA: 0.1
STEPSIZE: 70000
IMS_PER_BATCH: 1

@eragonruan
Copy link
Owner

eragonruan commented Dec 12, 2017

@guddulrk use the latest code,stepsize is used to adjust lr, snapshot save the model

@guddulrk
Copy link

@eragonruan thank you so much.

@guddulrk
Copy link

@eragonruan I am training the model on new dataset but getting nan values as shown below:
speed: 17.140s / iter
iter: 3390 / 5000, total loss: nan, model loss: nan, rpn_loss_cls: nan, rpn_loss_box: nan, lr: 0.000010

can you please help me to sort out the issue?
Thanks

@eragonruan
Copy link
Owner

@guddulrk this may caused by your training data, check here for more detail.

@kyosocan
Copy link

@eragonruan Thanks for your code.
Now, I'm trying to change the anchor's width from 16 to 10, should I change the parameter _feat_stride as well?
Looking forward to your reply, thank you!

@eragonruan
Copy link
Owner

@kyosocan I don't think 10 is an option. the anchor's width is decided by the feature layer you choose.
for VGG net, conv5_3==>16, conv4_3==>8

@kyosocan
Copy link

@eragonruan  Thank you very much for your reply. Now I see how to adjust this parameter. In addition, if
the resnet is better than VGG 16 in this case ?

@eragonruan
Copy link
Owner

@kyosocan sorry, I did not try model with resnet. but it's worth trying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants