Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about YoLoV4 vs EfficientDet #5311

Open
liminghuiv opened this issue Apr 25, 2020 · 7 comments
Open

Question about YoLoV4 vs EfficientDet #5311

liminghuiv opened this issue Apr 25, 2020 · 7 comments

Comments

@liminghuiv
Copy link

I noticed that EfficientDetD0~D7 has image resolution of: 512, 640, 768, 896, 1024, 1280,1280, 1536.
YoloV4 has image resolution: 416, 512, 608.

@AlexeyAB , Does it mean that if YoLoV4 improve the image resolution and the corresponding settings, it can even have better performance than EfficientD7?

Thanks.

@WongKinYiu
Copy link
Collaborator

I think yes, we follow the same way as EfficientNet to optimize our anchors.

using optimized anchor for 416 to train:

  • test on 320x320: 38.4 AP
  • test on 416x416: 41.5 AP
  • test on 512x512: 42.4 AP

using optimized anchor for 512 to train:

  • test on 320x320: 37.7 AP
  • test on 416x416: 41.2 AP
  • test on 512x512: 43.0 AP
  • test on 608x608: 43.5 AP

The image resolution and the corresponding settings can help improve AP.

By the way, EfficientDet is very powerful when inference on VPU or TPU...
See FPS information posted on #5079 (comment)

@liminghuiv
Copy link
Author

liminghuiv commented Apr 25, 2020

Thanks. @WongKinYiu That's encouraging. I am interested in trying it. any suggestion on the settings?

@WongKinYiu
Copy link
Collaborator

image

Larger input resolution means larger objects to be detected. Currently YOLOv4 uses P3-P5 to detect objects. I think for input resolution 640-1024, we need P3-P6; for input resolution 1280-1536, we need P3-P7.

From the table, we can see EfficientDet gets better results on big objects when the input resolution increase.
image

@xevolesi
Copy link

Hi, @WongKinYiu !
Could you please give some insights about anchor-box generation methods or optimization methods?
I saw many issues where AlexeyAB said not to change standard anchor-boxes, but it's not clearly explained why shoudn't we change it?
Where can i read/watch/listen about anchor-box generation procedure and why i shoudn't change standard anchors? Could you please tell me? Thanks.

@AlexeyAB
Copy link
Owner

I saw many issues where AlexeyAB said not to change standard anchor-boxes, but it's not clearly explained why shoudn't we change it?

Because many people generate anchors but do not change masks, as described here: https://github.com/AlexeyAB/darknet#how-to-improve-object-detection

@xevolesi
Copy link

Hi, @AlexeyAB !
Thanks for clarification. I read these recommendations about anchor-boxes twelve times as you suggested in some of the issues. =)) But I still did not understand how do you notice, if the anchor-boxes bad or good? I saw issues where you saying that anchor-boxes are good/bad after seeing point cloud generated by kmeans procedure. If this related to

If many of the calculated anchors do not fit under the appropriate layers - then just try using all the default anchors.

or something else?

@beizhengren
Copy link

@AlexeyAB how to change the masks?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants