-
Notifications
You must be signed in to change notification settings - Fork 8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenCV DNN is faster then Darknet? Different Sizes = Different Predictions. #5144
Comments
Can you show screenshot of FPS for both cases? In general OpenCV_dnn can be slightly faster, since it is optimized for inference-only.
it is due to different resizing approaches: #232 (comment)
This is normal.
What cfg-file do you use? use cfg: https://drive.google.com/open?id=15WhN7W8UZo7-4a0iLkx11Z7_sDVHU4l1
You can use for Training and Detectio the same fixed network resolution width=832 height=416
You can increase subdivisions= only up to batch=. Otherwise you should use another cfg-file:
it is better to train from scratch.
you must add a separate class "all possible objects". And in the training images you should place many different objects in this area and mark as this class "all possible objects" The more training images - the better. |
Hello I'm sorry for the delay.
Using the same settings and weights, with a 416x416 size network (value used in training).
FPS = ~1000/109 => ~9.17 FPS in darknet And here is from OpenCV:
FPS= 1000/25 => 40 FPS in openCV. The test was performed with more than 60 images. The time result is basically constant for both systems. For darknet, time is measured as follows: Lines 1567 to 1570 in afb4cc4
And I use command lline like: For my code, I did the following measurement in loop over all files (single theard):
In both codes the measurement is done item by item, not in batch.
I used yolov3.cfg. Thank you. |
|
GTX 1660 6GB
I think it's end less task. I put a video of 1 min and wait some minutes.
In OpenCV I used the same scheme as said. Single thread loop. The prediction time is 25ms (40 fps), but considering other tasks (decode and nms) the total fps dropped to 27 fps.
One more thing: I use darknet and opencv on linux. Therefore, I compiled both. For OpenCV, I compiled version 4.2.0. |
May be you incorrectly calculate FPS in OpenCV dnn. Or maybe something else is different. What is the What Backend, Traget and Width/Height parameters do you use in OpenCV-dnn?
|
I think there's not wrong with FPS in OpenCV. I process 60 frames with 1.5 seconds and I get the result, i.e., from start of program to the end of program with results. This is de function I use to determine FPS of network. 'result' is object Mat, then it is not linked to network anymore. There's no call to network before that statements.
I'll pull the changes from your repository and compile it again. Maybe there's something new. I'll past here the code:
Decode frame of video. My video is coded with h264, the system decodes it to some other basic format (like bgr or rgb). It's mean the time to grab a single frame.
|
By the way, you should put this lines in CMakeList.txt: Without this lines, my system can't found this packages. |
WOW!
After git pull and rebuild, I get muh better results:
Now it's basically same thing as OpenCV. To make sure I didn't screw up the first build, I went back to the 92e6e8e version and compiled it again, and the FPS dropped to 8 fps again. This version has bad fps. |
I have prepared images and I'm trying train with your specifications. But this weights are recognized as trained model with 500k iterations, then there's no iteration for my dataset (I'll train whit 40k iterations). How should I do?
|
Just train with flag
|
Thank you! |
@AlexeyAB I trained my model with the cfg and weights you indicated. More two things:
|
Hello @AlexeyAB,
first, thank you for your work with the darknet improvements and especially for your documentation, along with my many thanks to @pjreddie.
I'm sorry if it gets too long, I'm not an expert with DNN.
I was given the task of detecting objects, some specific and several low quality images, most of them CIF (352x244).
I did what the ritual says: I set up data and cfg as you teach and trained with ~ 1200 images annotated with 20 classes and some ~ 600 of tests. I know it's not a good number, but that's what I got for now.
I put the number of iterations at 50000 and the size of the network 416x416. It took a few days with a GTX 1660.
The network converged to an avg of 0.12. I see it is a good number, but the mAP was around 35%. But this is another problem, some classes showed ap = 0% due to the low number of examples (I will solve).
Validating the predictions, I found two things:
My system uses Java as a base. So, I used OpenCV and imported the model into the OpenCV DNN. To my surprise, OpenCV (running on CUDA) was 4x faster than running Darknet. I used the same size of net and weight. However, the predictions are a little different! Confidence values are not the same.
That makes sense? Do you know why?
I changed the size of the network after training, as you suggest. In some resolutions the network is able to detect more objects and with high confidentiality. However, increasing the resolution more and more did not always increase the accuracy. In some cases the object is not detected for hi net size (thresh = 0.25). Look (made with OpenCV):
In 416 has no
cinto
, but it is present in almost all resolutions. The confidence ofcinto
is 95,25% in 736px and 63,74% in 992px. The classeschapeu
andmarcha
only appears in some resolutions.Is there any way to improve detection with fixed net size?
And I had some doubts:
I have an important object that is very thin in width. His ap became low. Will increasing the network width help? Do I have to train again? I used fixed 416x416, but I think I can use same aspect ratio of CIF (fixed to some value multiple of 32).
I tried to train with a higher resolution, but I get out of memory. I increased subdivisions to 64, 128 and it didn't help. Can I increase more? I'll lose precision?
In the future, I will have to train more classes, how can I use my current weights as a base? Even increasing the size of the network?
There is a certain area of the image that I would like to detect any foreign object. I cannot train all possible objects. Is there any way to train a pattern, and return if there is something else there?]
Thank you very much.
The text was updated successfully, but these errors were encountered: