-
Notifications
You must be signed in to change notification settings - Fork 209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature-request: YOLOv4-tiny (detector) #59
Comments
Hi @AlexeyAB :) |
Hi @AlexeyAB , |
@mive93 Thanks for the Yolov4-Tiny impl I've tested on Jetson Nano with JetPack 4.4, TensorRT v7.1, 416 input size For FP32, profile results:
For FP16, profile results:
|
Hi @JasonDoingGreat, thanks :)
If needed I can test it on the Xavier or tx2 |
@mive93 Thanks! Yes, please test it on AGX or NX with max_N. |
Here it is.
|
@AlexeyAB @ceccocats @mive93 , single-handedly destroyed the reputation of google, facebook and nvidia. this is extraordinary. |
@mive93 Hi, |
Hi @AlexeyAB, |
Is there any accuracy degradation when you convert darknet weights to tkDNN format? What about accuracy loss when inferring in FP16 or INT8 mode? Is there any way to fine-tune the models in FP16 or INT8 mode or perform quantization aware training beforehand? Thanks |
Hi @mmaaz60 For the FP16 mode, the drop from full precision is negligible, while the drop from full precision to INT8 is heavy. I hope I covered all your doubts. |
Closing for now, |
@mive93 Hi, |
Hi @AlexeyAB Sorry for the late reply. Anyway, if you need some test I am available to do some, I also have a Xavier NX now ;) |
Hi @mive93 |
Hi @MohammadKassemZein |
@mive93 Nice ! Thank you. |
@mive93 Below are the results on Xavier NX.
|
Nice @MohammadKassemZein :) How did you collect those results? |
@mive93 I used your framework (tkDNN) on Jetson NX . |
yeah I guessed :) Sorry, I was vague. Have you activated jetson_clocks? |
I was using MODE 15W 2CORE (which I guess gives the highest clocking for GPU and CPU). |
Hi again @mive93 |
Hi @MohammadKassemZein |
Feature-request: YOLOv4-tiny (detector)
Many other features from Darknet were added previously.
There is required only 1 feature:
groups=
andgroup_id=
to the[route]
layer.So if input
WxHxC
, it divides input by 2 groupsWxHx(C/2)
(there are 2 groups: 0 and 1), and loads the 2nd group_1WxHx(C/2)
.If there are many layers specified in
layers=
parameter, then this will be done for each of the input layers specified inlayer=
, then results will be concatenated across channels.The text was updated successfully, but these errors were encountered: