Inference and training time is far slower than Caffe2? #103

Tangshitao · 2018-07-06T05:37:37Z

I test the inference and training time per batch. The configuration is the same as Caffe2's version of detectron. The result is as follows:
model | inference time | train time
Res-50-fpn | 0.182s+0.006s | 1.013s/iter
This speed is far behind the one claimed in the official detectron tested under the same environment. What do you think the problem might be?

gdshen · 2018-07-11T11:34:10Z

I test Res-50-C4, and it gives 0.18s/im, much the same as the one claims in Detectron

Tangshitao · 2018-07-25T06:33:54Z

How many gpus do you use for inference? My results show the speed is almost the same as the official one when using 1 gpu and much slower when using 8 gpus.

happyharrycn · 2018-08-08T19:16:32Z

I have to point out an issue when comparing run-time performance for multi-gpu training. The Caffe2 timing assumes peer-to-peer access across all GPUs (see their issue #32). This is typically not the case on many GPU servers, where the training will slow down due to extra communication cost.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference and training time is far slower than Caffe2? #103

Inference and training time is far slower than Caffe2? #103

Tangshitao commented Jul 6, 2018

gdshen commented Jul 11, 2018

Tangshitao commented Jul 25, 2018

happyharrycn commented Aug 8, 2018

Inference and training time is far slower than Caffe2? #103

Inference and training time is far slower than Caffe2? #103

Comments

Tangshitao commented Jul 6, 2018

gdshen commented Jul 11, 2018

Tangshitao commented Jul 25, 2018

happyharrycn commented Aug 8, 2018