Skip to content
This repository has been archived by the owner on Jan 26, 2022. It is now read-only.

Inference and training time is far slower than Caffe2? #103

Open
Tangshitao opened this issue Jul 6, 2018 · 3 comments
Open

Inference and training time is far slower than Caffe2? #103

Tangshitao opened this issue Jul 6, 2018 · 3 comments

Comments

@Tangshitao
Copy link

I test the inference and training time per batch. The configuration is the same as Caffe2's version of detectron. The result is as follows:
model | inference time | train time
Res-50-fpn | 0.182s+0.006s | 1.013s/iter
This speed is far behind the one claimed in the official detectron tested under the same environment. What do you think the problem might be?

@gdshen
Copy link

gdshen commented Jul 11, 2018

I test Res-50-C4, and it gives 0.18s/im, much the same as the one claims in Detectron

@Tangshitao
Copy link
Author

How many gpus do you use for inference? My results show the speed is almost the same as the official one when using 1 gpu and much slower when using 8 gpus.

@happyharrycn
Copy link

I have to point out an issue when comparing run-time performance for multi-gpu training. The Caffe2 timing assumes peer-to-peer access across all GPUs (see their issue #32). This is typically not the case on many GPU servers, where the training will slow down due to extra communication cost.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants