You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 26, 2022. It is now read-only.
I test the inference and training time per batch. The configuration is the same as Caffe2's version of detectron. The result is as follows:
model | inference time | train time
Res-50-fpn | 0.182s+0.006s | 1.013s/iter
This speed is far behind the one claimed in the official detectron tested under the same environment. What do you think the problem might be?
The text was updated successfully, but these errors were encountered:
How many gpus do you use for inference? My results show the speed is almost the same as the official one when using 1 gpu and much slower when using 8 gpus.
I have to point out an issue when comparing run-time performance for multi-gpu training. The Caffe2 timing assumes peer-to-peer access across all GPUs (see their issue #32). This is typically not the case on many GPU servers, where the training will slow down due to extra communication cost.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
I test the inference and training time per batch. The configuration is the same as Caffe2's version of detectron. The result is as follows:
model | inference time | train time
Res-50-fpn | 0.182s+0.006s | 1.013s/iter
This speed is far behind the one claimed in the official detectron tested under the same environment. What do you think the problem might be?
The text was updated successfully, but these errors were encountered: