-
Notifications
You must be signed in to change notification settings - Fork 6.8k
ResNet-50 is slower on Volta since #8302 #9874
Comments
@piiswrong @zheng-da - please take a look, this degradation may be related to your commit. |
Are the speeds that you mention averages? If so, averaged over how many batches? |
It's averaged over 1200 batches, I'm ignoring the 100 first batches. |
I think I may know what is the potential cause of this problem. I'll fix it next week. |
I searched all commits in PR #8302. I think I have found the commits that cause the perf issue. However, I failed to fix the problem. I created a branch that contains the commits. https://github.com/zheng-da/incubator-mxnet/tree/refactor_bn Basically, the commits that refactor BatchNorm cause the issue. @Caenorst could you help look into the issue? Thanks |
Is it know what part of the commit is the problem? |
@lanking520 requesting to close this issue due to lack of activity |
@Caenorst Please feel free to reopen this issue if you are still facing this failure. Close it for now. |
Description
I ran the Minimum reproducible example with the setup below at two different version (before and after #8302):
Here are the results:
d03182f (before #8302):
- real data: 5644 samples / s
- synthetic data: 5971 samples / s
c3e3a83 (after #8302):
- real data: 5461 samples / s
- synthetic data: 5740 samples / s
Latest:
- real data: 5425 samples / s
- synthetic data: 5817 samples / s
@ptrendx @DickJC123 @mkolod
Environment info (Required)
CPUs: Intel Xeon E5-2698 v4 (x2)
GPUs: Nvidia V100 (x8)
Build info (Required if built from source)
From the default config.mk (in make/config.mk) added:
Minimum reproducible example
The text was updated successfully, but these errors were encountered: