Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize x86 conv3d_ndhwc using data packing approach. #4866

Merged
merged 1 commit into from
Feb 13, 2020

Conversation

alexgl-github
Copy link
Contributor

Add tuneable conv3d_ndhwc schedule

Thanks for contributing to TVM! Please refer to guideline https://docs.tvm.ai/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers by @ them in the pull request thread.

@alexgl-github
Copy link
Contributor Author

@anijain2305 Please take a look

@kevinthesun
Copy link
Contributor

Thank you for this work! It would be great if you can provide benchmarking data comparing tvm conv3d performance VS existing solution(tensorflow + mkldnn?) to see where current implementation stands.

Copy link
Contributor

@anijain2305 anijain2305 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments.

Overall looks good to me. Currently, you have used NCHW schedule of Conv2D. you should also try Conv2d NCHWc schedule. That schedule gives best performance for Conv2D, and has potential here as well.

https://github.com/apache/incubator-tvm/blob/master/topi/python/topi/nn/conv2d.py#L421

topi/python/topi/nn/util.py Outdated Show resolved Hide resolved
topi/python/topi/x86/conv3d.py Outdated Show resolved Hide resolved
Copy link
Contributor

@anijain2305 anijain2305 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to go from my side! If you can, please add some TF comparison.

@alexgl-github
Copy link
Contributor Author

alexgl-github commented Feb 13, 2020

Thank you for this work! It would be great if you can provide benchmarking data comparing tvm conv3d performance VS existing solution(tensorflow + mkldnn?) to see where current implementation stands.

@kevinthesun
Below is benchmark results for certain data/kernel combinations, run on 2 core Intel "ivybridge" vs TF 1.15 + mkldnn. X: value means speedup (or slowdown) of TVM model vs same TF model

TVM: 0.007 sec; X: 1.645; TF: 0.011 sec; input_shape=(1, 16, 256, 256, 1) ; kernel_shape=(1, 3, 3, 1, 8)
TVM: 0.019 sec; X: 1.218; TF: 0.023 sec; input_shape=(1, 16, 256, 256, 1) ; kernel_shape=(1, 7, 7, 1, 8)
TVM: 0.054 sec; X: 1.490; TF: 0.080 sec; input_shape=(1, 16, 256, 256, 8) ; kernel_shape=(1, 3, 3, 8, 16)
TVM: 0.262 sec; X: 0.869; TF: 0.228 sec; input_shape=(1, 16, 256, 256, 8) ; kernel_shape=(1, 7, 7, 8, 16)
TVM: 0.013 sec; X: 1.290; TF: 0.016 sec; input_shape=(1, 16, 256, 256, 1) ; kernel_shape=(3, 3, 3, 1, 8)
TVM: 0.114 sec; X: 1.148; TF: 0.131 sec; input_shape=(1, 16, 256, 256, 1) ; kernel_shape=(7, 7, 7, 1, 8)
TVM: 0.146 sec; X: 1.058; TF: 0.154 sec; input_shape=(1, 16, 256, 256, 8) ; kernel_shape=(3, 3, 3, 8, 16)
TVM: 2.432 sec; X: 0.591; TF: 1.436 sec; input_shape=(1, 16, 256, 256, 8) ; kernel_shape=(7, 7, 7, 8, 16)

Copy link
Contributor

@kevinthesun kevinthesun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. As suggestion from @anijain2305, we might also want to try similar data layout as conv2d_NCHWc to see whether we can get more performance improvement.

@kevinthesun kevinthesun merged commit 8d94587 into apache:master Feb 13, 2020
@kevinthesun
Copy link
Contributor

Thanks @alexgl-github @anijain2305

alexwong pushed a commit to alexwong/tvm that referenced this pull request Feb 26, 2020
alexwong pushed a commit to alexwong/tvm that referenced this pull request Feb 28, 2020
zhiics pushed a commit to neo-ai/tvm that referenced this pull request Mar 2, 2020
@alexgl-github alexgl-github deleted the conv3d_packed branch November 3, 2020 22:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants