Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve parallelisation strategy for conv2d #15171

Closed
pavlejosipovic opened this issue Nov 18, 2024 · 2 comments
Closed

Improve parallelisation strategy for conv2d #15171

pavlejosipovic opened this issue Nov 18, 2024 · 2 comments
Assignees
Labels
bug Something isn't working CNNs op_cat: conv2D 2D convolution for CNNs P1

Comments

@pavlejosipovic
Copy link
Contributor

Conv2d determines number of work items along height and width dim and than tries to map to an array of tensix cores in 1d or 2d depending on the sharding strategy selected. In case this number of work items is a primer number or close to the prime number, conv2d will map one core or a handful of cores.
In addition to poor performance in these cases, this leads to out-of-memory issues since conv2d can use small number of cores to store the sharded tensors, and single core gets to process large chunk of work.

To improve on this we need to pad number of work items so that we can distribute the the workload in an effective way over the grid of tensix cores.

@pavlejosipovic pavlejosipovic added bug Something isn't working CNNs op_cat: conv2D 2D convolution for CNNs P1 labels Nov 18, 2024
@pavlejosipovic pavlejosipovic self-assigned this Nov 18, 2024
pavlejosipovic pushed a commit that referenced this issue Nov 18, 2024
# Please enter the commit message for your changes. Lines starting
# with '#' will be kept; you may remove them yourself if you want to.
# An empty message aborts the commit.
#
# Date:      Tue Nov 5 10:20:56 2024 +0000
#
# On branch pjosipovic/conv2d_better_parallelization
# Your branch is up to date with 'origin/pjosipovic/conv2d_better_parallelization'.
#
# Changes to be committed:
#	modified:   models/demos/ttnn_resnet/tt/ttnn_functional_resnet50_new_conv_api.py
#	modified:   models/demos/wormhole/stable_diffusion/tt/ttnn_functional_downsample_2d_new_conv.py
#	modified:   models/demos/wormhole/stable_diffusion/tt/ttnn_functional_resnetblock2d_new_conv.py
#	modified:   models/demos/yolov4/ttnn/downsample1.py
#	modified:   models/experimental/functional_unet/tt/unet_shallow_ttnn.py
#	modified:   tests/sweep_framework/sweeps/conv2d/short/conv2d_short_sweep.py
#	modified:   tests/ttnn/unit_tests/operations/test_maxpool2d.py
#	modified:   tests/ttnn/unit_tests/operations/test_new_conv2d.py
#	modified:   ttnn/cpp/ttnn/operations/conv/conv2d/conv2d.cpp
#	modified:   ttnn/cpp/ttnn/operations/conv/conv2d/conv2d.hpp
#	modified:   ttnn/cpp/ttnn/operations/conv/conv2d/conv2d_pybind.cpp
#	modified:   ttnn/cpp/ttnn/operations/conv/conv2d/device/conv2d_op.cpp
#	modified:   ttnn/cpp/ttnn/operations/conv/conv2d/device/conv2d_op.hpp
#	modified:   ttnn/cpp/ttnn/operations/conv/conv2d/device/conv2d_op_sharded_program_factory.cpp
#	modified:   ttnn/cpp/ttnn/operations/conv/conv2d/device/conv2d_op_width_sharded_program_factory.cpp
#	modified:   ttnn/cpp/ttnn/operations/conv/conv_transpose2d/conv_transpose2d.cpp
#	modified:   ttnn/cpp/ttnn/operations/matmul/device/matmul_op.cpp
#	modified:   ttnn/cpp/ttnn/operations/pool/maxpool/max_pool2d.cpp
#	modified:   ttnn/ttnn/__init__.py
#	modified:   ttnn/ttnn/operations/conv2d.py
#
@ayerofieiev-tt ayerofieiev-tt moved this from Todo to In Progress in PyTorch 2.0 TT-NN Compiler Nov 21, 2024
pavlejosipovic pushed a commit that referenced this issue Nov 25, 2024
pavlejosipovic pushed a commit that referenced this issue Nov 27, 2024
@kmabeeTT
Copy link
Contributor

Hi @pavlejosipovic - We hit compile error in tt-mlir after uplifting to latest tt-metal that includes this change because new field enable_channels_padding on determine_parallel_config() was added but we don't set it. What is guidance for setting this field? FYI @LPanosTT

kmabeeTT added a commit to tenstorrent/tt-mlir that referenced this issue Nov 28, 2024
@pavlejosipovic
Copy link
Contributor Author

Hi @pavlejosipovic - We hit compile error in tt-mlir after uplifting to latest tt-metal that includes this change because new field enable_channels_padding on determine_parallel_config() was added but we don't set it. What is guidance for setting this field? FYI @LPanosTT

For conv ops it should be set to true, and for max_pool it should be set to false.
Hopefully, we will enable this feature for max_pool as well, and get auto-sharing support as well so you won't have to use this function.

kmabeeTT added a commit to tenstorrent/tt-mlir that referenced this issue Nov 28, 2024
 - due to tt-metal changes from tenstorrent/tt-metal#15171

(cherry picked from commit 14cbe04)
@github-project-automation github-project-automation bot moved this from In Progress to Done in PyTorch 2.0 TT-NN Compiler Dec 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CNNs op_cat: conv2D 2D convolution for CNNs P1
Projects
Status: Done
Development

No branches or pull requests

2 participants