Adding multi-layer perceptron in ops #6053

datumbox · 2022-05-19T14:55:13Z

We should avoid using ViT's MLP block from Swin:

vision/torchvision/models/swin_transformer.py

Line 15 in 9d9cfab

from .vision_transformer import MLPBlock

The specific layer is very common and has been previously requested at #4333

This PR:

Adds a generic MLP block to TorchVision.
Handles BC on ViT.
Replaces MLPBlock with MLP on Swin and patches the unreleased weights (weights uploaded on S3 and manifold).

References:

Proof that the new API doesn't break Swin. The minor differences are expected due to the known non-deterministic behaviour of some kernels:

srun -p dev --cpus-per-task=96 -t 24:00:00 --gpus-per-node=1 torchrun --nproc_per_node=1 train.py --model swin_t --test-only -b 1 --weights="Swin_T_Weights.IMAGENET1K_V1"
Test:  Acc@1 81.476 Acc@5 95.780

srun -p dev --cpus-per-task=96 -t 24:00:00 --gpus-per-node=1 torchrun --nproc_per_node=1 train.py --model swin_s --test-only -b 1 --weights="Swin_S_Weights.IMAGENET1K_V1"
Test:  Acc@1 83.182 Acc@5 96.366

srun -p dev --cpus-per-task=96 -t 24:00:00 --gpus-per-node=1 torchrun --nproc_per_node=1 train.py --model swin_b --test-only -b 1 --weights="Swin_B_Weights.IMAGENET1K_V1"
Test:  Acc@1 83.584 Acc@5 96.636

cc @ankitade

torchvision/models/vision_transformer.py

torchvision/ops/misc.py

NicolasHug · 2022-05-19T15:41:39Z

torchvision/ops/misc.py

+        hidden_channels (List[int]): List of the hidden channel dimensions
+        out_channels (int): Number of channels of the output
+        norm_layer (Callable[..., torch.nn.Module], optional): Norm layer that will be stacked on top of the convolution layer. If ``None`` this layer wont be used. Default: ``None``
+        activation_layer (Callable[..., torch.nn.Module], optional): Activation function which will be stacked on top of the normalization layer (if not None), otherwise on top of the conv layer. If ``None`` this layer wont be used. Default: ``torch.nn.ReLU``


Annotations in docstring make me sad :'(

I know how you feel above this. :( I think all the callables all over TorchVision are added like that to provide info on what they are supposed to return.

torchvision/ops/misc.py

NicolasHug

API LGTM, thanks @datumbox

jdsgomes

LGTM, feel free to merge after the same changes are done in Swin. If you prefer I can have a second look once thats done

datumbox · 2022-05-19T17:25:57Z

torchvision/models/swin_transformer.py

 from ..ops.stochastic_depth import StochasticDepth
 from ..transforms._presets import ImageClassification, InterpolationMode
 from ..utils import _log_api_usage_once
 from ._api import WeightsEnum, Weights
 from ._meta import _IMAGENET_CATEGORIES
 from ._utils import _ovewrite_named_param
-from .convnext import Permute
-from .vision_transformer import MLPBlock
+from .convnext import Permute  # TODO: move Permute on ops


This is a straight move from convnext to ops (no weight patching needed) but to avoid doing everything on a single PR I plan to do it on a follow up.

torchvision/ops/misc.py

Summary: * Adding an MLP block. * Adding documentation * Update typos. * Fix inplace for Dropout. * Apply recommendations from code review. * Making changes on pre-trained models. * Fix linter Reviewed By: datumbox, NicolasHug Differential Revision: D36760914 fbshipit-source-id: 331d2ebbf9bb1782695c14bb6ee5e158847ba356

thomasbbrunner · 2022-08-04T12:24:09Z

torchvision/ops/misc.py

+            in_dim = hidden_dim
+
+        layers.append(torch.nn.Linear(in_dim, hidden_channels[-1], bias=bias))
+        layers.append(torch.nn.Dropout(dropout, **params))


It is not very clear for me why there's a Dropout layer after the last layer. I saw that it was present in the previous MLPBlock class, but no other implementation of MLP with dropout (that I could find) has a dropout layer on the output. Including the one in the multimodal package.

Maybe this was something specific for the usecase of MLPBlock? If so, this should not be in this class.

You are right there are various implementations of MLP, some of which don't have at all dropout, some have at the middle but not at the end or some have everywhere. If you check the references, you will see that all patterns exist. Our implementation is like that because it replaces MLP layers used in existing models like ViT and Swin. We also try to support more complex variations with more than 2 linear layers. Your observation is correct though that if one wanted to avoid having dropout at the end, the current implementation wouldn't let them. Since that variant is also valid, perhaps it's worth making this update in a non-BC way with a new boolean that controls the appearance of Dropout at the end or not. WDYT?

I think your suggestion sounds very nice. What would be the default value be for the boolean? I guess that setting it to True (with dropout) would cause no breaking changes. At the same time, I would say that not having a dropout in the last layer is more common (default) configuration? Also, I'd be intested in working on this, whichever option is chosen.

I guess that setting it to True (with dropout) would cause no breaking changes.

Yes, you are right we will need to maintain BC. Note that using True is the "default" setup on TorchVision at the moment as literally all existing models require dropout everywhere.

I'd be intested in working on this

Sounds great, let me recommend the following. Could you start an issue, summarizing what you said here and providing a few references of the usage of MLP with a middle dropout but without the final one? Providing a few examples from real-world vision architectures will help build a stronger case. Once we clarify the details on the issue, we can discuss a potential PR. 😃

Ok! I am a bit short on time at the moment, but will have more time in the upcoming weeks. Nevertheless, I'm interested in this and will be working on it!

datumbox added 3 commits May 19, 2022 15:42

Adding an MLP block.

f6ba25f

Merge branch 'main' into ops/mlp

d39df6d

Adding documentation

eb30cb8

datumbox added module: ops new feature labels May 19, 2022

datumbox requested a review from jdsgomes May 19, 2022 14:55

facebook-github-bot added the cla signed label May 19, 2022

Update typos.

1dfc312

datumbox mentioned this pull request May 19, 2022

[RFC] Batteries Included - Phase 2 #5410

Closed

24 tasks

datumbox requested a review from NicolasHug May 19, 2022 15:12

Fix inplace for Dropout.

39921e8

NicolasHug reviewed May 19, 2022

View reviewed changes

Apply recommendations from code review.

743457d

NicolasHug approved these changes May 19, 2022

View reviewed changes

Making changes on pre-trained models.

9356107

jdsgomes approved these changes May 19, 2022

View reviewed changes

datumbox and others added 2 commits May 19, 2022 18:09

Merge branch 'main' into ops/mlp

007ecf3

Fix linter

48d178d

datumbox changed the title ~~[WIP] Adding multi-layer perceptron in ops~~ Adding multi-layer perceptron in ops May 19, 2022

datumbox commented May 19, 2022

View reviewed changes

datumbox merged commit 77cad12 into pytorch:main May 19, 2022

datumbox deleted the ops/mlp branch May 19, 2022 18:15

datumbox mentioned this pull request May 19, 2022

Move Permute layer to ops #6055

Merged

oke-aditya reviewed May 21, 2022

View reviewed changes

torchvision/ops/misc.py Show resolved Hide resolved

thomasbbrunner reviewed Aug 4, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding multi-layer perceptron in ops #6053

Adding multi-layer perceptron in ops #6053

datumbox commented May 19, 2022 •

edited

Loading

NicolasHug May 19, 2022

datumbox May 19, 2022

NicolasHug left a comment

jdsgomes left a comment

datumbox May 19, 2022

thomasbbrunner Aug 4, 2022 •

edited

Loading

datumbox Aug 4, 2022 •

edited

Loading

thomasbbrunner Aug 8, 2022

datumbox Aug 8, 2022

thomasbbrunner Aug 9, 2022

Adding multi-layer perceptron in ops #6053

Adding multi-layer perceptron in ops #6053

Conversation

datumbox commented May 19, 2022 • edited Loading

NicolasHug May 19, 2022

Choose a reason for hiding this comment

datumbox May 19, 2022

Choose a reason for hiding this comment

NicolasHug left a comment

Choose a reason for hiding this comment

jdsgomes left a comment

Choose a reason for hiding this comment

datumbox May 19, 2022

Choose a reason for hiding this comment

thomasbbrunner Aug 4, 2022 • edited Loading

Choose a reason for hiding this comment

datumbox Aug 4, 2022 • edited Loading

Choose a reason for hiding this comment

thomasbbrunner Aug 8, 2022

Choose a reason for hiding this comment

datumbox Aug 8, 2022

Choose a reason for hiding this comment

thomasbbrunner Aug 9, 2022

Choose a reason for hiding this comment

datumbox commented May 19, 2022 •

edited

Loading

thomasbbrunner Aug 4, 2022 •

edited

Loading

datumbox Aug 4, 2022 •

edited

Loading