[RFC] Layers API in Quickvision #61

oke-aditya · 2020-12-01T18:16:52Z

🚀 Feature

A layers API for set of re-usable layers.

Motivation

Most models tend to re-use Layers. Note that this is different from components that we currently have.
Components are FPNs, Backbones, Heads such as YOLO Head, etc.

Layers are fundamentally small and less complicated.
E.g. Activation Functions, Small building blocks such as CSP Layer, simple Tranformer layers,
Squeeze-Excite Block.

We would need to commonly re-use layers across various tasks. E.g. Heads and FPNs are specific to Object Detection / Segmentation, but layers are quite generic. These can be re-used in any NN architecture.

Pitch

We do have layers folder, I think it's time to populate it.
Thinking of some minor stuff e.g. Should we decompose layers to Activation, blocks etc ?
I think we can avoid this for now and later introduce this if the API needs it.

Alternatives

I think this feature addition is great, these will allow us to easily decompose models, re-use, test and maitain code easily.

Additional context

Let's start simple by adding activations and simple blocks, we can move further.

cc @zhiqwang @hassiahk

zhiqwang · 2020-12-01T18:38:53Z

It seems that this will make this repo a little complicated 🤔 Although detection models (yolo/ssd/retinanet) have shard obstruct modules (Backbone, Head, Anchors, and PostProcess), but their commonality is limited in a sense. (maybe I'm wrong here). More specifically, if we unify this three single-stage detector, it seems a little difficult. (One method to handle this problem is like the registration mechanism in detectron2). But I'm open for this features here.

Or another thing to be determined is which layers we should add here?

oke-aditya · 2020-12-01T18:58:16Z

Yes, what would happen at places is our model.py file gets too lengthy, and creates tons of code duplication for common blocks, activation functions.

If we could collectively place them under layers, it would be easier to maintain and test.

For detection and segementation models, if we decompose. as you metioned.
It is often Backbone, Neck (optional, E,g, FPN, etc) Head (RetinaHead, FRCNN Head), and then PostProcess

These are bit high level blocks, and this abstraction is achieved by components, making them re-usable, but still they are model specific . E.g. Detr backbones won't work with YOLO or YOLO head would not work for FRCNN.

Let me illustrate where layers can be helpful.

Layers can provide simple blocks that we see in models.py quite often.
E.g. Mish layer, Swish layer (both aren't in PyTorch yet)
Or blocks such as Dilated Resnet, A simple MLP block. or say.

Maybe the word layers doesn't make sense?
But the idea is to keep these generic.

Registration mechanisms are confusing, and not even used in torchvision / bolts.
we can simply do then
from quickvision.layers import Mish, Swish

oke-aditya · 2020-12-01T19:01:22Z

Question is how to determine if certain block is layer, component or should be part of model.

A high level logic.

If we are pretty sure that a block of PyTorch nn.Sequential or such container is going to used super often.
E.g. Activation functions, Squeeze Excite block. It should be rather a layer.

Backbone, Neck, Head are components, no doubts, as they are model specific.

Model specific layers are something innovative in the paper, and appears very rare in literature.
If we find that these get used often and we have code repetitions or can be made generic, we can move them.

zhiqwang · 2020-12-01T19:01:25Z

Registration mechanisms are confusing, and not even used in torchvision / bolts.
we can simply do then
from quickvision.layers import Mish, Swish

Agree here, torch.hub is an option like detr doing?

oke-aditya · 2020-12-01T19:04:26Z

Yes, Detr is using torch.hub. I could not implement from scratch and Detr transfer learning differs from training, so it becomes quite hard to replicate.

torch.hub is very nice approach and can reduce extreme pains when model implementation is hard.
I won't recommend it when it bundles dependencies, but when it doesn't it is sleek way and provides a good implementation close to authors.

Our discussion about YOLO did have this point, #48 and we decided to avoid hub there 😄

oke-aditya · 2020-12-01T19:20:05Z

I suggest when people add new models, they might not have idea of how these stuff work.
Let them write model as follows

model_folder
----- __init__.py (whaterer we can import)
------ model.py (define whatever model you need)
------ model_factory.py (allow us to initialize model with possible backbones etc)
------ engine.py (all your training logic)
------ utils.py (post process, helpers)

I don't think people might need any additional file, if they need they can create it but these are bare minimum.

Currently we don't have model.py file since we are either using torchvision or torch.hub.
But #48 YOLO will have one, and we should try to keep it simple.

Later we can refactor as we need, since we need to maintain the repo,
so we can shift blocks from model.py wherever we need.

oke-aditya · 2020-12-01T19:21:55Z

Somehow I understand that decomposition kind of distributes our code here and there, but I think it will be only upto this level.

But this keeps flexibility, again authors who want to add model can simple write all code in a folder, we can refactor it.
Only API exposed is through model_factory as create_xyz_model so to end user it is less pain.

zhiqwang · 2020-12-01T19:32:26Z

One question here, add Mish or something like works for me, but how could we make the user convinced to choose our implementation?

In other words, people can simply use ReLU6 or a more experimental repo like https://github.com/digantamisra98/Mish for they first proposed this ops.

Or equivalently, we should determine the principle of adding a new layer?

oke-aditya · 2020-12-01T19:38:14Z

Again @zhiqwang aim is not to go into such super hyper details, we use layers which are used by papers, most use a simple version of mish and not so experimenal.

These are really hard decisions for end user too. We can keep simpler versions E.g it can simply be F.softplus(F.tanh(x)) (I think) .

Possibilities are really endless.

We can provide stuff that is used by papers, or something common, But if we get sufficient user requests we can add these (already we are adding layers which pytorch didn't so there is some limit to how much users might demand).

digantamisra98 · 2020-12-01T23:07:08Z

@oke-aditya You can add support for Echo. We will be releasing beta version soon (New Year's eve) containing optimizers, activations and attention layers. Subsequent releases will contain all custom layers (Conv, Regularizers, etc...)

zhiqwang · 2020-12-02T09:58:40Z

Again @zhiqwang aim is not to go into such super hyper details, we use layers which are used by papers, most use a simple version of mish and not so experimenal.

These are really hard decisions for end user too. We can keep simpler versions E.g it can simply be F.softplus(F.tanh(x)) (I think) .

Possibilities are really endless.

Let's go ahead and see what happens next!

oke-aditya · 2020-12-02T10:28:32Z

@digantamisra98 people are free to use Echo along with this, since adding Echo will add it's sub depedencies which will probably bulk up this library. It would also be hard to keep consistent API of this library and echo implementations. So as of now, I'm not quite sure how it will work out.

digantamisra98 · 2020-12-02T20:34:29Z

@oke-aditya Absolutely understandable.

oke-aditya · 2020-12-10T18:14:26Z

Another point with this API.
I think we should discourage the use functional API in our case for layers.
Reason

Blocks such as MLP, CSPNet do not have any functional API.
Class based API does the same job, it is not C++ exported API.

oke-aditya added the feature A new feature request label Dec 1, 2020

oke-aditya changed the title ~~[RFC]~~ [RFC] Layers API in Quickvision Dec 1, 2020

oke-aditya mentioned this issue Dec 7, 2020

Adds basic layers #80

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Layers API in Quickvision #61

[RFC] Layers API in Quickvision #61

oke-aditya commented Dec 1, 2020

zhiqwang commented Dec 1, 2020 •

edited

Loading

oke-aditya commented Dec 1, 2020

oke-aditya commented Dec 1, 2020

zhiqwang commented Dec 1, 2020

oke-aditya commented Dec 1, 2020 •

edited

Loading

oke-aditya commented Dec 1, 2020

oke-aditya commented Dec 1, 2020

zhiqwang commented Dec 1, 2020 •

edited

Loading

oke-aditya commented Dec 1, 2020 •

edited

Loading

digantamisra98 commented Dec 1, 2020

zhiqwang commented Dec 2, 2020

oke-aditya commented Dec 2, 2020

digantamisra98 commented Dec 2, 2020

oke-aditya commented Dec 10, 2020

[RFC] Layers API in Quickvision #61

[RFC] Layers API in Quickvision #61

Comments

oke-aditya commented Dec 1, 2020

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

zhiqwang commented Dec 1, 2020 • edited Loading

oke-aditya commented Dec 1, 2020

oke-aditya commented Dec 1, 2020

zhiqwang commented Dec 1, 2020

oke-aditya commented Dec 1, 2020 • edited Loading

oke-aditya commented Dec 1, 2020

oke-aditya commented Dec 1, 2020

zhiqwang commented Dec 1, 2020 • edited Loading

oke-aditya commented Dec 1, 2020 • edited Loading

digantamisra98 commented Dec 1, 2020

zhiqwang commented Dec 2, 2020

oke-aditya commented Dec 2, 2020

digantamisra98 commented Dec 2, 2020

oke-aditya commented Dec 10, 2020

zhiqwang commented Dec 1, 2020 •

edited

Loading

oke-aditya commented Dec 1, 2020 •

edited

Loading

zhiqwang commented Dec 1, 2020 •

edited

Loading

oke-aditya commented Dec 1, 2020 •

edited

Loading