Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Layers API in Quickvision #61

Open
oke-aditya opened this issue Dec 1, 2020 · 14 comments
Open

[RFC] Layers API in Quickvision #61

oke-aditya opened this issue Dec 1, 2020 · 14 comments
Labels
feature A new feature request

Comments

@oke-aditya
Copy link
Owner

🚀 Feature

A layers API for set of re-usable layers.

Motivation

Most models tend to re-use Layers. Note that this is different from components that we currently have.
Components are FPNs, Backbones, Heads such as YOLO Head, etc.

Layers are fundamentally small and less complicated.
E.g. Activation Functions, Small building blocks such as CSP Layer, simple Tranformer layers,
Squeeze-Excite Block.

We would need to commonly re-use layers across various tasks. E.g. Heads and FPNs are specific to Object Detection / Segmentation, but layers are quite generic. These can be re-used in any NN architecture.

Pitch

We do have layers folder, I think it's time to populate it.
Thinking of some minor stuff e.g. Should we decompose layers to Activation, blocks etc ?
I think we can avoid this for now and later introduce this if the API needs it.

Alternatives

I think this feature addition is great, these will allow us to easily decompose models, re-use, test and maitain code easily.

Additional context

Let's start simple by adding activations and simple blocks, we can move further.

cc @zhiqwang @hassiahk

@oke-aditya oke-aditya added the feature A new feature request label Dec 1, 2020
@oke-aditya oke-aditya changed the title [RFC] [RFC] Layers API in Quickvision Dec 1, 2020
@zhiqwang
Copy link
Contributor

zhiqwang commented Dec 1, 2020

It seems that this will make this repo a little complicated 🤔 Although detection models (yolo/ssd/retinanet) have shard obstruct modules (Backbone, Head, Anchors, and PostProcess), but their commonality is limited in a sense. (maybe I'm wrong here). More specifically, if we unify this three single-stage detector, it seems a little difficult. (One method to handle this problem is like the registration mechanism in detectron2). But I'm open for this features here.

Or another thing to be determined is which layers we should add here?

@oke-aditya
Copy link
Owner Author

Yes, what would happen at places is our model.py file gets too lengthy, and creates tons of code duplication for common blocks, activation functions.

If we could collectively place them under layers, it would be easier to maintain and test.

For detection and segementation models, if we decompose. as you metioned.
It is often Backbone, Neck (optional, E,g, FPN, etc) Head (RetinaHead, FRCNN Head), and then PostProcess

These are bit high level blocks, and this abstraction is achieved by components, making them re-usable, but still they are model specific . E.g. Detr backbones won't work with YOLO or YOLO head would not work for FRCNN.

Let me illustrate where layers can be helpful.

Layers can provide simple blocks that we see in models.py quite often.
E.g. Mish layer, Swish layer (both aren't in PyTorch yet)
Or blocks such as Dilated Resnet, A simple MLP block. or say.

Maybe the word layers doesn't make sense?
But the idea is to keep these generic.

Registration mechanisms are confusing, and not even used in torchvision / bolts.
we can simply do then
from quickvision.layers import Mish, Swish

@oke-aditya
Copy link
Owner Author

Question is how to determine if certain block is layer, component or should be part of model.

A high level logic.

If we are pretty sure that a block of PyTorch nn.Sequential or such container is going to used super often.
E.g. Activation functions, Squeeze Excite block. It should be rather a layer.

Backbone, Neck, Head are components, no doubts, as they are model specific.

Model specific layers are something innovative in the paper, and appears very rare in literature.
If we find that these get used often and we have code repetitions or can be made generic, we can move them.

@zhiqwang
Copy link
Contributor

zhiqwang commented Dec 1, 2020

Registration mechanisms are confusing, and not even used in torchvision / bolts.
we can simply do then
from quickvision.layers import Mish, Swish

Agree here, torch.hub is an option like detr doing?

@oke-aditya
Copy link
Owner Author

oke-aditya commented Dec 1, 2020

Yes, Detr is using torch.hub. I could not implement from scratch and Detr transfer learning differs from training, so it becomes quite hard to replicate.

torch.hub is very nice approach and can reduce extreme pains when model implementation is hard.
I won't recommend it when it bundles dependencies, but when it doesn't it is sleek way and provides a good implementation close to authors.

Our discussion about YOLO did have this point, #48 and we decided to avoid hub there 😄

@oke-aditya
Copy link
Owner Author

I suggest when people add new models, they might not have idea of how these stuff work.
Let them write model as follows

model_folder
----- __init__.py (whaterer we can import)
------ model.py (define whatever model you need)
------ model_factory.py (allow us to initialize model with possible backbones etc)
------ engine.py (all your training logic)
------ utils.py (post process, helpers)

I don't think people might need any additional file, if they need they can create it but these are bare minimum.

Currently we don't have model.py file since we are either using torchvision or torch.hub.
But #48 YOLO will have one, and we should try to keep it simple.

Later we can refactor as we need, since we need to maintain the repo,
so we can shift blocks from model.py wherever we need.

@oke-aditya
Copy link
Owner Author

Somehow I understand that decomposition kind of distributes our code here and there, but I think it will be only upto this level.

But this keeps flexibility, again authors who want to add model can simple write all code in a folder, we can refactor it.
Only API exposed is through model_factory as create_xyz_model so to end user it is less pain.

@zhiqwang
Copy link
Contributor

zhiqwang commented Dec 1, 2020

One question here, add Mish or something like works for me, but how could we make the user convinced to choose our implementation?

In other words, people can simply use ReLU6 or a more experimental repo like https://github.com/digantamisra98/Mish for they first proposed this ops.

Or equivalently, we should determine the principle of adding a new layer?

@oke-aditya
Copy link
Owner Author

oke-aditya commented Dec 1, 2020

Again @zhiqwang aim is not to go into such super hyper details, we use layers which are used by papers, most use a simple version of mish and not so experimenal.

These are really hard decisions for end user too. We can keep simpler versions E.g it can simply be F.softplus(F.tanh(x)) (I think) .

Possibilities are really endless.

We can provide stuff that is used by papers, or something common, But if we get sufficient user requests we can add these (already we are adding layers which pytorch didn't so there is some limit to how much users might demand).

@digantamisra98
Copy link

@oke-aditya You can add support for Echo. We will be releasing beta version soon (New Year's eve) containing optimizers, activations and attention layers. Subsequent releases will contain all custom layers (Conv, Regularizers, etc...)

@zhiqwang
Copy link
Contributor

zhiqwang commented Dec 2, 2020

Again @zhiqwang aim is not to go into such super hyper details, we use layers which are used by papers, most use a simple version of mish and not so experimenal.

These are really hard decisions for end user too. We can keep simpler versions E.g it can simply be F.softplus(F.tanh(x)) (I think) .

Possibilities are really endless.

Let's go ahead and see what happens next!

@oke-aditya
Copy link
Owner Author

@digantamisra98 people are free to use Echo along with this, since adding Echo will add it's sub depedencies which will probably bulk up this library. It would also be hard to keep consistent API of this library and echo implementations. So as of now, I'm not quite sure how it will work out.

@digantamisra98
Copy link

@oke-aditya Absolutely understandable.

@oke-aditya oke-aditya mentioned this issue Dec 7, 2020
3 tasks
@oke-aditya
Copy link
Owner Author

Another point with this API.
I think we should discourage the use functional API in our case for layers.
Reason

  1. Blocks such as MLP, CSPNet do not have any functional API.
  2. Class based API does the same job, it is not C++ exported API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new feature request
Projects
None yet
Development

No branches or pull requests

3 participants