Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: TensorFlow Official Model Garden Redesign #130

Merged
merged 7 commits into from
Mar 6, 2020

Conversation

rachellj218
Copy link
Member

@rachellj218 rachellj218 commented Aug 3, 2019

This RFC will be open for comment until Friday, August 16th, 2019.

TensorFlow Official Model Garden Redesign

Status Proposed
RFC# 130
Author(s) Jing Li ([email protected]), Hongkun Yu ([email protected]), Xiaodan Song ([email protected])
Sponsor Edd Wilder-James ([email protected])
Updated 2019-08-02

Objective

This document presents a proposal to redesign TensorFlow official model garden.
We aim to provide a central and reliable place to contain popular examples,
state-of-the-art models and tutorials to demonstrate the best practice in TF2.0
and illustrate real-world use cases.

Copy link

@zhenhuaw-me zhenhuaw-me left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this proposal is really cool as many TF users are suffering the existing model zoon :) Generally, will we maintain multi-branch (or tagging) to make the code/models consistent with specific TF version? For example, having branch r2.0 or v2.0.0 tracking TF branch/tag. (I know that TF will provide API stability though...)


## Motivation

The current [TF official model garden](https://github.com/tensorflow/models/tree/master/official)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the new model garden still lie in https://github.com/tensorflow/models/tree/master/official or as a "root" repo? Putting such important resources in a sub directory may make (new) users confused I guess.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have this same question.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current plan is still to keep the model garden lie in the current directory. But we are going to work with TF Hub to provide a unified UI to provide both pretrained models and links to model codes, hopefully it will make it easier for users to find. Thanks!

Copy link

@zhenhuaw-me zhenhuaw-me Aug 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what others' thoughts, but to me, model garden directory is https://github.com/tensorflow/models/tree/master, while feeling hard to understand what the official sub directory means - because in current directory hierarchy, the MobileNets are published inresearch/slim where I also take them as official...

Anyway, UI and links will be great! Thank you for reply.

| Directory | Subdirectories | | Explainations |
:-------------- |:---------------------|:--|:------------------------------ |
| modeling | | | Common modeling libraries |
| | layers | | Common modules/layers, not built-in tensorflow layers yet.|
Copy link

@zhenhuaw-me zhenhuaw-me Aug 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting layers here is dangerous to me. By not built-in tensorflow layers yet, I assume they are layers introduced by newly published papers. If we put the layers here, and the layers were included in tensorflow later, will we rewrite the related code in this model garden? If not, this will be another slim/contrib.layers of TF 1.x. Maybe to push the implementation in TF hard?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how are these layers different from tf-addons

Copy link
Member

@seanpmorgan seanpmorgan Aug 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splintering the TF ecosystem with multiple repos that contain new layers/optimizers seems counter-productive. Addons is already going to manage the burden of graduating layers into core so why have 2 repos for the same thing.

cc @karmel @facaiy for visibility

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this, we are going in the direction of TF1.x again.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rachellj218 , I believe we discussed having a process to ensure any broadly useful layers/etc make it to Addons and/or tf.text. We should update the doc to reflect that, as it seems to be a common concern-- maybe a section describing what you would like the evaluation/graduation process to be?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for raising this! As karmel@ mentioned, we do have the plan to graduate the common layers to tensorflow/addons. I clarified in the RFC. Thanks!

I removed optimizers/ subdirectory. I agree it should be good to add to Addons directly.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your clarification! In the latest RFC:

Temporary folder to hold common modules/layers, not built-in tensorflow layers yet during refactoring. Broadly used layers will be graduated to tensorflow/addons and/or tf.text.

It seems to me that graduation will be another SLIM. The graduation will create different versions of one network which are based on different API, say a network uses 3 layers A, B and C firstly introduced in model garden, these 3 layers move to TF Addons one by one, thus we have four versions of network code. That is pretty confusing. And, as the interfaces of these layers are probably going to be different, it won't be as easy as re-write one line. Eventually, the effort to maintain these code will be huge.

What about implement the layers directly in TF Addons, which I think is much more flexible than the TF repo?

Thanks, hoping that my concern won't be too annoying...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this concern. IMO it's less of a headache if they are moved from the temporary folder to Addons prior to a model-garden pip release. We can promise a quick review and patch release for official model additions. To ensure this we could have a model-garden team member on addons team.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need a consolidation stage for modeling components. This is not to develop infra but write models. We will guarantee layers moved to tf-addon must be removed in model garden. It is also up to how do you define "layers" as some of them are not common components as the ones in tf-addon now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To ensure this we could have a model-garden team member on addons team.

+1, good idea. We can work in close coordination.

Maybe model garden can put its private layers in each model's module. And when you find that some layers need be shared by two models or more, I believe it's a good time to move those layers to tf.addons, rather than modeling.layers. What do you think?

| modeling | | | Common modeling libraries |
| | layers | | Common modules/layers, not built-in tensorflow layers yet.|
| | networks | | Well-known networks built on top of layers, e.g. transformer |
| | optimziers | | New or customized optimizers |

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to layers, I think we are creating TF dialect here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern, addons has additional optimizers that can be graduated to core.

Any optimizer/layer that isn't useful as a supplementary package can just be committed to the model garden repository IMO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed optimizers subdirectory.


These models will have continuous convergence and performance testing to
make sure no regression. In general, we won’t deprecate these models unless:
* the model isn’t compatible with the TF APIs any more and have to be replaced by a new version

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this imply that the model will only target latest version of TF?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We plan to do model garden release for major TF release, e.g. TF 2.1, 2.2. Thanks!

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Thanks for your reply.

@ewilderj ewilderj added the RFC: Proposed RFC Design Document label Aug 5, 2019
Copy link

@jayfurmanek jayfurmanek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great idea! As it is now, it's hard to know which models work against which version of TF. Getting some focus here is good. The TF-hub integration is good as well.


We are going to reorganize the official model directory to provide:

* common libraries, mainly two types:
Copy link

@jayfurmanek jayfurmanek Aug 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving common code into a central location is always a good design practice, but let's be careful not to create another SLIM here. There are still models (Deeplab anyone) that are chained to SLIM.

Remember, part of what would be valuable here is showing good examples on how to write a model, not necessarily good examples on how to write a model garden with intertwined and complicated dependencies.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree on the last part. models/research is a typical example of this. The whole garden there is so complex and huge that it is nowhere near usable for a project until unless you:

  • Just want the model off the shelf without any modification
  • You know in an out of each and every part of it

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern. Software engineering design may not be applied to deep learning model garden these days.

Personally, a model garden is simply somewhere user can obtain the whole runnable model from a very sub directory. Sharing common library, and any similar things, sounds like a SDK built on top of TF to me. Maybe we can have TF SDK, and then the garden.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for raising this concern! We definitely need a good balance between implementation duplication and complicated dependencies. The current design is to split the modeling process to two stages: 1)common networks (in modeling directory), such as resnet, transformer, so that these networks can be reused in specific model/task, e.g. resnet50, mask r-cnn, sequence model; 2) specific model/task along with public dataset (e.g. in NLP and vision directory), models will be defined in its own sub directory, data preprocessing and eval and other utils for the same type of tasks/datasets can be shared.

If you have better suggestions, we are definitely open to them. Thanks again!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current design is to split the modeling process to two stages: 1)common networks (in modeling directory), such as resnet, transformer, so that these networks can be reused in specific model/task, e.g. resnet50, mask r-cnn, sequence model

How will these be different from https://www.tensorflow.org/api_docs/python/tf/keras/applications ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, Keras applications are going on an external repo x Keras migration RFC at #202.
Right?


## Motivation

The current [TF official model garden](https://github.com/tensorflow/models/tree/master/official)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have this same question.

hyperparameter definition in a consistent style.
* Model category related common library, e.g. primitives as basic building
block for NLP models. We will follow the fundamental design of Keras
layer/network/model to define and utilize model building blocks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: Network is a private class, so objects will need to subclass either Layer or Model.

Also note that whenever possible, we should prefer the pattern of Functional model + custom layers, rather than subclassed models (which have less functionality).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestions! We do plan to follow the pattern of functional model + custom layers.

| | | EfficientNet | |
| | | MnasNet | |
| | | ... | |
| | object_detection | | |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unet is mainly used for segmentation. So instead of object_detection, rather make two sub-directories object_detection and segmentation. segmentation will then contain:

  • DeepLab v3/v3+
  • Unet
  • FastSCNN
    ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Updated to reflect the suggestion. We are still debating the best structure for vision models, this may subject to changes in near future.

We are going to provide the pretrained models for research exploration and
real-world application development. The plan is to integrate with [TensorFlow Hub](https://www.tensorflow.org/hub),
where users can access the Hub modules and SavedModel for pretrained checkpoints and links to the code in the model
garden.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid use of configs as much as possible. The code should be flexible enough to change the parameters in the code itself. Changing things though a config file makes sense for some people but not for all. It is easy to maintain the changes in the code but it is hard to track the changes in the config file and at the same time, it is hard to validate the changes as well

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your suggestion! We plan to follow the practice in cloud tpu model repo, for example, https://github.com/tensorflow/tpu/blob/master/models/official/mnasnet/mnasnet_main.py#L674. A default parameter dictionary will be provided, but users can overwrite the default value or extend to new key/value with string, dict or yaml.

| Directory | Subdirectories | | Explainations |
:-------------- |:---------------------|:--|:------------------------------ |
| modeling | | | Common modeling libraries |
| | layers | | Common modules/layers, not built-in tensorflow layers yet.|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how are these layers different from tf-addons

| modeling | | | Common modeling libraries |
| | layers | | Common modules/layers, not built-in tensorflow layers yet.|
| | networks | | Well-known networks built on top of layers, e.g. transformer |
| | optimziers | | New or customized optimizers |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| | optimziers | | New or customized optimizers |
| | optimizers | | New or customized optimizers |

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! We do have the plan to graduate the common layers to tensorflow/addons. I clarified in the RFC. Thanks!

optimizers/ subdirectory was removed from RFC. I agree it should be good to add to Addons directly.

@martinwicke
Copy link
Member

martinwicke commented Aug 21, 2019 via email

@ewilderj
Copy link
Contributor

@rachellj218 @martinwicke What was the fate of this RFC? Did you have a design review meeting that accepted it? Could you post the notes?

@bhack
Copy link
Contributor

bhack commented Oct 15, 2019

It is a little bit off topic but new stuffs like Dectectron2 or High Level task API + models create a competitive pressure over Garden/Hub.
Please take care super-care about modernizing old TF Object API and similar high level task approaches in the Garden design.

* make training on both GPU and TPU an easy switch.
* provide reusable components for research and production.
* End-to-end solutions
* provide seamless end-to-end training and inference solutions, where inference covers serving on TPU, GPU, mobile and edge devices.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this include Google Coral devices?

@bhack bhack mentioned this pull request Nov 19, 2019
@theadactyl
Copy link
Contributor

@rachellj218 @martinwicke pinging again for notes and/or resolution for this RFC.

@bhack
Copy link
Contributor

bhack commented Nov 23, 2019

Can we try to extend the deadline if needed? It Is still set on August 16th, 2019.

@bhack
Copy link
Contributor

bhack commented Feb 7, 2020

The garden needs gardeners.. It is turning into a jungle again.
Not much different from the current zoo 😄

@bhack
Copy link
Contributor

bhack commented Mar 5, 2020

We really need a more close coordination between HUB and the Garden. Models are spread in too much repositories/places... we really need an Italian style Garden 😄
Many non google models/papers reference or third party implementations are published in pytorch on github so this is a super strategic topic to push Tensorflow.
Honestly it is incomprehensible how it is not treated with a TOP priority

@rachellj218 rachellj218 requested a review from ematejska as a code owner March 6, 2020 00:54
@rachellj218
Copy link
Member Author

Thank you for providing feedback, @bhack! Here're the differences between TF Hub and Model Garden:

  • TensorFlow Hub (https://tfhub.dev) is the main entry point (a search engine for all TensorFlow pre-trained models) for TF users who wants to find a pre-trained model, and it provides a pathway to (1) code examples and (2) tutorials (e.g., Colab notebooks) for all TensorFlow 1 & 2 models (TF, TFLite, TF.js, Coral).
  • Model Garden (https://github.com/tensorflow/models) provides TensorFlow 2 model implementations and E2E modeling solutions with code examples. The pretrained checkpoints are exported to TFHub for user access.
    The TFHub team is actively working on improving user experience.

We do realize that the current model garden design isn't good enough. We are actively working on redesigning the model garden. We will make an announcement on our new model garden redesign once ready.

@bhack
Copy link
Contributor

bhack commented Mar 6, 2020

I see code example in both the two bullet points and models are in other repos. I think the entry point need to be unfied.
I don't think that we need any duplicate elementi between Garden and Hub but coordination between teams for a unified goal.
See "Pre-made Models with Explicit Support for Modification" in https://ai.googleblog.com/2020/03/toward-human-centered-design-for-ml.html

@saberkun
Copy link
Member

saberkun commented Mar 6, 2020

I see code example in both the two bullet points and models are in other repos. I think the entry point need to be unfied.
I don't think that we need any duplicate elementi between Garden and Hub but coordination between teams for a unified goal.
See "Pre-made Models with Explicit Support for Modification" in https://ai.googleblog.com/2020/03/toward-human-centered-design-for-ml.html

There should not have duplicated elements. Hub is the entry point offering links to the code examples and Garden is the place holding solid modeling code for a selected collection of models.

Copy link
Contributor

@ewilderj ewilderj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have been merged some time ago. The design is undergoing iteration from this original RFC. Thank you everyone for your conversation and contribution.

@ewilderj ewilderj merged commit 69b7275 into tensorflow:master Mar 6, 2020
@ematejska ematejska added RFC: Accepted RFC Design Document: Accepted by Review and removed RFC: Proposed RFC Design Document labels Apr 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes RFC: Accepted RFC Design Document: Accepted by Review
Projects
None yet
Development

Successfully merging this pull request may close these issues.