RFC: Standalone Keras Repository #202

qlzh727 · 2020-02-06T20:18:42Z

Comment period is open till 2/25/2020.

Standalone Keras Repository

Status	Proposed
RFC #	202
Author(s)	Qianli Zhu ([email protected]), Francois Chollet ([email protected])
Sponsor	Karmel Allison ([email protected])
Updated	2020-02-05

Objective

Move the Keras code from the TensorFlow main GitHub repository to its own
repository, with TensorFlow as a dependency.

Rebase to the HEAD of tf/community.

Proofreading, etc.

rfcs/20200205-standalone-keras-repository.md

bhack · 2020-02-07T20:54:00Z

rfcs/20200205-standalone-keras-repository.md

+
+### Two-stage change process
+
+For any change that is affecting both TensorFlow and Keras, the change


In this case we have again the cache sharing problem.

Can you detail what is the cache you are mentioning?

#202 (comment)

rfcs/20200205-standalone-keras-repository.md

bhack · 2020-02-07T20:58:05Z

rfcs/20200205-standalone-keras-repository.md

+
+## Questions and Discussion Topics
+
+1. Tools for issue tracking: we can't rely on Google-internal bug tracking tool 


This another critical point that need to be solved. We cannot have a weaker issue tracking than TF. Check the full thread at #29

Thanks for the notice. I will check with infra team and support team to see if they have any suggestions for improving this situation. Will leave this open for discussion in the design meeting.

rfcs/20200205-standalone-keras-repository.md

bhack · 2020-02-07T22:31:36Z

Please announce this on the TF dev mailinglist(s) as usual. /cc @ewilderj

pavanky · 2020-02-07T23:13:20Z

Will the ci still run keras tests when changes are made to the tensorflow repository ?

rfcs/20200205-standalone-keras-repository.md

AakashKumarNain · 2020-02-09T05:34:01Z

rfcs/20200205-standalone-keras-repository.md

+* pip package management. Keras will now follow the `tf-estimator` approach. 
+"pip install tensorflow" should also install Keras (from PyPI) as well.
+There are more details for the pip package in the
+[Improved pip package structure](https://github.com/tensorflow/community/pull/182) RFC.


Can we also considerconda this time? Anaconda is a huge part of Data Science and ML ecosystem and tf.keras not being a part of it doesn't feel right.

@gunan from TF build team, and also @seanpmorgan who mentioned this in the SIG-addon meeting.

Discussed this item on the design review meeting. So far this is not active target we are supporting. The meeting conclusion are:

Conda uses a different toolchain, with no ABI compatibility between TF+Conda. Because we don’t have C interfaces, any custom ops built for TF will not work for Conda, and vice versa. There is a set of people who publish TF for Conda. But tf addons, io, etc. are not published for Conda. There’s no easy way around this because it would require doubling the TF release burden entirely.
Keras in theory should be easier? Insofar as it is Python only (currently). But Keras will in the future include C(++). That should be fine as long as there is a Python interface to the exposed parts, or C parts are used internally.

rfcs/20200205-standalone-keras-repository.md

gabrieldemarmiesse · 2020-02-10T17:59:10Z

rfcs/20200205-standalone-keras-repository.md

+* Replace the usage with another alternative TF public API.
+* Make the functionality a new TF public API.
+
+**Note that the open-source community is encouraged to contribute to this effort.**


Very nice description of the problem. Since there is a lot of decision making going on there, I believe it would be nice for people in charge of decisions to open issues describing what they want to do for all the unwanted imports we'll have. It's hard to help when you don't know what will be decided to resolve the problem (here option 1? 2? or 3?).

If we don't do that, I believe the open-source community won't help much just because of a lack of guidance.

Co-Authored-By: Frédéric Branchaud-Charron <[email protected]>

googlebot · 2020-02-10T18:50:27Z

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

rfcs/20200205-standalone-keras-repository.md

alextp · 2020-02-13T23:37:31Z

rfcs/20200205-standalone-keras-repository.md

+* We expect the majority of the code development/contribution from GitHub
+and the dev tools / tests / scripts should focus on the GitHub development use
+case. See below for more details.
+* Keras CI/presubmit build for the GitHub repo should target a stable PIP 


This has the disadvantage of slowing down keras's adoption of new TF features and APIs.

This document should clarify when would CI against tf stable be expected to break, etc.

(also important to highlight the release process for TF)

It's not supposed to slow down adoption of new features compared to targeting an unpinned nightly all the time. Maybe it's not clear in the document, but if a pull request requires a new feature in tensorflow nightly, the person making the pull request can change the targeted version of tensorflow in the CI. So we don't have to wait at all.

Do I get it right or you were referring to another problem?

rfcs/20200205-standalone-keras-repository.md

gabrieldemarmiesse · 2020-02-14T13:25:30Z

rfcs/20200205-standalone-keras-repository.md

+developer community.
+
+In addition, by getting the Keras team at Google to start developing Keras
+using the same public tools and infrastructure as third-party developers,


I would split this point into multiple sub-points, because I believe that this argument is not sufficiently highlighted.

Github is not made to manage monorepos like tensorflow. Keras issues and pull requests get mixed up with the tensorflow core ones.

The source of truth being in github is very important because it allows the community to manage the repo. Currently, some commits pop up without any public pull request or public user, making it impossible for non-googlers to understand what is the rational behind a commit, or even who to contact if it caused a bug: e.g. tensorflow/tensorflow@9698ae1

As long as keras is in tensorflow, it's hard for us to have any impact on the processes/tests/tools that keras use internally. Imagine for example that we want to use Isort, or pytest, or github actions... In all cases it's a lot more work (code) and a lot more people who need to review and give us a green light. With a repository with the size of tensorflow, it's just unfeasable unless we have a face-to-face meeting.

Keras issues and pull requests get mixed up with the tensorflow core ones.

before the split, we had the same issue. Not sure how to fix that.

x10000year · 2020-02-18T15:21:54Z

If keras is going to be moved to a standalone repo, are v1 apis (tf.get_variable(), tf.variable_scope(), etc.) for building static graph still available in the tensorflow repo? Can we build custom models without using keras? The v1 apis are really very simple and flexible to use for building static graph, while the keras api forces people to use OOP style and gives a lot of pain for industrial engineers who are familiar with static graphs and functional style.

qlzh727 · 2020-02-18T21:36:51Z

If keras is going to be moved to a standalone repo, are v1 apis (tf.get_variable(), tf.variable_scope(), etc.) for building static graph still available in the tensorflow repo? Can we build custom models without using keras? The v1 apis are really very simple and flexible to use for building static graph, while the keras api forces people to use OOP style and gives a lot of pain for industrial engineers who are familiar with static graphs and functional style.

All the public API symbols will still remain available for v1.

1. Udpate the motivation to emphasize the modularity of TF. 2. Update the section for reverse dependency from TF to keras.

These numbers might help us decide whether the split PR execise is too much a overhead or not.

rfcs/20200205-standalone-keras-repository.md

qlzh727 · 2020-03-04T18:52:07Z

@ematejska, I have updated RFC with latest result from the design meeting. Would u like to have the meeting notes also being posted here?

ematejska · 2020-03-04T18:57:57Z

@qlzh727 Yes please.

ematejska · 2020-03-04T18:59:04Z

Also, if you could update the status of the RFC to Accepted, that would be appreciated. Thanks.

qlzh727 · 2020-03-04T19:07:14Z

Meeting notes from the design review on Mar 03 (taken by @karmel)

Notes

Dependencies between TF and Keras. Two ways: Keras on non-public TF; TF backwards on Keras. Majority of TF -> Keras dependencies are in higher level TF APIs (old tf.layers; feature columns).
- For v1 things, eg layers, we will freeze a copy of necessary APIs (eg base layer v1) in core TF. This has the advantage of also preventing accidental changes.
- Higher cost; requires a refactor. Relying on existing unit tests to tell us if we break something.
- Lazy loading is easier in many ways.
- But, deceptively easy; may lead to integration pains down the line.
- If we can decouple future changes to Keras from tf.layers, we will be happier.
- Can we move v1 keras layer to tf core? Ends up being tricky-- graph/eager is the split in Keras versus v1/v2. Version selecting.
- If we make a copy, what is the scope of the copy? Strictly to break the circular dependencies. Trim down current base_layer to reduce the amount of code copied to TF.
- So, concretely, what would be copied for tf.layers? There is a very long blaze dep list right now, but doesn’t need to be. Eg mixed precision imports optimizers.
- Split out legacy tf.layers into Keras? In tf we re-export that. Requires we inject into TF namespaces from other packages, but maybe that’s not so bad.
- Can tf.feature_column, tf.layers be separated out and depend on keras and TF?
- tf.lite depends on some private utils from Keras saving. Some of these can be forked. They are at least public APIs in general.
- Suggestion: move tf.feature_column to Estimator; move tf.layers to Keras (in some sort of legacy quarantine)
- Unit tests move either to the top-most package, or into a separate integration test package. Distribution strategies is one of the largest clients here. Would be inconvenient if those tests lived in Keras (since change has to be split); moving to integration tests outside TF seems a better solution long-term. Also allows “import tensorflow as tf”, which is preferable. Can be run on nightly and head.
Conclusions:
- tf.feature_column to Estimator
- RNN Cells to Keras
- tf.layers to Keras or frozen; whichever is less painful.
- Integration tests to new repo
- TPU is just instance checks; resolve with tf.types
- TF Lite either fork utils or move to tf core
- No LazyLoader
Keras deps on private TF symbols
- Head cases (count >= 5) we can resolve by using different APIs in Keras
- Long tail (count < 5). What to do with those? Many are in dataset APIs.
- apassos: We are less concerned about these private usages; we can address over time. We will need to get owners of private APIs on board.
- wicke: I don’t think we should split before we have a CL that gets us to where estimator is today. This list is too long; we need to narrow the list down to fewer of these before we can realistically split.
- scottzhu: currently I am addressing the most-used of these. Generated CL from mihai’s script that converts all public. Can take care of head cases and leave longer tail.
- raw_ops for ops.
- CondBranchFuncGraph-- why do we need this? Pretty ugly check. Seems to be mostly to raise a better error. Likely what this means is that some amount of FuncGraph needs to be made public and documented so that we can allow this. Or improve error messages from core; we are solving this for Keras, but other users will have similar problems. A new public API that says, I am in a conditional branch of the graph.
- Much of the long-tail will be case-by-case
- How do we avoid a proliferation of experimental APIs that we depend on? That’s better than private, since at least API owners see and govern changes to experimental. We generally require global tap, and one-release deprecation before removing.
Source of truth is Github
- What happens if external change to Keras can’t be brought into google3 because something internal breaks? It will happen, and it will be painful. It will eat up time, and has for Swift, MLIR, etc. We must add tests to OSS keras every time this happens. But that’s the cost of having it be OSS first.
- That’s a cost, but also a good forcing function-- our APIs should be stable.
- We may have to have a policy that says, if you are on private APIs, we may break you on syncs.
- Syncing should ideally be frequent, but may fail and require manual intervention.
- When are we ready to make the split?

apassos: no circular deps; triaging private symbols + preventing backsliding; for the gaps, we have a plan.

We did all this work to bring in Keras, now we’re separating it back out. In was critical for 2.0. What will we do at the next big change?
In many ways, the way we did the 2.0 change was too entangled. And would have been better if it had been more structured. But now we’ve learned and grown, and we can use that going forward.
From OSS community: conda packages?
- Conda uses a different toolchain, with no ABI compatibility between TF+Conda. Because we don’t have C interfaces, any custom ops built for TF will not work for Conda, and vice versa. There is a set of people who publish TF for Conda. But tf addons, io, etc. are not published for Conda. There’s no easy way around this because it would require doubling the TF release burden entirely.
- Keras in theory should be easier? Insofar as it is Python only (currently). But Keras will in the future include C(++). That should be fine as long as there is a Python interface to the exposed parts, or C parts are used internally.
How are we going to manage issues and PRs?
- PRs cannot span the two repos. Must be submitted separately, without breaking CI.
- For issues, will we transfer between the repos? We can crosslink if the issue moves to TF. Issue in Keras will stay, and link to the TF issue.
- We will have to reset permissions for keras repo.
- Automatically pull in PRs to CLs?

qlzh727 · 2020-03-04T19:27:51Z

Also, if you could update the status of the RFC to Accepted, that would be appreciated. Thanks.

Done.

bhack · 2020-03-04T19:50:41Z

We need to handle cross repositories PR/ISSUE but we don't have this feature naturally in Github. See https://codetree.com/guides/managing-issues-across-multiple-github-repositories

bhack · 2020-05-28T07:53:52Z

What Is the timeline to reactivate the standalone repository?

qlzh727 and others added 5 commits February 5, 2020 14:21

Merge pull request #1 from tensorflow/master

26cafee

Rebase to the HEAD of tf/community.

Initial commit for RFC standalone Keras repository

7af3f0b

Update wording and nits.

0145eea

Update 20200205-standalone-keras-repository.md

0405d80

Merge pull request #2 from fchollet/patch-1

7064794

Proofreading, etc.

qlzh727 requested review from ematejska, ewilderj, martinwicke and theadactyl as code owners February 6, 2020 20:18

googlebot added the cla: yes label Feb 6, 2020

Update the RFC nubmer

3cce978

pavithrasv reviewed Feb 7, 2020

View reviewed changes

rfcs/20200205-standalone-keras-repository.md Outdated Show resolved Hide resolved

rfcs/20200205-standalone-keras-repository.md Show resolved Hide resolved

gabrieldemarmiesse reviewed Feb 7, 2020

View reviewed changes

rfcs/20200205-standalone-keras-repository.md Show resolved Hide resolved

gabrieldemarmiesse reviewed Feb 7, 2020

View reviewed changes

rfcs/20200205-standalone-keras-repository.md Outdated Show resolved Hide resolved

gabrieldemarmiesse reviewed Feb 7, 2020

View reviewed changes

rfcs/20200205-standalone-keras-repository.md Show resolved Hide resolved

bhack reviewed Feb 7, 2020

View reviewed changes

gabrieldemarmiesse reviewed Feb 7, 2020

View reviewed changes

rfcs/20200205-standalone-keras-repository.md Outdated Show resolved Hide resolved

gabrieldemarmiesse reviewed Feb 7, 2020

View reviewed changes

rfcs/20200205-standalone-keras-repository.md Outdated Show resolved Hide resolved

bhack mentioned this pull request Feb 7, 2020

Criteria for doing API reviews and design reviews in TF #185

Merged

bhack mentioned this pull request Feb 7, 2020

RFC: TensorFlow Official Model Garden Redesign #130

Merged

terrytangyuan reviewed Feb 8, 2020

View reviewed changes

rfcs/20200205-standalone-keras-repository.md Outdated Show resolved Hide resolved

rfcs/20200205-standalone-keras-repository.md Show resolved Hide resolved

AakashKumarNain reviewed Feb 9, 2020

View reviewed changes

gabrieldemarmiesse reviewed Feb 10, 2020

View reviewed changes

rfcs/20200205-standalone-keras-repository.md Show resolved Hide resolved

gabrieldemarmiesse reviewed Feb 10, 2020

View reviewed changes

rfcs/20200205-standalone-keras-repository.md Show resolved Hide resolved

Dref360 reviewed Feb 10, 2020

View reviewed changes

rfcs/20200205-standalone-keras-repository.md Outdated Show resolved Hide resolved

gabrieldemarmiesse reviewed Feb 10, 2020

View reviewed changes

Update rfcs/20200205-standalone-keras-repository.md

3f0dc92

Co-Authored-By: Frédéric Branchaud-Charron <[email protected]>

gabrieldemarmiesse mentioned this pull request Feb 11, 2020

RFC: Improved pip package structure #182

Merged

alextp suggested changes Feb 13, 2020

View reviewed changes

gabrieldemarmiesse reviewed Feb 14, 2020

View reviewed changes

Dref360 mentioned this pull request Feb 14, 2020

Fix sequence padding documentation keras-team/keras-preprocessing#279

Merged

qlzh727 added 2 commits February 18, 2020 10:04

Update wording wrt to pure python project.

6872453

Update the sectoin for TF to keras dependency.

7144521

qlzh727 added 5 commits February 25, 2020 11:06

Address the comment from @Apassos

ecbc169

1. Udpate the motivation to emphasize the modularity of TF. 2. Update the section for reverse dependency from TF to keras.

Update the states for python change history.

ce8a20f

These numbers might help us decide whether the split PR execise is too much a overhead or not.

Updating the section for CI and presubmit.

9123f7f

Update style and fix typo.

11f3828

Adding a section for Alternative considered.

5de7dfe

crccw reviewed Feb 26, 2020

View reviewed changes

rfcs/20200205-standalone-keras-repository.md Outdated Show resolved Hide resolved

Fix typo for LasyLoader

132f542

ematejska added the RFC: Proposed RFC Design Document label Feb 28, 2020

Update the RFC wrt to the design meeting outcome.

3cb07db

ematejska added RFC: Accepted RFC Design Document: Accepted by Review and removed RFC: Proposed RFC Design Document labels Mar 4, 2020

Update the RFC status to 'Accepted'

1343f17

ematejska approved these changes Mar 4, 2020

View reviewed changes

ematejska merged commit 1ee72f3 into tensorflow:master Mar 4, 2020

bhack mentioned this pull request Apr 4, 2020

experimental API policy #218

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Standalone Keras Repository #202

RFC: Standalone Keras Repository #202

qlzh727 commented Feb 6, 2020 •

edited

Loading

bhack Feb 7, 2020

gabrieldemarmiesse Feb 10, 2020

bhack Feb 10, 2020

bhack Feb 7, 2020

qlzh727 Feb 10, 2020

bhack commented Feb 7, 2020

pavanky commented Feb 7, 2020

AakashKumarNain Feb 9, 2020

qlzh727 Feb 10, 2020

qlzh727 Mar 4, 2020

gabrieldemarmiesse Feb 10, 2020

googlebot commented Feb 10, 2020

alextp Feb 13, 2020

alextp Feb 13, 2020

gabrieldemarmiesse Feb 14, 2020

gabrieldemarmiesse Feb 14, 2020

Dref360 Feb 14, 2020

x10000year commented Feb 18, 2020 •

edited

Loading

qlzh727 commented Feb 18, 2020

qlzh727 commented Mar 4, 2020

ematejska commented Mar 4, 2020

ematejska commented Mar 4, 2020

qlzh727 commented Mar 4, 2020

qlzh727 commented Mar 4, 2020

bhack commented Mar 4, 2020

bhack commented May 28, 2020 •

edited

Loading


		### Two-stage change process

		For any change that is affecting both TensorFlow and Keras, the change


		## Questions and Discussion Topics

		1. Tools for issue tracking: we can't rely on Google-internal bug tracking tool

RFC: Standalone Keras Repository #202

RFC: Standalone Keras Repository #202

Conversation

qlzh727 commented Feb 6, 2020 • edited Loading

Standalone Keras Repository

Objective

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bhack commented Feb 7, 2020

pavanky commented Feb 7, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

googlebot commented Feb 10, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

x10000year commented Feb 18, 2020 • edited Loading

qlzh727 commented Feb 18, 2020

qlzh727 commented Mar 4, 2020

ematejska commented Mar 4, 2020

ematejska commented Mar 4, 2020

qlzh727 commented Mar 4, 2020

apassos: no circular deps; triaging private symbols + preventing backsliding; for the gaps, we have a plan.

qlzh727 commented Mar 4, 2020

bhack commented Mar 4, 2020

bhack commented May 28, 2020 • edited Loading

qlzh727 commented Feb 6, 2020 •

edited

Loading

x10000year commented Feb 18, 2020 •

edited

Loading

bhack commented May 28, 2020 •

edited

Loading