-
Notifications
You must be signed in to change notification settings - Fork 579
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Standalone Keras Repository #202
Conversation
Rebase to the HEAD of tf/community.
Proofreading, etc.
|
||
### Two-stage change process | ||
|
||
For any change that is affecting both TensorFlow and Keras, the change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case we have again the cache sharing problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you detail what is the cache you are mentioning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
## Questions and Discussion Topics | ||
|
||
1. Tools for issue tracking: we can't rely on Google-internal bug tracking tool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This another critical point that need to be solved. We cannot have a weaker issue tracking than TF. Check the full thread at #29
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the notice. I will check with infra team and support team to see if they have any suggestions for improving this situation. Will leave this open for discussion in the design meeting.
Please announce this on the TF dev mailinglist(s) as usual. /cc @ewilderj |
Will the ci still run keras tests when changes are made to the tensorflow repository ? |
* pip package management. Keras will now follow the `tf-estimator` approach. | ||
"pip install tensorflow" should also install Keras (from PyPI) as well. | ||
There are more details for the pip package in the | ||
[Improved pip package structure](https://github.com/tensorflow/community/pull/182) RFC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also considerconda
this time? Anaconda
is a huge part of Data Science and ML ecosystem and tf.keras
not being a part of it doesn't feel right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gunan from TF build team, and also @seanpmorgan who mentioned this in the SIG-addon meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed this item on the design review meeting. So far this is not active target we are supporting. The meeting conclusion are:
Conda uses a different toolchain, with no ABI compatibility between TF+Conda. Because we don’t have C interfaces, any custom ops built for TF will not work for Conda, and vice versa. There is a set of people who publish TF for Conda. But tf addons, io, etc. are not published for Conda. There’s no easy way around this because it would require doubling the TF release burden entirely.
Keras in theory should be easier? Insofar as it is Python only (currently). But Keras will in the future include C(++). That should be fine as long as there is a Python interface to the exposed parts, or C parts are used internally.
* Replace the usage with another alternative TF public API. | ||
* Make the functionality a new TF public API. | ||
|
||
**Note that the open-source community is encouraged to contribute to this effort.** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice description of the problem. Since there is a lot of decision making going on there, I believe it would be nice for people in charge of decisions to open issues describing what they want to do for all the unwanted imports we'll have. It's hard to help when you don't know what will be decided to resolve the problem (here option 1? 2? or 3?).
If we don't do that, I believe the open-source community won't help much just because of a lack of guidance.
Co-Authored-By: Frédéric Branchaud-Charron <[email protected]>
We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. ℹ️ Googlers: Go here for more info. |
* We expect the majority of the code development/contribution from GitHub | ||
and the dev tools / tests / scripts should focus on the GitHub development use | ||
case. See below for more details. | ||
* Keras CI/presubmit build for the GitHub repo should target a stable PIP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has the disadvantage of slowing down keras's adoption of new TF features and APIs.
This document should clarify when would CI against tf stable be expected to break, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(also important to highlight the release process for TF)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not supposed to slow down adoption of new features compared to targeting an unpinned nightly all the time. Maybe it's not clear in the document, but if a pull request requires a new feature in tensorflow nightly, the person making the pull request can change the targeted version of tensorflow in the CI. So we don't have to wait at all.
Do I get it right or you were referring to another problem?
developer community. | ||
|
||
In addition, by getting the Keras team at Google to start developing Keras | ||
using the same public tools and infrastructure as third-party developers, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would split this point into multiple sub-points, because I believe that this argument is not sufficiently highlighted.
- Github is not made to manage monorepos like tensorflow. Keras issues and pull requests get mixed up with the tensorflow core ones.
- The source of truth being in github is very important because it allows the community to manage the repo. Currently, some commits pop up without any public pull request or public user, making it impossible for non-googlers to understand what is the rational behind a commit, or even who to contact if it caused a bug: e.g. tensorflow/tensorflow@9698ae1
- As long as keras is in tensorflow, it's hard for us to have any impact on the processes/tests/tools that keras use internally. Imagine for example that we want to use Isort, or pytest, or github actions... In all cases it's a lot more work (code) and a lot more people who need to review and give us a green light. With a repository with the size of tensorflow, it's just unfeasable unless we have a face-to-face meeting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keras issues and pull requests get mixed up with the tensorflow core ones.
before the split, we had the same issue. Not sure how to fix that.
If keras is going to be moved to a standalone repo, are v1 apis (tf.get_variable(), tf.variable_scope(), etc.) for building static graph still available in the tensorflow repo? Can we build custom models without using keras? The v1 apis are really very simple and flexible to use for building static graph, while the keras api forces people to use OOP style and gives a lot of pain for industrial engineers who are familiar with static graphs and functional style. |
All the public API symbols will still remain available for v1. |
1. Udpate the motivation to emphasize the modularity of TF. 2. Update the section for reverse dependency from TF to keras.
These numbers might help us decide whether the split PR execise is too much a overhead or not.
@ematejska, I have updated RFC with latest result from the design meeting. Would u like to have the meeting notes also being posted here? |
@qlzh727 Yes please. |
Also, if you could update the status of the RFC to Accepted, that would be appreciated. Thanks. |
Meeting notes from the design review on Mar 03 (taken by @karmel) Notes
apassos: no circular deps; triaging private symbols + preventing backsliding; for the gaps, we have a plan.
|
Done. |
We need to handle cross repositories PR/ISSUE but we don't have this feature naturally in Github. See https://codetree.com/guides/managing-issues-across-multiple-github-repositories |
What Is the timeline to reactivate the standalone repository? |
Comment period is open till 2/25/2020.
Standalone Keras Repository
Objective
Move the Keras code from the TensorFlow main GitHub repository to its own
repository, with TensorFlow as a dependency.