You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is the same roadmap document that I'm using internally, with the internal bits taken out.
I am forcing these containers to get continuous support by using them for TF's internal CI: if they don't work, then our tests don't work. While I'm getting that ready during Q4 and Q1, I'm explicitly avoiding features that the TF team is not going to use, which would be dead-on-arrival unless we set up more testing for them, which I don't have the cycles to consider yet.
TF Nightly Milestone - Q4 Q1
Goal: Replicable container build of our tf-nightly Ubuntu packages
Containers can build tf-nightly package
SIG Build repository explains how to build tf-nightly package in Containers
Documentation exists on how to make changes to the containers
Suite of Kokoro jobs exists that publishes 80%-identical-to-now tf-nightly via containers
TF-nightly is officially built with the new containers
Documentation exists on how to use and debug containers
Release Test Milestone - Q4 Q1
Goal: Replicable container builds of our release tests, supporting each release
Containers can run same-as-now Nightly release tests
SIG Build repository explains how to run release tests as we do
Suite of CI jobs exists that matches current rel/nightly jobs
Existing release jobs replaced (but reversible if needed) by Container-based equivalent
Containers may be maintained and updated separately for TF release branches
Containers used for nightly/release libtensorflow and ci_sanity (now "code_check") jobs
CI & RBE Milestone - Q4 Q1/Q2
Goal: The main tests and our RBE tests use the same Docker container, updated in one place
Containers support internal presubmit/continuous build behavior
Containers are used in internal buildcop-monitored, DevInfra-owned presubmit/continuous jobs
Containers can be used in RBE
Containers are used as RBE environment for internal buildcop-monitored, DevInfra-owned jobs
DevInfra's GitHub-side presubmit tests use the containers
Containers are published on gcr.io
There is an easy way to verify if a change to the containers will not break the whole internal test suite
Forward Planning Milestone - Q2
Goal: Establish clear plan for any future work related to these containers. This is internal team planning stuff so I've removed it.
Downstream & OSS Milestone - Q2/Q3
Goal: Downstream users and custom-op developers use the same containers as our CI
SIG Addons / SIG IO use these Containers (or derivative) instead of old custom-op ones
Custom-op documentation migrated to SIG Build repository
Resolve: what to do about inconvenient default packages for e.g. SIG Addons (keras-nightly, etc.)
Resolve: what to do about inconveniently large image sizes for e.g. GPU content not needed
Docker-related documentation on tensorflow.org replaced with these containers
"devel" containers deprecated in favor of SIG Build containers
The text was updated successfully, but these errors were encountered:
Thanks for sharing the roadmap.
It could be a little bit hard to understand steps mentioning "internal/our" requirements but I think it is expected.
Taking a look at the new Github Actions that we have here in the repository it is really super-clear what we are doing and when we are what on the OSS side with the limit to what we have orchestrated with Github Action.
When we are mixing OSS receipts/code and internal not visible stuffs/steps (e.g. like orchestration, args like commits for nightly etc..) it could be a little bit hard to follow the machinery if the not visible part is not compensated by some documentation details (e.g. what event/cron will start the scripts, what is the script chains, what are the args etc..).
But also having this documentation compensation generally it will bet under a constant risk to be outdated as probably internal teams have a direct visibility on the internal changes and so the operations will be not directly impacted by an outdated public documentation.
But as Github Actions rely on a well know and popular YAML dialect, and Github users/contributors/develoeprs are generally skilled on this dialect, do you think that it could be possible to setup a TF own self-hosted Github Actions runners on the Google Cloud so that we have a complete overview on the TF OSS build and orchestration and probably also a little bit of autonomy to the SIG without adding too much overhead to the system?
Road Map for Docker Containers
This is the same roadmap document that I'm using internally, with the internal bits taken out.
I am forcing these containers to get continuous support by using them for TF's internal CI: if they don't work, then our tests don't work. While I'm getting that ready during Q4 and Q1, I'm explicitly avoiding features that the TF team is not going to use, which would be dead-on-arrival unless we set up more testing for them, which I don't have the cycles to consider yet.
TF Nightly Milestone -
Q4Q1Goal: Replicable container build of our tf-nightly Ubuntu packages
Release Test Milestone -
Q4Q1Goal: Replicable container builds of our release tests, supporting each release
CI & RBE Milestone -
Q4Q1/Q2Goal: The main tests and our RBE tests use the same Docker container, updated in one place
Forward Planning Milestone - Q2
Goal: Establish clear plan for any future work related to these containers. This is internal team planning stuff so I've removed it.
Downstream & OSS Milestone - Q2/Q3
Goal: Downstream users and custom-op developers use the same containers as our CI
The text was updated successfully, but these errors were encountered: