This highlights the main OSS efforts for the TFX team in 2019 and H1 2020. If you're interested in contributing in one of these areas, contributions are always welcome, especially in areas that extend TFX into infrastructure currently not widely in use at Google.
- Democratize access to machine learning (ML) best practices, tools, and code.
- Enable users to easily run production ML pipelines on public clouds, on premises, and in heterogeneous computing environments.
- Help enterprises realize large-scale production ML capabilities similar to what we have available at Google. We recognize that every enterprise has unique infrastructure challenges, and we want TFX to be open and adaptable to those challenges.
- Stimulate innovation: Machine learning is a rapid, innovative field and we want TFX to help researchers and engineers both realize and contribute to that innovation. Likewise, we want TFX to be interoperable with other ML efforts in the open source community.
- Usability: We want the journey to deploy a model in production to be as frictionless as possible throughout the entire journey -- from the initial efforts building a model to the final touches of deploying in production.
- Encourage the discovery and reuse of external contributions.
- Participate in and extend support for other OSS efforts, initially: Apache Beam, ML Metadata, Kubeflow, Tensorboard, and TensorFlow 2.0.
- Align ML framework support with Kubeflow pipelines.
- Extend portability across additional cluster computing frameworks, orchestrators, and data representations.
- Better distributed training support (DistributionStrategy).
- Better telemetry for users to understand the behavior of components in a TFX pipeline.
- Complete the support for tensorflow 2.x functionaties, including tf.distribute and Keras without Estimator.
- Improving the testing capabilities for OSS developers.
- Increased interoperability with Kubeflow Pipelines, with a focus on providing more flexibility from unified DSL and converging on pipeline presentation and orchestration semantics.
- Support for training on continuously arriving data and more advanced orchestration semantics.
- New template in TFX OSS to ease creation of TFX pipelines.
- More pipeline code examples, including DIY orchestrators and custom components.
- Work with ML Metadata to publish standard ontology types and show case them through TFX.
- Support mobile and edge devices by integrating with tf.lite.
- Formalize Special Interest Groups (SIGs) for specific aspects of TFX to accelerate community innovation and collaboration.
- Early access to new features.
- Q1 2020
- New ComponentSpec and standard artifact types published.
- Allow pipelines to be parameterized with
RuntimeParameters
. - Enabled warm-starting for estimator based trainers.
- Q4 2019
- Added limited support for TF.Keras through
tf.keras.estimator.model_to_estimator()
.
- Added limited support for TF.Keras through
- Q3 2019
- Support for local orchestrator through Apache Beam.
- Experimental support for interactive development on Jupyter notebook.
- Experimental support for TFX CLI released.
- Multiple public RFCs published to the tensorflow/community project.
- Q2 2019
- Support for Python3.
- Support for Apache Spark and Apache Flink runners (with examples).
- Custom executors (with examples).
- Q1 2019
- TFX end-to-end pipeline, config, and orchestration initial release.
- ml.metadata initial release.
- Q3 2018
- TensorFlow Data Validation initial release.
- Q1 2018
- TensorFlow Model Analysis initial release.
- Q1 2017
- TensorFlow Transform initial release.
- Q1 2016
- TensorFlow Serving initial release.