Compare to other ML e2e platforms #58

elgalu · 2018-06-18T03:34:30Z

First of all congratulations for releasing all this hard work to the public!

I went through the examples to see if I would be able to figure out how exactly does this project differentiates from others but only saw some minor technical differences.

Could you provide a summary on why did you decide to create a complete new ML pipeline instead of joining some of the other ongoing efforts?

mateiz · 2018-06-18T17:51:56Z

Our blog post at https://databricks.com/blog/2018/06/05/introducing-mlflow-an-open-source-machine-learning-platform.html has the overall motivation, though it might be good to write more direct comparisons on the website at some point. (I personally don't like having that because they invariably get out of date and then people are worried that the website compared against an older version of their platform.)

In a nutshell, I think there are two main goals in MLflow that are different from several of the platforms you list:

MLflow is meant to be an "open" platform in the sense that it's easy to bring in any ML library, existing code, existing deployment tools, etc, whereas a lot of the projects you mentioned are focused on a specific set of libraries (for example, TensorFlow and PyTorch) or a specific deployment environment (for example, Kubernetes). We want to allow people to string together workflows out of any component that some other team has used to implement an ML task. If some team across the world wrote a great classifier using a 25-year old R library, that's awesome: you should be able to call it as easily as you can call the latest Spark or PyTorch release.
The specific functions supported are different, and in particular, MLflow focuses more on the ML lifecycle and less on the task of deploying jobs to a specific execution platform (such as Kubernetes). For instance, we've spent more time on an experiment tracking UI/API, on a project packaging format that's easy to share through Git, and on designing a multi-flavor model format that allows deployment to quite a few different tools (local serving, SageMaker, Azure ML and Spark ML in the current release). On the other hand, many of these platforms focus specifically on deploying jobs to Kubernetes, which we don't currently try to do.

Basically, we didn't find anything that supported the large-scale, multi-library experimentation and deployment workflow that we saw people wanting to do, so we decided to focus on that.

In general though, MLflow should also be pretty complementary to many of the tools you listed. For example, you can deploy your jobs to Kubernetes using one of these but use MLflow Tracking to track experiments or MLflow Models as a format for deploying the model. MLflow's goal is mainly to let you manage the ML lifecycle regardless of which tools you use to train or run the model.

rquintino · 2018-06-21T15:54:09Z

outsider view, but agree with @mateiz , Polyaxon has some similarities but others above focus on completely different problems, not properly ml experimentation platforms for the ml develop lifecycle

been following pretty much every experimentation framework for quite some time, actually using and evolving our own, pretty much changes the way you do ml, for the better :) IMO.

still find mlflow work one of the best so far on the open source, was very refreshing to see this published and a lot of needs/ideas validated (grabbing some new ideas and contributing if possible).

Some of the ideas we were actually already using on our platform (local storage, experiments as a group of runs but not blocking of comparing pretty much everything to anything, everything gets stored,minimum overhead) plus job queue/scale out/docker, notifications, & saving a huge amount of metadata on a huge scale (thousands up to millions of runs), currently targeting better UI (here mlflow is better :) )

some references for similar work or workflows:

(must read, this is the feeling & inspiration )
http://blog.niland.io/how-we-conduct-research-at-niland/

this one, very recent:
https://machinelearningmastery.com/controlled-experiments-in-machine-learning/

others:
https://www.wandb.com/
https://azuremarketplace.microsoft.com/en-us/marketplace/apps/Microsoft.MachineLearningExperimentation?tab=Overview
https://github.com/williamFalcon/test-tube
http://artemis-ml.readthedocs.io/en/latest/experiments.html
https://www.comet.ml/
https://kaixhin.github.io/FGLab/
https://github.com/IDSIA/sacred
https://mllg.github.io/batchtools/
https://neptune.ml/
https://docs.skymind.ai/docs/welcome
https://github.com/mitdbg/modeldb
https://pythonhosted.org/Sumatra/index.html
https://github.com/christiansch/pythia
https://modelchimp.com/
https://github.com/ucbrise/flor
http://vfx.ai/2017/11/machine-learning-labs/
RQ

mateiz · 2018-06-22T05:06:52Z

Glad you like it, Rui! We're still early on on so we'd love input on what to improve or what will make it easier to run. We've also tried to design MLflow in a fairly modular way, where you can pick up some pieces but not others in your own platform.

kirk86 · 2019-08-20T14:46:50Z

@elgalu That's an ongoing issue that I see on github with many other libraries as well. For instance, I see that mlfflow is hightly influenced by sacred which was influenced by sumatra but it's a shame that ppl don't contribute to existing libraries. Even at this point I still find it hard to see the differences between mlflow vs sacred, for instance (not meant to be a criticism). Not only that, but some old libraries like sumatra still have features that I haven't seen in any of the new libraries being offered.

hernanborre · 2019-09-06T20:10:04Z

Our blog post at https://databricks.com/blog/2018/06/05/introducing-mlflow-an-open-source-machine-learning-platform.html has the overall motivation, though it might be good to write more direct comparisons on the website at some point. (I personally don't like having that because they invariably get out of date and then people are worried that the website compared against an older version of their platform.)

In a nutshell, I think there are two main goals in MLflow that are different from several of the platforms you list:

MLflow is meant to be an "open" platform in the sense that it's easy to bring in any ML library, existing code, existing deployment tools, etc, whereas a lot of the projects you mentioned are focused on a specific set of libraries (for example, TensorFlow and PyTorch) or a specific deployment environment (for example, Kubernetes). We want to allow people to string together workflows out of any component that some other team has used to implement an ML task. If some team across the world wrote a great classifier using a 25-year old R library, that's awesome: you should be able to call it as easily as you can call the latest Spark or PyTorch release.

The specific functions supported are different, and in particular, MLflow focuses more on the ML lifecycle and less on the task of deploying jobs to a specific execution platform (such as Kubernetes). For instance, we've spent more time on an experiment tracking UI/API, on a project packaging format that's easy to share through Git, and on designing a multi-flavor model format that allows deployment to quite a few different tools (local serving, SageMaker, Azure ML and Spark ML in the current release). On the other hand, many of these platforms focus specifically on deploying jobs to Kubernetes, which we don't currently try to do.

Basically, we didn't find anything that supported the large-scale, multi-library experimentation and deployment workflow that we saw people wanting to do, so we decided to focus on that.

In general though, MLflow should also be pretty complementary to many of the tools you listed. For example, you can deploy your jobs to Kubernetes using one of these but use MLflow Tracking to track experiments or MLflow Models as a format for deploying the model. MLflow's goal is mainly to let you manage the ML lifecycle regardless of which tools you use to train or run the model.

Hi Mateiz, what a great reply!

I'm deciding what stack to adopt for my current employer and I'm having a hard time figuring out if it's possible ( or if it makes sense too) to have TFX on the models side but adopt MLFlow to manage libraries, artifacts, lifecycle, etc.

What are your thoughts on on this?

Thanks in advance,
Best regards,
Hernán

mateiz · 2019-09-06T21:58:10Z

Yup, it should be possible to do that. MLflow already supports saving and managing TensorFlow models, as well as automatic logging of metrics that you send to TensorBoard. Which other pieces of TFX are important to you? We might be able to add built in integrations if needed, or you can just use them alongside the MLflow APIs.

hernanborre · 2019-09-09T17:38:11Z

Thanks a lot for your quick reply @mateiz!

So basically the idea is to have a relatively common pipeline implemented. The main idea is to use it to a variety of applications (from tabular data to images or NLP). Since I'm building up the machine learning area in this company, we are still discussing with the stakeholders which use cases will be tackled first. However, I've been working on ML/DS for a while now and I know the importance of defining a pipeline to be able to reuse models, data prep, data validation and sharing.

My only fear on integrating MLFlow and TFX is that there will be things that might go out of control in either one of the tools at some point.

Master sync 05 12

Disable conda compatibility tests

mateiz added the area/docs Documentation issues label Jun 28, 2018

durandom mentioned this issue Oct 12, 2018

[WIP] Experiment tracking proposal kubeflow/community#195

Closed

apurva-koti closed this as completed Jul 26, 2019

jdlesage added a commit to jdlesage/mlflow that referenced this issue Dec 23, 2019

Merge pull request mlflow#58 from criteo-forks/master-sync-05-12

2960d56

Master sync 05 12

dbczumar referenced this issue in dbczumar/mlflow Apr 28, 2022

Merge pull request #58 from databricks/disable_conda_compat

8fe823c

Disable conda compatibility tests

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compare to other ML e2e platforms #58

Compare to other ML e2e platforms #58

elgalu commented Jun 18, 2018

mateiz commented Jun 18, 2018 •

edited

Loading

rquintino commented Jun 21, 2018 •

edited

Loading

mateiz commented Jun 22, 2018

kirk86 commented Aug 20, 2019

hernanborre commented Sep 6, 2019

mateiz commented Sep 6, 2019

hernanborre commented Sep 9, 2019

Compare to other ML e2e platforms #58

Compare to other ML e2e platforms #58

Comments

elgalu commented Jun 18, 2018

mateiz commented Jun 18, 2018 • edited Loading

rquintino commented Jun 21, 2018 • edited Loading

mateiz commented Jun 22, 2018

kirk86 commented Aug 20, 2019

hernanborre commented Sep 6, 2019

mateiz commented Sep 6, 2019

hernanborre commented Sep 9, 2019

mateiz commented Jun 18, 2018 •

edited

Loading

rquintino commented Jun 21, 2018 •

edited

Loading