Skip to content
This repository has been archived by the owner on Sep 13, 2023. It is now read-only.

Update README.md #220

Merged
merged 10 commits into from
Jun 21, 2022
61 changes: 37 additions & 24 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,52 +5,65 @@
[![codecov](https://codecov.io/gh/iterative/mlem/branch/main/graph/badge.svg?token=WHU4OAB6O2)](https://codecov.io/gh/iterative/mlem)
[![PyPi](https://img.shields.io/pypi/v/mlem.svg?label=pip&logo=PyPI&logoColor=white)](https://pypi.org/project/mlem)

MLEM helps you with model deployment. It saves ML models in a standard format that can be used in a variety of downstream deployment scenarios such as real-time serving through a REST API or batch processing. MLEM format is a human-readable text that helps you use GitOps with Git as the single source of truth.
MLEM helps you package and deploy machine learning models.
It saves model metadata in a standard, human-readable format that can be used in a variety of deployment scenarios, such as real-time serving through a REST API or a batch processing task.
MLEM lets you keep Git as the single source of truth for code and models.

- **Run your model anywhere you want:** package it as a Python package, a Docker Image or deploy it to Heroku (SageMaker, Kubernetes and more platforms are coming). Switch between formats and deployment platforms with a single command thanks to unified abstraction.
- **Simple text file to save model metadata:** automatically package Python env requirements and input data specifications into a ready-to-deploy format. Use the same human-readable format for any ML framework.
- **Stick to your training workflow:** MLEM doesn't ask you to rewrite your training code. To start using packaging or deployment machinery, add just two lines to your python script: one to import the library and one to save the model.
- **Developer-first experience:** use CLI when you feel like DevOps and API when you feel like a developer.
- **Run your ML models anywhere:**
Wrap models as a Python package or Docker Image, or deploy them to Heroku (SageMaker, Kubernetes, and more platforms coming soon).
Switch between platforms transparently, with a single command.
jorgeorpinel marked this conversation as resolved.
Show resolved Hide resolved

## Why MLEM?
- **Simple text file to save model metadata:**
Automatically include Python requirements and input data needs into a deployment-ready format.
Use the same format on any ML framework.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the format YAML? Where can we see the schema?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, YAML.
Do you think it worth in showing schema somewhere? You aren't supposed to write those Yaml files, only read, if you need it for some reason.

Copy link
Contributor Author

@jorgeorpinel jorgeorpinel May 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment was marked as resolved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.mlem yaml schema is dependent on the type of the object. For model, data, deployment, etc, it will be a different schema. IIRC, this is the reason why we cannot easily create the schema as DVC can. We can take a look later, but this is not the priority rn.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created the issue to address this later https://github.com/iterative/mlem/issues/306

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK np. Putting a pseudo schema spec somewhere in mlem.ai.doc would be nice though ⏳

jorgeorpinel marked this conversation as resolved.
Show resolved Hide resolved

The main reason to use MLEM instead of other related solutions is that it works well with **GitOps approach** and helps you manage model lifecycle in Git:
- **Stick to your training workflow:**
MLEM doesn't ask you to rewrite model training code.
Just add two lines around your Python code: one to import the library and one to save the model.

- **Git as a single source of truth:** we use plain text to save metadata for models that can be saved and versioned.
- **Reuse existing Git and Github/Gitlab infrastructure** for model management instead of installing separate model management software.
- **Unify model and software deployment.** Deploy models using the same processes and code you use to deploy software.
- **Developer-first experience:**
Use the CLI when you feel like DevOps and the API when you feel like a developer.

## Why is MLEM special?

The main reason to use MLEM instead of other tools is to adopt a **GitOps approach**, helping you manage model lifecycles in Git:

- **Git as a single source of truth:** MLEM writes model metadata to a plain text file that can be versioned in a Git repo along with your code.
Copy link
Contributor Author

@jorgeorpinel jorgeorpinel May 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Consider: This whole section could be merged in with the intro, as well as the bullet lists combined (not as-is).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UPDATE: See @dberenbaum's recommendation:

the first section bullets should explain the basic workflow [numbered list] rather than the benefits, and the rest should be moved to “Why MLEM?” [bullet list].

- **Unify model and software deployment:** Release models using the same processes used for software updates (branching, pull requests, etc.).
- **Reuse existing Git infrastructure:** Use familiar hosting like Github or Gitlab for model management, instead of having separate services.

## Installation

Install MLEM with pip:
MLEM requires Python 3.

```
$ pip install mlem
```console
$ pyhon -m pip install mlem
aguschin marked this conversation as resolved.
Show resolved Hide resolved
```

To install the development version, run:
To install the bleeding-edge version, use:
jorgeorpinel marked this conversation as resolved.
Show resolved Hide resolved

```
$ pip install git+https://github.com/iterative/mlem
```console
$ pyhon -m pip install git+https://github.com/iterative/mlem
jorgeorpinel marked this conversation as resolved.
Show resolved Hide resolved
```

## Anonymized Usage Analytics

To help us better understand how MLEM is used and improve it, MLEM captures and reports anonymized usage statistics. You will be notified the first time you run `mlem init`.
To help us better understand how MLEM is used and improve it, MLEM captures and reports anonymized usage statistics.

### What
MLEM's analytics record the following information per event:

- MLEM version (e.g., `0.1.2+5fb5a3.mod`) and OS version (e.g., `MacOS 10.16`)
- Command name and exception type (e.g., `ls, ValueError` or `get, MLEMRootNotFound`)
- Country, city (e.g., `RU, Moscow`)
- A random user_id (generated with [uuid](https://docs.python.org/3/library/uuid.html))

### Implementation
The code is viewable in [analytics.py](https://github.com/iterative/mlem/mlem/analytics.py). They are done in a separate background process and fail fast to avoid delaying any execution. They will fail immediately and silently if you have no network connection.

MLEM's analytics are sent through Iterative's proxy to Google BigQuery over HTTPS.
> The code is viewable in [analytics.py](https://github.com/iterative/mlem/mlem/analytics.py).
> MLEM's analytics are sent through Iterative's proxy to Google BigQuery over HTTPS.

### Opting out
MLEM analytics help the entire community, so leaving it on is appreciated. However, if you want to opt out of MLEM's analytics, you can disable it via setting an environment variable `MLEM_NO_ANALYTICS=true` or by adding `no_analytics: true` to `.mlem/config.yaml`

This will disable it for the project. We'll add an option to opt out globally soon.
MLEM analytics have no performance impact and help the entire community, so leaving them on is appreciated.
However, to opt out set an environment variable `MLEM_NO_ANALYTICS=true` or add `no_analytics: true` to `.mlem/config.yaml`.

> This will disable it for the project.
> We'll add an option to opt out globally soon.