iterative · aguschin · Mar 7, 2023 · Mar 7, 2023 · Mar 7, 2023 · Mar 7, 2023
diff --git a/content/docs/gto/get-started-dvc.md b/content/docs/gto/get-started-dvc.md
@@ -0,0 +1,176 @@
+# Get Started DVC
+
+To leverage concepts of Model and Data Registries in a more explicit way, you
+can denote the `type` of each output. This will let you browse models and data
+separately, address them by `name` in `dvc get`, and eventually, see them in DVC
+Studio.
+
+Let's start with marking an artifact as data or model.
-Let's start with marking an artifact as data or model.
+Let's start with marking a tracked artifact (file) as a `model`.
-Let's start with marking an artifact as data or model.
+Let's start with marking a tracked artifact (file) as a `model`.
+
+If you're using `dvc add` to track your artifact, you'll need to run:
+
+```dvc
+# note that all CLI options are optional:
+$ dvc add models/mymodel.pkl \
+    --type model \  # this makes DVC understand it's an ML model
+    --name def-detector \
+    --description "glass defect image classifier" \
+    --label "algo=cnn" \
+    --label "owner=aguschin" \
+    --label "project=prod-qual-002"
+```
+
+<details>
+
+### Beside tracking this as usually, his will add it to a top section called `registry` in your `dvc.yaml`
+
+```yaml
+# dvc.yaml
+registry:
+  def-detector: # just like with plots, this could be a path or any string ID
+    type: model
+    description: glass defect image classifier
+    labels:
+      - algo=cnn
+      - owner=aguschin
+      - project=prod-qual-002
+    path: models/mymodel.pkl # specify path if use alias to name this
+```
+
+If you want this to be in a separate file (say, `artifacts.yaml`), you can tell
+DVC to use it with:
+
+```yaml
+# dvc.yaml
+registry: artifacts.yaml
+```
+
+</details>
+
+If you're producing your models in DVC pipeline, you can edit `registry` section
+or `artifacts.yaml` yourself (or simply run the same `dvc add` command which
+will do that for you) and then reference the output by ID or path in `deps` or
+`outs`:
+
+```yaml
+# dvc.yaml
+stages:
+  train:
+    cmd: python train.py
+    deps:
+      - data.xml
+    outs:
+      - def-detector # or "models/mymodel.pkl" instead
+```
+
+You can also specify that while using DVCLive, which will also add your model to
+the `registry` section in `dvc.yaml`:
+
+```py
+# you can pass `name`, `description`, `labels` as well
+live.log_artifact(artifact, "path", type="model")
+```
+
+This will make them appear in DVC Model Registry:
-This will make them appear in DVC Model Registry:
+This will make them appear in [Studio Model Registry](https://dvc.org/doc/studio/user-guide/model-registry/what-is-a-model-registry):
-This will make them appear in DVC Model Registry:
+This will make them appear in [Studio Model Registry](https://dvc.org/doc/studio/user-guide/model-registry/what-is-a-model-registry):
+
+![](https://user-images.githubusercontent.com/6797716/223443152-84f57b79-3395-4965-97f9-edc81896a1dc.png)
+
+and make them shown as models in `dvc ls`:
+
+```dvc
+$ dvc ls --registry  # add `--type model` to see models only
+ Path           Name                   Type     Labels                       Description
+ mymodel.pkl                           model
+ data.xml       stackoverflow-dataset  data     data-registry,get-started    imported code
+ data/data.xml  another-dataset        data     data-registry,get-started    imported
+```
+
+The same way you specify `type`, you can specify `description`, `labels` and
+`name`. Defining human-readable `name` (should be unique) is useful when you
+have complex folder structures or if you artifact can have different paths
+during the project lifecycle.
+
+You can use `name` to address the object in `dvc get`:
+
+```dvc
+$ dvc get $REPO def-detector -o model.pkl
+```
+
+Now, you usually need a specific model version rather than one from the `main`
+branch. You can keep track of the model's lineage by
+[registering Semantic versions and promoting your models](/doc/gto/get-started)
+(or other artifacts) to stages such as `dev` or `production` with GTO. GTO
+operates by creating Git tags such as `[email protected]` or `mymodel#prod`.
+Knowing the right Git tag, you can get the model locally:
+
+```dvc
+$ dvc get $REPO mymodel.pkl --rev [email protected]
+```
+
+Check out
+[GTO User Guide](/doc/gto/user-guide/#getting-artifacts-in-systems-downstream)
+to learn how to get the Git tag of the `latest` version or version currently
+promoted to stages like `prod`.
+
+<details>
+
+### Getting `latest` or what's in `prod` directly with DVC [extra for now]
+
+(This can be implemented, but for now we decided not to - let's wait and see)
+
+You can also use shortcuts in `dvc get`:
+
+```dvc
+$ dvc get $REPO def-detector@latest  # download the latest version
+$ dvc get $REPO def-detector#prod    # download what's in prod
+```
+
+</details>
+
+## Getting models in CI/CD
+
+Git tags are great to [kick off CI/CD](/doc/gto/user-guide/#acting-in-cicd)
+pipeline in which we can consume our model. You can use
+[GTO GitHub action](https://github.com/iterative/gto-action) to interpret the
+Git tag that triggered the workflow and act based on that. If you simply need to
+download the model to CI, you can also use this Action with `download` option:
+
+```yaml
+steps:
+  - uses: actions/checkout@v3
+  - id: gto
+    uses: iterative/gto-action@v1
+    with:
+      download: True # you can provide a specific destination path here instead of `True`
+```
+
+Which means, if the Git tag that triggered this workflow registers a version or
+promotes it to a stage (like `[email protected]` or `mymodel#prod`), this will run
+`dvc get . mymodel`.
+
+## Restricting which types are allowed [extra for now]
+
+To specify which `type`s are allowed to be used, you can add the following to
+your `.dvc/config`:
+
+```
+# .dvc/config
+types: [model, data]
+```
+
+## Seeing new model versions pushed with DVC experiments
-## Seeing new model versions pushed with DVC experiments
+## Models and Experiments
-## Seeing new model versions pushed with DVC experiments
+## Models and Experiments
+
+After you run `dvc exp push` to push your experiment that updates your model,
+you'll see a commit candidate to be registered:
+
+![](https://user-images.githubusercontent.com/6797716/223444959-d8ddd1a0-5582-405f-9ab0-807e1a0c9489.png)
+
+Please note it's usually a good idea to merge your experiment before registering
+a semantic version to avoid creating dangling commits (not reachable from any
+branch).
+
+In future you'll also be able to compare that new model version pushed (even non
+semver-registered) with the latest one on this Model Details Page. Or have a
+button to go to the main repo view with "compare" enabled:
+
+![](https://user-images.githubusercontent.com/6797716/223445799-7ae65e58-6a9e-42a8-890a-f04839349873.png)
diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json
@@ -656,6 +656,11 @@
         "label": "Get Started",
         "source": "get-started.md"
       },
+      {
+        "slug": "get-started-dvc",
+        "label": "Get Started for DVC",
+        "source": "get-started-dvc.md"
+      },
       {
         "slug": "user-guide",
         "label": "User Guide",