diff --git a/content/docs/use-cases/data-registries.md b/content/docs/use-cases/data-registries.md index 76c01098315..206ff21b9c2 100644 --- a/content/docs/use-cases/data-registries.md +++ b/content/docs/use-cases/data-registries.md @@ -22,10 +22,10 @@ _middleware_ between ML projects and cloud storage. Here are its advantages: - Reusability: reproduce and organize _feature stores_ with a simple CLI (`dvc get` and `dvc import` commands, similar to software package management systems like `pip`). -- Persistence: the DVC registry-controlled - [remote storage](/doc/command-reference/remote) (e.g. an S3 bucket) improves - data security. There are less chances someone can delete or rewrite a model, - for example. +- Availability and persistence: the + [remote storage](/doc/command-reference/remote) configured in the DVC registry + (e.g. an S3 bucket) works as a proxy that endures your data is always + available and that it can outlive the registry itself. - Storage optimization: track data [shared](/doc/use-cases/sharing-data-and-model-files) by multiple projects centralized in a single location (with the ability to create distributed @@ -34,10 +34,11 @@ _middleware_ between ML projects and cloud storage. Here are its advantages: - Data as code: leverage Git workflow such as commits, branching, pull requests, reviews, and even CI/CD for your data and models lifecycle. Think "Git for cloud storage", but without ad-hoc conventions. -- Security: registries can be setup to have read-only remote storage (e.g. an - HTTP location). Git versioning of - [DVC metafiles](/doc/user-guide/dvc-files-and-directories) allows us to track - and audit data changes. +- Data Security: there are less chances someone can delete or rewrite datasets + or models, preserving data integrity. Registries can even be setup to use + read-only remote storage (e.g. an HTTP location). Additionally, versioning + [DVC metafiles](/doc/user-guide/dvc-files-and-directories) with Git enables + following and auditing data changes. ## Building registries