diff --git a/content/docs/start/index.md b/content/docs/start/index.md index 060edd3e9a..a31786869e 100644 --- a/content/docs/start/index.md +++ b/content/docs/start/index.md @@ -49,13 +49,15 @@ Changes to be committed: $ git commit -m "Initialize DVC" ``` -DVC features can be grouped into functional components. We'll explore them one -by one in the next few sections: +Now you're ready to DVC! -- [**Data versioning**](/doc/start/data-versioning) is the base layer of DVC for - large files, datasets, and machine learning models. It looks like a regular - Git workflow, but without storing large files in the repo (think "Git for - data"). Data is stored separately, which allows for efficient sharing. +DVC's features can be grouped into functional components. We'll explore them one +by one in the next few pages: + +- [**Data versioning**](/doc/start/data-versioning) (try this next) is the base + layer of DVC for large files, datasets, and machine learning models. Use a + regular Git workflow, but without storing large files in the repo (think "Git + for data"). Data is stored separately, which allows for efficient sharing. - [**Data access**](/doc/start/data-access) shows how to use data artifacts from outside of the project and how to import data artifacts from another DVC diff --git a/content/docs/use-cases/data-registries.md b/content/docs/use-cases/data-registries.md index 2848f75877..3cdaa4fb2d 100644 --- a/content/docs/use-cases/data-registries.md +++ b/content/docs/use-cases/data-registries.md @@ -30,7 +30,7 @@ Advantages of data registries: management and optimizes space requirements. - **Data as code**: leverage Git workflows such as commits, branching, pull requests, reviews, and even CI/CD for your data and models lifecycle. Think - "Git for cloud storage", but without ad-hoc conventions. + "Git for cloud storage". - **Security**: registries can be setup with read-only remote storage (e.g. an HTTP server). diff --git a/content/docs/user-guide/dvc-files-and-directories.md b/content/docs/user-guide/dvc-files-and-directories.md index 5d687dafb4..d7760b6272 100644 --- a/content/docs/user-guide/dvc-files-and-directories.md +++ b/content/docs/user-guide/dvc-files-and-directories.md @@ -72,14 +72,15 @@ An _output entry_ (`outs`) can have these fields: HTTP, S3, or Azure [external outputs](/doc/user-guide/managing-external-data); and a special _checksum_ for HDFS and WebHDFS. - `size`: Size of the file or directory (sum of all files). -- `nfiles`: If a directory, number of files inside. +- `nfiles`: If this output is a directory, the number of files inside + (recursive). - `cache`: Whether or not this file or directory is cached (`true` by default, if not present). See the `--no-commit` option of `dvc add`. - `persist`: Whether the output file/dir should remain in place while `dvc repro` runs. By default outputs are deleted when `dvc repro` starts (if this value is not present). -- `desc`: User description for this output. This doesn't affect any DVC - operations. +- `desc` (optional): User description for this output. This doesn't affect any + DVC operations. A _dependency entry_ (`deps`) can have these fields: @@ -91,7 +92,8 @@ A _dependency entry_ (`deps`) can have these fields: HTTP, S3, or Azure external dependencies; and a special _checksum_ for HDFS and WebHDFS. See `dvc import-url` for more information. - `size`: Size of the file or directory (sum of all files). -- `nfiles`: If a directory, number of files inside. +- `nfiles`: If this dependency is a directory, the number of files inside + (recursive). - `repo`: This entry is only for external dependencies created with `dvc import`, and can contains the following fields: diff --git a/content/docs/user-guide/how-to/merge-conflicts.md b/content/docs/user-guide/how-to/merge-conflicts.md index 89c003b9ab..dd4408ef15 100644 --- a/content/docs/user-guide/how-to/merge-conflicts.md +++ b/content/docs/user-guide/how-to/merge-conflicts.md @@ -36,9 +36,8 @@ stages: ## `dvc.lock` -There's no need to resolve lock file conflicts manually. You can safely delete -this file and then use `dvc repro` after merging `dvc.yaml` to regenerate this -file. +There's no need to resolve lock file conflicts manually. You can safely +overwrite this file by using `dvc repro` after merging `dvc.yaml`. > `dvc commit` can also be a good option, but only for the specific case where > the `HEAD` version is chosen. diff --git a/content/docs/user-guide/setup-google-drive-remote.md b/content/docs/user-guide/setup-google-drive-remote.md index e86e7961fb..cecff6b626 100644 --- a/content/docs/user-guide/setup-google-drive-remote.md +++ b/content/docs/user-guide/setup-google-drive-remote.md @@ -215,7 +215,7 @@ individually. If you use multiple GDrive remotes, by default they will be sharing the same `.dvc/tmp/gdrive-user-credentials.json` file. It can be overridden with the -`gdrive_user_credentials_file` setting: +`gdrive_user_credentials_file` parameter: ```dvc $ dvc remote modify myremote gdrive_user_credentials_file \