From 9f7b3d97db31f0376f445502d80ad9084c25a91f Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Wed, 5 Aug 2020 21:14:07 -0500 Subject: [PATCH 1/5] cmd: remove unnecessary commas in get and import --- content/docs/command-reference/get.md | 2 +- content/docs/command-reference/import.md | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/content/docs/command-reference/get.md b/content/docs/command-reference/get.md index d09a375804..b3b54c94d9 100644 --- a/content/docs/command-reference/get.md +++ b/content/docs/command-reference/get.md @@ -39,7 +39,7 @@ downloading, DVC will try to copy the target data from its cache). The `path` argument is used to specify the location of the target to download within the source repository at `url`. `path` can specify any file or directory in the source repo, either tracked by DVC (including paths inside tracked -directories), or by Git. Note that DVC-tracked targets should be found in a +directories) or by Git. Note that DVC-tracked targets should be found in a `dvc.yaml` or `.dvc` file of the project. ⚠️ The project should have a default diff --git a/content/docs/command-reference/import.md b/content/docs/command-reference/import.md index c504be21c1..7b1a131e91 100644 --- a/content/docs/command-reference/import.md +++ b/content/docs/command-reference/import.md @@ -42,7 +42,7 @@ downloading, DVC will try to copy the target data from its cache). The `path` argument is used to specify the location of the target to download within the source repository at `url`. `path` can specify any file or directory in the source repo, either tracked by DVC (including paths inside tracked -directories), or by Git. Note that DVC-tracked targets should be found in a +directories) or by Git. Note that DVC-tracked targets should be found in a `dvc.yaml` or `.dvc` file of the project. ⚠️ The project should have a default From c0f7748857e8a019d11245a69bda6aa4ad89f9d2 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Wed, 5 Aug 2020 21:14:27 -0500 Subject: [PATCH 2/5] cmd: fix typo in add --- content/docs/command-reference/remote/add.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/docs/command-reference/remote/add.md b/content/docs/command-reference/remote/add.md index adc643d7aa..524eb9ce1d 100644 --- a/content/docs/command-reference/remote/add.md +++ b/content/docs/command-reference/remote/add.md @@ -24,8 +24,8 @@ or even a directory in the local file system. (See all the supported remote storage types in the examples below.) If `url` is a relative path, it will be resolved against the current working directory, but saved **relative to the config file location** (see LOCAL example below). Whenever possible, DVC will -create a remote directory if it doesn't exists yet. (It won't create an S3 -bucket though, and will rely on default access settings.) +create a remote directory if it doesn't exist yet. (It won't create an S3 bucket +though, and will rely on default access settings.) > If you installed DVC via `pip` and plan to use cloud services as remote > storage, you might need to install these optional dependencies: `[s3]`, From 308a23dbaac6eb782c254b731974d465a7dd0e54 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Thu, 6 Aug 2020 02:44:13 -0500 Subject: [PATCH 3/5] cmd: remote copy edits per https://github.com/iterative/dvc.org/pull/1617#discussion_r465675597 --- content/docs/command-reference/remote/add.md | 3 +- .../docs/command-reference/remote/modify.md | 86 +++++++++++-------- 2 files changed, 50 insertions(+), 39 deletions(-) diff --git a/content/docs/command-reference/remote/add.md b/content/docs/command-reference/remote/add.md index 524eb9ce1d..7fca263aa7 100644 --- a/content/docs/command-reference/remote/add.md +++ b/content/docs/command-reference/remote/add.md @@ -131,7 +131,8 @@ For example: ```dvc $ dvc remote add -d myremote s3://mybucket/path/to/dir -$ dvc remote modify myremote endpointurl https://object-storage.example.com +$ dvc remote modify myremote endpointurl \ + https://object-storage.example.com ``` > See `dvc remote modify` for a full list of S3 API parameters. diff --git a/content/docs/command-reference/remote/modify.md b/content/docs/command-reference/remote/modify.md index 279a791719..ce9211a97d 100644 --- a/content/docs/command-reference/remote/modify.md +++ b/content/docs/command-reference/remote/modify.md @@ -90,19 +90,19 @@ these settings, you could use the following options: $ dvc remote modify myremote region us-east-2 ``` -- `profile` - credentials profile name to use to access S3: +- `profile` - credentials profile name to access S3: ```dvc $ dvc remote modify myremote profile myprofile ``` -- `credentialpath` - credentials path to use to access S3: +- `credentialpath` - credentials path to access S3: ```dvc $ dvc remote modify myremote credentialpath /path/to/my/creds ``` -- `endpointurl` - endpoint URL to use to access S3: +- `endpointurl` - endpoint URL to access S3: ```dvc $ dvc remote modify myremote endpointurl https://myendpoint.com @@ -168,21 +168,24 @@ these settings, you could use the following options: for specific grantees\*\*. Grantee can read object and its metadata. ```dvc - $ dvc remote modify myremote grant_read id=aws-canonical-user-id,id=another-aws-canonical-user-id + $ dvc remote modify myremote grant_read \ + id=aws-canonical-user-id,id=another-aws-canonical-user-id ``` - `grant_read_acp`\* - grants `READ_ACP` permissions at object level access control list for specific grantees\*\*. Grantee can read the object's ACP. ```dvc - $ dvc remote modify myremote grant_read_acp id=aws-canonical-user-id,id=another-aws-canonical-user-id + $ dvc remote modify myremote grant_read_acp \ + id=aws-canonical-user-id,id=another-aws-canonical-user-id ``` - `grant_write_acp`\* - grants `WRITE_ACP` permissions at object level access control list for specific grantees\*\*. Grantee can modify the object's ACP. ```dvc - $ dvc remote modify myremote grant_write_acp id=aws-canonical-user-id,id=another-aws-canonical-user-id + $ dvc remote modify myremote grant_write_acp \ + id=aws-canonical-user-id,id=another-aws-canonical-user-id ``` - `grant_full_control`\* - grants `FULL_CONTROL` permissions at object level @@ -190,7 +193,8 @@ these settings, you could use the following options: grant_read_acp + grant_write_acp ```dvc - $ dvc remote modify myremote grant_full_control id=aws-canonical-user-id,id=another-aws-canonical-user-id + $ dvc remote modify myremote grant_full_control \ + id=aws-canonical-user-id,id=another-aws-canonical-user-id ``` > \* `grant_read`, `grant_read_acp`, `grant_write_acp` and @@ -221,7 +225,8 @@ For example: ```dvc $ dvc remote add myremote s3://path/to/dir -$ dvc remote modify myremote endpointurl https://object-storage.example.com +$ dvc remote modify myremote endpointurl \ + https://object-storage.example.com ``` S3 remotes can also be configured entirely via environment variables: @@ -250,7 +255,8 @@ For more information about the variables DVC supports, please visit - `connection_string` - connection string. ```dvc - $ dvc remote modify --local myremote connection_string "my-connection-string" + $ dvc remote modify --local myremote connection_string \ + "my-connection-string" ``` > The connection string contains sensitive user info. Therefore, it's safer to @@ -274,8 +280,8 @@ a full guide on using Google Drive as DVC remote storage. [possible formats](/doc/user-guide/setup-google-drive-remote#url-format). ```dvc - $ dvc remote modify myremote \ - url gdrive://0AIac4JZqHhKmUk9PDA/dvcstore + $ dvc remote modify myremote url \ + gdrive://0AIac4JZqHhKmUk9PDA/dvcstore ``` - `gdrive_client_id` - Client ID for authentication with OAuth 2.0 when using a @@ -415,13 +421,13 @@ more information. $ dvc remote modify myremote oss_endpoint endpoint ``` -- `oss_key_id` - OSS key ID to use to access a remote. +- `oss_key_id` - OSS key ID to access the remote. ```dvc $ dvc remote modify myremote --local oss_key_id my-key-id ``` -- `oss_key_secret` - OSS secret key for authorizing access into a remote. +- `oss_key_secret` - OSS secret key for authorizing access into the remote. ```dvc $ dvc remote modify myremote --local oss_key_secret my-key-secret @@ -440,41 +446,43 @@ more information. - `url` - remote location URL. ```dvc - $ dvc remote modify myremote url ssh://user@example.com:1234/path/to/remote + $ dvc remote modify myremote url \ + ssh://user@example.com:1234/absolute/path ``` -- `user` - username to use to access a remote. The order in which dvc searches - for username: - - 1. `user` specified in one of the dvc configs; - 2. `user` specified in the url(e.g. `ssh://user@example.com/path`); - 3. `user` specified in `~/.ssh/config` for remote host; - 4. current user; +- `user` - username to access the remote. ```dvc $ dvc remote modify --local myremote user myuser ``` -- `port` - port to use to access a remote. The order in which dvc searches for - port: + The order in which DVC picks the username: - 1. `port` specified in one of the dvc configs; - 2. `port` specified in the url(e.g. `ssh://example.com:1234/path`); - 3. `port` specified in `~/.ssh/config` for remote host; - 4. default ssh port 22; + 1. `user` parameter set with this command (found in `.dvc/config`); + 2. User defined in the URL (e.g. `ssh://user@example.com/path`); + 3. User defined in `~/.ssh/config` for this host (URL); + 4. Current user + +- `port` - port to access the remote. ```dvc $ dvc remote modify myremote port 2222 ``` -- `keyfile` - path to private key to use to access a remote. + The order in which DVC decide the port number: + + 1. `port` parameter set with this command (found in `.dvc/config`); + 2. Port defined in the URL (e.g. `ssh://example.com:1234/path`); + 3. Port defined in `~/.ssh/config` for this host (URL); + 4. Default SSH port 22 + +- `keyfile` - path to private key to access the remote. ```dvc $ dvc remote modify myremote keyfile /path/to/keyfile ``` -- `password` - a private key passphrase or a password to use to use when - accessing a remote. +- `password` - a private key passphrase or a password to access the remote. ```dvc $ dvc remote modify --local myremote password mypassword @@ -484,8 +492,8 @@ more information. > safer to add them with the `--local` option, so they're written to a > Git-ignored config file. -- `ask_password` - ask for a private key passphrase or a password to use when - accessing a remote. +- `ask_password` - ask for a private key passphrase or a password to access the + remote. ```dvc $ dvc remote modify myremote ask_password true @@ -509,7 +517,7 @@ more information. ### Click for HDFS -- `user` - username to use to access a remote. +- `user` - username to access the remote. ```dvc $ dvc remote modify --local myremote user myuser @@ -524,7 +532,7 @@ more information. ### Click for HTTP -- `auth` - authentication method to use when accessing a remote. The accepted +- `auth` - authentication method to use when accessing the remote. The accepted values are: - `basic` - @@ -551,15 +559,17 @@ more information. ``` - `user` - username to use when the `auth` parameter is set to `basic` or - `digest`. The order in which DVC searches for username: - - 1. `user` specified in one of the DVC configs; - 2. `user` specified in the url(e.g. `http://user@example.com/path`); + `digest`. ```dvc $ dvc remote modify --local myremote user myuser ``` + The order in which DVC picks the username: + + 1. `user` parameter set with this command (found in `.dvc/config`); + 2. User defined in the URL (e.g. `http://user@example.com/path`); + - `password` - password to use for any `auth` method. ```dvc From 56349a7ba703fbaec1e69757b236d54676366d46 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Thu, 6 Aug 2020 12:23:26 -0500 Subject: [PATCH 4/5] guide: .dvcignore copy edit --- content/docs/user-guide/dvcignore.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/docs/user-guide/dvcignore.md b/content/docs/user-guide/dvcignore.md index 69af83366d..7305abe11b 100644 --- a/content/docs/user-guide/dvcignore.md +++ b/content/docs/user-guide/dvcignore.md @@ -16,7 +16,7 @@ similar to `.gitignore` in Git. - You need to create the `.dvcignore` file. It can be placed in the root of the project or inside any subdirectory (see also [remarks](#Remarks) below). - Populate it with [patterns](https://git-scm.com/docs/gitignore) that you would - like to ignore. You can find useful file templates + like to ignore. You can find useful templates [here](https://github.com/github/gitignore). - Each line should contain only one pattern. - During execution of commands that traverse directories, DVC will ignore From dbb9b9462e7b3c959f7bf7e87b3363014857ea35 Mon Sep 17 00:00:00 2001 From: Jorge Orpinel Date: Thu, 6 Aug 2020 12:29:44 -0500 Subject: [PATCH 5/5] cmd: init copy edits --- content/docs/command-reference/init.md | 38 +++++++++++++------------- 1 file changed, 19 insertions(+), 19 deletions(-) diff --git a/content/docs/command-reference/init.md b/content/docs/command-reference/init.md index 365499f15b..2c4115fb75 100644 --- a/content/docs/command-reference/init.md +++ b/content/docs/command-reference/init.md @@ -32,45 +32,45 @@ advanced scenarios: ### Initializing DVC in subdirectories `--subdir` must be provided to initialize DVC in a subdirectory of a Git -repository. DVC still expects to find the Git repository (will check all -directories up to the system root to find `.git/`). This options does not affect -any config files, `.dvc/` directory is created the same way as in the default -mode. This way multiple DVC projects can be initialized in a single -Git repository, providing isolation between projects. +repository. DVC still expects to find a Git root (will check all directories up +to the system root to find `.git/`). This options does not affect any config +files, `.dvc/` directory is created the same way as in the default mode. This +way multiple DVC projects can be initialized in a single Git +repository, providing isolation between projects. #### When is this useful? This option is mostly useful in the scenario of a -[monorepo](https://en.wikipedia.org/wiki/Monorepo) (Git repository split into -several project directories), but can also be used with other patterns when such -isolation is needed. `dvc init --subdir` mitigates the issues of initializing -DVC in the Git repo root: +[monorepo](https://en.wikipedia.org/wiki/Monorepo) (Git repo split into several +project directories), but can also be used with other patterns when such +isolation is needed. `dvc init --subdir` mitigates possible limitations of +initializing DVC in the Git repo root: - Repository maintainers might not allow a top level `.dvc/` directory, - especially if DVC is being used by several sub-projects (monorepo). + especially if DVC is already being used by several sub-projects (monorepo). - DVC [internals](/doc/user-guide/dvc-files-and-directories) (config file, cache - directory, etc.) are shared across different sub-projects. This forces all of - them to use the same DVC settings and + directory, etc.) would be shared across different subdirectories. This forces + all of them to use the same DVC settings and [remote storage](/doc/command-reference/remote). - By default, DVC commands like `dvc pull` and `dvc repro` explore the whole DVC repository to find DVC-tracked data and pipelines to work with. This can be inefficient for large monorepos. -- Other commands such as `dvc status` and `dvc metrics show` would produce - unexpected results if not constrained to a single project scope. +- Commands such as `dvc status` and `dvc metrics show` would produce unexpected + results if not constrained to a single project scope. #### How does it affect DVC commands? The project root is found by DVC by looking for `.dvc/` from the current working directory, up. It defines the scope of action for most DVC -commands (e.g. `dvc repro`, `dvc pull`, `dvc metrics diff`), meaning that only -`dvc.yaml`, `.dvc` files, etc. inside the project are usable by the commands. +commands (e.g. `dvc repro`, `dvc pull`, `dvc metrics diff`, etc.) meaning that +only `dvc.yaml`, `.dvc` files, etc. inside the project are usable by the +commands. -With `--subdir`, the project root will be found before the Git root, making sure -the scope of DVC commands run here is constrained to this project alone, even if -there are more DVC-related files elsewhere in the repo. +With `--subdir`, the project root will be found before the Git root, causing the +scope of DVC commands run here is constrained to this project alone. If there are multiple `--subdir` projects, but not nested, e.g.: