Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: misc updates #1669

Merged
merged 6 commits into from
Aug 6, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion content/docs/command-reference/get.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ downloading, DVC will try to copy the target data from its <abbr>cache</abbr>).
The `path` argument is used to specify the location of the target to download
within the source repository at `url`. `path` can specify any file or directory
in the source repo, either tracked by DVC (including paths inside tracked
directories), or by Git. Note that DVC-tracked targets should be found in a
directories) or by Git. Note that DVC-tracked targets should be found in a
`dvc.yaml` or `.dvc` file of the project.

⚠️ The project should have a default
Expand Down
2 changes: 1 addition & 1 deletion content/docs/command-reference/import.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ downloading, DVC will try to copy the target data from its <abbr>cache</abbr>).
The `path` argument is used to specify the location of the target to download
within the source repository at `url`. `path` can specify any file or directory
in the source repo, either tracked by DVC (including paths inside tracked
directories), or by Git. Note that DVC-tracked targets should be found in a
directories) or by Git. Note that DVC-tracked targets should be found in a
`dvc.yaml` or `.dvc` file of the project.

⚠️ The project should have a default
Expand Down
38 changes: 19 additions & 19 deletions content/docs/command-reference/init.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,45 +32,45 @@ advanced scenarios:
### Initializing DVC in subdirectories

`--subdir` must be provided to initialize DVC in a subdirectory of a Git
repository. DVC still expects to find the Git repository (will check all
directories up to the system root to find `.git/`). This options does not affect
any config files, `.dvc/` directory is created the same way as in the default
mode. This way multiple <abbr>DVC projects</abbr> can be initialized in a single
Git repository, providing isolation between projects.
repository. DVC still expects to find a Git root (will check all directories up
to the system root to find `.git/`). This options does not affect any config
files, `.dvc/` directory is created the same way as in the default mode. This
way multiple <abbr>DVC projects</abbr> can be initialized in a single Git
repository, providing isolation between projects.

#### When is this useful?

This option is mostly useful in the scenario of a
[monorepo](https://en.wikipedia.org/wiki/Monorepo) (Git repository split into
several project directories), but can also be used with other patterns when such
isolation is needed. `dvc init --subdir` mitigates the issues of initializing
DVC in the Git repo root:
[monorepo](https://en.wikipedia.org/wiki/Monorepo) (Git repo split into several
project directories), but can also be used with other patterns when such
isolation is needed. `dvc init --subdir` mitigates possible limitations of
initializing DVC in the Git repo root:

- Repository maintainers might not allow a top level `.dvc/` directory,
especially if DVC is being used by several sub-projects (monorepo).
especially if DVC is already being used by several sub-projects (monorepo).

- DVC [internals](/doc/user-guide/dvc-files-and-directories) (config file, cache
directory, etc.) are shared across different sub-projects. This forces all of
them to use the same DVC settings and
directory, etc.) would be shared across different subdirectories. This forces
all of them to use the same DVC settings and
[remote storage](/doc/command-reference/remote).

- By default, DVC commands like `dvc pull` and `dvc repro` explore the whole
<abbr>DVC repository</abbr> to find DVC-tracked data and pipelines to work
with. This can be inefficient for large monorepos.

- Other commands such as `dvc status` and `dvc metrics show` would produce
unexpected results if not constrained to a single project scope.
- Commands such as `dvc status` and `dvc metrics show` would produce unexpected
results if not constrained to a single project scope.

#### How does it affect DVC commands?

The <abbr>project</abbr> root is found by DVC by looking for `.dvc/` from the
current working directory, up. It defines the scope of action for most DVC
commands (e.g. `dvc repro`, `dvc pull`, `dvc metrics diff`), meaning that only
`dvc.yaml`, `.dvc` files, etc. inside the project are usable by the commands.
commands (e.g. `dvc repro`, `dvc pull`, `dvc metrics diff`, etc.) meaning that
only `dvc.yaml`, `.dvc` files, etc. inside the project are usable by the
commands.

With `--subdir`, the project root will be found before the Git root, making sure
the scope of DVC commands run here is constrained to this project alone, even if
there are more DVC-related files elsewhere in the repo.
With `--subdir`, the project root will be found before the Git root, causing the
scope of DVC commands run here is constrained to this project alone.

If there are multiple `--subdir` projects, but not nested, e.g.:

Expand Down
7 changes: 4 additions & 3 deletions content/docs/command-reference/remote/add.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,8 @@ or even a directory in the local file system. (See all the supported remote
storage types in the examples below.) If `url` is a relative path, it will be
resolved against the current working directory, but saved **relative to the
config file location** (see LOCAL example below). Whenever possible, DVC will
create a remote directory if it doesn't exists yet. (It won't create an S3
bucket though, and will rely on default access settings.)
create a remote directory if it doesn't exist yet. (It won't create an S3 bucket
though, and will rely on default access settings.)

> If you installed DVC via `pip` and plan to use cloud services as remote
> storage, you might need to install these optional dependencies: `[s3]`,
Expand Down Expand Up @@ -131,7 +131,8 @@ For example:

```dvc
$ dvc remote add -d myremote s3://mybucket/path/to/dir
$ dvc remote modify myremote endpointurl https://object-storage.example.com
$ dvc remote modify myremote endpointurl \
https://object-storage.example.com
```

> See `dvc remote modify` for a full list of S3 API parameters.
Expand Down
86 changes: 48 additions & 38 deletions content/docs/command-reference/remote/modify.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,19 +90,19 @@ these settings, you could use the following options:
$ dvc remote modify myremote region us-east-2
```

- `profile` - credentials profile name to use to access S3:
- `profile` - credentials profile name to access S3:

```dvc
$ dvc remote modify myremote profile myprofile
```

- `credentialpath` - credentials path to use to access S3:
- `credentialpath` - credentials path to access S3:

```dvc
$ dvc remote modify myremote credentialpath /path/to/my/creds
```

- `endpointurl` - endpoint URL to use to access S3:
- `endpointurl` - endpoint URL to access S3:

```dvc
$ dvc remote modify myremote endpointurl https://myendpoint.com
Expand Down Expand Up @@ -168,29 +168,33 @@ these settings, you could use the following options:
for specific grantees\*\*. Grantee can read object and its metadata.

```dvc
$ dvc remote modify myremote grant_read id=aws-canonical-user-id,id=another-aws-canonical-user-id
$ dvc remote modify myremote grant_read \
id=aws-canonical-user-id,id=another-aws-canonical-user-id
```

- `grant_read_acp`\* - grants `READ_ACP` permissions at object level access
control list for specific grantees\*\*. Grantee can read the object's ACP.

```dvc
$ dvc remote modify myremote grant_read_acp id=aws-canonical-user-id,id=another-aws-canonical-user-id
$ dvc remote modify myremote grant_read_acp \
id=aws-canonical-user-id,id=another-aws-canonical-user-id
```

- `grant_write_acp`\* - grants `WRITE_ACP` permissions at object level access
control list for specific grantees\*\*. Grantee can modify the object's ACP.

```dvc
$ dvc remote modify myremote grant_write_acp id=aws-canonical-user-id,id=another-aws-canonical-user-id
$ dvc remote modify myremote grant_write_acp \
id=aws-canonical-user-id,id=another-aws-canonical-user-id
```

- `grant_full_control`\* - grants `FULL_CONTROL` permissions at object level
access control list for specific grantees\*\*. Equivalent of grant_read +
grant_read_acp + grant_write_acp

```dvc
$ dvc remote modify myremote grant_full_control id=aws-canonical-user-id,id=another-aws-canonical-user-id
$ dvc remote modify myremote grant_full_control \
id=aws-canonical-user-id,id=another-aws-canonical-user-id
```

> \* `grant_read`, `grant_read_acp`, `grant_write_acp` and
Expand Down Expand Up @@ -221,7 +225,8 @@ For example:

```dvc
$ dvc remote add myremote s3://path/to/dir
$ dvc remote modify myremote endpointurl https://object-storage.example.com
$ dvc remote modify myremote endpointurl \
https://object-storage.example.com
```

S3 remotes can also be configured entirely via environment variables:
Expand Down Expand Up @@ -250,7 +255,8 @@ For more information about the variables DVC supports, please visit
- `connection_string` - connection string.

```dvc
$ dvc remote modify --local myremote connection_string "my-connection-string"
$ dvc remote modify --local myremote connection_string \
"my-connection-string"
```

> The connection string contains sensitive user info. Therefore, it's safer to
Expand All @@ -274,8 +280,8 @@ a full guide on using Google Drive as DVC remote storage.
[possible formats](/doc/user-guide/setup-google-drive-remote#url-format).

```dvc
$ dvc remote modify myremote \
url gdrive://0AIac4JZqHhKmUk9PDA/dvcstore
$ dvc remote modify myremote url \
gdrive://0AIac4JZqHhKmUk9PDA/dvcstore
```

- `gdrive_client_id` - Client ID for authentication with OAuth 2.0 when using a
Expand Down Expand Up @@ -415,13 +421,13 @@ more information.
$ dvc remote modify myremote oss_endpoint endpoint
```

- `oss_key_id` - OSS key ID to use to access a remote.
- `oss_key_id` - OSS key ID to access the remote.

```dvc
$ dvc remote modify myremote --local oss_key_id my-key-id
```

- `oss_key_secret` - OSS secret key for authorizing access into a remote.
- `oss_key_secret` - OSS secret key for authorizing access into the remote.

```dvc
$ dvc remote modify myremote --local oss_key_secret my-key-secret
Expand All @@ -440,41 +446,43 @@ more information.
- `url` - remote location URL.

```dvc
$ dvc remote modify myremote url ssh://[email protected]:1234/path/to/remote
$ dvc remote modify myremote url \
ssh://[email protected]:1234/absolute/path
```

- `user` - username to use to access a remote. The order in which dvc searches
for username:

1. `user` specified in one of the dvc configs;
2. `user` specified in the url(e.g. `ssh://[email protected]/path`);
3. `user` specified in `~/.ssh/config` for remote host;
4. current user;
- `user` - username to access the remote.

```dvc
$ dvc remote modify --local myremote user myuser
```

- `port` - port to use to access a remote. The order in which dvc searches for
port:
The order in which DVC picks the username:

1. `port` specified in one of the dvc configs;
2. `port` specified in the url(e.g. `ssh://example.com:1234/path`);
3. `port` specified in `~/.ssh/config` for remote host;
4. default ssh port 22;
1. `user` parameter set with this command (found in `.dvc/config`);
2. User defined in the URL (e.g. `ssh://[email protected]/path`);
3. User defined in `~/.ssh/config` for this host (URL);
4. Current user

- `port` - port to access the remote.

```dvc
$ dvc remote modify myremote port 2222
```

- `keyfile` - path to private key to use to access a remote.
The order in which DVC decide the port number:

1. `port` parameter set with this command (found in `.dvc/config`);
2. Port defined in the URL (e.g. `ssh://example.com:1234/path`);
3. Port defined in `~/.ssh/config` for this host (URL);
4. Default SSH port 22

- `keyfile` - path to private key to access the remote.

```dvc
$ dvc remote modify myremote keyfile /path/to/keyfile
```

- `password` - a private key passphrase or a password to use to use when
accessing a remote.
- `password` - a private key passphrase or a password to access the remote.

```dvc
$ dvc remote modify --local myremote password mypassword
Expand All @@ -484,8 +492,8 @@ more information.
> safer to add them with the `--local` option, so they're written to a
> Git-ignored config file.

- `ask_password` - ask for a private key passphrase or a password to use when
accessing a remote.
- `ask_password` - ask for a private key passphrase or a password to access the
remote.

```dvc
$ dvc remote modify myremote ask_password true
Expand All @@ -509,7 +517,7 @@ more information.

### Click for HDFS

- `user` - username to use to access a remote.
- `user` - username to access the remote.

```dvc
$ dvc remote modify --local myremote user myuser
Expand All @@ -524,7 +532,7 @@ more information.

### Click for HTTP

- `auth` - authentication method to use when accessing a remote. The accepted
- `auth` - authentication method to use when accessing the remote. The accepted
values are:

- `basic` -
Expand All @@ -551,15 +559,17 @@ more information.
```

- `user` - username to use when the `auth` parameter is set to `basic` or
`digest`. The order in which DVC searches for username:

1. `user` specified in one of the DVC configs;
2. `user` specified in the url(e.g. `http://[email protected]/path`);
`digest`.

```dvc
$ dvc remote modify --local myremote user myuser
```

The order in which DVC picks the username:

1. `user` parameter set with this command (found in `.dvc/config`);
2. User defined in the URL (e.g. `http://[email protected]/path`);

- `password` - password to use for any `auth` method.

```dvc
Expand Down
2 changes: 1 addition & 1 deletion content/docs/user-guide/dvcignore.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ similar to `.gitignore` in Git.
- You need to create the `.dvcignore` file. It can be placed in the root of the
project or inside any subdirectory (see also [remarks](#Remarks) below).
- Populate it with [patterns](https://git-scm.com/docs/gitignore) that you would
like to ignore. You can find useful file templates
like to ignore. You can find useful templates
[here](https://github.com/github/gitignore).
- Each line should contain only one pattern.
- During execution of commands that traverse directories, DVC will ignore
Expand Down