-
Notifications
You must be signed in to change notification settings - Fork 394
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Remove remote details from the command reference
- Loading branch information
1 parent
6ed975b
commit 2f2f9cf
Showing
2 changed files
with
0 additions
and
524 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -83,274 +83,6 @@ Use `dvc config` to unset/change the default remote as so: | |
|
||
- `-v`, `--verbose` - displays detailed tracing information. | ||
|
||
## Supported storage types | ||
|
||
These are the possible remote storage (protocols) DVC can work with: | ||
|
||
<details> | ||
|
||
### Click for local remote | ||
|
||
A "local remote" is a directory in the machine's file system. | ||
|
||
> While the term may seem contradictory, it doesn't have to be. The "local" part | ||
> refers to the machine where the project is stored, so it can be any directory | ||
> accessible to the same system. The "remote" part refers specifically to the | ||
> project/repository itself. | ||
Using an absolute path (recommended): | ||
|
||
```dvc | ||
$ dvc remote add myremote /tmp/my-dvc-storage | ||
$ cat .dvc/config | ||
... | ||
['remote "myremote"'] | ||
url = /tmp/my-dvc-storage | ||
... | ||
``` | ||
|
||
> Note that the absolute path `/tmp/my-dvc-storage` is saved as is. | ||
Using a relative path: | ||
|
||
```dvc | ||
$ dvc remote add myremote ../my-dvc-storage | ||
$ cat .dvc/config | ||
... | ||
['remote "myremote"'] | ||
url = ../../my-dvc-storage | ||
... | ||
``` | ||
|
||
> Note that `../my-dvc-storage` has been resolved relative to the `.dvc/` dir, | ||
> resulting in `../../my-dvc-storage`. | ||
</details> | ||
|
||
<details> | ||
|
||
### Click for Amazon S3 | ||
|
||
> **Note!** Before adding a new remote be sure to login into AWS services and | ||
> follow instructions at | ||
> [Create a Bucket](https://docs.aws.amazon.com/AmazonS3/latest/gsg/CreatingABucket.html) | ||
> to create your bucket. | ||
```dvc | ||
$ dvc remote add myremote s3://bucket/path | ||
``` | ||
|
||
By default DVC expects your AWS CLI is already | ||
[configured](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html). | ||
DVC will be using default AWS credentials file to access S3. To override some of | ||
these settings, use the options described in `dvc remote modify`. | ||
|
||
We use the `boto3` library to communicate with AWS. The following API methods | ||
are performed: | ||
|
||
- `list_objects_v2`, `list_objects` | ||
- `head_object` | ||
- `download_file` | ||
- `upload_file` | ||
- `delete_object` | ||
- `copy` | ||
|
||
So, make sure you have the following permissions enabled: | ||
|
||
- `s3:ListBucket` | ||
- `s3:GetObject` | ||
- `s3:PutObject` | ||
- `s3:DeleteObject` | ||
|
||
</details> | ||
|
||
<details> | ||
|
||
### Click for S3 API compatible storage | ||
|
||
To communicate with a remote object storage that supports an S3 compatible API | ||
(e.g. [Minio](https://min.io/), | ||
[DigitalOcean Spaces](https://www.digitalocean.com/products/spaces/), | ||
[IBM Cloud Object Storage](https://www.ibm.com/cloud/object-storage) etc.) you | ||
must explicitly set the `endpointurl` in the configuration: | ||
|
||
For example: | ||
|
||
```dvc | ||
$ dvc remote add -d myremote s3://mybucket/path/to/dir | ||
$ dvc remote modify myremote endpointurl https://object-storage.example.com | ||
``` | ||
|
||
S3 remotes can also be configured entirely via environment variables: | ||
|
||
```dvc | ||
$ export AWS_ACCESS_KEY_ID="<my-access-key>" | ||
$ export AWS_SECRET_ACCESS_KEY="<my-secret-key>" | ||
$ dvc remote add myremote "s3://bucket/myremote" | ||
``` | ||
|
||
For more information about the variables DVC supports, please visit | ||
[boto3 documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html#environment-variable-configuration) | ||
|
||
</details> | ||
|
||
<details> | ||
|
||
### Click for Azure | ||
|
||
```dvc | ||
$ dvc remote add myremote azure://my-container-name/path | ||
$ dvc remote modify myremote connection_string my-connection-string --local | ||
``` | ||
|
||
> The connection string contains access to data and is inserted into the | ||
> `.dvc/config` file. Therefore, it is safer to add the connection string with | ||
> the `--local` option, enforcing it to be written to a Git-ignored config file. | ||
The Azure Blob Storage remote can also be configured entirely via environment | ||
variables: | ||
|
||
```dvc | ||
$ export AZURE_STORAGE_CONNECTION_STRING="<my-connection-string>" | ||
$ export AZURE_STORAGE_CONTAINER_NAME="my-container-name" | ||
$ dvc remote add myremote "azure://" | ||
``` | ||
|
||
> For more information on configuring Azure Storage connection strings, visit | ||
> [here](https://docs.microsoft.com/en-us/azure/storage/common/storage-configure-connection-string). | ||
- `connection string` - this is the connection string to access your Azure | ||
Storage Account. If you don't already have a storage account, you can create | ||
one following | ||
[these instructions](https://docs.microsoft.com/en-us/azure/storage/common/storage-create-storage-account). | ||
The connection string can be found in the "Access Keys" pane of your Storage | ||
Account resource in the Azure portal. | ||
|
||
- `container name` - this is the top-level container in your Azure Storage | ||
Account under which all the files for this remote will be uploaded. If the | ||
container doesn't already exist, it will be created automatically. | ||
|
||
</details> | ||
|
||
<details> | ||
|
||
### Click for Google Cloud Storage | ||
|
||
```dvc | ||
$ dvc remote add myremote gs://bucket/path | ||
``` | ||
|
||
</details> | ||
|
||
<details> | ||
|
||
### Click for SSH | ||
|
||
```dvc | ||
$ dvc remote add myremote ssh://[email protected]/path/to/dir | ||
``` | ||
|
||
> **Note!** DVC requires both SSH and SFTP access to work with SSH remote | ||
> storage. Please check that you are able to connect to the remote location with | ||
> tools like `ssh` and `sftp` (GNU/Linux). | ||
<!-- Separate MD quote: --> | ||
|
||
> Note that your server's SFTP root might differ from its physical root (`/`). | ||
> (On Linux, see the `ChrootDirectory` config option in `/etc/ssh/sshd_config`.) | ||
> In these cases, the path component in the SSH URL (e.g. `/path/to/dir` above) | ||
> should be specified relative to the SFTP root instead. For example, on some | ||
> Sinology NAS drives, the SFTP root might be in directory `/volume1`, in which | ||
> case you should use path `/path/to/dir` instead of `/volume1/path/to/dir`. | ||
</details> | ||
|
||
<details> | ||
|
||
### Click for HDFS | ||
|
||
```dvc | ||
$ dvc remote add myremote hdfs://[email protected]/path/to/dir | ||
``` | ||
|
||
> **Note!** If you are seeing `Unable to load libjvm` error on ubuntu with | ||
> openjdk-8, try setting JAVA_HOME env variable. This issue is solved in the | ||
> [upstream version of pyarrow](https://github.com/apache/arrow/pull/4907) and | ||
> the fix will be included into the next pyarrow release. | ||
</details> | ||
|
||
<details> | ||
|
||
### Click for HTTP | ||
|
||
> **Note!** Currently HTTP remotes only support downloads operations: | ||
> | ||
> - `pull` and `fetch` | ||
> - `import-url` and `get-url` | ||
> - As an [external dependency](/doc/user-guide/external-data/http) | ||
```dvc | ||
$ dvc remote add myremote https://example.com/path/to/dir | ||
``` | ||
|
||
</details> | ||
|
||
<details> | ||
|
||
### Click for Aliyun OSS | ||
|
||
First you need to setup OSS storage on Aliyun Cloud and then use an S3 style URL | ||
for OSS storage and make the endpoint value configurable. An example is shown | ||
below: | ||
|
||
```dvc | ||
$ dvc remote add myremote oss://my-bucket/path | ||
``` | ||
|
||
To set key id, key secret and endpoint you need to use `dvc remote modify`. | ||
Example usage is show below. Make sure to use the `--local` option to avoid | ||
committing your secrets into Git: | ||
|
||
```dvc | ||
$ dvc remote modify myremote --local oss_key_id my-key-id | ||
$ dvc remote modify myremote --local oss_key_secret my-key-secret | ||
$ dvc remote modify myremote oss_endpoint endpoint | ||
``` | ||
|
||
You can also set environment variables and use them later, to set environment | ||
variables use following environment variables: | ||
|
||
```dvc | ||
$ export OSS_ACCESS_KEY_ID="my-key-id" | ||
$ export OSS_ACCESS_KEY_SECRET="my-key-secret" | ||
$ export OSS_ENDPOINT="endpoint" | ||
``` | ||
|
||
#### Test your OSS storage using docker | ||
|
||
Start a container running an OSS emulator. | ||
|
||
```dvc | ||
$ git clone https://github.com/nanaya-tachibana/oss-emulator.git | ||
$ docker image build -t oss:1.0 oss-emulator | ||
$ docker run --detach -p 8880:8880 --name oss-emulator oss:1.0 | ||
``` | ||
|
||
Setup environment variables. | ||
|
||
```dvc | ||
$ export OSS_BUCKET='my-bucket' | ||
$ export OSS_ENDPOINT='localhost:8880' | ||
$ export OSS_ACCESS_KEY_ID='AccessKeyID' | ||
$ export OSS_ACCESS_KEY_SECRET='AccessKeySecret' | ||
``` | ||
|
||
> Uses default key id and key secret when they are not given, which gives read | ||
> access to public read bucket and public bucket. | ||
</details> | ||
|
||
## Example: Custom configuration of an S3 remote | ||
|
||
Add an Amazon S3 remote as the _default_ (via `-d` option), and modify its | ||
|
Oops, something went wrong.