-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Remote executors
Note: This is documentation for an experimental feature which is under active development, it should not be used in production environments.
dvc machine
provides a set of DVC commands for provisioning and managing remote machines which will eventually be used for executing DVC experiments.
Currently dvc machine
implementation utilizes https://github.com/iterative/terraform-provider-iterative and requires the terraform client be installed and available in your PATH.
-
(Optional) Download & install terraform client for your platform
-
(Optional) Install latest tpi from
master
(pip install -e
) -
Install DVC deps (preferably using
pip install -e
frommaster
:pip install dvc[terraform]
- This will install tpi from pypi if you did not already install it from source
Note: If you do not install a terraform client yourself, it will be downloaded and installed for you (via tpi)
- Enable the
dvc machine
feature (either per-repo or globally):
dvc config [--global] feature.machine true
Machines are configured similarly to DVC remotes, and configuration usage generally mirrors dvc remote add/modify/remove
.
-
dvc machine add
- adds a machine to your repo configuration (note that no machine instance will actually be created untildvc machine create
is run). -
dvc machine modify
- modify the configuration for an existing machine. For a full list of available options, refer to the documentation for https://github.com/iterative/terraform-provider-iterative#machine -
dvc machine list
- List the configuration of one/all machines. -
dvc machine remove
- removes a machine from your repo configuration (note that any running machine instances should be destroyed withdvc machine destroy
before removing the machine from your repo configuration. -
dvc machine rename
- Rename a machine to a new name, will also affect the instances related to this machine.
-
dvc machine create
- create and start an instance of a configured machine. -
dvc machine status
- List the running status of the instances from one specified or all machines. -
dvc machine destroy
- stop and destroy a previously created machine instance. -
dvc machine ssh
- connect to a machine via SSH.- Your default
ssh
client will be used if available in your PATH. - Otherwise a limited functionality client session will be provided via
asyncssh
- Note that interactive programs (particularly line editors likevi
) may not work as expected when run in this shell session.
- Your default
- Very basic exp execution can be done over SSH via
dvc exp run --machine <machine_name>
(see also: https://github.com/iterative/dvc/pull/7173). - Runtime execution environment for the remote machine can be configured via the
setup_script
machine configuration option.setup_script
should be a shell script, and will be sourced from the root of the user's Git repository prior to running an experiment (i.e. it is sourced before executingdvc exp run
). Note that this is separate from thestartup_script
terraform configuration, which is executed at boot time and meant for installing system packages. - Detached/unattended execution is not currently supported, killing or interrupting the
dvc exp run --machine
command will also terminate the exp execution on the remote machine.
Example .dvc/config
:
['machine "aws-test"']
cloud = aws
setup_script = ../setup.sh
Example setup.sh
:
#!/bin/bash
python3.9 -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install -r src/requirements.txt
To run on remote machine:
$ dvc machine create aws-test
$ dvc exp run --machine aws-test
$ dvc machine destroy aws-test