Skip to content

Commit

Permalink
Add doc for understanding cluster key and client machine movement (oa…
Browse files Browse the repository at this point in the history
  • Loading branch information
jerrychenhf authored Jul 4, 2022
1 parent 5272933 commit 0559afd
Show file tree
Hide file tree
Showing 6 changed files with 118 additions and 8 deletions.
14 changes: 6 additions & 8 deletions docs/source/UserGuide/AdvancedTasks/submitting-jobs.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Submitting Jobs to Cluster

### Overview
## Overview
CloudTik provides the ability to easily submit scripts to run on your clusters. Currently, CloudTik supports to submit local script files and web script files.
The supported file types include:
- .sh: Shell scripts run by bash
Expand All @@ -24,8 +24,7 @@ Usage: cloudtik submit [OPTIONS] /path/to/your-cluster-config.yaml $YOUE_SCRIPT
| --no-config-cache |Disable the local cluster config cache.|



### Specifying additional arguments for the job
## Specifying additional arguments for the job

You can specify additional argument when submitting a job.
For example, the user has a python file **experiment.py** to submit, and `--smoke-test` is an option for experiment.py. The command is as follows:
Expand All @@ -34,7 +33,7 @@ For example, the user has a python file **experiment.py** to submit, and `--smok
```
The script file will be automatically uploaded to the path "~/jobs/" on the head. And tehn it will run on head with the interpreter based on the script type.

### Script arguments quote or escaping
## Script arguments quote or escaping
If your parameters for the script contain special character like ***|,\\*** or
you need environment variable substitution in the parameters, you need to quote or escape such parameters.
These need to be handled differently as following:
Expand Down Expand Up @@ -72,14 +71,14 @@ or
"\$abc"
```

### Submitting job running in background
## Submitting job running in background

Sometimes, user's network connection to the cluster may be not stable. CloudTik will be disconnected from the remote clusters during jobs execution.
Or user needs to run some long-time tasks, and just want to check the output halfway or after the job is finished.
To solve such scenarios, we provide options `--screen` or `--tmux` to support run jobs in background.
**[Screen](https://www.gnu.org/software/screen/manual/screen.html)** and **[tmux](https://github.com/tmux/tmux/wiki/Getting-Started)** are the most popular Terminal multiplexers, you can choose according to your needs.

#### Using screen
### Using screen
Submitting a job:
```bash
cloudtik submit --screen /path/to/your-cluster-config.yaml experiment.py
Expand All @@ -99,8 +98,7 @@ Checking background job:

```


#### Using tmux
### Using tmux
Submitting a job:
```bash
cloudtik submit --tmux /path/to/your-cluster-config.yaml experiment.py
Expand Down
25 changes: 25 additions & 0 deletions docs/source/UserGuide/AdvancedTasks/switching-client-machine.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Switching Client Machine
Sometimes, you may need to work on another client machine (working machine).
This guide will introduce the key aspects when you created a cluster from one client machine,
and you need to switch to another client machine to operate the cluster through CloudTik CLI.

## Moving the configuration files
You need move the cluster configuration file to the new client machine
so that you can issue CloudTik CLI command to the cluster specifying the same cluster configuration file.

If you want to manage workspace on the new client machine, backup and move the workspace configuration file as well.

## Moving the cluster private key file
The cluster was accessed using the cluster private key file.
If you don't know where is the private key file located, you can execute:

```bash
cloudtik info your-cluster-config.yaml
```
The command will show where is the cluster private key located.

You need moving the corresponding private key file and for azure also the public key file to the new client machine.
The file names of the private or public key may have association with the key pair name at Cloud side (for AWS and GCP).
Please don't change the file names of the private or public key files when you are doing file movements.

For a more detailed description of cluster key file, refer to [Understanding Cluster Key](./understanding-cluster-key.md)
81 changes: 81 additions & 0 deletions docs/source/UserGuide/AdvancedTasks/understanding-cluster-key.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# Understanding Cluster Key
The cluster is protected by the cluster key file (private key file and optionally public key file).

The public key is used to when creating VM instance. For different Cloud providers, the way of specifying
public key when creating VM instance is different.
For AWS, AWS key pair is used to associate the name with the public and private key
and the key pair name is specified when creating VM instance.
For Azure, the public key is specified directly when creating VM instance.
For GCP, we can create project wide ssh key to associate a USERNAME with its public key.
And all VM instances in the same project can be accessed using the private key of the project wide
ssh key.

For private key, it is simpler. CloudTik CLI needs the cluster private key file to connect to the cluster
and issue management commands for all Cloud providers.

## The location of cluster key files
For a running cluster, if you don't know where is the cluster key files located,
you can execute info command,

```bash
cloudtik info your-cluster-config.yaml
```
The command will show where is the cluster private or public key files located.

## AWS cluster key

### Implicit cluster private key
If you don't specify a private key through ssh_private_key in the auth section in cluster configuration file,
CloudTik will try to find an existing AWS key pair with the name pattern of cloudtik_{index}_{region}
and if the private key file of the key pair ~/.ssh/cloudtik_{index}_{region}.pem exists.
If found, the cluster will be created with the key pair and the private key file at ~/.ssh/cloudtik_{index}_{region}.pem.

Otherwise, CloudTik will create a new key pair with the name pattern of cloudtik_{index}_{region}
and download the private key file of the key pair to ~/.ssh/cloudtik_{index}_{region}.pem.

If you create multiple clusters on the same client machine, CloudTik will use the same cluster private key based
on the above process. If you don't want this, you can specify explicitly a key pair name to use for the cluster
with provider/key_pair/key_name configuration key in cluster configuration file.

### Explicit cluster private key
You can also specify explicitly a private key file in the auth section in cluster configuration file.
In this case, the specified private key file will be used.
And you also need to explicitly specify the AWS key pair name of the private key through 'KeyName' configuration key
in the 'node_config' of each node types defined in 'available_node_types'.

For this case, CloudTik will not try creating any key pairs on AWS.
You need to make sure the key pair exists on AWS and the private key file contains the corresponding private key material.

## Azure cluster private key
Azure cluster key management is simple.
When creating the cluster, you need to generate a cluster RSA key pair.
And specify the ssh_private_key configuration key and ssh_public_key configuration key
in auth section for locating the private and public key file.

The public key file will be used to create VM instances and the private key file
can be used to access the VM instances through SSH.

## GCP cluster private key

### Implicit cluster private key
If you don't specify a private key through ssh_private_key in the auth section in cluster configuration file,
CloudTik will try to find an existing GCP project wide ssh key with the USERNAME equals to ssh_user in the configuration file
and if the private key file and public key file of the key pair ~/.ssh/cloudtik_gcp_{region}_{project_id}_{ssh_user}_{index}.pem
and ~/.ssh/cloudtik_gcp_{region}_{project_id}_{ssh_user}_{index}.pub exist locally.
If found, the cluster will be created with the ssh key and using the private key file at ~/.ssh/cloudtik_gcp_{region}_{project_id}_{ssh_user}_{index}.pem
for cluster access.

Otherwise, CloudTik will generate an RSA key pair
and create a GCP project wide ssh key with the USERNAME equals to ssh_user and the public key of the generated RSA key pair.
and save the private key file of the key pair to ~/.ssh/cloudtik_gcp_{region}_{project_id}_{ssh_user}_{index}.pem and
public key file of the key pair to ~/.ssh/cloudtik_gcp_{region}_{project_id}_{ssh_user}_{index}.pub.

If you create multiple clusters on the same client machine and used the same ssh_user, CloudTik will use the same cluster private key based
on the above process.

### Explicit cluster private key
You can also specify explicitly a private key file in the auth section in cluster configuration file.
In this case, the specified private key file will be used.

For this case, CloudTik will not try creating any project wide ssh key on GCP.
You need to make sure the project wide ssh key with USERNAME equals to ssh_user exists on GCP project and the private key file contains the corresponding private key material.
2 changes: 2 additions & 0 deletions docs/source/UserGuide/advanced-tasks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,5 @@ Advanced Tasks
AdvancedTasks/using-templates.md
AdvancedTasks/submitting-jobs.md
AdvancedTasks/using-for-on-premise.md
AdvancedTasks/understanding-cluster-key.md
AdvancedTasks/switching-client-machine.md
3 changes: 3 additions & 0 deletions python/cloudtik/core/_private/cluster/cluster_operator.py
Original file line number Diff line number Diff line change
Expand Up @@ -1831,7 +1831,10 @@ def show_useful_commands(config_file: str,
cli_logger.newline()
with cli_logger.group("Key information:"):
private_key_file = config["auth"].get("ssh_private_key", "")
public_key_file = config["auth"].get("ssh_public_key")
cli_logger.print("Cluster private key file: {}", private_key_file)
if public_key_file is not None:
cli_logger.print("Cluster public key file: {}", public_key_file)
cli_logger.print("Please keep the cluster private key file safe.")

cli_logger.newline()
Expand Down
1 change: 1 addition & 0 deletions python/cloudtik/providers/_private/gcp/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@

GCP_MANAGED_STORAGE_GCS_BUCKET = "gcp.managed.storage.gcs.bucket"


def key_pair_name(i, region, project_id, ssh_user):
"""Returns the ith default gcp_key_pair_name."""
key_name = "{}_gcp_{}_{}_{}_{}".format(GCP_RESOURCE_NAME_PREFIX, region, project_id, ssh_user,
Expand Down

0 comments on commit 0559afd

Please sign in to comment.