Skip to content

Commit

Permalink
first commit
Browse files Browse the repository at this point in the history
  • Loading branch information
blotus committed Aug 9, 2021
0 parents commit 6dbba4c
Show file tree
Hide file tree
Showing 8 changed files with 636 additions and 0 deletions.
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
jupyterhub*
__pycache__
build/
dist/
*-info
venv
81 changes: 81 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# ECSSpawner (WIP)

The ECSSpawner enables JupyterHub to spawn notebooks running on EC2 instances, in an ECS cluster.

It is designed to be flexible for the end-user:
- You can choose the region where your instance will run
- You can choose the instance type
- You can request spot instances
- Optionally, the docker image to use. If not provided, it will default to the image configured in the hub config
- Optionally, the size of the root block volume of the instance. If not provided, it will defaut to the size of the volume of the AMI (30 GiB for the ECS optimized AMIs)

Each notebook server runs on a dedicated EC2 instance which is destroyed when the notebook server is shutdown, removing the need to manage the instances in the ECS cluster.

## Configuration

| Name | Description | Required | Default |
| --- | --- | --- | --- |
| ECSSpawner.default_docker_image | Name of the docker image to use for non-GPU instance | True | N/A |
| ECSSpawner.default_docker_image_gpu | Name of the docker image to use for GPU instance | True | N/A |
| ECSSpawner.instance_role_arn | ARN of the role for the EC2 instance. | True | |
| ECSSpawner.ecs_cluster | Name of the ECS cluster in which to add the instance | False | default |
| ECSSpawner.key_pair_name | Name of a keypair to add to the instance | False | Not set |
| ECSSpawner.subnet_id | Id of the subnet to place the EC2 instances in. If not provided, will use the default subnet of the VPC | False | Default subnet of the default VPC |
| ECSSpawner.security_group_id | List of Security groups to attach to the EC2 instances. If not provided, will use the default security group of the VPC | False | Default security group of the default VPC |
| ECSSpawner.use_public_ip | Use the public IP of the underlying EC2 instance to access the notebook (it means that a public IP must be auto-assigned to the instance). Mainly intended for cases where jupyterhub itself is not running in AWS. | False | False |
| ECSSpawner.ec2_ami | AMI to use for x86 instances | False | The latest ECS optimized x86 AMI in the region |
| ECSSpawner.ec2_arm_ami | AMI to use for ARM instances | False | The latest ECS optimized ARM AMI in the region (caution: AWS does not provide ARM ECS AMI in all regions) |
| ECSSpawner.ec2_gpu_ami | AMI to use for GPU instances | False | The latest ECS optimized GPU AMI in the region |


### Private images

Currently only docker hub public images, and ECR public/private images are supported.

ECR private image support relies on giving the correct permissions to the role attached to the instance with ECSSpawner.instance_role_arn.

## Setup

On your jupyterhub server, run:

`pip install jupyterhub-ecs-spawner`

As EC2 instances are tied to the lifecycle of a notebook server, the data in them is ephemeral and will be lost when the server is stopped.

As such, it is strongly recommended to setup some kind of persistent storage for the notebooks, such as https://github.com/danielfrg/s3contents

It is also recommended to setup https://github.com/jupyterhub/jupyterhub-idle-culler to cull idle servers to avoid runaway costs.

You will need to increase the timeout for spawning notebook server, as it is very unlikely for the ECS task to be up in less than 60s (the default jupyterhub timeout)

## Resources file

As it would be impractical to query AWS at runtime to get the list of all available instances in a given region, static files containing the list of instances per region and the id of the ECS AMI are generated with the `gen_resource_file.py` script.

This means:
- If AWS releases new instance types, they won't be available in the list without a new release
- Same thing goes for the AMIs, if AWS releases new versions, a new release must be made


## Example configuration

```
from ecsspawner import ECSSpawner
c.JupyterHub.spawner_class = 'ecsspawner.ECSSpawner'
c.ECSSpawner.ecs_cluster = 'test-jupyter'
c.ECSSpawner.default_docker_image = 'jupyter/datascience-notebook:notebook-6.4.0'
c.ECSSpawner.default_docker_image_gpu = 'cschranz/gpu-jupyter:latest'
c.ECSSpawner.instance_role_arn = 'arn:aws:iam::XXXX:instance-profile/ecsInstanceRole'
c.Spawner.start_timeout = 180
```


## TODO
- Log configuration (allow to use cloudwatch to get the log of the containers)
- Support docker hub private images
- Have a more robust system when waiting for the ECS task to be up
1 change: 1 addition & 0 deletions ecsspawner/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
from .spawner import ECSSpawner
109 changes: 109 additions & 0 deletions ecsspawner/form_template.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
<script>
instance_details = $instance_json;
regions = $regions;
available_instances = {};
tbl_header = ["Type", "vCPU", "Memory", "Arch", "GPU"]


function displayInstanceDetails() {
instance_type = document.getElementById("instance").value;
var tbl = document.createElement('table');
head = tbl.createTHead();
row = tbl.insertRow();
for (var i = 0; i < tbl_header.length; i++) {
th = document.createElement("th");
text = document.createTextNode(tbl_header[i]);
th.appendChild(text);
row.appendChild(th);
}
instance = available_instances[instance_type];
row = tbl.insertRow();
for (var i = 0; i < tbl_header.length; i++) {
let cell = row.insertCell();
if (tbl_header[i] == "Type") {
text = document.createTextNode(instance_type)
} else if (tbl_header[i] == "GPU") {
if (instance.gpu !== undefined) {
text = document.createTextNode(instance.gpu.count + " x " + instance.gpu.type)
} else {
text = document.createTextNode("No GPU")
}
} else {
text = document.createTextNode(instance[tbl_header[i].toLowerCase()]);
}
cell.appendChild(text);
}
tbl.setAttribute("class", "table");
details = document.getElementById("details");
details.innerHTML = '';
details.appendChild(tbl);
}

function genRegionSelect() {
var region_select = document.getElementById('region');
for (var i = 0; i < regions.length; i++) {
var opt = document.createElement('option');
opt.value = regions[i];
opt.innerHTML = regions[i];
region_select.appendChild(opt);
}
}

function updateAvailableInstances() {
region = document.getElementById("region").value;
available_instances = instance_details[region];
select = document.getElementById('instance');
select.innerHTML = '';
keys = Object.keys(available_instances).sort();
for (var i = 0; i < keys.length; i++) {
var opt = document.createElement('option');
opt.value = keys[i];
opt.innerHTML = keys[i];
select.appendChild(opt);
}
}

window.addEventListener("load",function(){
genRegionSelect();
updateAvailableInstances();
displayInstanceDetails();
},false);

</script>

<div class="form-group">
<label for="region">Region</label>
<select class="form-select" id="region" name="region" onchange="updateAvailableInstances()"></select>
<br />
<label for="instance">Instance type</label>
<select class="form-select" onchange=displayInstanceDetails() name="instance" id="instance">
</select>
<br />
<label for="spot">Spot instance</label>
<input type="checkbox" id="spot" name="spot">

<div class="wrapper center-block">
<div class="panel-group" id="accordion" role="tablist">
<div class="panel panel-default">
<div class="panel-heading" role="tab" id="headingOne">
<h4 class="panel-title">
<a role="button" data-toggle="collapse" data-parent="#accordion" href="#advancedConfig" aria-expanded="false" aria-controls="advancedConfig">
Advanced Configuration
</a>
</h4>
</div>
<div id="advancedConfig" class="panel-collapse collapse in" role="tabpanel" aria-labelledby="headingOne">
<div class="panel-body">
<label for="image">Docker Image</label>
<input type="text" id="image" name="image">
<br />
<label for="volume">Root Volume Size (GiB)</label>
<input type="number" id="volume" name="volume">
</div>
</div>
</div>
</div>
</div>
</div>
<div id="details">
</div>
Loading

0 comments on commit 6dbba4c

Please sign in to comment.