Skip to content

Commit

Permalink
add interruptible doc (#527)
Browse files Browse the repository at this point in the history
  • Loading branch information
migueltol22 authored Sep 29, 2020
1 parent 2724eda commit c00310c
Show file tree
Hide file tree
Showing 2 changed files with 51 additions and 0 deletions.
1 change: 1 addition & 0 deletions rsts/user/features/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,4 @@ Flyte Features
roles
single_task_execution
on_failure_policy
interruptible
50 changes: 50 additions & 0 deletions rsts/user/features/interruptible.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
.. _features-interruptible
Interruptible
####################

What is interruptible?
======================

Interruptible allows users to specify that their tasks are ok to be scheduled on machines that may get preempted such as AWS spot instances.
`Spot instances <https://aws.amazon.com/ec2/spot/?cards.sort-by=item.additionalFields.startDateTime&cards.sort-order=asc>`_ can lead up to 90% savings over on-demand. Anyone looking to realize cost savings should look into interruptible.

What are Spot Instances?
========================

Spot Instances are unused EC2 capacity in AWS. Spot instances are available at up to a 90% discount compared to on-demand prices. The caveat is that at any point these instances can be preempted and no longer be available for use. This can happen due to:

* Price – The Spot price is greater than your maximum price.
* Capacity – If there are not enough unused EC2 instances to meet the demand for Spot Instances, Amazon EC2 interrupts Spot Instances. The order in which the instances are interrupted is determined by Amazon EC2.
* Constraints – If your request includes a constraint such as a launch group or an Availability Zone group, these Spot Instances are terminated as a group when the constraint can no longer be met.

As a general rule of thumb, most spot instances are obtained for around 2 hours (median), with the floor being around 20 minutes, and the ceiling being unbounded duration.

Setting Interruptible
=====================

In order to run your workload on spot, you can set interruptible to True. Example:

.. code-block:: python
@inputs(value_to_print=Types.Integer)
@outputs(out=Types.Integer)
@python_task(cache_version='1', interruptible=True)
def add_one_and_print(workflow_parameters, value_to_print, out):
workflow_parameters.stats.incr("task_run")
added = value_to_print + 1
print("My printed value: {}".format(added))
out.set(added)
By setting this value, Flyte will schedule your task on an ASG with only spot instances. In the case your task gets preempted, Flyte will retry your task on a non-spot instance. This retry will not count towards a retry that a user sets.


What tasks should be set to interruptible?
==========================================

Most Flyte workloads should be good candidates for spot instances. If your task does not exhibit the following properties, then the recommendation would be to set interruptible to true.

* Time sensitive. I need this to run now and can not have any unexpected delays.
* Side Effects. My task is not idempotent and retrying will cause issues.
* Long Running Task. My task takes > 2 hours. Having an interruption during this time frame could potentially waste a lot of computation already done.

0 comments on commit c00310c

Please sign in to comment.