Skip to content

Commit

Permalink
docs: Link memoization and work-avoidance
Browse files Browse the repository at this point in the history
Despite argoproj#10769 and argoproj#10426 both having examples of memoization not
working with the examples having no output, no-one has picked up on
this.

To address this improve the documentation for memoization and
work-avoidance, linking the two ideas and pointing people who want to
skip steps towards work-avoidance unless they are really doing what
memoization was designed to do.

Issue argoproj#10426 is problematic in that some steps get memoized when
perhaps they should't, so this commit shouldn't close it.

Fixes argoproj#10769

Signed-off-by: Alan Clucas <[email protected]>
  • Loading branch information
Joibel committed Jun 23, 2023
1 parent 873a58d commit 2903c39
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 3 deletions.
10 changes: 8 additions & 2 deletions docs/memoization.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,16 @@
## Introduction

Workflows often have outputs that are expensive to compute.
This feature reduces cost and workflow execution time by memoizing previously run steps:
Memoization reduces cost and workflow execution time by recording the result of previously run steps:
it stores the outputs of a template into a specified cache with a variable key.

Memoization only works for steps which have outputs, if you attempt to use it on steps which do not it should not work (there are some cases where it does, but they shouldn't). It is designed for 'pure' steps, where the purpose of running the step is to calculate some outputs based upon the steps inputs, and only the inputs. Pure steps should not interact with the outside world, but workflows won't enforce this on you.

If your steps are not there to create outputs, but you'd still like to skip running them, you should look at the [work avoidance](work-avoidance.md) technique instead of memoization.

## Cache Method

Currently, caching can only be performed with config-maps.
Currently, the cached data is stored in config-maps.
This allows you to easily manipulate cache entries manually through `kubectl` and the Kubernetes API without having to go through Argo.
All cache config-maps must have the label `workflows.argoproj.io/configmap-type: Cache` to be used as a cache. This prevents accidental access to other important config-maps in the system

Expand Down Expand Up @@ -50,3 +54,5 @@ spec:
* Delete the existing `ConfigMap` cache or switch to use a different cache.
* Reduce the size of the output parameters for the nodes that are being memoized.
* Split your cache into different memoization keys and cache names so that each cache entry is small.
1. My step isn't getting memoized, why not?
Ensure that you have specified at least one output on the step.
6 changes: 5 additions & 1 deletion docs/work-avoidance.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,11 @@

> v2.9 and after
You can make workflows faster and more robust by employing **work avoidance**. A workflow that utilizes this is simply a workflow containing steps that do not run if the work has already been done. This simplest way to do this is to use **marker files**.
You can make workflows faster and more robust by employing **work avoidance**. A workflow that utilizes this is simply a workflow containing steps that do not run if the work has already been done.

This technique is similar to [memoization](memoization.md) but they have distinct use cases. Work avoidance is totally in your control and you make the decisions as to have to skip the work. [Memoization](memoization.md) is a feature of Argo Workflows to automatically skip steps which generate outputs - it is designed

This simplest way to do this is to use **marker files**.

Use cases:

Expand Down

0 comments on commit 2903c39

Please sign in to comment.