docs: Link memoization and work-avoidance

Despite argoproj#10769 and argoproj#10426 both having examples of memoization not working with the examples having no output, no-one has picked up on this. To address this improve the documentation for memoization and work-avoidance, linking the two ideas and pointing people who want to skip steps towards work-avoidance unless they are really doing what memoization was designed to do. Issue argoproj#10426 is problematic in that some steps get memoized when perhaps they should't, so this commit shouldn't close it. Fixes argoproj#10769 Signed-off-by: Alan Clucas <[email protected]>
Joibel · Jun 23, 2023 · 2903c39 · 2903c39
1 parent 873a58d
commit 2903c39
Show file tree

Hide file tree

Showing 2 changed files with 13 additions and 3 deletions.
diff --git a/docs/memoization.md b/docs/memoization.md
@@ -5,12 +5,16 @@
 ## Introduction
 
 Workflows often have outputs that are expensive to compute.
-This feature reduces cost and workflow execution time by memoizing previously run steps:
+Memoization reduces cost and workflow execution time by recording the result of previously run steps:
 it stores the outputs of a template into a specified cache with a variable key.
 
+Memoization only works for steps which have outputs, if you attempt to use it on steps which do not it should not work (there are some cases where it does, but they shouldn't). It is designed for 'pure' steps, where the purpose of running the step is to calculate some outputs based upon the steps inputs, and only the inputs. Pure steps should not interact with the outside world, but workflows won't enforce this on you.
+
+If your steps are not there to create outputs, but you'd still like to skip running them, you should look at the [work avoidance](work-avoidance.md) technique instead of memoization.
+
 ## Cache Method
 
-Currently, caching can only be performed with config-maps.
+Currently, the cached data is stored in config-maps.
 This allows you to easily manipulate cache entries manually through `kubectl` and the Kubernetes API without having to go through Argo.
 All cache config-maps must have the label `workflows.argoproj.io/configmap-type: Cache` to be used as a cache. This prevents accidental access to other important config-maps in the system
 
@@ -50,3 +54,5 @@ spec:
     * Delete the existing `ConfigMap` cache or switch to use a different cache.
     * Reduce the size of the output parameters for the nodes that are being memoized.
     * Split your cache into different memoization keys and cache names so that each cache entry is small.
+1. My step isn't getting memoized, why not?
+   Ensure that you have specified at least one output on the step.
diff --git a/docs/work-avoidance.md b/docs/work-avoidance.md
@@ -2,7 +2,11 @@
 
 > v2.9 and after
 
-You can make workflows faster and more robust by employing **work avoidance**. A workflow that utilizes this is simply a workflow containing steps that do not run if the work has already been done. This simplest way to do this is to use **marker files**.
+You can make workflows faster and more robust by employing **work avoidance**. A workflow that utilizes this is simply a workflow containing steps that do not run if the work has already been done.
+
+This technique is similar to [memoization](memoization.md) but they have distinct use cases. Work avoidance is totally in your control and you make the decisions as to have to skip the work. [Memoization](memoization.md) is a feature of Argo Workflows to automatically skip steps which generate outputs - it is designed
+
+This simplest way to do this is to use **marker files**.
 
 Use cases: