Skip to content

Commit

Permalink
Updated map task information to indicate array node is now the defaul…
Browse files Browse the repository at this point in the history
…t, added optional return value for min success ratio (flyteorg#5561)

Signed-off-by: pryce-turner <[email protected]>
Signed-off-by: Vladyslav Libov <[email protected]>
  • Loading branch information
pryce-turner authored and VladyslavLibov committed Aug 16, 2024
1 parent ec7892b commit 043e41e
Showing 1 changed file with 9 additions and 21 deletions.
30 changes: 9 additions & 21 deletions docs/user_guide/advanced_composition/map_tasks.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,10 +67,14 @@ If the input size exceeds the concurrency value, multiple batches will run seria

```python
@workflow
def map_workflow_with_additional_params(data: list[int] = [10, 12, 11, 10, 13, 12, 100, 11, 12, 10]) -> list[bool]:
def map_workflow_with_additional_params(data: list[int] = [10, 12, 11, 10, 13, 12, 100, 11, 12, 10]) -> list[typing.Optional[bool]]:
return map_task(detect_anomalies, concurrency=1, min_success_ratio=0.75)(data_point=data)
```

:::{note}
Notice the return type of the list has been set to `Optional` when a `min_success_ratio` is added. This is due to the fact we are now tolerating failures, meaning the expected return type from the mapped task may in fact not get returned.
:::

A map task internally uses a compression algorithm (bitsets) to handle every Flyte workflow node’s metadata,
which would have otherwise been in the order of 100s of bytes.

Expand Down Expand Up @@ -162,30 +166,14 @@ pyflyte run --remote \

## ArrayNode

ArrayNode map tasks serve as a seamless substitution for regular map tasks, differing solely in the submodule
utilized to import the `map_task` function. Specifically, you will need to import `map_task` from the experimental module as illustrated below:

```python
from flytekit import task, workflow
from flytekit.experimental import map_task

@task
def t(a: int) -> int:
...

@workflow
def array_node_wf(xs: list[int]) -> list[int]:
return map_task(t)(a=xs)
```

Flyte introduces map task to enable parallelization of homogeneous operations,
Flyte originally introduced map tasks to enable parallelization of homogeneous operations,
offering efficient evaluation and a user-friendly API. Because it’s implemented as a backend plugin,
its evaluation is independent of core Flyte logic, which generates subtask executions that lack full Flyte functionality.
ArrayNode tackles this issue by offering robust support for subtask executions.
ArrayNode tackled this issue by offering robust support for subtask executions.
It also extends mapping capabilities across all plugins and Flyte node types.
This enhancement will be a part of our move from the experimental phase to general availability.
Starting with `flytekit` version 1.12.0, ArrayNode is the default `map_task` importable via `from flytekit import map_task`.

In contrast to map tasks, an ArrayNode provides the following enhancements:
In contrast to the original map tasks, an ArrayNode provides the following enhancements:

- **Wider mapping support**. ArrayNode extends mapping capabilities beyond Kubernetes tasks, encompassing tasks such as Python tasks, container tasks and pod tasks.
- **Cache management**. It supports both cache serialization and cache overwriting for subtask executions.
Expand Down

0 comments on commit 043e41e

Please sign in to comment.