Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for different recipes #10641

Merged
merged 79 commits into from
Feb 2, 2022
Merged
Show file tree
Hide file tree
Changes from 68 commits
Commits
Show all changes
79 commits
Select commit Hold shift + click to select a range
78fc387
Add support for different recipes
Jan 6, 2022
661342c
Remove extra import
Jan 7, 2022
4554cc2
Fix changes so tests pass and backwards compatible
Jan 10, 2022
561b282
Merge branch 'main' into tayfun/10473-support-other-recipes
Jan 11, 2022
890812b
Graph recipe beginnings
Jan 11, 2022
e883752
Fix code climate
Jan 11, 2022
08da835
Fix auto_configure test failures
Jan 11, 2022
f78b775
Fix mypy errors
Jan 11, 2022
4b8d17a
Fix tests
Jan 11, 2022
4d1288a
Fix flake8 warnings
Jan 11, 2022
3f8f471
What happens in shared stays in shared
Jan 12, 2022
d967bc2
Merge branch 'main' into tayfun/10473-support-other-recipes
Jan 13, 2022
9332d63
Fix tests
Jan 13, 2022
495fc49
Fix mypy complaints
Jan 14, 2022
3306db5
Fix tests
Jan 14, 2022
a057cdc
Fix tests
Jan 14, 2022
fc19f83
Fix extra import
Jan 14, 2022
df2098e
No need to change auto-config template
Jan 14, 2022
c44ea8c
Merge branch 'main' into tayfun/10473-support-other-recipes
Jan 14, 2022
248d0ea
Mark as experimental and track recipe via telemetry
Jan 14, 2022
1b25fa3
Warn if CLI parameters for graph recipe
Jan 14, 2022
1ddc9be
Fix graph recipe
Jan 19, 2022
bcb659c
Merge branch 'main' into tayfun/10473-support-other-recipes
Jan 21, 2022
ef42783
Fix tests
Jan 21, 2022
3716a7d
Black reformat
Jan 21, 2022
a1fa40a
Update with new component
Jan 21, 2022
ae9f5e5
Fix registered component doc example
Jan 24, 2022
49ce5d6
Add docs for graph recipe
Jan 24, 2022
2293bca
Merge branch 'main' into tayfun/10473-support-other-recipes
Jan 24, 2022
ea70d41
Fix recipe docs URL in warning
Jan 24, 2022
47683f7
Fix example code path
Jan 24, 2022
2ef1118
Fix docs
Jan 24, 2022
8e2568a
Add changelog
Jan 24, 2022
f20c898
Fix docs
Jan 24, 2022
3e93547
Fix docs broken link
Jan 24, 2022
154056f
Merge branch 'main' into tayfun/10473-support-other-recipes
Jan 25, 2022
875a17d
Update docs/docs/graph-recipe.mdx
Jan 26, 2022
22e6e2c
Bump google-github-actions/setup-gcloud from 0.3.0 to 0.4.0
dependabot[bot] Jan 25, 2022
b6b3427
update endpoint and unit tests
ancalita Jan 26, 2022
dec37e0
add changelog
ancalita Jan 26, 2022
5ac4b08
add request timeout to cli command line arguments
donodje Jan 18, 2022
d4c8836
parse request timeout from command line args and set DEFAULT_REQUEST_…
donodje Jan 18, 2022
f44e014
test that the default request timeout argument is used properly
donodje Jan 18, 2022
ef6ce21
fix linter errors
donodje Jan 18, 2022
1f67c2f
directly access sys.argv to read the cmd line parameters
donodje Jan 18, 2022
b5fde26
remove trailing whitespace
donodje Jan 19, 2022
5330be7
run black to fix linter issues
donodje Jan 19, 2022
f79c756
deduplicate declaration of DEFAULT_RESPONSE_TIMEOUT and DEFAULT_REQUE…
donodje Jan 19, 2022
83ae794
correct constants import path in test
donodje Jan 19, 2022
3c15a3a
lint against upgraded black version 21.7b0
donodje Jan 19, 2022
74ab4b2
add changelog entry
donodje Jan 20, 2022
ad49502
set the request timeout from the request timeout command line arg
donodje Jan 25, 2022
094b49b
move DEFAULT_STREAM_READING_TIMEOUT to rasa.core.constants
donodje Jan 25, 2022
1f53e10
set default param for _get_stream_reading_timeout
donodje Jan 25, 2022
de72159
set None default for request_timeout parameter
donodje Jan 26, 2022
fc66008
Remove default schemas which is same for recipes
Jan 26, 2022
7368dd1
Merge branch 'main' into tayfun/10473-support-other-recipes
Jan 26, 2022
8304754
Update rasa/engine/recipes/default_recipe.py
Jan 26, 2022
f8ec9da
Make core|nlu_target to be configurable
Jan 28, 2022
1b9954a
Make tests clear; rewrite without monkeypatch
Jan 28, 2022
cb454dd
Fix typo
Jan 28, 2022
fa43bac
Add tip at the top for graph recipe docs
Jan 28, 2022
249a046
fix over-indent in Tokenizers section of docs (#10555)
Polaris000 Jan 28, 2022
1d3dc70
Merge branch 'main' into tayfun/10473-support-other-recipes
Jan 28, 2022
8e631e3
Fix mypy error
Jan 28, 2022
495fb46
Fix tests for auto-config
Jan 28, 2022
e5fa3ca
Update docs with target nodes
Jan 31, 2022
2b5af0f
Merge branch 'main' into tayfun/10473-support-other-recipes
Feb 1, 2022
fb9479b
Update docs/docs/graph-recipe.mdx
Feb 2, 2022
2bc6fe5
Move graph recipe to data folder for tests
Feb 2, 2022
9c771c8
Warn if multiple importers are used
Feb 2, 2022
b43566f
Add graph config schemas to telemetry
Feb 2, 2022
998da42
Fix changelog URL
Feb 2, 2022
3a491d4
Merge branch 'main' into tayfun/10473-support-other-recipes
Feb 2, 2022
16a4c24
Revert "Fix changelog URL"
Feb 2, 2022
fa1c2eb
Fix link to yet inexistent graph recipe docs
Feb 2, 2022
5efdb50
Add test for graph recipe telemetry event
Feb 2, 2022
b978569
Make targets required, raise if not provided
Feb 2, 2022
506f130
Fix mypy
Feb 2, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions changelog/10473.feature.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Support other recipe types.

This pull request also adds support for graph recipes, see details at
https://rasa.com/docs/rasa/model-configuration check Graph Recipes section.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you can use an anchor link to go directly to that section?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a new page not an anchor and I can't link to it because it's not live yet, docs check fails with dead link :D


Graph recipe is a raw format for specifying executed graph directly. This is
useful if you need a more powerful way to specify your model creation.
73 changes: 73 additions & 0 deletions data/graph_schemas/graph_config_short_predict_schema.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
nodes:
nlu_message_converter:
needs:
messages: __message__
uses: rasa.graph_components.converters.nlu_message_converter.NLUMessageConverter
constructor_name: load
fn: convert_user_message
config: {}
eager: true
is_target: false
is_input: false
resource: null
custom_nlu_target:
needs:
messages: nlu_message_converter
domain: domain_provider
uses: rasa.nlu.classifiers.regex_message_handler.RegexMessageHandler
constructor_name: load
fn: process
config: {}
eager: true
is_target: false
is_input: false
resource: null
domain_provider:
needs: {}
uses: rasa.graph_components.providers.domain_provider.DomainProvider
constructor_name: load
fn: provide_inference
config: {}
eager: true
is_target: false
is_input: false
resource:
name: domain_provider
run_MemoizationPolicy0:
needs:
domain: domain_provider
tracker: __tracker__
rule_only_data: rule_only_data_provider
uses: rasa.core.policies.memoization.MemoizationPolicy
constructor_name: load
fn: predict_action_probabilities
config: {}
eager: true
is_target: false
is_input: false
resource:
name: train_MemoizationPolicy0
rule_only_data_provider:
needs: {}
uses: rasa.graph_components.providers.rule_only_provider.RuleOnlyDataProvider
constructor_name: load
fn: provide
config: {}
eager: true
is_target: false
is_input: false
resource:
name: train_RulePolicy1
custom_core_target:
needs:
policy0: run_MemoizationPolicy0
domain: domain_provider
tracker: __tracker__
uses: rasa.core.policies.ensemble.DefaultPolicyPredictionEnsemble
constructor_name: load
fn: combine_predictions_from_kwargs
config: {}
eager: true
is_target: false
is_input: false
resource: null
27 changes: 27 additions & 0 deletions data/graph_schemas/graph_config_short_train_schema.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
nodes:
finetuning_validator:
needs:
importer: __importer__
uses: rasa.graph_components.validators.finetuning_validator.FinetuningValidator
constructor_name: create
fn: validate
config:
validate_core: true
validate_nlu: true
eager: false
is_target: false
is_input: true
resource: null
nlu_training_data_provider:
needs:
importer: finetuning_validator
uses: rasa.graph_components.providers.nlu_training_data_provider.NLUTrainingDataProvider
constructor_name: create
fn: provide
config:
language: en
persist: false
eager: false
is_target: false
is_input: true
resource: null
115 changes: 115 additions & 0 deletions data/test_config/graph_config_short.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# The config recipe.
# https://rasa.com/docs/rasa/model-configuration/
recipe: graph.v1

language: en

core_target: custom_core_target

nlu_target: custom_nlu_target

train_schema:
nodes:
# We skip schema_validator node (we only have this for DefaultV1Recipe
# since we don't do validation for the GraphV1Recipe)
finetuning_validator:
needs:
importer: __importer__
uses: rasa.graph_components.validators.finetuning_validator.FinetuningValidator
constructor_name: create
fn: validate
config:
validate_core: true
validate_nlu: true
eager: false
is_target: false
is_input: true
resource: null
nlu_training_data_provider:
needs:
importer: finetuning_validator
uses: rasa.graph_components.providers.nlu_training_data_provider.NLUTrainingDataProvider
constructor_name: create
fn: provide
config:
language: en
persist: false
eager: false
is_target: false
is_input: true
resource: null

predict_schema:
nodes:
nlu_message_converter:
needs:
messages: __message__
uses: rasa.graph_components.converters.nlu_message_converter.NLUMessageConverter
constructor_name: load
fn: convert_user_message
config: {}
eager: true
is_target: false
is_input: false
resource: null
custom_nlu_target:
needs:
messages: nlu_message_converter
domain: domain_provider
uses: rasa.nlu.classifiers.regex_message_handler.RegexMessageHandler
constructor_name: load
fn: process
config: {}
eager: true
is_target: false
is_input: false
resource: null
domain_provider:
needs: {}
uses: rasa.graph_components.providers.domain_provider.DomainProvider
constructor_name: load
fn: provide_inference
config: {}
eager: true
is_target: false
is_input: false
resource:
name: domain_provider
run_MemoizationPolicy0:
needs:
domain: domain_provider
tracker: __tracker__
rule_only_data: rule_only_data_provider
uses: rasa.core.policies.memoization.MemoizationPolicy
constructor_name: load
fn: predict_action_probabilities
config: {}
eager: true
is_target: false
is_input: false
resource:
name: train_MemoizationPolicy0
rule_only_data_provider:
needs: {}
uses: rasa.graph_components.providers.rule_only_provider.RuleOnlyDataProvider
constructor_name: load
fn: provide
config: {}
eager: true
is_target: false
is_input: false
resource:
name: train_RulePolicy1
custom_core_target:
needs:
policy0: run_MemoizationPolicy0
domain: domain_provider
tracker: __tracker__
uses: rasa.core.policies.ensemble.DefaultPolicyPredictionEnsemble
constructor_name: load
fn: combine_predictions_from_kwargs
config: {}
eager: true
is_target: false
is_input: false
resource: null
12 changes: 6 additions & 6 deletions docs/docs/custom-graph-components.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -228,7 +228,7 @@ Your graph component's train method must return the value of `resource` so that
the training results between trainings.
The `self._model_storage.write_to(self._resource)` context manager provides a path to
a directory where you can persist any data required by your
graph component.
graph component.

```python
from __future__ import annotations
Expand Down Expand Up @@ -328,16 +328,16 @@ class MyComponent(GraphComponent):


## Registering Graph Components with the Model Configuration
To make your graph component available to Rasa Open Source you have to register your
To make your graph component available to Rasa Open Source you may have to register your
graph component with a recipe. Rasa Open Source uses recipes to translate the content
of your model configuration to executable
[graphs](custom-graph-components.mdx#graph-components).
Currently, Rasa Open Source only supports the `default.v1` recipe.
Register your graph component with this recipe by using the `DefaultV1Recipe.register`
Currently, Rasa Open Source supports the `default.v1` and the experimental `graph.v1` recipes.
For `default.v1` recipe, you need to register your graph component by using the `DefaultV1Recipe.register`
decorator:

:::code language="python" source="docs/sources/data/test_classes/registered_component.py"
highlight="5-9":::
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was failing in current documents so I fixed it.

```python (docs/sources/data/test_classes/registered_component.py)
```

Rasa Open Source uses the information provided in the `register` decorator and the
position of your graph component within the configuration file to schedule the execution
Expand Down
134 changes: 134 additions & 0 deletions docs/docs/graph-recipe.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
---
id: graph-recipe
sidebar_label: Graph Recipe
title: Graph Recipe
description: Learn about Graph Recipe for Rasa Open Source.
abstract: Graph recipes provide a more fine tuned configuration for your executable graphs.
---

:::tip Default Recipe or Graph Recipe?

You will probably only need graph recipes if you're running ML experiments or ablation studies on an existing model. We recommend starting with the default recipe and for many applications that will be all that's needed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"ablation studies" feels a bit specific? (I had to google it 😁 )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've used the exact wording from @rctatman actually and I had to google it too :D I think it is good because it has the effect of warning against using graph recipes if the user is not too advanced.


:::

We now support graph recipes in addition to the default recipe. Graph recipes provide more granular control over how execution graph schemas are built.

:::caution New in 3.1
This feature is experimental.
We introduce experimental features to get feedback from our community, so we encourage you to try it out!
However, the functionality might be changed or removed in the future.
If you have feedback (positive or negative) please share it with us on the [Rasa Forum](https://forum.rasa.com).

:::


## Differences with Default Recipe

There are some differences between the default recipe and the new graph recipe. Main differences are:

- Default recipe is named `default.v1` in the config file whereas graph recipes are named `graph.v1`.
- Default recipes provide an easy to use recipe structure whereas graph recipes are more advanced and powerful.
- Default recipes are very opinionated and provide various defaults whereas graph recipes are more explicit.
- Default recipes can auto-configure themselves and dump the defaults used to the file if some sections in `config.yml` are missing, whereas graph recipes do none of this and assume what you see is what you get. There are no surprises with graph recipes.
- Default recipe divides graph configuration into mainly two parts: `pipeline` and `policies`. These can also be described as NLU and core (dialogue management) parts. For graph recipe on the other hand, the separation is between training (ie. `train_schema`) and prediction (ie. `predict_schema`).

:::tip Starting from scratch?

If you don't know which recipe to choose, use the default recipe to bootstrap your project fast. If later you find that you need more powerful fine-tuning options, you can always change your recipe to be a graph recipe.
tayfun marked this conversation as resolved.
Show resolved Hide resolved

:::

## Graph Configuration File Structure

Graph recipes share `recipe` and `language` keys with the same meaning. Similarities end there as graph recipes do not have `pipeline` or `policies` keys but they do have `train_schema` and `predict_schema` keys for determining the graph nodes during train and predict runs respectively. In addition to this, target nodes for NLU and core can be specified explicitly with graph recipes, these can be declared with `nlu_target` and `core_target`. If targets are omitted, node names used by default recipe will take over, and these are `run_RegexMessageHandler` and `select_prediction` for nlu and core respectively.

Here's an example graph recipe:

```yaml-rasa (docs/sources/data/test_config/graph_config_short.yml)
```

:::note graph targets
For NLU, default target name of `run_RegexMessageHandler` will be used, while for core (dialogue management) the target will be called `select_prediction` if omitted. Make sure you have graph nodes with relevant names in your schema definitions.

In a similar fashion, note that the default resource needed by the first graph node is fixed to be `__importer__` (representing configuration, training data etc.) for training task and it is `__message__` (representing the message received) for prediction task. Make sure your first nodes make use of these dependencies.

:::

## Graph Node Configuration

As you can see in the example above, graph recipes are very much explicit and you can configure each graph node as you would like. Here is an explanation of what some of the keys mean:

- `needs`: You can define here what data your graph node requires and from which parent node. Key is the data name, whereas the value would refer to the node name.
```yaml-rasa
needs:
messages: nlu_message_converter
```
Current graph node needs `messages` which is provided by `nlu_message_converter` node.

- `uses`: You can provide the class used to instantiate this node with this key. Please provide the full path in Python path syntax, eg.

```yaml-rasa
uses: rasa.graph_components.converters.nlu_message_converter.NLUMessageConverter
```
You are not required to use Rasa internal graph component classes and you
can use your own components here. Refer to [custom graph
components](custom-graph-components.mdx) pages to find out how to write your
own graph components.

- `constructor_name`: This is the constructor used to instantiate your component. Example:

```yaml-rasa
constructor_name: load
```

- `fn`: This is the function used in executing the graph component. Example:

```yaml-rasa
fn: combine_predictions_from_kwargs
```

- `config`: You can provide any configuration parameters for your components using this key.

```yaml-rasa
config:
language: en
persist: false
```

- `eager`: This determines if your component should be eagerly loaded
when the graph is constructed or if it should wait until the
runtime (this is called lazy instantiation). Usually we always
instantiate lazily during training and eagerly during inference (to
avoid slow first prediction).


```yaml-rasa
eager: true
```

- `resource`: If given, graph node is loaded from this resource instead of of instantiated from scratch. This is e.g. used to load a trained component for predictions.

```yaml-rasa
resource:
name: train_RulePolicy1
```

- `is_target`: Boolean value, if `True` then this node can't be pruned
during fingerprinting (it might be replaced with a cached value
though). This
is e.g. used for all components which train as their result always needs
to be added to the model archive so that the data is available during
inference.

```yaml-rasa
is_target: false
```

- `is_input`: Boolean value; nodes with `is_input` are _always_ run (also during the
fingerprint run). This makes sure that we e.g. detect changes in file
contents.

```yaml-rasa
is_input: false
```
Loading