action_unlikely_intent prediction hides TED errors in failed_test_stories.yml #9057

kedz · 2021-07-06T21:06:03Z

Rasa version: 2.7.1

Python version: 3.7.10

Operating system (windows, osx, ...): osx

Issue:
When running rasa test with a policy ensemble that includes UnexpecTEDIntent policy, if the ensemble fails after an action_unlikely_intent is predicted by UnexpecTEDIntentPolicy, the action will be recorded by rasa test as being correct in the metrics and will not show up in failed_test_stories.yml.

Let’s take an example test story:

- story:
  - intent: greet
  - action: utter_greet

When a partial part of this test story:

  - story:
   - intent: greet

is passed to the ensemble it can result in an action_unlikely_intent being triggered

- story:
  - intent: greet
  - action: action_unlikely_intent

Since, the last action triggered by ensemble was action_unlikely_intent, the ensemble should be queried again to see if other policies can actually predict utter_greet. So, the following story:

- story:
  - intent: greet
  - action: action_unlikely_intent

should be fed again to the ensemble. Now there are two cases possible:

Case 1: utter_greet gets predicted. So, the story looks like this -

- story:
  - intent: greet
  - action: action_unlikely_intent
  - action: utter_greet

This story should go in stories_with_warnings.yml, as the expected action was correctly predicted after action_unlikely_intent.

Case 2: utter_greet does not get predicted but some other action gets predicted. So, the story looks like this -

- story:
  - intent: greet
  - action: action_unlikely_intent
  - action: some_other_action

This story should end up in failed_test_stories.yml because some_other_action is predicted instead of utter_greet
Right now, case (2) is not being checked and can be reproduced with this example ….

# Make sure you have checked out the intent-ted branch on your rasa install
git clone https://github.com/rasahq/ited-tolerance-experiments
cd ited-tolerance-experiments

# Train a ted only ensemble
rasa train core -c ted-config.yml -s dataset1/ --out models/ted
# Train a ted+unexpecTEDIntentpolicy ensemble
rasa train core -c ited-config.yml -s dataset1/ --out models/ited

Run rasa test on the TED only ensemble

rasa test core -s test-bug/ -m models/ted --out results/ted

and you should see the following results:

...
2021-07-06 16:40:34 INFO     rasa.core.test  - Evaluation Results on CONVERSATION level:
2021-07-06 16:40:34 INFO     rasa.core.test  - 	Correct:          0 / 1
2021-07-06 16:40:34 INFO     rasa.core.test  - 	Accuracy:         0.000
...
2021-07-06 16:40:36 INFO     rasa.core.test  - Evaluation Results on ACTION level:
2021-07-06 16:40:36 INFO     rasa.core.test  - 	Correct:          10 / 11
2021-07-06 16:40:36 INFO     rasa.core.test  - 	F1-Score:         0.879
2021-07-06 16:40:36 INFO     rasa.core.test  - 	Precision:        0.864
2021-07-06 16:40:36 INFO     rasa.core.test  - 	Accuracy:         0.909
2021-07-06 16:40:36 INFO     rasa.core.test  - 	In-data fraction: 0.455
...

which are correct.

Run the ensemble with UnexpecTEDIntentPolicy

rasa test core -s test-bug/ -m models/ited --out results/ited

and you get:

...
2021-07-06 16:41:36 INFO     rasa.core.test  - Evaluation Results on CONVERSATION level:
2021-07-06 16:41:36 INFO     rasa.core.test  - 	Correct:          1 / 1
2021-07-06 16:41:36 INFO     rasa.core.test  - 	Accuracy:         1.000
...
2021-07-06 16:41:37 INFO     rasa.core.test  - Evaluation Results on ACTION level:
2021-07-06 16:41:37 INFO     rasa.core.test  - 	Correct:          10 / 10
2021-07-06 16:41:37 INFO     rasa.core.test  - 	F1-Score:         1.000
2021-07-06 16:41:37 INFO     rasa.core.test  - 	Precision:        1.000
2021-07-06 16:41:37 INFO     rasa.core.test  - 	Accuracy:         1.000
2021-07-06 16:41:37 INFO     rasa.core.test  - 	In-data fraction: 0.5
...

Looking in the failed test stories you can see TED's error when it's run in isolation results/ted/failed_test_stories.yml:

...
  - action: utter_flight_available
  - intent: deny
  - action: utter_anything_else
  - intent: deny
  - action: utter_goodbye  # predicted: utter_anything_else

However, the corresponding file for the combined TED+UnexpecTEDIntentPolicy (results/ited/failed_test_stories.yml) is empty while the action_unlikely_intent shows up in results/ited/stories_with_warnings.yml.

WARNING: This does not change the case where action_unlikely_intent was actually expected to be triggered at a conversation turn but was not predicted by the ensemble. Such cases should still go to failed_test_stories.yml.

The text was updated successfully, but these errors were encountered:

dakshvar22 · 2021-07-07T15:36:24Z

We'll also need a way to display the information that action_unlikely_intent was triggered just before an incorrectly predicted action. For example, if the expected story was:

- intent: greet
- action: utter_greet

Instead, if the sequence of actions predicted was:

- intent: greet
- action: action_unlikely_intent
- action: utter_somethingelse

then, the user should be informed that:

utter_somethingelse was predicted instead of utter_greet
action_unlikely_intent was predicted before utter_somethingelse.

Therefore the output story will need a formatting like this:

- intent: greet
- action: utter_greet.     # predicted: utter_somethingelse after action_unlikely_intent

Just TBC, this was not mentioned in the initial scope or implementation document and that's my bad.

kedz added type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors. area:rasa-oss 🎡 Anything related to the open source Rasa framework feature:ml/intent-ted labels Jul 6, 2021

TyDunn assigned alwx Jul 7, 2021

alwx mentioned this issue Jul 9, 2021

UnexpecTEDIntentPolicy predictions fix #9079

Merged

4 tasks

alwx closed this as completed Jul 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

action_unlikely_intent prediction hides TED errors in failed_test_stories.yml #9057

action_unlikely_intent prediction hides TED errors in failed_test_stories.yml #9057

kedz commented Jul 6, 2021

dakshvar22 commented Jul 7, 2021 •

edited

Loading

action_unlikely_intent prediction hides TED errors in failed_test_stories.yml #9057

action_unlikely_intent prediction hides TED errors in failed_test_stories.yml #9057

Comments

kedz commented Jul 6, 2021

dakshvar22 commented Jul 7, 2021 • edited Loading

dakshvar22 commented Jul 7, 2021 •

edited

Loading