-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add entities to UserUttered event if they are predicted via a policy #7443
Conversation
@Ghostvv Just wanted to check if I am on the right path. Can you take a quick look, no full review needed yet. Do we already have an e2e test that I can use to test the functionality? |
The tests also fail on the |
rasa/shared/core/trackers.py
Outdated
@@ -459,6 +460,10 @@ def applied_events(self) -> List[Event]: | |||
) | |||
if event.use_text_for_featurization is None: | |||
event.use_text_for_featurization = use_text_for_featurization | |||
# update event's entities based on the future event | |||
entities = self._define_user_entities(events_as_list[i + 1 :]) | |||
if event.entities is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure about that one. Because most probably event.entities
is already set with smth from nlu pipeline. I'd say we have to update entities
dict treating TEDPolicy
as another extractor. I'm not sure though how the historical UserUttered
events are persisted. We need to avoid updating entities every time
Maybe we should do it if event.use_text_for_featurization is None:
but not sure if it is a correct condition, it needs to be tested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the check. Currently checking if the entity is not yet inside the entities dict. If not, it is added.
@Ghostvv Updated the code and tested it locally on e2e bot. Everything seems to work as expected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a couple of comments
rasa/core/policies/ted_policy.py
Outdated
if tracker.latest_action_name == ACTION_LISTEN_NAME: | ||
last_user_utterance = tracker.latest_message | ||
else: | ||
return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think
if tracker.latest_action_name == ACTION_LISTEN_NAME:
return
should be the first check to save calculation and extraction of entity labels etc
actually, in this case TED itself should return smth like None
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you mean !=
instead of ==
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
entities = self._define_user_entities(events_as_list[i + 1 :]) | ||
if entities is not None: | ||
for entity in entities: | ||
if entity not in event.entities: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
entity is a full dict here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
@tabergma the auto slot filling doesn't work fo these entities? |
Yes, slot filling only works for entities coming from user messages aka NLU directly. But I guess that is something for a different PR. Also testing entities in TED is not yet implemented, e.g. via |
* fix a test * action texts in importer added to action_text in domain * a little end-to-end bot example * implement hacky e2e prediction * fix warnings * fix user uttered featurization * fix useruttered featurization * 100 epochs * fix writin bot action text to file * store action text in wrong text * add the comment to bot end-to-end utterance * update printing e2e utterances * fix single type input * fix e2e prediction after action * implement comparison between e2e policies * reduce e2e confidence threshold * fix rule policy * fix non e2e prediction * fix cleaning of working data * fix in regex featurizer processing * RasaModelData can handle 4D Tensors (#6833) * handle 4d dense features * padding for dense and sparse works * update RasaModelData * update RasaModelData tests * update shape of sparse tensors * update is_in_4d_format * set eager back to False * fix code quality issues * formatting * fix type issues * refactoring * update types * formatting * fix type issue * subclass numpy array * explicit specify number_of_dimensions * clean up * training is working again * rename feature_dimension to units * reset default eager values * update comments * refactoring * fix type issue * fix types * review comments * formatting * Fix tests on e2e branch (#6984) * fix tests in test_policies * set use_text_for_featurization in example dialogues * fix test for NLU training & test_rule_policy * add missing import * formatting * fix more tests * use intent for featurization by default * fix test_processor * fix test_trackers * fix test_importer * fix test_dialogues and test_policies Co-authored-by: Vova Vv <[email protected]> * Refactor creation of RasaModelData (#7010) * fix tests in test_policies * set use_text_for_featurization in example dialogues * fix test for NLU training & test_rule_policy * add missing import * formatting * fix more tests * update doc strings * refactor create_model_data in DIETClassifier * create sub-methods * make sure response selector trains * reset default value of eager * reset epoch of e2e example * fix entity key * fix tests, testing model is failing * clean up * fix import * Fix issues in DIETClassifier * remove zero features from DIETClassifier again * add test * remove whitespace in blank line * clean up * clean up model data utils * fix type issue * make sure only entity recognition works in diet * Add tests for state and tracker featurizers (#7086) * split test_featurizer.py into two files * code style * add test for prepare_from_domain * add tracker featurizer tests * add test * fix imports in tests * add more tests * Add option "featurizers" to TEDPolicy (#7079) * add FEATURIZERS to TEDPolicy parameters * update tests * fix import * fix merging master * Bring DIET into TED (#7131) * add diet to ted * reshape 4d tensors into 3d and back * fix shapes in non eager mode * make shape indices more general * fix add_length * add todo * sentence features are now also 4D * sequence length is 4D * convert 4d to 3 during padding * mask is 4d now * bring mask in correct shape before transformer * keep also the orginial dialogue length * update doc strings * use tf.scatter_nd to tranform 3d back to 4d * move tensor transformation to _encode_features_per_attribute * fix issues in _encode_features_per_attribute * use correct dialogue length * add comments * clean up * update constants * Update rasa/utils/tensorflow/model_data.py Co-authored-by: Vladimir Vlasov <[email protected]> * review comment * fix tests * use correct attribute mask * use 4d attribute mask * set eager back to default * fix test * update _convert_to_original_shape * add indices to model data * fix tf.scatter_nd * fix TED train and predict * remove not needed constants * fix failing tests * update docstrings * fix docstring issues * review comments * fix shape mismatches Co-authored-by: Vova Vv <[email protected]> Co-authored-by: Vladimir Vlasov <[email protected]> * fix entities features * resolve merge conflict in yaml_story_writer * use story string when writing user uttered event * create empty fakes (#7198) * substitute fake features with empty arrays and use attribute mask to rebuild input * remove unused import, remove comment * refactor, add comments, add types * support empty features * add prepare_for_predict to precalculate self.all_labels_embed * return to default config * add error * add prepare_for_predict to diet * fix test_model_data_utils * fix test gen_batch * Update rasa/core/policies/ted_policy.py Co-authored-by: Tanja <[email protected]> * rename to filter fakes and create dial len beforehand * add dtype= * fix comment * add comments about fake features Co-authored-by: Tanja <[email protected]> * Monster ted (#7262) * add diet to ted * reshape 4d tensors into 3d and back * fix shapes in non eager mode * make shape indices more general * fix add_length * add todo * sentence features are now also 4D * sequence length is 4D * convert 4d to 3 during padding * mask is 4d now * bring mask in correct shape before transformer * keep also the orginial dialogue length * update doc strings * use tf.scatter_nd to tranform 3d back to 4d * move tensor transformation to _encode_features_per_attribute * fix issues in _encode_features_per_attribute * use correct dialogue length * add comments * clean up * update constants * review comment * keep entity dict * create tag_ids for TED * clean up after merge * add batch_loss_entities (not working) * concatenate text and dialogue transformer output * get last dialogue before CRF * add predicting entities * clean up * differentiate between max history tracker featurizer used or not * add todo * add comments * use correct tag id mapping * check if text exists * fix frozenset issues * ignore actual entity value in MemoizationPolicy * fix import * fix some tests * update after merge * use python if instead of tf.cond * we need to return a tensor in tf.cond instead of None * create entity tags for all texts * update batch loss entities (not yet working) * input to entity loss * update entity prediction * fix randomness and shapes * fix ffnn encoding layer name * add todo * Update rasa/core/policies/ted_policy.py Co-authored-by: Tanja <[email protected]> * Update rasa/core/featurizers/single_state_featurizer.py Co-authored-by: Tanja <[email protected]> * rename to entity_tag_id_mapping * add comment to last dial mask * add comments to tf.cond * add docstrings * refactor number of dims check * rename zero features to fake features * pre compute dialogue_indices * create helper methods * calculate number of units for text_transformer_output * add todo * fix tests * use indices constant Co-authored-by: Tanja Bergmann <[email protected]> * refactor e2e ted choice (#7285) * refactor e2e ted choice * add comment why prediction batch of size 2 * fix test policies * fix test ensemble * fix e2e prediction * utter end-to-end bot responses * add docstring * deprecate unused method * add changelog for deprecation * log end-to-end action with text * pass flag instead of determining end-to-end utterance on the fly. * Revert "pass flag instead of determining end-to-end utterance on the fly." This reverts commit 868a715. * remove `events_for_prediction` * remove unused import * fix e2e training edge cases * use special action for end-to-end responses * rename to `from_action_name_or_text` * clarify in comment * rename to `ActionEndToEndResponse` * fix form tests * add docstrings * remove useless test * remove user text if intent is present * remove story read check for user and intent message * ignore entities in text if intent is present * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * PR comments * black * rename var * remove unused `_action_event_for_prediction` * improve phrasing / typing / code structure * add test to ensure action text with `utter_` as start works * rename `Domain.action_names` to `Domain.action_names_or_texts` * fix docstrings * remove unused imports * test and fix writing YAML stories * move `MarkdownStoryWriter` tests to separate file * use `tmp_path` * consider end-to-end stories correctly * fix story reading for retrieval intents * fix missing renames for `prepare_from_domain` * fixup for last merge in from `master` * dump story not as test story * fix docstring errors * remove unused method (not used in Rasa X either) * raise if printing end-to-end things in Markdown * add todos * fix error with entity formatting * move to `rasa.shared` * remove CoreDataImporter * change fingerprinting to use yaml writer * fix tests failing due to new default story file * adapt remaining parts to `as_story_string` failing if end-to-end event * remove `as_story_string` from story validator * don't add e2e entities as features (#7435) * don't add e2e entities as features * remove entities as input features for text * simply don't add entities for e2e user utterances * remove non existent import * fix test * add comment * only train NLU model if data or end to end * fix filter units * fix import * read and write in test * fix displaying of end-to-end actions in rasa interactive * skip warning for end-to-end user messages in training data * add docs link * remove trailing whitespace * return `NotImplemented` if other class * remove `md_` as it's not related to md * add docstrings to entire module * add more docstrings * increase timeout due to failing windows tests * improve string representation of `UserUttered` * fix hashing of `UserUttered` * Add entities to UserUttered event if they are predicted via a policy (#7443) * remove unused import * remove unused import * fix problematic docstrings * specify yaml content-type * fix docstrings in featurearray * fix docstrings * Update rasa/core/policies/rule_policy.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/core/policies/rule_policy.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/core/policies/rule_policy.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/utils/tensorflow/model_data_utils.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/utils/tensorflow/model_data_utils.py Co-authored-by: Daksh Varshneya <[email protected]> * add tests for define featurization in ensemble * update docstrings * add tests for e2e rules * Update rasa/core/policies/ted_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/featurizers/tracker_featurizers.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/featurizers/tracker_featurizers.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/policies/sklearn_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/policies/ted_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/policies/ted_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * Add experimental warning to e2e training (#7524) * Add experimental warning to e2e training * re-order imports * update user messages in apply_to of define events (#7503) * update user messages in apply_to of define events * fix e2e entity prediction in ted * rename method * fix entity featurization for text input * fix entity prediction in ted * remove safeguard * fix actionexecuted string * fix comments * keep __str__ inconsistent for actionexecuted * increase number of epochs for ted * add Define events in applied events * clean states during prediction * review comments * add _prediction_with_unhappy_path * review comments * remove unneeded variable * review comments in DIET * renamed empty_features to absent_features * remove unused imports * update docstrings * refactor entity data creation into a separate method * Update rasa/shared/core/domain.py Co-authored-by: Joe Juzl <[email protected]> * Update rasa/shared/core/domain.py Co-authored-by: Joe Juzl <[email protected]> * Update rasa/shared/core/domain.py Co-authored-by: Joe Juzl <[email protected]> * shorten the long comment * create separate constant for prediction features * Update rasa/core/policies/ted_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * add comment why we add 1 to sequence features * rephrase last dial comment * rephrase comments * Add description to RasaModelData. * explain choice of warning * fix e2e train tests (#7540) * remove prints * use precise len in tests * remove blank line * type annotations * fix dry run test * fix test_surface_attributes * fix typo in test * Use tokens for story structure validation (#7436) * Add tests * Draft first implementation * Fix random sorting before hash * Update doc strings * Add doc strings * Fix minor issues * Add config file loading * Fix some docstrings * Make test stories part of the test * Update tests * Fix minor issues * Fix config param argument * Add TrainingType to tests to avoid config change * Delete hash again * Update docs * Update rasa/core/training/story_conflict.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/shared/nlu/training_data/features.py Co-authored-by: Tobias Wochinger <[email protected]> * Fix minor issues Co-authored-by: Tobias Wochinger <[email protected]> * move constant to red * do not enumerate * fix domain test conflicts * add example showing e2e functionality (#7535) * add basic files * add NLU examples * increase epochs and remove memoization policy * Update examples/e2ebot/config.yml Co-authored-by: Vladimir Vlasov <[email protected]> * Update examples/e2ebot/domain.yml Co-authored-by: Vladimir Vlasov <[email protected]> * Update examples/e2ebot/data/stories.yml Co-authored-by: Vladimir Vlasov <[email protected]> * added a story with a bot utterance instead of an action label * remove bot utterance again Co-authored-by: Vladimir Vlasov <[email protected]> * code quality check * Update tests/test_server.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/shared/core/training_data/story_reader/yaml_story_reader.py Co-authored-by: Alexander Khizov <[email protected]> * add types * Update tests/test_train.py Co-authored-by: Tobias Wochinger <[email protected]> * fix tests * add types * add types * Update tests/test_train.py Co-authored-by: Tobias Wochinger <[email protected]> * add types * Update tests/test_train.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/shared/nlu/training_data/features.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * remove unused parameter * assert return value * use proper objects * use dedent * PR review comments * use function scoped fixture * create new domain to avoid interacting with session one * use safer default for __eq__ * remove not required ignore comment * fix docstrings * mark `Event` class as abtract * use correct docstring * rename to `AlwaysEqualEventMixin` * mark methods as abstract * add docstring * move comment out of docstring * don't persist changed entities + autofill slots for policy entities (#7553) * don't persist changed entities * make tracker state return combined `UserUttered` event * autofill slots for policy entities * made if more explicit * use constants * rename `DefinePrevUserUtteredEntities` to `EntitiesAdded` * rename and make `DefinePrevUserUttered` more general * fix docstrings * add e2e docs (#7512) * change TED default params * Update docs/docs/training-data-format.mdx Co-authored-by: Tanja <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tanja <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tanja <[email protected]> * update docs phrases * break long line * add changelog * add deprecation config for dense dimension * fix new config parameters * add comments for config params * fix docstrings * update comments * update migration guide with new ted parameters * update changelog * a lot of bug fixes regarding updating config * fix updating config dict again * Update docs/docs/training-data-format.mdx Co-authored-by: Tanja <[email protected]> * remove new-old config param descriptions * remove else * add docstring * update changelog * Update docs/docs/stories.mdx Co-authored-by: Ella Rohm-Ensing <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Ella Rohm-Ensing <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Ella Rohm-Ensing <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Ella Rohm-Ensing <[email protected]> * update stories.mdx * update training-data-format.mdx * substitute we with you * don't include e2e in the stories example * make list * remove required * add migration guide for domain changes * mention explicitly * fix import * add link to ted policy * add docstrings * fix updating config * move e2e into separate paragraph * add blank line back * add increased train time note * Update docs/docs/migration-guide.mdx Co-authored-by: Ben Quachtran <[email protected]> * Update docs/docs/migration-guide.mdx Co-authored-by: Ben Quachtran <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Akela Drissner-Schmid <[email protected]> * remove the link to the training data format page * remove the line' * remove the line * break long line * Update changelog/7496.improvement.md Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/migration-guide.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/migration-guide.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/migration-guide.mdx Co-authored-by: Sam Sucik <[email protected]> * add example to migration guide * remove notes * rename changelog to feature * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * expand explanation * update examples in docs to have the same topic as e2ebot * copy ted description from diet * update parameter description * fix overriding default config * Update docs/docs/training-data-format.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tobias Wochinger <[email protected]> * add actions to doc stories * update story * more details in error message Co-authored-by: Tanja <[email protected]> Co-authored-by: Ella Rohm-Ensing <[email protected]> Co-authored-by: Tobias Wochinger <[email protected]> Co-authored-by: Ben Quachtran <[email protected]> Co-authored-by: Akela Drissner-Schmid <[email protected]> Co-authored-by: Sam Sucik <[email protected]> Co-authored-by: m-vdb <[email protected]> * correctly use mixin class + filter out abstract classes * add clarification comment * make entities a list * remove space * fix assigning variables * fix assigning variables * Update docs/docs/components.mdx Co-authored-by: Tobias Wochinger <[email protected]> * add empty line add end * fix breaking hash function (list is not hashable) Co-authored-by: Zhenya Razumovskaia <[email protected]> Co-authored-by: Tanja Bergmann <[email protected]> Co-authored-by: Tobias Wochinger <[email protected]> Co-authored-by: Joseph Juzl <[email protected]> Co-authored-by: Alexander Khizov <[email protected]> Co-authored-by: Daksh Varshneya <[email protected]> Co-authored-by: Johannes E. M. Mosig <[email protected]> Co-authored-by: Alan Nichol <[email protected]> Co-authored-by: Roberto <[email protected]> Co-authored-by: Ella Rohm-Ensing <[email protected]> Co-authored-by: Ben Quachtran <[email protected]> Co-authored-by: Akela Drissner-Schmid <[email protected]> Co-authored-by: Sam Sucik <[email protected]> Co-authored-by: m-vdb <[email protected]>
* fix a test * action texts in importer added to action_text in domain * a little end-to-end bot example * implement hacky e2e prediction * fix warnings * fix user uttered featurization * fix useruttered featurization * 100 epochs * fix writin bot action text to file * store action text in wrong text * add the comment to bot end-to-end utterance * update printing e2e utterances * fix single type input * fix e2e prediction after action * implement comparison between e2e policies * reduce e2e confidence threshold * fix rule policy * fix non e2e prediction * fix cleaning of working data * fix in regex featurizer processing * RasaModelData can handle 4D Tensors (#6833) * handle 4d dense features * padding for dense and sparse works * update RasaModelData * update RasaModelData tests * update shape of sparse tensors * update is_in_4d_format * set eager back to False * fix code quality issues * formatting * fix type issues * refactoring * update types * formatting * fix type issue * subclass numpy array * explicit specify number_of_dimensions * clean up * training is working again * rename feature_dimension to units * reset default eager values * update comments * refactoring * fix type issue * fix types * review comments * formatting * Fix tests on e2e branch (#6984) * fix tests in test_policies * set use_text_for_featurization in example dialogues * fix test for NLU training & test_rule_policy * add missing import * formatting * fix more tests * use intent for featurization by default * fix test_processor * fix test_trackers * fix test_importer * fix test_dialogues and test_policies Co-authored-by: Vova Vv <[email protected]> * Refactor creation of RasaModelData (#7010) * fix tests in test_policies * set use_text_for_featurization in example dialogues * fix test for NLU training & test_rule_policy * add missing import * formatting * fix more tests * update doc strings * refactor create_model_data in DIETClassifier * create sub-methods * make sure response selector trains * reset default value of eager * reset epoch of e2e example * fix entity key * fix tests, testing model is failing * clean up * fix import * Fix issues in DIETClassifier * remove zero features from DIETClassifier again * add test * remove whitespace in blank line * clean up * clean up model data utils * fix type issue * make sure only entity recognition works in diet * Add tests for state and tracker featurizers (#7086) * split test_featurizer.py into two files * code style * add test for prepare_from_domain * add tracker featurizer tests * add test * fix imports in tests * add more tests * Add option "featurizers" to TEDPolicy (#7079) * add FEATURIZERS to TEDPolicy parameters * update tests * fix import * fix merging master * Bring DIET into TED (#7131) * add diet to ted * reshape 4d tensors into 3d and back * fix shapes in non eager mode * make shape indices more general * fix add_length * add todo * sentence features are now also 4D * sequence length is 4D * convert 4d to 3 during padding * mask is 4d now * bring mask in correct shape before transformer * keep also the orginial dialogue length * update doc strings * use tf.scatter_nd to tranform 3d back to 4d * move tensor transformation to _encode_features_per_attribute * fix issues in _encode_features_per_attribute * use correct dialogue length * add comments * clean up * update constants * Update rasa/utils/tensorflow/model_data.py Co-authored-by: Vladimir Vlasov <[email protected]> * review comment * fix tests * use correct attribute mask * use 4d attribute mask * set eager back to default * fix test * update _convert_to_original_shape * add indices to model data * fix tf.scatter_nd * fix TED train and predict * remove not needed constants * fix failing tests * update docstrings * fix docstring issues * review comments * fix shape mismatches Co-authored-by: Vova Vv <[email protected]> Co-authored-by: Vladimir Vlasov <[email protected]> * fix entities features * resolve merge conflict in yaml_story_writer * use story string when writing user uttered event * create empty fakes (#7198) * substitute fake features with empty arrays and use attribute mask to rebuild input * remove unused import, remove comment * refactor, add comments, add types * support empty features * add prepare_for_predict to precalculate self.all_labels_embed * return to default config * add error * add prepare_for_predict to diet * fix test_model_data_utils * fix test gen_batch * Update rasa/core/policies/ted_policy.py Co-authored-by: Tanja <[email protected]> * rename to filter fakes and create dial len beforehand * add dtype= * fix comment * add comments about fake features Co-authored-by: Tanja <[email protected]> * Monster ted (#7262) * add diet to ted * reshape 4d tensors into 3d and back * fix shapes in non eager mode * make shape indices more general * fix add_length * add todo * sentence features are now also 4D * sequence length is 4D * convert 4d to 3 during padding * mask is 4d now * bring mask in correct shape before transformer * keep also the orginial dialogue length * update doc strings * use tf.scatter_nd to tranform 3d back to 4d * move tensor transformation to _encode_features_per_attribute * fix issues in _encode_features_per_attribute * use correct dialogue length * add comments * clean up * update constants * review comment * keep entity dict * create tag_ids for TED * clean up after merge * add batch_loss_entities (not working) * concatenate text and dialogue transformer output * get last dialogue before CRF * add predicting entities * clean up * differentiate between max history tracker featurizer used or not * add todo * add comments * use correct tag id mapping * check if text exists * fix frozenset issues * ignore actual entity value in MemoizationPolicy * fix import * fix some tests * update after merge * use python if instead of tf.cond * we need to return a tensor in tf.cond instead of None * create entity tags for all texts * update batch loss entities (not yet working) * input to entity loss * update entity prediction * fix randomness and shapes * fix ffnn encoding layer name * add todo * Update rasa/core/policies/ted_policy.py Co-authored-by: Tanja <[email protected]> * Update rasa/core/featurizers/single_state_featurizer.py Co-authored-by: Tanja <[email protected]> * rename to entity_tag_id_mapping * add comment to last dial mask * add comments to tf.cond * add docstrings * refactor number of dims check * rename zero features to fake features * pre compute dialogue_indices * create helper methods * calculate number of units for text_transformer_output * add todo * fix tests * use indices constant Co-authored-by: Tanja Bergmann <[email protected]> * refactor e2e ted choice (#7285) * refactor e2e ted choice * add comment why prediction batch of size 2 * fix test policies * fix test ensemble * fix e2e prediction * utter end-to-end bot responses * add docstring * deprecate unused method * add changelog for deprecation * log end-to-end action with text * pass flag instead of determining end-to-end utterance on the fly. * Revert "pass flag instead of determining end-to-end utterance on the fly." This reverts commit 868a715. * remove `events_for_prediction` * remove unused import * fix e2e training edge cases * use special action for end-to-end responses * rename to `from_action_name_or_text` * clarify in comment * rename to `ActionEndToEndResponse` * fix form tests * add docstrings * remove useless test * remove user text if intent is present * remove story read check for user and intent message * ignore entities in text if intent is present * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * PR comments * black * rename var * remove unused `_action_event_for_prediction` * improve phrasing / typing / code structure * add test to ensure action text with `utter_` as start works * rename `Domain.action_names` to `Domain.action_names_or_texts` * fix docstrings * remove unused imports * test and fix writing YAML stories * move `MarkdownStoryWriter` tests to separate file * use `tmp_path` * consider end-to-end stories correctly * fix story reading for retrieval intents * fix missing renames for `prepare_from_domain` * fixup for last merge in from `master` * dump story not as test story * fix docstring errors * remove unused method (not used in Rasa X either) * raise if printing end-to-end things in Markdown * add todos * fix error with entity formatting * move to `rasa.shared` * remove CoreDataImporter * change fingerprinting to use yaml writer * fix tests failing due to new default story file * adapt remaining parts to `as_story_string` failing if end-to-end event * remove `as_story_string` from story validator * don't add e2e entities as features (#7435) * don't add e2e entities as features * remove entities as input features for text * simply don't add entities for e2e user utterances * remove non existent import * fix test * add comment * only train NLU model if data or end to end * fix filter units * fix import * read and write in test * fix displaying of end-to-end actions in rasa interactive * skip warning for end-to-end user messages in training data * add docs link * remove trailing whitespace * return `NotImplemented` if other class * remove `md_` as it's not related to md * add docstrings to entire module * add more docstrings * increase timeout due to failing windows tests * improve string representation of `UserUttered` * fix hashing of `UserUttered` * Add entities to UserUttered event if they are predicted via a policy (#7443) * remove unused import * remove unused import * fix problematic docstrings * specify yaml content-type * fix docstrings in featurearray * fix docstrings * Update rasa/core/policies/rule_policy.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/core/policies/rule_policy.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/core/policies/rule_policy.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/utils/tensorflow/model_data_utils.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/utils/tensorflow/model_data_utils.py Co-authored-by: Daksh Varshneya <[email protected]> * add tests for define featurization in ensemble * update docstrings * add tests for e2e rules * Update rasa/core/policies/ted_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/featurizers/tracker_featurizers.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/featurizers/tracker_featurizers.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/policies/sklearn_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/policies/ted_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/policies/ted_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * Add experimental warning to e2e training (#7524) * Add experimental warning to e2e training * re-order imports * update user messages in apply_to of define events (#7503) * update user messages in apply_to of define events * fix e2e entity prediction in ted * rename method * fix entity featurization for text input * fix entity prediction in ted * remove safeguard * fix actionexecuted string * fix comments * keep __str__ inconsistent for actionexecuted * increase number of epochs for ted * add Define events in applied events * clean states during prediction * review comments * add _prediction_with_unhappy_path * review comments * remove unneeded variable * review comments in DIET * renamed empty_features to absent_features * remove unused imports * update docstrings * refactor entity data creation into a separate method * Update rasa/shared/core/domain.py Co-authored-by: Joe Juzl <[email protected]> * Update rasa/shared/core/domain.py Co-authored-by: Joe Juzl <[email protected]> * Update rasa/shared/core/domain.py Co-authored-by: Joe Juzl <[email protected]> * shorten the long comment * create separate constant for prediction features * Update rasa/core/policies/ted_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * add comment why we add 1 to sequence features * rephrase last dial comment * rephrase comments * Add description to RasaModelData. * explain choice of warning * fix e2e train tests (#7540) * remove prints * use precise len in tests * remove blank line * type annotations * fix dry run test * fix test_surface_attributes * fix typo in test * Use tokens for story structure validation (#7436) * Add tests * Draft first implementation * Fix random sorting before hash * Update doc strings * Add doc strings * Fix minor issues * Add config file loading * Fix some docstrings * Make test stories part of the test * Update tests * Fix minor issues * Fix config param argument * Add TrainingType to tests to avoid config change * Delete hash again * Update docs * Update rasa/core/training/story_conflict.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/shared/nlu/training_data/features.py Co-authored-by: Tobias Wochinger <[email protected]> * Fix minor issues Co-authored-by: Tobias Wochinger <[email protected]> * move constant to red * do not enumerate * fix domain test conflicts * add example showing e2e functionality (#7535) * add basic files * add NLU examples * increase epochs and remove memoization policy * Update examples/e2ebot/config.yml Co-authored-by: Vladimir Vlasov <[email protected]> * Update examples/e2ebot/domain.yml Co-authored-by: Vladimir Vlasov <[email protected]> * Update examples/e2ebot/data/stories.yml Co-authored-by: Vladimir Vlasov <[email protected]> * added a story with a bot utterance instead of an action label * remove bot utterance again Co-authored-by: Vladimir Vlasov <[email protected]> * code quality check * Update tests/test_server.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/shared/core/training_data/story_reader/yaml_story_reader.py Co-authored-by: Alexander Khizov <[email protected]> * add types * Update tests/test_train.py Co-authored-by: Tobias Wochinger <[email protected]> * fix tests * add types * add types * Update tests/test_train.py Co-authored-by: Tobias Wochinger <[email protected]> * add types * Update tests/test_train.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/shared/nlu/training_data/features.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * remove unused parameter * assert return value * use proper objects * use dedent * PR review comments * use function scoped fixture * create new domain to avoid interacting with session one * use safer default for __eq__ * remove not required ignore comment * fix docstrings * mark `Event` class as abtract * use correct docstring * rename to `AlwaysEqualEventMixin` * mark methods as abstract * add docstring * move comment out of docstring * don't persist changed entities + autofill slots for policy entities (#7553) * don't persist changed entities * make tracker state return combined `UserUttered` event * autofill slots for policy entities * made if more explicit * use constants * rename `DefinePrevUserUtteredEntities` to `EntitiesAdded` * rename and make `DefinePrevUserUttered` more general * fix docstrings * add e2e docs (#7512) * change TED default params * Update docs/docs/training-data-format.mdx Co-authored-by: Tanja <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tanja <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tanja <[email protected]> * update docs phrases * break long line * add changelog * add deprecation config for dense dimension * fix new config parameters * add comments for config params * fix docstrings * update comments * update migration guide with new ted parameters * update changelog * a lot of bug fixes regarding updating config * fix updating config dict again * Update docs/docs/training-data-format.mdx Co-authored-by: Tanja <[email protected]> * remove new-old config param descriptions * remove else * add docstring * update changelog * Update docs/docs/stories.mdx Co-authored-by: Ella Rohm-Ensing <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Ella Rohm-Ensing <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Ella Rohm-Ensing <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Ella Rohm-Ensing <[email protected]> * update stories.mdx * update training-data-format.mdx * substitute we with you * don't include e2e in the stories example * make list * remove required * add migration guide for domain changes * mention explicitly * fix import * add link to ted policy * add docstrings * fix updating config * move e2e into separate paragraph * add blank line back * add increased train time note * Update docs/docs/migration-guide.mdx Co-authored-by: Ben Quachtran <[email protected]> * Update docs/docs/migration-guide.mdx Co-authored-by: Ben Quachtran <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Akela Drissner-Schmid <[email protected]> * remove the link to the training data format page * remove the line' * remove the line * break long line * Update changelog/7496.improvement.md Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/migration-guide.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/migration-guide.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/migration-guide.mdx Co-authored-by: Sam Sucik <[email protected]> * add example to migration guide * remove notes * rename changelog to feature * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * expand explanation * update examples in docs to have the same topic as e2ebot * copy ted description from diet * update parameter description * fix overriding default config * Update docs/docs/training-data-format.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tobias Wochinger <[email protected]> * add actions to doc stories * update story * more details in error message Co-authored-by: Tanja <[email protected]> Co-authored-by: Ella Rohm-Ensing <[email protected]> Co-authored-by: Tobias Wochinger <[email protected]> Co-authored-by: Ben Quachtran <[email protected]> Co-authored-by: Akela Drissner-Schmid <[email protected]> Co-authored-by: Sam Sucik <[email protected]> Co-authored-by: m-vdb <[email protected]> * correctly use mixin class + filter out abstract classes * add clarification comment * make entities a list * remove space * fix assigning variables * fix assigning variables * Update docs/docs/components.mdx Co-authored-by: Tobias Wochinger <[email protected]> * add empty line add end * fix breaking hash function (list is not hashable) Co-authored-by: Zhenya Razumovskaia <[email protected]> Co-authored-by: Tanja Bergmann <[email protected]> Co-authored-by: Tobias Wochinger <[email protected]> Co-authored-by: Joseph Juzl <[email protected]> Co-authored-by: Alexander Khizov <[email protected]> Co-authored-by: Daksh Varshneya <[email protected]> Co-authored-by: Johannes E. M. Mosig <[email protected]> Co-authored-by: Alan Nichol <[email protected]> Co-authored-by: Roberto <[email protected]> Co-authored-by: Ella Rohm-Ensing <[email protected]> Co-authored-by: Ben Quachtran <[email protected]> Co-authored-by: Akela Drissner-Schmid <[email protected]> Co-authored-by: Sam Sucik <[email protected]> Co-authored-by: m-vdb <[email protected]>
Proposed changes:
TEDPolicy
add aDefinePrevUserUtteredEntities
to the policy prediction in case of entitiesUserUttered
event if entities are predictedrelated to #6670
Status (please check what you already did):
black
(please check Readme for instructions)