Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
introduce end-to-end processing into rasa (#7496)
* fix a test * action texts in importer added to action_text in domain * a little end-to-end bot example * implement hacky e2e prediction * fix warnings * fix user uttered featurization * fix useruttered featurization * 100 epochs * fix writin bot action text to file * store action text in wrong text * add the comment to bot end-to-end utterance * update printing e2e utterances * fix single type input * fix e2e prediction after action * implement comparison between e2e policies * reduce e2e confidence threshold * fix rule policy * fix non e2e prediction * fix cleaning of working data * fix in regex featurizer processing * RasaModelData can handle 4D Tensors (#6833) * handle 4d dense features * padding for dense and sparse works * update RasaModelData * update RasaModelData tests * update shape of sparse tensors * update is_in_4d_format * set eager back to False * fix code quality issues * formatting * fix type issues * refactoring * update types * formatting * fix type issue * subclass numpy array * explicit specify number_of_dimensions * clean up * training is working again * rename feature_dimension to units * reset default eager values * update comments * refactoring * fix type issue * fix types * review comments * formatting * Fix tests on e2e branch (#6984) * fix tests in test_policies * set use_text_for_featurization in example dialogues * fix test for NLU training & test_rule_policy * add missing import * formatting * fix more tests * use intent for featurization by default * fix test_processor * fix test_trackers * fix test_importer * fix test_dialogues and test_policies Co-authored-by: Vova Vv <[email protected]> * Refactor creation of RasaModelData (#7010) * fix tests in test_policies * set use_text_for_featurization in example dialogues * fix test for NLU training & test_rule_policy * add missing import * formatting * fix more tests * update doc strings * refactor create_model_data in DIETClassifier * create sub-methods * make sure response selector trains * reset default value of eager * reset epoch of e2e example * fix entity key * fix tests, testing model is failing * clean up * fix import * Fix issues in DIETClassifier * remove zero features from DIETClassifier again * add test * remove whitespace in blank line * clean up * clean up model data utils * fix type issue * make sure only entity recognition works in diet * Add tests for state and tracker featurizers (#7086) * split test_featurizer.py into two files * code style * add test for prepare_from_domain * add tracker featurizer tests * add test * fix imports in tests * add more tests * Add option "featurizers" to TEDPolicy (#7079) * add FEATURIZERS to TEDPolicy parameters * update tests * fix import * fix merging master * Bring DIET into TED (#7131) * add diet to ted * reshape 4d tensors into 3d and back * fix shapes in non eager mode * make shape indices more general * fix add_length * add todo * sentence features are now also 4D * sequence length is 4D * convert 4d to 3 during padding * mask is 4d now * bring mask in correct shape before transformer * keep also the orginial dialogue length * update doc strings * use tf.scatter_nd to tranform 3d back to 4d * move tensor transformation to _encode_features_per_attribute * fix issues in _encode_features_per_attribute * use correct dialogue length * add comments * clean up * update constants * Update rasa/utils/tensorflow/model_data.py Co-authored-by: Vladimir Vlasov <[email protected]> * review comment * fix tests * use correct attribute mask * use 4d attribute mask * set eager back to default * fix test * update _convert_to_original_shape * add indices to model data * fix tf.scatter_nd * fix TED train and predict * remove not needed constants * fix failing tests * update docstrings * fix docstring issues * review comments * fix shape mismatches Co-authored-by: Vova Vv <[email protected]> Co-authored-by: Vladimir Vlasov <[email protected]> * fix entities features * resolve merge conflict in yaml_story_writer * use story string when writing user uttered event * create empty fakes (#7198) * substitute fake features with empty arrays and use attribute mask to rebuild input * remove unused import, remove comment * refactor, add comments, add types * support empty features * add prepare_for_predict to precalculate self.all_labels_embed * return to default config * add error * add prepare_for_predict to diet * fix test_model_data_utils * fix test gen_batch * Update rasa/core/policies/ted_policy.py Co-authored-by: Tanja <[email protected]> * rename to filter fakes and create dial len beforehand * add dtype= * fix comment * add comments about fake features Co-authored-by: Tanja <[email protected]> * Monster ted (#7262) * add diet to ted * reshape 4d tensors into 3d and back * fix shapes in non eager mode * make shape indices more general * fix add_length * add todo * sentence features are now also 4D * sequence length is 4D * convert 4d to 3 during padding * mask is 4d now * bring mask in correct shape before transformer * keep also the orginial dialogue length * update doc strings * use tf.scatter_nd to tranform 3d back to 4d * move tensor transformation to _encode_features_per_attribute * fix issues in _encode_features_per_attribute * use correct dialogue length * add comments * clean up * update constants * review comment * keep entity dict * create tag_ids for TED * clean up after merge * add batch_loss_entities (not working) * concatenate text and dialogue transformer output * get last dialogue before CRF * add predicting entities * clean up * differentiate between max history tracker featurizer used or not * add todo * add comments * use correct tag id mapping * check if text exists * fix frozenset issues * ignore actual entity value in MemoizationPolicy * fix import * fix some tests * update after merge * use python if instead of tf.cond * we need to return a tensor in tf.cond instead of None * create entity tags for all texts * update batch loss entities (not yet working) * input to entity loss * update entity prediction * fix randomness and shapes * fix ffnn encoding layer name * add todo * Update rasa/core/policies/ted_policy.py Co-authored-by: Tanja <[email protected]> * Update rasa/core/featurizers/single_state_featurizer.py Co-authored-by: Tanja <[email protected]> * rename to entity_tag_id_mapping * add comment to last dial mask * add comments to tf.cond * add docstrings * refactor number of dims check * rename zero features to fake features * pre compute dialogue_indices * create helper methods * calculate number of units for text_transformer_output * add todo * fix tests * use indices constant Co-authored-by: Tanja Bergmann <[email protected]> * refactor e2e ted choice (#7285) * refactor e2e ted choice * add comment why prediction batch of size 2 * fix test policies * fix test ensemble * fix e2e prediction * utter end-to-end bot responses * add docstring * deprecate unused method * add changelog for deprecation * log end-to-end action with text * pass flag instead of determining end-to-end utterance on the fly. * Revert "pass flag instead of determining end-to-end utterance on the fly." This reverts commit 868a715. * remove `events_for_prediction` * remove unused import * fix e2e training edge cases * use special action for end-to-end responses * rename to `from_action_name_or_text` * clarify in comment * rename to `ActionEndToEndResponse` * fix form tests * add docstrings * remove useless test * remove user text if intent is present * remove story read check for user and intent message * ignore entities in text if intent is present * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * PR comments * black * rename var * remove unused `_action_event_for_prediction` * improve phrasing / typing / code structure * add test to ensure action text with `utter_` as start works * rename `Domain.action_names` to `Domain.action_names_or_texts` * fix docstrings * remove unused imports * test and fix writing YAML stories * move `MarkdownStoryWriter` tests to separate file * use `tmp_path` * consider end-to-end stories correctly * fix story reading for retrieval intents * fix missing renames for `prepare_from_domain` * fixup for last merge in from `master` * dump story not as test story * fix docstring errors * remove unused method (not used in Rasa X either) * raise if printing end-to-end things in Markdown * add todos * fix error with entity formatting * move to `rasa.shared` * remove CoreDataImporter * change fingerprinting to use yaml writer * fix tests failing due to new default story file * adapt remaining parts to `as_story_string` failing if end-to-end event * remove `as_story_string` from story validator * don't add e2e entities as features (#7435) * don't add e2e entities as features * remove entities as input features for text * simply don't add entities for e2e user utterances * remove non existent import * fix test * add comment * only train NLU model if data or end to end * fix filter units * fix import * read and write in test * fix displaying of end-to-end actions in rasa interactive * skip warning for end-to-end user messages in training data * add docs link * remove trailing whitespace * return `NotImplemented` if other class * remove `md_` as it's not related to md * add docstrings to entire module * add more docstrings * increase timeout due to failing windows tests * improve string representation of `UserUttered` * fix hashing of `UserUttered` * Add entities to UserUttered event if they are predicted via a policy (#7443) * remove unused import * remove unused import * fix problematic docstrings * specify yaml content-type * fix docstrings in featurearray * fix docstrings * Update rasa/core/policies/rule_policy.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/core/policies/rule_policy.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/core/policies/rule_policy.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/utils/tensorflow/model_data_utils.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/utils/tensorflow/model_data_utils.py Co-authored-by: Daksh Varshneya <[email protected]> * add tests for define featurization in ensemble * update docstrings * add tests for e2e rules * Update rasa/core/policies/ted_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/featurizers/tracker_featurizers.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/featurizers/tracker_featurizers.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/policies/sklearn_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/policies/ted_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/policies/ted_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * Add experimental warning to e2e training (#7524) * Add experimental warning to e2e training * re-order imports * update user messages in apply_to of define events (#7503) * update user messages in apply_to of define events * fix e2e entity prediction in ted * rename method * fix entity featurization for text input * fix entity prediction in ted * remove safeguard * fix actionexecuted string * fix comments * keep __str__ inconsistent for actionexecuted * increase number of epochs for ted * add Define events in applied events * clean states during prediction * review comments * add _prediction_with_unhappy_path * review comments * remove unneeded variable * review comments in DIET * renamed empty_features to absent_features * remove unused imports * update docstrings * refactor entity data creation into a separate method * Update rasa/shared/core/domain.py Co-authored-by: Joe Juzl <[email protected]> * Update rasa/shared/core/domain.py Co-authored-by: Joe Juzl <[email protected]> * Update rasa/shared/core/domain.py Co-authored-by: Joe Juzl <[email protected]> * shorten the long comment * create separate constant for prediction features * Update rasa/core/policies/ted_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * add comment why we add 1 to sequence features * rephrase last dial comment * rephrase comments * Add description to RasaModelData. * explain choice of warning * fix e2e train tests (#7540) * remove prints * use precise len in tests * remove blank line * type annotations * fix dry run test * fix test_surface_attributes * fix typo in test * Use tokens for story structure validation (#7436) * Add tests * Draft first implementation * Fix random sorting before hash * Update doc strings * Add doc strings * Fix minor issues * Add config file loading * Fix some docstrings * Make test stories part of the test * Update tests * Fix minor issues * Fix config param argument * Add TrainingType to tests to avoid config change * Delete hash again * Update docs * Update rasa/core/training/story_conflict.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/shared/nlu/training_data/features.py Co-authored-by: Tobias Wochinger <[email protected]> * Fix minor issues Co-authored-by: Tobias Wochinger <[email protected]> * move constant to red * do not enumerate * fix domain test conflicts * add example showing e2e functionality (#7535) * add basic files * add NLU examples * increase epochs and remove memoization policy * Update examples/e2ebot/config.yml Co-authored-by: Vladimir Vlasov <[email protected]> * Update examples/e2ebot/domain.yml Co-authored-by: Vladimir Vlasov <[email protected]> * Update examples/e2ebot/data/stories.yml Co-authored-by: Vladimir Vlasov <[email protected]> * added a story with a bot utterance instead of an action label * remove bot utterance again Co-authored-by: Vladimir Vlasov <[email protected]> * code quality check * Update tests/test_server.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/shared/core/training_data/story_reader/yaml_story_reader.py Co-authored-by: Alexander Khizov <[email protected]> * add types * Update tests/test_train.py Co-authored-by: Tobias Wochinger <[email protected]> * fix tests * add types * add types * Update tests/test_train.py Co-authored-by: Tobias Wochinger <[email protected]> * add types * Update tests/test_train.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/shared/nlu/training_data/features.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * remove unused parameter * assert return value * use proper objects * use dedent * PR review comments * use function scoped fixture * create new domain to avoid interacting with session one * use safer default for __eq__ * remove not required ignore comment * fix docstrings * mark `Event` class as abtract * use correct docstring * rename to `AlwaysEqualEventMixin` * mark methods as abstract * add docstring * move comment out of docstring * don't persist changed entities + autofill slots for policy entities (#7553) * don't persist changed entities * make tracker state return combined `UserUttered` event * autofill slots for policy entities * made if more explicit * use constants * rename `DefinePrevUserUtteredEntities` to `EntitiesAdded` * rename and make `DefinePrevUserUttered` more general * fix docstrings * add e2e docs (#7512) * change TED default params * Update docs/docs/training-data-format.mdx Co-authored-by: Tanja <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tanja <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tanja <[email protected]> * update docs phrases * break long line * add changelog * add deprecation config for dense dimension * fix new config parameters * add comments for config params * fix docstrings * update comments * update migration guide with new ted parameters * update changelog * a lot of bug fixes regarding updating config * fix updating config dict again * Update docs/docs/training-data-format.mdx Co-authored-by: Tanja <[email protected]> * remove new-old config param descriptions * remove else * add docstring * update changelog * Update docs/docs/stories.mdx Co-authored-by: Ella Rohm-Ensing <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Ella Rohm-Ensing <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Ella Rohm-Ensing <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Ella Rohm-Ensing <[email protected]> * update stories.mdx * update training-data-format.mdx * substitute we with you * don't include e2e in the stories example * make list * remove required * add migration guide for domain changes * mention explicitly * fix import * add link to ted policy * add docstrings * fix updating config * move e2e into separate paragraph * add blank line back * add increased train time note * Update docs/docs/migration-guide.mdx Co-authored-by: Ben Quachtran <[email protected]> * Update docs/docs/migration-guide.mdx Co-authored-by: Ben Quachtran <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Akela Drissner-Schmid <[email protected]> * remove the link to the training data format page * remove the line' * remove the line * break long line * Update changelog/7496.improvement.md Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/migration-guide.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/migration-guide.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/migration-guide.mdx Co-authored-by: Sam Sucik <[email protected]> * add example to migration guide * remove notes * rename changelog to feature * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * expand explanation * update examples in docs to have the same topic as e2ebot * copy ted description from diet * update parameter description * fix overriding default config * Update docs/docs/training-data-format.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tobias Wochinger <[email protected]> * add actions to doc stories * update story * more details in error message Co-authored-by: Tanja <[email protected]> Co-authored-by: Ella Rohm-Ensing <[email protected]> Co-authored-by: Tobias Wochinger <[email protected]> Co-authored-by: Ben Quachtran <[email protected]> Co-authored-by: Akela Drissner-Schmid <[email protected]> Co-authored-by: Sam Sucik <[email protected]> Co-authored-by: m-vdb <[email protected]> * correctly use mixin class + filter out abstract classes * add clarification comment * make entities a list * remove space * fix assigning variables * fix assigning variables * Update docs/docs/components.mdx Co-authored-by: Tobias Wochinger <[email protected]> * add empty line add end * fix breaking hash function (list is not hashable) Co-authored-by: Zhenya Razumovskaia <[email protected]> Co-authored-by: Tanja Bergmann <[email protected]> Co-authored-by: Tobias Wochinger <[email protected]> Co-authored-by: Joseph Juzl <[email protected]> Co-authored-by: Alexander Khizov <[email protected]> Co-authored-by: Daksh Varshneya <[email protected]> Co-authored-by: Johannes E. M. Mosig <[email protected]> Co-authored-by: Alan Nichol <[email protected]> Co-authored-by: Roberto <[email protected]> Co-authored-by: Ella Rohm-Ensing <[email protected]> Co-authored-by: Ben Quachtran <[email protected]> Co-authored-by: Akela Drissner-Schmid <[email protected]> Co-authored-by: Sam Sucik <[email protected]> Co-authored-by: m-vdb <[email protected]>
- Loading branch information