-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update user messages in apply_to of define events #7503
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀
@tabergma I was wrong, sorry. Things are a bit more complicated 🙈: The batch dimension of entity prediction is not of batch size, but of number of last (if max history featurizer else all) text inputs in the batch |
# sub_state is transformed to frozenset because we will later hash it | ||
# for deduplication | ||
entities = tuple(self._get_featurized_entities(latest_message)) | ||
entities = tuple( | ||
self._get_featurized_entities(latest_message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nearly there but not entirely.
This still fails because self.latest_message
is just a pointer to the event.
def test_tobi():
tracker = DialogueStateTracker.from_events(
"Vova",
evts=[
ActionExecuted(ACTION_LISTEN_NAME),
UserUttered("hi", intent={"name": "greet"}),
DefinePrevUserUtteredFeaturization(True),
DefinePrevUserUtteredEntities(
entities=[{"entity": "entity1", "value": "value1"}]
),
],
)
user = list(tracker.events)[1]
assert isinstance(user, UserUttered)
# Fails
assert not user.entities
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should fail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we want user uttered to have entities
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but this means that when we store the tracker, we store the updated version of UserUttered
which would change history.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need to rely on a fragile thing such as the time of persistence.
We should only persist UserUttered event after policy prediction happened and all the policy events are applied
There is no need for that as we have the two additional / new events. We will just replay events and come back to the current state. When we discussed the timing of persistence, it was when we considered if we need the additional events after all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, but we use them differently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means that the TrackerStore is no longer a 1:1 presentation of what happened.
which is good, since we effectively made UserUttered
a dynamic event
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good for the ML models, but it's not good for the system itself. Breaking the immutability property is opening the box of the pandora in my opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw the other comment, I see your point
@wochinge since we established that the problem exists regardless of this change, I'd like to merge this PR, because it contains other fixes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left another comment on the ActionExecuted
class which should be considered.
We can continue the discussion for the mutating the UserUttered
event on the other branch.
@wochinge do you have an idea how the changes in this PR break the tests that fail. I don't understand why they're dependent |
rasa/shared/core/trackers.py
Outdated
|
||
applied_events.append(event) | ||
else: | ||
elif not isinstance(event, DefinePrevUserUttered): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@wochinge for some reason this change breaks everything. If I don't include Define...
events in applied_events
sometimes they're not applied. I'll remove it from here. Anyhow, this should be redesigned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mhm, then we should investigate why this breaks everything. That can only mean that events are not applied properly to the tracker.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the problem is that your apply_to
implementations manipulate last_user_message
and last_action
instead of the events tracker.events
🤔 They should refer to the same things, but maybe some stuff is actually deepcopying it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, now I understand it! Yes, you definitely have to still add the event as applied events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't add it, then the events are not applied in the new tracker (as it doesn't get them) and everything will break
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but the new tracker should already get new updated UserUttered
There is no need to. The new tracker will call apply_to
on each new event
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but then we do it million of times, doesn't feel efficient for me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We actually need to do this for all of the events (e.g. restarted
needs to reset a lot of state). I think the problem is rather in the featurization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you mean? in core featurizers? Theoretically they should always receive UserUttered
events where their featurization is decided
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this. Anyway, let's close this discussion and have it in comment of the large PR
* fix a test * action texts in importer added to action_text in domain * a little end-to-end bot example * implement hacky e2e prediction * fix warnings * fix user uttered featurization * fix useruttered featurization * 100 epochs * fix writin bot action text to file * store action text in wrong text * add the comment to bot end-to-end utterance * update printing e2e utterances * fix single type input * fix e2e prediction after action * implement comparison between e2e policies * reduce e2e confidence threshold * fix rule policy * fix non e2e prediction * fix cleaning of working data * fix in regex featurizer processing * RasaModelData can handle 4D Tensors (#6833) * handle 4d dense features * padding for dense and sparse works * update RasaModelData * update RasaModelData tests * update shape of sparse tensors * update is_in_4d_format * set eager back to False * fix code quality issues * formatting * fix type issues * refactoring * update types * formatting * fix type issue * subclass numpy array * explicit specify number_of_dimensions * clean up * training is working again * rename feature_dimension to units * reset default eager values * update comments * refactoring * fix type issue * fix types * review comments * formatting * Fix tests on e2e branch (#6984) * fix tests in test_policies * set use_text_for_featurization in example dialogues * fix test for NLU training & test_rule_policy * add missing import * formatting * fix more tests * use intent for featurization by default * fix test_processor * fix test_trackers * fix test_importer * fix test_dialogues and test_policies Co-authored-by: Vova Vv <[email protected]> * Refactor creation of RasaModelData (#7010) * fix tests in test_policies * set use_text_for_featurization in example dialogues * fix test for NLU training & test_rule_policy * add missing import * formatting * fix more tests * update doc strings * refactor create_model_data in DIETClassifier * create sub-methods * make sure response selector trains * reset default value of eager * reset epoch of e2e example * fix entity key * fix tests, testing model is failing * clean up * fix import * Fix issues in DIETClassifier * remove zero features from DIETClassifier again * add test * remove whitespace in blank line * clean up * clean up model data utils * fix type issue * make sure only entity recognition works in diet * Add tests for state and tracker featurizers (#7086) * split test_featurizer.py into two files * code style * add test for prepare_from_domain * add tracker featurizer tests * add test * fix imports in tests * add more tests * Add option "featurizers" to TEDPolicy (#7079) * add FEATURIZERS to TEDPolicy parameters * update tests * fix import * fix merging master * Bring DIET into TED (#7131) * add diet to ted * reshape 4d tensors into 3d and back * fix shapes in non eager mode * make shape indices more general * fix add_length * add todo * sentence features are now also 4D * sequence length is 4D * convert 4d to 3 during padding * mask is 4d now * bring mask in correct shape before transformer * keep also the orginial dialogue length * update doc strings * use tf.scatter_nd to tranform 3d back to 4d * move tensor transformation to _encode_features_per_attribute * fix issues in _encode_features_per_attribute * use correct dialogue length * add comments * clean up * update constants * Update rasa/utils/tensorflow/model_data.py Co-authored-by: Vladimir Vlasov <[email protected]> * review comment * fix tests * use correct attribute mask * use 4d attribute mask * set eager back to default * fix test * update _convert_to_original_shape * add indices to model data * fix tf.scatter_nd * fix TED train and predict * remove not needed constants * fix failing tests * update docstrings * fix docstring issues * review comments * fix shape mismatches Co-authored-by: Vova Vv <[email protected]> Co-authored-by: Vladimir Vlasov <[email protected]> * fix entities features * resolve merge conflict in yaml_story_writer * use story string when writing user uttered event * create empty fakes (#7198) * substitute fake features with empty arrays and use attribute mask to rebuild input * remove unused import, remove comment * refactor, add comments, add types * support empty features * add prepare_for_predict to precalculate self.all_labels_embed * return to default config * add error * add prepare_for_predict to diet * fix test_model_data_utils * fix test gen_batch * Update rasa/core/policies/ted_policy.py Co-authored-by: Tanja <[email protected]> * rename to filter fakes and create dial len beforehand * add dtype= * fix comment * add comments about fake features Co-authored-by: Tanja <[email protected]> * Monster ted (#7262) * add diet to ted * reshape 4d tensors into 3d and back * fix shapes in non eager mode * make shape indices more general * fix add_length * add todo * sentence features are now also 4D * sequence length is 4D * convert 4d to 3 during padding * mask is 4d now * bring mask in correct shape before transformer * keep also the orginial dialogue length * update doc strings * use tf.scatter_nd to tranform 3d back to 4d * move tensor transformation to _encode_features_per_attribute * fix issues in _encode_features_per_attribute * use correct dialogue length * add comments * clean up * update constants * review comment * keep entity dict * create tag_ids for TED * clean up after merge * add batch_loss_entities (not working) * concatenate text and dialogue transformer output * get last dialogue before CRF * add predicting entities * clean up * differentiate between max history tracker featurizer used or not * add todo * add comments * use correct tag id mapping * check if text exists * fix frozenset issues * ignore actual entity value in MemoizationPolicy * fix import * fix some tests * update after merge * use python if instead of tf.cond * we need to return a tensor in tf.cond instead of None * create entity tags for all texts * update batch loss entities (not yet working) * input to entity loss * update entity prediction * fix randomness and shapes * fix ffnn encoding layer name * add todo * Update rasa/core/policies/ted_policy.py Co-authored-by: Tanja <[email protected]> * Update rasa/core/featurizers/single_state_featurizer.py Co-authored-by: Tanja <[email protected]> * rename to entity_tag_id_mapping * add comment to last dial mask * add comments to tf.cond * add docstrings * refactor number of dims check * rename zero features to fake features * pre compute dialogue_indices * create helper methods * calculate number of units for text_transformer_output * add todo * fix tests * use indices constant Co-authored-by: Tanja Bergmann <[email protected]> * refactor e2e ted choice (#7285) * refactor e2e ted choice * add comment why prediction batch of size 2 * fix test policies * fix test ensemble * fix e2e prediction * utter end-to-end bot responses * add docstring * deprecate unused method * add changelog for deprecation * log end-to-end action with text * pass flag instead of determining end-to-end utterance on the fly. * Revert "pass flag instead of determining end-to-end utterance on the fly." This reverts commit 868a715. * remove `events_for_prediction` * remove unused import * fix e2e training edge cases * use special action for end-to-end responses * rename to `from_action_name_or_text` * clarify in comment * rename to `ActionEndToEndResponse` * fix form tests * add docstrings * remove useless test * remove user text if intent is present * remove story read check for user and intent message * ignore entities in text if intent is present * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * PR comments * black * rename var * remove unused `_action_event_for_prediction` * improve phrasing / typing / code structure * add test to ensure action text with `utter_` as start works * rename `Domain.action_names` to `Domain.action_names_or_texts` * fix docstrings * remove unused imports * test and fix writing YAML stories * move `MarkdownStoryWriter` tests to separate file * use `tmp_path` * consider end-to-end stories correctly * fix story reading for retrieval intents * fix missing renames for `prepare_from_domain` * fixup for last merge in from `master` * dump story not as test story * fix docstring errors * remove unused method (not used in Rasa X either) * raise if printing end-to-end things in Markdown * add todos * fix error with entity formatting * move to `rasa.shared` * remove CoreDataImporter * change fingerprinting to use yaml writer * fix tests failing due to new default story file * adapt remaining parts to `as_story_string` failing if end-to-end event * remove `as_story_string` from story validator * don't add e2e entities as features (#7435) * don't add e2e entities as features * remove entities as input features for text * simply don't add entities for e2e user utterances * remove non existent import * fix test * add comment * only train NLU model if data or end to end * fix filter units * fix import * read and write in test * fix displaying of end-to-end actions in rasa interactive * skip warning for end-to-end user messages in training data * add docs link * remove trailing whitespace * return `NotImplemented` if other class * remove `md_` as it's not related to md * add docstrings to entire module * add more docstrings * increase timeout due to failing windows tests * improve string representation of `UserUttered` * fix hashing of `UserUttered` * Add entities to UserUttered event if they are predicted via a policy (#7443) * remove unused import * remove unused import * fix problematic docstrings * specify yaml content-type * fix docstrings in featurearray * fix docstrings * Update rasa/core/policies/rule_policy.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/core/policies/rule_policy.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/core/policies/rule_policy.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/utils/tensorflow/model_data_utils.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/utils/tensorflow/model_data_utils.py Co-authored-by: Daksh Varshneya <[email protected]> * add tests for define featurization in ensemble * update docstrings * add tests for e2e rules * Update rasa/core/policies/ted_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/featurizers/tracker_featurizers.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/featurizers/tracker_featurizers.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/policies/sklearn_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/policies/ted_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/policies/ted_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * Add experimental warning to e2e training (#7524) * Add experimental warning to e2e training * re-order imports * update user messages in apply_to of define events (#7503) * update user messages in apply_to of define events * fix e2e entity prediction in ted * rename method * fix entity featurization for text input * fix entity prediction in ted * remove safeguard * fix actionexecuted string * fix comments * keep __str__ inconsistent for actionexecuted * increase number of epochs for ted * add Define events in applied events * clean states during prediction * review comments * add _prediction_with_unhappy_path * review comments * remove unneeded variable * review comments in DIET * renamed empty_features to absent_features * remove unused imports * update docstrings * refactor entity data creation into a separate method * Update rasa/shared/core/domain.py Co-authored-by: Joe Juzl <[email protected]> * Update rasa/shared/core/domain.py Co-authored-by: Joe Juzl <[email protected]> * Update rasa/shared/core/domain.py Co-authored-by: Joe Juzl <[email protected]> * shorten the long comment * create separate constant for prediction features * Update rasa/core/policies/ted_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * add comment why we add 1 to sequence features * rephrase last dial comment * rephrase comments * Add description to RasaModelData. * explain choice of warning * fix e2e train tests (#7540) * remove prints * use precise len in tests * remove blank line * type annotations * fix dry run test * fix test_surface_attributes * fix typo in test * Use tokens for story structure validation (#7436) * Add tests * Draft first implementation * Fix random sorting before hash * Update doc strings * Add doc strings * Fix minor issues * Add config file loading * Fix some docstrings * Make test stories part of the test * Update tests * Fix minor issues * Fix config param argument * Add TrainingType to tests to avoid config change * Delete hash again * Update docs * Update rasa/core/training/story_conflict.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/shared/nlu/training_data/features.py Co-authored-by: Tobias Wochinger <[email protected]> * Fix minor issues Co-authored-by: Tobias Wochinger <[email protected]> * move constant to red * do not enumerate * fix domain test conflicts * add example showing e2e functionality (#7535) * add basic files * add NLU examples * increase epochs and remove memoization policy * Update examples/e2ebot/config.yml Co-authored-by: Vladimir Vlasov <[email protected]> * Update examples/e2ebot/domain.yml Co-authored-by: Vladimir Vlasov <[email protected]> * Update examples/e2ebot/data/stories.yml Co-authored-by: Vladimir Vlasov <[email protected]> * added a story with a bot utterance instead of an action label * remove bot utterance again Co-authored-by: Vladimir Vlasov <[email protected]> * code quality check * Update tests/test_server.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/shared/core/training_data/story_reader/yaml_story_reader.py Co-authored-by: Alexander Khizov <[email protected]> * add types * Update tests/test_train.py Co-authored-by: Tobias Wochinger <[email protected]> * fix tests * add types * add types * Update tests/test_train.py Co-authored-by: Tobias Wochinger <[email protected]> * add types * Update tests/test_train.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/shared/nlu/training_data/features.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * remove unused parameter * assert return value * use proper objects * use dedent * PR review comments * use function scoped fixture * create new domain to avoid interacting with session one * use safer default for __eq__ * remove not required ignore comment * fix docstrings * mark `Event` class as abtract * use correct docstring * rename to `AlwaysEqualEventMixin` * mark methods as abstract * add docstring * move comment out of docstring * don't persist changed entities + autofill slots for policy entities (#7553) * don't persist changed entities * make tracker state return combined `UserUttered` event * autofill slots for policy entities * made if more explicit * use constants * rename `DefinePrevUserUtteredEntities` to `EntitiesAdded` * rename and make `DefinePrevUserUttered` more general * fix docstrings * add e2e docs (#7512) * change TED default params * Update docs/docs/training-data-format.mdx Co-authored-by: Tanja <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tanja <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tanja <[email protected]> * update docs phrases * break long line * add changelog * add deprecation config for dense dimension * fix new config parameters * add comments for config params * fix docstrings * update comments * update migration guide with new ted parameters * update changelog * a lot of bug fixes regarding updating config * fix updating config dict again * Update docs/docs/training-data-format.mdx Co-authored-by: Tanja <[email protected]> * remove new-old config param descriptions * remove else * add docstring * update changelog * Update docs/docs/stories.mdx Co-authored-by: Ella Rohm-Ensing <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Ella Rohm-Ensing <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Ella Rohm-Ensing <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Ella Rohm-Ensing <[email protected]> * update stories.mdx * update training-data-format.mdx * substitute we with you * don't include e2e in the stories example * make list * remove required * add migration guide for domain changes * mention explicitly * fix import * add link to ted policy * add docstrings * fix updating config * move e2e into separate paragraph * add blank line back * add increased train time note * Update docs/docs/migration-guide.mdx Co-authored-by: Ben Quachtran <[email protected]> * Update docs/docs/migration-guide.mdx Co-authored-by: Ben Quachtran <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Akela Drissner-Schmid <[email protected]> * remove the link to the training data format page * remove the line' * remove the line * break long line * Update changelog/7496.improvement.md Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/migration-guide.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/migration-guide.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/migration-guide.mdx Co-authored-by: Sam Sucik <[email protected]> * add example to migration guide * remove notes * rename changelog to feature * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * expand explanation * update examples in docs to have the same topic as e2ebot * copy ted description from diet * update parameter description * fix overriding default config * Update docs/docs/training-data-format.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tobias Wochinger <[email protected]> * add actions to doc stories * update story * more details in error message Co-authored-by: Tanja <[email protected]> Co-authored-by: Ella Rohm-Ensing <[email protected]> Co-authored-by: Tobias Wochinger <[email protected]> Co-authored-by: Ben Quachtran <[email protected]> Co-authored-by: Akela Drissner-Schmid <[email protected]> Co-authored-by: Sam Sucik <[email protected]> Co-authored-by: m-vdb <[email protected]> * correctly use mixin class + filter out abstract classes * add clarification comment * make entities a list * remove space * fix assigning variables * fix assigning variables * Update docs/docs/components.mdx Co-authored-by: Tobias Wochinger <[email protected]> * add empty line add end * fix breaking hash function (list is not hashable) Co-authored-by: Zhenya Razumovskaia <[email protected]> Co-authored-by: Tanja Bergmann <[email protected]> Co-authored-by: Tobias Wochinger <[email protected]> Co-authored-by: Joseph Juzl <[email protected]> Co-authored-by: Alexander Khizov <[email protected]> Co-authored-by: Daksh Varshneya <[email protected]> Co-authored-by: Johannes E. M. Mosig <[email protected]> Co-authored-by: Alan Nichol <[email protected]> Co-authored-by: Roberto <[email protected]> Co-authored-by: Ella Rohm-Ensing <[email protected]> Co-authored-by: Ben Quachtran <[email protected]> Co-authored-by: Akela Drissner-Schmid <[email protected]> Co-authored-by: Sam Sucik <[email protected]> Co-authored-by: m-vdb <[email protected]>
* fix a test * action texts in importer added to action_text in domain * a little end-to-end bot example * implement hacky e2e prediction * fix warnings * fix user uttered featurization * fix useruttered featurization * 100 epochs * fix writin bot action text to file * store action text in wrong text * add the comment to bot end-to-end utterance * update printing e2e utterances * fix single type input * fix e2e prediction after action * implement comparison between e2e policies * reduce e2e confidence threshold * fix rule policy * fix non e2e prediction * fix cleaning of working data * fix in regex featurizer processing * RasaModelData can handle 4D Tensors (#6833) * handle 4d dense features * padding for dense and sparse works * update RasaModelData * update RasaModelData tests * update shape of sparse tensors * update is_in_4d_format * set eager back to False * fix code quality issues * formatting * fix type issues * refactoring * update types * formatting * fix type issue * subclass numpy array * explicit specify number_of_dimensions * clean up * training is working again * rename feature_dimension to units * reset default eager values * update comments * refactoring * fix type issue * fix types * review comments * formatting * Fix tests on e2e branch (#6984) * fix tests in test_policies * set use_text_for_featurization in example dialogues * fix test for NLU training & test_rule_policy * add missing import * formatting * fix more tests * use intent for featurization by default * fix test_processor * fix test_trackers * fix test_importer * fix test_dialogues and test_policies Co-authored-by: Vova Vv <[email protected]> * Refactor creation of RasaModelData (#7010) * fix tests in test_policies * set use_text_for_featurization in example dialogues * fix test for NLU training & test_rule_policy * add missing import * formatting * fix more tests * update doc strings * refactor create_model_data in DIETClassifier * create sub-methods * make sure response selector trains * reset default value of eager * reset epoch of e2e example * fix entity key * fix tests, testing model is failing * clean up * fix import * Fix issues in DIETClassifier * remove zero features from DIETClassifier again * add test * remove whitespace in blank line * clean up * clean up model data utils * fix type issue * make sure only entity recognition works in diet * Add tests for state and tracker featurizers (#7086) * split test_featurizer.py into two files * code style * add test for prepare_from_domain * add tracker featurizer tests * add test * fix imports in tests * add more tests * Add option "featurizers" to TEDPolicy (#7079) * add FEATURIZERS to TEDPolicy parameters * update tests * fix import * fix merging master * Bring DIET into TED (#7131) * add diet to ted * reshape 4d tensors into 3d and back * fix shapes in non eager mode * make shape indices more general * fix add_length * add todo * sentence features are now also 4D * sequence length is 4D * convert 4d to 3 during padding * mask is 4d now * bring mask in correct shape before transformer * keep also the orginial dialogue length * update doc strings * use tf.scatter_nd to tranform 3d back to 4d * move tensor transformation to _encode_features_per_attribute * fix issues in _encode_features_per_attribute * use correct dialogue length * add comments * clean up * update constants * Update rasa/utils/tensorflow/model_data.py Co-authored-by: Vladimir Vlasov <[email protected]> * review comment * fix tests * use correct attribute mask * use 4d attribute mask * set eager back to default * fix test * update _convert_to_original_shape * add indices to model data * fix tf.scatter_nd * fix TED train and predict * remove not needed constants * fix failing tests * update docstrings * fix docstring issues * review comments * fix shape mismatches Co-authored-by: Vova Vv <[email protected]> Co-authored-by: Vladimir Vlasov <[email protected]> * fix entities features * resolve merge conflict in yaml_story_writer * use story string when writing user uttered event * create empty fakes (#7198) * substitute fake features with empty arrays and use attribute mask to rebuild input * remove unused import, remove comment * refactor, add comments, add types * support empty features * add prepare_for_predict to precalculate self.all_labels_embed * return to default config * add error * add prepare_for_predict to diet * fix test_model_data_utils * fix test gen_batch * Update rasa/core/policies/ted_policy.py Co-authored-by: Tanja <[email protected]> * rename to filter fakes and create dial len beforehand * add dtype= * fix comment * add comments about fake features Co-authored-by: Tanja <[email protected]> * Monster ted (#7262) * add diet to ted * reshape 4d tensors into 3d and back * fix shapes in non eager mode * make shape indices more general * fix add_length * add todo * sentence features are now also 4D * sequence length is 4D * convert 4d to 3 during padding * mask is 4d now * bring mask in correct shape before transformer * keep also the orginial dialogue length * update doc strings * use tf.scatter_nd to tranform 3d back to 4d * move tensor transformation to _encode_features_per_attribute * fix issues in _encode_features_per_attribute * use correct dialogue length * add comments * clean up * update constants * review comment * keep entity dict * create tag_ids for TED * clean up after merge * add batch_loss_entities (not working) * concatenate text and dialogue transformer output * get last dialogue before CRF * add predicting entities * clean up * differentiate between max history tracker featurizer used or not * add todo * add comments * use correct tag id mapping * check if text exists * fix frozenset issues * ignore actual entity value in MemoizationPolicy * fix import * fix some tests * update after merge * use python if instead of tf.cond * we need to return a tensor in tf.cond instead of None * create entity tags for all texts * update batch loss entities (not yet working) * input to entity loss * update entity prediction * fix randomness and shapes * fix ffnn encoding layer name * add todo * Update rasa/core/policies/ted_policy.py Co-authored-by: Tanja <[email protected]> * Update rasa/core/featurizers/single_state_featurizer.py Co-authored-by: Tanja <[email protected]> * rename to entity_tag_id_mapping * add comment to last dial mask * add comments to tf.cond * add docstrings * refactor number of dims check * rename zero features to fake features * pre compute dialogue_indices * create helper methods * calculate number of units for text_transformer_output * add todo * fix tests * use indices constant Co-authored-by: Tanja Bergmann <[email protected]> * refactor e2e ted choice (#7285) * refactor e2e ted choice * add comment why prediction batch of size 2 * fix test policies * fix test ensemble * fix e2e prediction * utter end-to-end bot responses * add docstring * deprecate unused method * add changelog for deprecation * log end-to-end action with text * pass flag instead of determining end-to-end utterance on the fly. * Revert "pass flag instead of determining end-to-end utterance on the fly." This reverts commit 868a715. * remove `events_for_prediction` * remove unused import * fix e2e training edge cases * use special action for end-to-end responses * rename to `from_action_name_or_text` * clarify in comment * rename to `ActionEndToEndResponse` * fix form tests * add docstrings * remove useless test * remove user text if intent is present * remove story read check for user and intent message * ignore entities in text if intent is present * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * PR comments * black * rename var * remove unused `_action_event_for_prediction` * improve phrasing / typing / code structure * add test to ensure action text with `utter_` as start works * rename `Domain.action_names` to `Domain.action_names_or_texts` * fix docstrings * remove unused imports * test and fix writing YAML stories * move `MarkdownStoryWriter` tests to separate file * use `tmp_path` * consider end-to-end stories correctly * fix story reading for retrieval intents * fix missing renames for `prepare_from_domain` * fixup for last merge in from `master` * dump story not as test story * fix docstring errors * remove unused method (not used in Rasa X either) * raise if printing end-to-end things in Markdown * add todos * fix error with entity formatting * move to `rasa.shared` * remove CoreDataImporter * change fingerprinting to use yaml writer * fix tests failing due to new default story file * adapt remaining parts to `as_story_string` failing if end-to-end event * remove `as_story_string` from story validator * don't add e2e entities as features (#7435) * don't add e2e entities as features * remove entities as input features for text * simply don't add entities for e2e user utterances * remove non existent import * fix test * add comment * only train NLU model if data or end to end * fix filter units * fix import * read and write in test * fix displaying of end-to-end actions in rasa interactive * skip warning for end-to-end user messages in training data * add docs link * remove trailing whitespace * return `NotImplemented` if other class * remove `md_` as it's not related to md * add docstrings to entire module * add more docstrings * increase timeout due to failing windows tests * improve string representation of `UserUttered` * fix hashing of `UserUttered` * Add entities to UserUttered event if they are predicted via a policy (#7443) * remove unused import * remove unused import * fix problematic docstrings * specify yaml content-type * fix docstrings in featurearray * fix docstrings * Update rasa/core/policies/rule_policy.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/core/policies/rule_policy.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/core/policies/rule_policy.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/utils/tensorflow/model_data_utils.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/utils/tensorflow/model_data_utils.py Co-authored-by: Daksh Varshneya <[email protected]> * add tests for define featurization in ensemble * update docstrings * add tests for e2e rules * Update rasa/core/policies/ted_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/featurizers/tracker_featurizers.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/featurizers/tracker_featurizers.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/policies/sklearn_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/policies/ted_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * Update rasa/core/policies/ted_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * Add experimental warning to e2e training (#7524) * Add experimental warning to e2e training * re-order imports * update user messages in apply_to of define events (#7503) * update user messages in apply_to of define events * fix e2e entity prediction in ted * rename method * fix entity featurization for text input * fix entity prediction in ted * remove safeguard * fix actionexecuted string * fix comments * keep __str__ inconsistent for actionexecuted * increase number of epochs for ted * add Define events in applied events * clean states during prediction * review comments * add _prediction_with_unhappy_path * review comments * remove unneeded variable * review comments in DIET * renamed empty_features to absent_features * remove unused imports * update docstrings * refactor entity data creation into a separate method * Update rasa/shared/core/domain.py Co-authored-by: Joe Juzl <[email protected]> * Update rasa/shared/core/domain.py Co-authored-by: Joe Juzl <[email protected]> * Update rasa/shared/core/domain.py Co-authored-by: Joe Juzl <[email protected]> * shorten the long comment * create separate constant for prediction features * Update rasa/core/policies/ted_policy.py Co-authored-by: Daksh Varshneya <[email protected]> * add comment why we add 1 to sequence features * rephrase last dial comment * rephrase comments * Add description to RasaModelData. * explain choice of warning * fix e2e train tests (#7540) * remove prints * use precise len in tests * remove blank line * type annotations * fix dry run test * fix test_surface_attributes * fix typo in test * Use tokens for story structure validation (#7436) * Add tests * Draft first implementation * Fix random sorting before hash * Update doc strings * Add doc strings * Fix minor issues * Add config file loading * Fix some docstrings * Make test stories part of the test * Update tests * Fix minor issues * Fix config param argument * Add TrainingType to tests to avoid config change * Delete hash again * Update docs * Update rasa/core/training/story_conflict.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/shared/nlu/training_data/features.py Co-authored-by: Tobias Wochinger <[email protected]> * Fix minor issues Co-authored-by: Tobias Wochinger <[email protected]> * move constant to red * do not enumerate * fix domain test conflicts * add example showing e2e functionality (#7535) * add basic files * add NLU examples * increase epochs and remove memoization policy * Update examples/e2ebot/config.yml Co-authored-by: Vladimir Vlasov <[email protected]> * Update examples/e2ebot/domain.yml Co-authored-by: Vladimir Vlasov <[email protected]> * Update examples/e2ebot/data/stories.yml Co-authored-by: Vladimir Vlasov <[email protected]> * added a story with a bot utterance instead of an action label * remove bot utterance again Co-authored-by: Vladimir Vlasov <[email protected]> * code quality check * Update tests/test_server.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/shared/core/training_data/story_reader/yaml_story_reader.py Co-authored-by: Alexander Khizov <[email protected]> * add types * Update tests/test_train.py Co-authored-by: Tobias Wochinger <[email protected]> * fix tests * add types * add types * Update tests/test_train.py Co-authored-by: Tobias Wochinger <[email protected]> * add types * Update tests/test_train.py Co-authored-by: Tobias Wochinger <[email protected]> * Update rasa/shared/nlu/training_data/features.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/message.py Co-authored-by: Alexander Khizov <[email protected]> * Update rasa/shared/nlu/training_data/training_data.py Co-authored-by: Alexander Khizov <[email protected]> * remove unused parameter * assert return value * use proper objects * use dedent * PR review comments * use function scoped fixture * create new domain to avoid interacting with session one * use safer default for __eq__ * remove not required ignore comment * fix docstrings * mark `Event` class as abtract * use correct docstring * rename to `AlwaysEqualEventMixin` * mark methods as abstract * add docstring * move comment out of docstring * don't persist changed entities + autofill slots for policy entities (#7553) * don't persist changed entities * make tracker state return combined `UserUttered` event * autofill slots for policy entities * made if more explicit * use constants * rename `DefinePrevUserUtteredEntities` to `EntitiesAdded` * rename and make `DefinePrevUserUttered` more general * fix docstrings * add e2e docs (#7512) * change TED default params * Update docs/docs/training-data-format.mdx Co-authored-by: Tanja <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tanja <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tanja <[email protected]> * update docs phrases * break long line * add changelog * add deprecation config for dense dimension * fix new config parameters * add comments for config params * fix docstrings * update comments * update migration guide with new ted parameters * update changelog * a lot of bug fixes regarding updating config * fix updating config dict again * Update docs/docs/training-data-format.mdx Co-authored-by: Tanja <[email protected]> * remove new-old config param descriptions * remove else * add docstring * update changelog * Update docs/docs/stories.mdx Co-authored-by: Ella Rohm-Ensing <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Ella Rohm-Ensing <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Ella Rohm-Ensing <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Ella Rohm-Ensing <[email protected]> * update stories.mdx * update training-data-format.mdx * substitute we with you * don't include e2e in the stories example * make list * remove required * add migration guide for domain changes * mention explicitly * fix import * add link to ted policy * add docstrings * fix updating config * move e2e into separate paragraph * add blank line back * add increased train time note * Update docs/docs/migration-guide.mdx Co-authored-by: Ben Quachtran <[email protected]> * Update docs/docs/migration-guide.mdx Co-authored-by: Ben Quachtran <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Akela Drissner-Schmid <[email protected]> * remove the link to the training data format page * remove the line' * remove the line * break long line * Update changelog/7496.improvement.md Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/migration-guide.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/migration-guide.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/policies.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/stories.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Sam Sucik <[email protected]> * Update docs/docs/migration-guide.mdx Co-authored-by: Sam Sucik <[email protected]> * add example to migration guide * remove notes * rename changelog to feature * Update docs/docs/stories.mdx Co-authored-by: Tobias Wochinger <[email protected]> * expand explanation * update examples in docs to have the same topic as e2ebot * copy ted description from diet * update parameter description * fix overriding default config * Update docs/docs/training-data-format.mdx Co-authored-by: Tobias Wochinger <[email protected]> * Update docs/docs/training-data-format.mdx Co-authored-by: Tobias Wochinger <[email protected]> * add actions to doc stories * update story * more details in error message Co-authored-by: Tanja <[email protected]> Co-authored-by: Ella Rohm-Ensing <[email protected]> Co-authored-by: Tobias Wochinger <[email protected]> Co-authored-by: Ben Quachtran <[email protected]> Co-authored-by: Akela Drissner-Schmid <[email protected]> Co-authored-by: Sam Sucik <[email protected]> Co-authored-by: m-vdb <[email protected]> * correctly use mixin class + filter out abstract classes * add clarification comment * make entities a list * remove space * fix assigning variables * fix assigning variables * Update docs/docs/components.mdx Co-authored-by: Tobias Wochinger <[email protected]> * add empty line add end * fix breaking hash function (list is not hashable) Co-authored-by: Zhenya Razumovskaia <[email protected]> Co-authored-by: Tanja Bergmann <[email protected]> Co-authored-by: Tobias Wochinger <[email protected]> Co-authored-by: Joseph Juzl <[email protected]> Co-authored-by: Alexander Khizov <[email protected]> Co-authored-by: Daksh Varshneya <[email protected]> Co-authored-by: Johannes E. M. Mosig <[email protected]> Co-authored-by: Alan Nichol <[email protected]> Co-authored-by: Roberto <[email protected]> Co-authored-by: Ella Rohm-Ensing <[email protected]> Co-authored-by: Ben Quachtran <[email protected]> Co-authored-by: Akela Drissner-Schmid <[email protected]> Co-authored-by: Sam Sucik <[email protected]> Co-authored-by: m-vdb <[email protected]>
Proposed changes:
apply_to
methods instead oftracker.applied_events
Status (please check what you already did):
black
(please check Readme for instructions)