-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
E2e story printing #7388
E2e story printing #7388
Conversation
19d82f9
to
a8d8e04
Compare
d05f869
to
12912fb
Compare
rasa/shared/core/events.py
Outdated
Returns: | ||
Event as string. | ||
""" | ||
if self.use_text_for_featurization: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Ghostvv I think printing an end-to-end UserUttered
event in the conversation test format will just make things worse and more complicated. I'd vote to fail early. What do you think?
I checked the usage and we should be fine except some stuff for the story validation tool which for some reason uses this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it used only in markdown, then ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer yaml writer to also use method inside the event then everything is in one place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Joe and I just had quite a session about it and the fact that as_story_string
is used for varying use cases (fingerprinting, printing stories, story validation) actually makes changing code quite complicated. I prefer having the representation of an event in a specific format separated out. How an event is featurized / unique represent should be part of the Event
, but it shouldn't be related to any markdown stuff.
Rasa X story printing will be broken |
@@ -231,7 +232,7 @@ def parse_e2e_message(line: Text, is_used_for_training: bool = True) -> Message: | |||
intent = match.group(2) | |||
message = match.group(4) | |||
example = entities_parser.parse_training_example(message, intent) | |||
if not is_used_for_training: | |||
if not is_used_for_training and not self.use_e2e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when is parse_e2e_message
used but elf.use_e2e
False
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when converting stories from Markdown to YAML core.training.converters.test_story_markdown_to_yaml_converter.test_test_stories
was failing
@@ -389,7 +372,7 @@ def _user_intent_from_step( | |||
) -> Tuple[Text, Optional[Text]]: | |||
user_intent = step.get(KEY_USER_INTENT, "").strip() | |||
|
|||
if not user_intent: | |||
if not user_intent and KEY_USER_MESSAGE not in step: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the rest of this method gracefully handle there being no intent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have tests for it and they pass 🤷♂️
rasa/shared/core/training_data/story_writer/yaml_story_writer.py
Outdated
Show resolved
Hide resolved
LGTM |
@Ghostvv Would you mind reviewing? |
6ba5d26
to
59c7cc0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good, but it is better if someone who knows yaml
writing will take a look
rasa/shared/core/events.py
Outdated
|
||
def __str__(self) -> Text: | ||
"""Returns text representation of event.""" | ||
return ( | ||
f"UserUttered(text: {self.text}, intent: {self.intent}, " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be
f"UserUttered(text: {self.text}, intent: {self.intent}, " | |
f"UserUttered(text: {self.text}, intent: {self.intent_name}, " |
otherwise you get all the confidences etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. We should revisit this on master
once we decided on a Python code convention. Let's focus on getting e2e
merged and not to fix all problems in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you mean? I don't think its convention problem. If you use str
as hash method, you'd get different events all the time, because confidences will be different
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't use str
as hash
here. The hash
actually uses the intent_name
. __hash__
is currently not used for hashing on predictions / only on training data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then hash
is wrong, because it contains self.entities
which contain confidences and extractors etc, which I don't think should be a part of the hash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed representation and hashing
rasa/shared/core/events.py
Outdated
|
||
def __str__(self) -> Text: | ||
"""Returns text representation of event.""" | ||
return ( | ||
f"UserUttered(text: {self.text}, intent: {self.intent}, " | ||
f"entities: {self.entities})" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that will contain a dict with all the confidences, extractors, etc, should we filter only entity names here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see my answer to #7388 (comment)
I'd add a |
@@ -377,30 +409,44 @@ def _from_parse_data( | |||
) | |||
|
|||
def __hash__(self) -> int: | |||
return hash((self.text, self.intent_name, jsonpickle.encode(self.entities))) | |||
"""Returns unique hash of object.""" | |||
return hash(json.dumps(self.as_sub_state())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
Proposed changes:
rasa interactive
DEFAULT_STORIES_FILE
changed to be yamlUserUttered
/ActionExecuted
events are tried to be printed to markdownmaster
and not herestr
to represent actions instead of markdown representation which usesas_story_string
Status (please check what you already did):
black
(please check Readme for instructions)