-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove Markdown training data format #9390
Conversation
One question I've had that isn't only related to this but also other backwards compatible changes in As an example, Rasa X imports |
@chdorner yes, that's correct — we're currently introducing a lot of architectural changes so it's not guaranteed that Rasa X will work with Rasa from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow - this is big! Thanks for taking this huge thing on 💯
Some other MD things I found
- comment here, here and some others
- do a lot of the
as_story_string
methods actually still make sense? I think we need to keep theEvent.as_story_string
ones forrasa interactive
but we might be able to delete them forStoryGraph
,Story
(Rasa X is using this here but this doesn't make sense any more after dropping MD in Rasa X as well). Might be worth dropping that in a separate PR - drop it here
@@ -0,0 +1,5 @@ | |||
Remove the support of Markdown training data format. This includes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add also an entry to the migration guide and maybe a link to the old migration guide where we tell them how to convert old MD files to YAML
tests/shared/core/training_data/story_reader/test_common_story_reader.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow this PR is an amazing feat 🚀
Left a few small comments.
@@ -62,18 +56,6 @@ def read_from_file( | |||
""" | |||
raise NotImplementedError | |||
|
|||
@staticmethod | |||
def is_test_stories_file(filename: Text) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm slightly confused why this method was removed 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason for it to exist because it's just a static method that just proxies its arguments to another function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, you're referring to its implementation in YAMLStoryReader
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙌🏼
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very close 🚀
Can we also remove this one ?
test_file = test_dir / f"tests{suffix}" | ||
test_file.write_bytes(request.body) | ||
return str(test_file) | ||
def _test_data_file_from_payload(request: Request, temporary_directory: Path) -> Text: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 1067 in 7c91b14
training_payload = _training_payload_from_json(request, temporary_directory) |
_training_payload_from_json
can also be removed, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alwx _training_payload_from_json
is still there.
changelog/9390.removal.md
Outdated
@@ -0,0 +1,2 @@ | |||
Removes `template_variables` argument from `get_stories` method of `TrainingDataImporter`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removes `template_variables` argument from `get_stories` method of `TrainingDataImporter`. | |
Removes `template_variables` and `e2e` arguments from `get_stories` method of `TrainingDataImporter`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also mention in the migration guide as it's an explicit public interface
test_file = test_dir / f"tests{suffix}" | ||
test_file.write_bytes(request.body) | ||
return str(test_file) | ||
def _test_data_file_from_payload(request: Request, temporary_directory: Path) -> Text: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alwx _training_payload_from_json
is still there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wohoo 🚀
Proposed changes:
Scope/list of changes:
_convert_core_data
function and the ability to convert data from markdown to yamltest_yaml_story_writer.py
to work without a MarkdownStoryReadertest_extractor.py
to useRasaYAMLReader
instead ofMarkdownReader
_convert_nlg_data
function and the ability to convert data from NLG markdown to yamlrasa.nlu.training_data.converters
data
directoryuse_e2e
in Rasa Open Source 3.0.0 when removing Markdown supportStatus (please check what you already did):
black
(please check Readme for instructions)