-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make rasa data validate
check for duplicated intents, forms, responses and slots when using domains split between multiple files
#10444
Conversation
rasa data validate
should fail if conflicting slots/responses/forms/... are created in different domain files rasa data validate
check for duplicated intents, forms, responses and slots when using domains split between multiple files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great 👍🏼
I left mainly some documentation related suggestions.
I was also wondering if we want to raise InvalidDomain
when duplicates are not None in from_dict
? What happens now if you train with a Domain with conflicting slots for example?
This could be purely out of scope for the issue which refers only to rasa data validate
, but it would be just an addition of a few lines of code 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good 🎉
I'd still explore with Ty and the rest of the squad looking at the other Domain.merge
related issues whether a holistic approach is required for this entire umbrella + if to raise InvalidDomain
whenever duplicates are found?
# this code merges lists of dicts of intents | ||
dict1 = {list(i.keys())[0]: i for i in combined[KEY_INTENTS]} | ||
dict2 = {list(i.keys())[0]: i for i in domain_dict[KEY_INTENTS]} | ||
duplicates[KEY_INTENTS] = extract_duplicates(dict1, dict2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we consider just raising an error/warning here so that we don't need to store the duplicates on the class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea was to show this information only when running rasa data validate
, as it's written in the description of this issue. We can update it to show the warning every time the domain is getting merged but in this case I don't see the point in doing any changes to rasa data validate
and even link this issue to the validator which means the whole task will be different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe @TyDunn can take a look at it and say which approach we prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the approach that @joejuzl describe happen when you run rasa data validate
too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For context we do already have quite a lot of domain validation that happens just from loading it e.g. _check_domain_sanity
in the __init__
looks for duplicates etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alwx I believe Joe is out for the rest of the year, and I am not going to be able to gain enough context atm. Can you make a judgement call here yourself?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TyDunn I think it makes sense to keep this one as it as and maybe create a new issue because what Joe was talking about makes the scope much bigger
@TyDunn please approve to merge this one. |
Proposed changes:
Status (please check what you already did):
black
(please check Readme for instructions)