Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix ted training e2e entities when none are given #8194

Merged
merged 7 commits into from
Mar 22, 2021
Merged

Conversation

Ghostvv
Copy link
Contributor

@Ghostvv Ghostvv commented Mar 12, 2021

Proposed changes:

  • fix ted training e2e entities when none are given in the stories but there are entities in the domain

Status (please check what you already did):

  • added some tests for the functionality
  • updated the documentation
  • updated the changelog (please check changelog for instructions)
  • reformat files using black (please check Readme for instructions)

@Ghostvv Ghostvv requested a review from joejuzl March 12, 2021 16:02
@Ghostvv
Copy link
Contributor Author

Ghostvv commented Mar 12, 2021

@joejuzl could you please help me with the test? Testing that TEDPolicy turns off entity_recognition parameter requires trained nlu model and I'm not sure where to write such a test

@joejuzl
Copy link
Contributor

joejuzl commented Mar 12, 2021

@joejuzl could you please help me with the test? Testing that TEDPolicy turns off entity_recognition parameter requires trained nlu model and I'm not sure where to write such a test

There is a fixture called e2e_bot which has a trained NLU model.

@Ghostvv
Copy link
Contributor Author

Ghostvv commented Mar 12, 2021

There is a fixture called e2e_bot which has a trained NLU model.

how can I use it? 🙈

@Ghostvv
Copy link
Contributor Author

Ghostvv commented Mar 12, 2021

There is a fixture called e2e_bot which has a trained NLU model.

oh, I need e2e_bot without e2e entities

@joejuzl
Copy link
Contributor

joejuzl commented Mar 15, 2021

What is it you exactly want to test? A TED with entity extraction turned on but no entities in the training data?

@Ghostvv
Copy link
Contributor Author

Ghostvv commented Mar 15, 2021

A TED with entity extraction turned on but no entities in the training data?

yes the problem is that entity extraction is turned on by default, we need to turn it off if there are no e2e entities, currently it turns off only if there is no e2e stories or no entities defined in the domain file. This fix turns it off if there is e2e user text in the stories but no e2e entities. I'd like to add a test for this to make sure we don't accidentally remove this fix in the future

@Ghostvv
Copy link
Contributor Author

Ghostvv commented Mar 17, 2021

@joejuzl any ideas how to test it?

@joejuzl
Copy link
Contributor

joejuzl commented Mar 18, 2021

@joejuzl any ideas how to test it?

hmm are there no tests that train TED on e2e data?

@Ghostvv
Copy link
Contributor Author

Ghostvv commented Mar 18, 2021

hmm are there no tests that train TED on e2e data?

that's what I was asking 🙈

@joejuzl
Copy link
Contributor

joejuzl commented Mar 18, 2021

that's what I was asking 🙈

Ok well, there are tests in test_model_training.py that test e2e from the scope of loading files and fingerprinting, but they mock the actual testing calls.

Then there is test_e2ebot which is a fixture that trains a simple e2e bot that I used for testing rasa test with e2e entities.

However I would have assumed when the actual changes were made to TED to make it work with e2e that tests would have been created i.e. in test_ted_policy.py to test the actual training and prediction for e2e.
But I'm assuming that's not the case given your question.
So I guess we should create some!

@Ghostvv
Copy link
Contributor Author

Ghostvv commented Mar 18, 2021

test_ted_policy.py doesn't have access to trained nlu model

@joejuzl
Copy link
Contributor

joejuzl commented Mar 19, 2021

test_ted_policy.py doesn't have access to trained nlu model

Could you not use a fixture like default_agent, get the interpreter from it, and pass it into TED in the tests?

@Ghostvv
Copy link
Contributor Author

Ghostvv commented Mar 19, 2021

Could you not use a fixture like default_agent, get the interpreter from it, and pass it into TED in the tests?

I'll try. how can I use the fixture? I don't know what fixture is 🙈

@joejuzl
Copy link
Contributor

joejuzl commented Mar 22, 2021

I'll try. how can I use the fixture? I don't know what fixture is 🙈

Surely you've used fixtures before? A fixture is any method that has the decorator @pytest.fixture and if it (or its module) has been imported it will be available in any test by having an argument to the test method with the same name as the fixture method.
https://docs.pytest.org/en/stable/fixture.html#what-fixtures-are

Search for default_agent in the test files of the code base and you'll see how it's used.
Let me know if you wanna call to discuss using it in the TED tests, or alternative approaches.

@Ghostvv
Copy link
Contributor Author

Ghostvv commented Mar 22, 2021

@joejuzl I just took a look and I think implementing e2e ted tests within TestTEDPolicy would require significant refactor of PolicyTestCollection, because whether ted is e2e is controlled by the training data it is trained on. I think it is out of scope of this PR. I propose merge this fix as is, and create a separate issue for e2e ted tests. What do you think?

@joejuzl
Copy link
Contributor

joejuzl commented Mar 22, 2021

@joejuzl I just took a look and I think implementing e2e ted tests within TestTEDPolicy would require significant refactor of PolicyTestCollection, because whether ted is e2e is controlled by the training data it is trained on. I think it is out of scope of this PR. I propose merge this fix as is, and create a separate issue for e2e ted tests. What do you think?

Yup agreed - adding e2e tests for ted is out of scope for this PR.
However we should make sure we do it soon, as it's a lot of untested functionality...

@Ghostvv Ghostvv merged commit f56c8b8 into 2.4.x Mar 22, 2021
@Ghostvv Ghostvv deleted the fix-e2e-no-entity-1 branch March 22, 2021 14:33
@Ghostvv
Copy link
Contributor Author

Ghostvv commented Mar 22, 2021

created the issue for e2e TED tests: #8273

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants