Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make NLU configuration more flexible #5510

Closed
3 tasks
dakshvar22 opened this issue Mar 27, 2020 · 0 comments · Fixed by #5863
Closed
3 tasks

Make NLU configuration more flexible #5510

dakshvar22 opened this issue Mar 27, 2020 · 0 comments · Fixed by #5863
Assignees
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR

Comments

@dakshvar22
Copy link
Contributor

dakshvar22 commented Mar 27, 2020

Description of Problem:
Currently, the NLU pipeline is interpreted as a flat pipeline where each component produces annotations, i.e. tokens, sparse features, dense features, etc. and then they are finally consumed by all the ML components present in the pipeline. This used to work well when we were dealing with a smaller number of ML tasks in the same pipeline, i.e. intent classification and entity recognition. As we keep adding more ML tasks as part of our NLU pipeline, like response selection, the idea doesn't scale well because the same set of featurizers are not useful for every ML task and may actually degrade the performance for some.
It would be better if each ML component can take as input which featurizers should be used for its modelling.

Overview of the Solution:
We add an extra parameter to each component which sets an alias for it in the configuration. That alias can be used in the downstream ML component.

For example -

- pipeline:
   - name: WhitespaceTokenizer
   - name: CountVectorsFeaturizer
      alias: cvf_word
   - name: CountVectorsFeaturizer
      analyzer: char
      min_ngram: 1
      max_ngram: 4
      alias: cvf_char
   - name: DIETClassifier
      in: [cvf_word, cvf_char]
   - name: ResponseSelector
      in: [cvf_word]

Definition of Done:

  • Tests are added
  • Feature described the docs
  • Feature mentioned in the changlog
@dakshvar22 dakshvar22 added the type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR label Mar 27, 2020
@tabergma tabergma added the area:rasa-oss 🎡 Anything related to the open source Rasa framework label Mar 27, 2020
@tabergma tabergma self-assigned this May 4, 2020
@tabergma tabergma mentioned this issue May 20, 2020
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:rasa-oss 🎡 Anything related to the open source Rasa framework type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants