Let `RegexEntityExtractor` work without the dummy entity annotation #9439

JEM-Mosig · 2021-08-23T17:20:55Z

Description of Problem: When we use RegexEntityExtractor we have to annotate at least one example of each entity in the training data. This is only necessary because of this condition, and it is problematic when you want to use multiple entity extractors.

We recommend that "If you use multiple entity extractors, we advise that each extractor targets an exclusive set of entity types", but we don't actually allow this. You cannot distinguish in the annotation what entity extractor the annotation is for, so once you annotate the entity, DIET will try to extract it. But you also have to annotate an entity if you use RegexEntityExtractor because it should only pay attention to those lookup tables and entities whose names are also names of entities, but the code that checks this has no access to the domain and thus asks the training data, hence the required annotation.

Overview of the Solution: Make the domain (or entities in the domain) available here. That’ll require that Featurizers and Extractors get access to domain during train - which isn’t done there currently, but we do that for policies already.

Definition of Done:

RegexEntityExtractor works without any annotated entities (only the lookup table or regex is defined)

samsucik · 2021-09-10T10:32:41Z

Exalate commented:

samsucik commented:

@wochinge based on this thread what do you say to including this in the 3.0 milestone?

samsucik · 2021-09-10T11:03:23Z

Exalate commented:

samsucik commented:

Just to make the definition of done clearer for this one: As part of updating the relevant docs, we should change the bits here that talk about having to include at least two training examples in order for the NLU model to pick up the entity. I haven't been able to find this rule anywhere in our code and I suspect that, in reality, anything with one or more example is picked up.

wochinge · 2021-09-15T08:36:25Z

Exalate commented:

wochinge commented:

I'm hesitant to add anything to the milestone this late in the process. If so I'd add it to a "nice to have in 3.0" milestone.

samsucik · 2021-09-16T08:03:31Z

Exalate commented:

samsucik commented:

@wochinge I think having as a "nice to have" would be great. (You know the 3.0 milestone better, I myself am just trying to bump up this particular issue so it gets addressed soon.)

wochinge · 2021-09-17T07:49:36Z

Exalate commented:

wochinge commented:

@TyDunn What do you think? Should we create a nice to have milestone?

sync-by-unito · 2022-12-19T12:25:09Z

➤ Maxime Verger commented:

💡 Heads up! We're moving issues to Jira: https://rasa-open-source.atlassian.net/browse/OSS.

From now on, this Jira board is the place where you can browse (without an account) and create issues (you'll need a free Jira account for that). This GitHub issue has already been migrated to Jira and will be closed on January 9th, 2023. Do not forget to subscribe to the corresponding Jira issue!

➡️ More information in the forum: https://forum.rasa.com/t/migration-of-rasa-oss-issues-to-jira/56569.

JEM-Mosig added type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR area:rasa-oss 🎡 Anything related to the open source Rasa framework labels Aug 23, 2021

JEM-Mosig changed the title ~~Let RegexEntityExtractor work without the dummy entity annotation~~ Let RegexEntityExtractor work without the dummy entity annotation Aug 23, 2021

TyDunn added cse-issues area:rasa-oss/ml/nlu-components Issues focused around rasa's NLU components labels Sep 14, 2021

TyDunn added this to the 3.0 Nice to have milestone Sep 17, 2021

TyDunn mentioned this issue Oct 1, 2021

Impossible to pass end to end test when the same entity is extracted by different classifiers #9771

Closed

4 tasks

rasabot-exalate added area:rasa-oss and removed type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR area:rasa-oss 🎡 Anything related to the open source Rasa framework labels Mar 15, 2022 — with Exalate Issue Sync

rasabot added area:rasa-oss 🎡 Anything related to the open source Rasa framework type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR and removed area:rasa-oss labels Mar 16, 2022

rasabot-exalate added type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR and removed type:enhancement_:sparkles: labels Mar 21, 2022 — with Exalate Issue Sync

m-vdb added the type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors. label Oct 7, 2022

m-vdb removed the type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors. label Dec 7, 2022

m-vdb closed this as completed Jan 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Let `RegexEntityExtractor` work without the dummy entity annotation #9439

Let `RegexEntityExtractor` work without the dummy entity annotation #9439

JEM-Mosig commented Aug 23, 2021 •

edited by rasabot-exalate

Loading

samsucik commented Sep 10, 2021 •

edited by rasabot-exalate

Loading

samsucik commented Sep 10, 2021 •

edited by rasabot-exalate

Loading

wochinge commented Sep 15, 2021 •

edited by rasabot-exalate

Loading

samsucik commented Sep 16, 2021 •

edited by rasabot-exalate

Loading

wochinge commented Sep 17, 2021 •

edited by rasabot-exalate

Loading

sync-by-unito bot commented Dec 19, 2022

Let RegexEntityExtractor work without the dummy entity annotation #9439

Let RegexEntityExtractor work without the dummy entity annotation #9439

Comments

JEM-Mosig commented Aug 23, 2021 • edited by rasabot-exalate Loading

samsucik commented Sep 10, 2021 • edited by rasabot-exalate Loading

samsucik commented Sep 10, 2021 • edited by rasabot-exalate Loading

wochinge commented Sep 15, 2021 • edited by rasabot-exalate Loading

samsucik commented Sep 16, 2021 • edited by rasabot-exalate Loading

wochinge commented Sep 17, 2021 • edited by rasabot-exalate Loading

sync-by-unito bot commented Dec 19, 2022

Let `RegexEntityExtractor` work without the dummy entity annotation #9439

Let `RegexEntityExtractor` work without the dummy entity annotation #9439

JEM-Mosig commented Aug 23, 2021 •

edited by rasabot-exalate

Loading

samsucik commented Sep 10, 2021 •

edited by rasabot-exalate

Loading

samsucik commented Sep 10, 2021 •

edited by rasabot-exalate

Loading

wochinge commented Sep 15, 2021 •

edited by rasabot-exalate

Loading

samsucik commented Sep 16, 2021 •

edited by rasabot-exalate

Loading

wochinge commented Sep 17, 2021 •

edited by rasabot-exalate

Loading