-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Meta entities #3889
Meta entities #3889
Conversation
@amn41 Can you please explain in a couple of sentences how your approach works? Looking at the code, I'm not 100% sure if I can follow. How do you distinguish between entities and roles? It seems like roles are marked with '@' in the training data. How does the 'Message' in the end look like? Is it correct that 'roles' are added to the 'entities'? How do you know what role belongs to which entity? I guess some explanation would help to understand the PR better. Thanks. |
Sorry I thought I'd included the notes but they were only on slack. Updated description |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of adding a second CRF for the role detection 👍 I'm curious to see, how well it actually works. Especially, in the cases were you group stuff (give me a [small](size@1) pizza with [mushrooms](topping@1) and a [large](size@2) [pepperoni](topping@2)
). I thing using @
is a good way to format it in markdown.
We should also think about
- Testing the approach: How well does the CRF pick up the different roles?
- How are we going to add it to the other training data formats we have?
- I guess, the roles should also be accessible in actions. How are we doing that? Same way as for entities?
Other then that I added some comments to the code itself.
"for details".format(message.text) | ||
) | ||
|
||
def process(self, message: Message, **kwargs: Any) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell, only the methods train
, process
, and extract_roles
are different from the CRFEntityExtractor
class. To avoid code duplication, we should create an abstract class CRFExtractor
and inherit from that in CRFEntityExtractor
and CRFRoleExtractor
(or reuse the class EntityExtractor
).
# r"\[(?P<entity_text>[^\]]+)" r"\]\((?P<entity>[^:)]*?)" r"(?:\:(?P<value>[^)]+))?\)" | ||
#) | ||
|
||
# regex for: `[entity_text](entity_type(^entity_role)?)` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it be an @
instead of the ^
?
# regex for: `[entity_text](entity_type(@entity_role)?)`
@@ -19,10 +19,16 @@ | |||
available_sections = [INTENT, SYNONYM, REGEX, LOOKUP] | |||
|
|||
# regex for: `[entity_text](entity_type(:entity_synonym)?)` | |||
#ent_regex = re.compile( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean we cannot define synonyms anymore?
|
||
start_index = match.start() - offset | ||
end_index = start_index + len(entity_text) | ||
offset += len(match.group(0)) - len(entity_text) | ||
|
||
entity = build_entity(start_index, end_index, entity_value, entity_type) | ||
entity = build_entity(start_index, end_index, entity_value, entity_type, role=entity_role) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
entity_role
is unassigned if no role was found, should be set to None per default.
text_data = self._from_text_to_crf(role_message) | ||
features = self._sentence_to_features(text_data) | ||
ents = self.ent_tagger.predict_marginals_single(features) | ||
formatted = self._from_crf_to_json(role_message, ents) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not used
ent_with_role = ent.copy() | ||
ent_with_role["role"] = starts[start_idx]["entity"] | ||
entities.append(ent_with_role) | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we don't want the else statement. Currently, we get in case no role was detected:
"entities": [
{
"start": x,
"end": y,
"value": "Buenos Aires",
"entity": "location",
"confidence": 0.9792075801341894,
"extractor": "CRFEntityExtractor"
}
],
...
"roles": [
{
"start": x,
"end": y,
"value": "Buenos Aires",
"entity": "location",
"confidence": 0.9792075801341894,
"extractor": "CRFEntityExtractor"
}
],
The roles is just a duplicate of entities. Shouldn't roles only contain "entites" that are actual roles?
role_ent = ent.copy() | ||
role_ent["entity"] = ent["role"] | ||
role_ent["value"] = ent["entity"] | ||
# TODO update start and end values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So would you simply replace Buenos Aires
by location
and then just obtain the start and end value for location
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes!
self._check_spacy_doc(message) | ||
|
||
extracted = self.extract_roles(message) | ||
#extracted = self.add_extractor_name(self.extract_roles(message)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be removed
"""Take a sentence and return roles in json format""" | ||
|
||
if self.ent_tagger is not None: | ||
role_message = self.replace_entities_with_roles(message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess at prediction time this will not work. We want to have a sentence like heading for location and flying out of location
in the end. However, the function replace_entites_with_roles
would do heading for [location](destination) and flying out of [location](origin)
. It is not replacing the actual text, is it? Or do I miss something?
thanks @tabergma ! sorry I should have clarified that the code is terrible and was mostly looking for feedback on the approach :)
This is a good question, especially if you have both entity synonyms AND roles. Maybe we should switch to having a dict like
Yes I think it would become part of the |
This interpretation for composite entities and the solution idea looks similar to |
@amn41 what is the plan for this - is someone working on this? |
I guess, the plan is to tackle this once the update of our NER is done. I would close this PR and keep the branch. |
Sounds good |
Hi @tmbo @tabergma @JustinaPetr my organization would like to leverage our existing partnership with rasa to work together on this for a custom solution Can you please advise how best we could take this forward? |
Proposed changes:
This is experimental and may well not make it into the codebase.
Looking at treating entity 'roles' as just another NER problem.
Idea for doing this: use 2 CRFs
Data format includes roles:
heading for [Buenos Aires](location@destination) and flying out of [Quito](location@origin)
I need [2](number@rooms) rooms for [5](number@guests) people.
Composite entities get grouped using numerical roles:
give me a [small](size@1) pizza with [mushrooms](topping@1) and a [large](size@2) [pepperoni](topping^2)
show me some [black](color@1) [shoes](item@1) and a [grey](color@2) [suit](item@2)
Pt 1: Entity recognition
remove role annotations (the
@xxx
part) so it’s justheading for [Buenos Aires](location) and flying out of [Quito](location)
then train the CRF as normal.
Pt 2: role recognition
remove entity values and replace with entity type:
heading for [location](destination) and flying out of [location](origin)
train a second CRF just to pick up the
destination
andorigin
‘entities’Pt 3: recombine at prediction time
Status (please check what you already did):
black
(please check Readme for instructions)