Dependency Matcher ignore OP for negation #12926

shaked571 · 2023-08-21T09:27:14Z

I wrote a Dependency Matcher pattern, when I demand not to have a Negation Arc from the root.

For some reason, the matcher ignores my request and matched it anyway.
If OP is not supported I would expect to get an Exception not to fail and match.

## How to reproduce the behavior
negative_effect_pattern_dep = [
    {
        'RIGHT_ID': 'cause_verb',
        'RIGHT_ATTRS': {'LEMMA': {'IN': ['cause', 'make', 'give']}}
    },
    {
        'LEFT_ID': 'cause_verb',
        'REL_OP': '>',
        'RIGHT_ID': 'negation',
        'RIGHT_ATTRS': {'LEMMA': {'IN': ["nt", "n't", "not"]}, 'OP': '!'},
        'OP': '!'
    },
    {
        'LEFT_ID': 'cause_verb',
        'REL_OP': '>>',
        'RIGHT_ID': 'effect',
        'RIGHT_ATTRS': {'LOWER': {'IN': ['eczema', 'psoriasis', 'allergy', 'hive', 'itch', 'blister', 'inflammation', 'acne', 'dermatitis']}}
    },
    {
        'LEFT_ID': 'cause_verb',
        'REL_OP': '>',
        'RIGHT_ID': 'subject',
        'RIGHT_ATTRS': {'POS':{'IN': ['NOUN', 'PRON', 'PROPN']}}
    }
]

matcher.add('NEG_AFFECT1', [negative_effect_pattern_dep])



docs = [
    "it gave a horrible and annoying weird headache and an allergy ",
    "it didn't give a horrible and annoying weird headache and an allergy "

]
print("Found Matches:")
for doc in docs:
    parsed_doc = nlp(doc)
    matches = matcher(parsed_doc)
    for match_id, span in matches:
        string_id = nlp.vocab.strings[match_id]  # Get string representation
        span_t = parsed_doc[min(span):max(span)+1]
        print(f"{string_id:<20}{span} {span_t.text}")

NEG_AFFECT1 [1, 10, 0] it gave a horrible and annoying weird headache and an allergy
NEG_AFFECT1 [1, 10, 7] gave a horrible and annoying weird headache and an allergy
NEG_AFFECT1 [3, 12, 0] it didn't give a horrible and annoying weird headache and an allergy
NEG_AFFECT1 [3, 12, 9] gave a horrible and annoying weird headache and an allergy

Your Environment

spaCy version: 3.5.1
Platform: macOS-10.16-x86_64-i386-64bit
Python version: 3.9.16
Pipelines: en_core_web_lg (3.5.0), en_core_web_trf (3.5.0), en_core_web_sm (3.5.0), en_core_web_md (3.5.0)

The text was updated successfully, but these errors were encountered:

svlandeg · 2023-08-21T13:22:35Z

Hi!

To better understand what is happening, it's helpful to change the printing of the matches so that they show which exact string was matched for which subpart of the pattern:

        if matches:
            match_id, token_ids = matches[0]
            for i in range(len(token_ids)):
                print(negative_effect_pattern_dep[i]["RIGHT_ID"] + ":", parsed_doc[token_ids[i]].text)
            print()

Now, let's first define the negation pattern as an actual, simple negation:

        {
            'LEFT_ID': 'cause_verb',
            'REL_OP': '>',
            'RIGHT_ID': 'negation',
            'RIGHT_ATTRS': {'LEMMA': {'IN': ["nt", "n't", "not"]}}
        },

As expected, this does not produce any matches on your first (positive) example text, but it will on your second:

   cause_verb: give
   negation: n't
   effect: allergy
   subject: it

Now, you've tried adding 'OP': '!' to this subpattern:

        {
            'LEFT_ID': 'cause_verb',
            'REL_OP': '>',
            'RIGHT_ID': 'negation',
            'RIGHT_ATTRS': {'LEMMA': {'IN': ["nt", "n't", "not"]}},
            'OP': '!'
        },

but this is unfortunately not supported and won't change the output. As stated in the docs, only 4 keys are supported: 'LEFT_ID', 'REL_OP', 'RIGHT_ID' and 'RIGHT_ATTRS'. You can add 'FOO': 'BAR' and that won't have any effect. I agree with your suggestion of emittting a warning when an unrecognized key occurs.

Note that for an unrecognized value (for an existing key) an error will in fact be raised. E.g. when you put 'REL_OP': 'BAR', you'll get a ValueError: [E1007] Unsupported DependencyMatcher operator 'BAR'..

Looking at your example code, it appears that you've also tried putting the 'OP': '!' part within the RIGHT_ATTRS definition, so you get this:

        {
            'LEFT_ID': 'cause_verb',
            'REL_OP': '>',
            'RIGHT_ID': 'negation',
            'RIGHT_ATTRS': {'LEMMA': {'IN': ["nt", "n't", "not"]}, 'OP': '!'},
        },

In this case, both sentences will match and that's actually not a bug. Look at the output:

(text 1)
   cause_verb: gave
   negation: it
   effect: allergy
   subject: it

(text 2)
   cause_verb: give
   negation: it
   effect: allergy
   subject: it

What happened is that in both cases, the matcher found a token dependent on the cause_verb whose lemma is NOT in the given list ["nt", "n't", "not"] - this token was "it".

In summary (TLDR):

Adding the '!' operator within the 'RIGHT_ATTRS' won't give you the desired behaviour as there may be other tokens that match
What you want to do instead, is say to the dependency matcher that none of the dependent tokens should match the given specification. This functionality is currently not supported (but we'd accept a PR!)
We'll have a look at potentially warning when an unrecognized key is used in the pattern dictionary.

shaked571 · 2023-08-21T14:54:32Z

OK.

I now understand what happened.

Thank you for your quick and professional response.

I managed to achieve what I need using the 'REL_OP': ';':

     {
            'LEFT_ID': 'cause_verb',
            'REL_OP': ';', # didn current->[nt] [gave]<-root | ; meanning - A immediately follows B, i.e. A.i == B.i + 1, and both are within the same dependency tree
            'RIGHT_ID': 'negation',
            'RIGHT_ATTRS': {'LEMMA': {'IN': ["nt", "n't", "not"]}, 'OP': '!'},
        },

This way I can enforce that the token before give is not a negation

svlandeg · 2023-08-21T15:11:02Z

Great, thanks for posting this solution to your specific usage example! That'll be useful for others finding this thread :-)

github-actions · 2023-09-21T00:02:13Z

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

svlandeg added the feat / matcher Feature: Token, phrase and dependency matcher label Aug 21, 2023

svlandeg added the enhancement Feature requests and improvements label Aug 21, 2023

svlandeg mentioned this issue Aug 21, 2023

warn for unsupported key in dependency matcher #12928

Merged

3 tasks

shaked571 closed this as completed Aug 21, 2023

github-actions bot locked as resolved and limited conversation to collaborators Sep 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dependency Matcher ignore OP for negation #12926

Dependency Matcher ignore OP for negation #12926

shaked571 commented Aug 21, 2023 •

edited by svlandeg

Loading

svlandeg commented Aug 21, 2023

shaked571 commented Aug 21, 2023 •

edited

Loading

svlandeg commented Aug 21, 2023

github-actions bot commented Sep 21, 2023

Dependency Matcher ignore OP for negation #12926

Dependency Matcher ignore OP for negation #12926

Comments

shaked571 commented Aug 21, 2023 • edited by svlandeg Loading

Your Environment

svlandeg commented Aug 21, 2023

shaked571 commented Aug 21, 2023 • edited Loading

svlandeg commented Aug 21, 2023

github-actions bot commented Sep 21, 2023

shaked571 commented Aug 21, 2023 •

edited by svlandeg

Loading

shaked571 commented Aug 21, 2023 •

edited

Loading