-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comparatives Without More/less terms #123
Conversation
|
||
//Comparatives Without More/less terms | ||
rc &= test_sentence ("Her great-grandson is nicer than her great-granddaughter.", | ||
"_subj(is, great grandson)\n"+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here, the dash in great-grandson is missing, but below its present... ??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a TYPO, I wanted to omit "-" from all the relationships. The Relex doesn't output the dash "-", go I guess that is not a bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove the dashes? They seem just fine to me!
Other than the highly inconsistent normalization, and the one LG bug, this looks reasonable, so I'll merge. Please fix the normalization as you get the chance. |
Comparatives Without More/less terms
I guess I should have let @ruiting review too ... sorry, I forgot. These looked straightforward to me. |
@linas I agreed with your comments... I can't get normal internet for this week, so it will be great if you can just review them. Thanks. I think we have already discussed most of the issues about comparatives through the email from a couple weeks ago... |
OK, so the .a-c subscript in link-grammar means its an "adjective-comparative"; this is converted to the relex POS tags in data/relex-tagging.algs The stemming is done in src/java/relex/morph/*java I'm just confused about why some of the comparatives get normalized, and sometimes they don't. Is this a bug in relex, or is this due to WordNet being inconsistent? |
It seems "bigger" can be an adjective itself in WordNet... Overview of adj bigger The adj bigger has 1 sense (first 1 from tagged texts)
|
OK, so I take that as the answer "Its a feature (not a bug) in WordNet" |
OK, so if I may: a mini-lecture: The current relex morphology/normalization code is mis-designed/mis-architected. Its good enough to handle most cases, but can fail badly. The problem is that the current code only looks at single words; it should be looking at larger parts of the sentence to do this. Examples of failures: better-> good? or -> well? Depends on the actual sentence. I've seen other examples in the past, I can't remember them. The correct design would understand normalization as the inverse of a "lexical function": https://en.wikipedia.org/wiki/Lexical_function What we should be doing is not only normalizing, but we should also be identifying the lexical function that applies to the given situation. This would also make relex far more useful for language generation (.... this last comment is for you @ruiting !) |
@ruiting @linas please check newly added test sentences.