Skip to content
This repository has been archived by the owner on Feb 14, 2018. It is now read-only.

Unable to detect quotes #2

Open
ghost opened this issue Jul 16, 2014 · 2 comments
Open

Unable to detect quotes #2

ghost opened this issue Jul 16, 2014 · 2 comments

Comments

@ghost
Copy link

ghost commented Jul 16, 2014

While using your tagger, I am getting good results.

However, when it comes to quotes, such as inch-symbols and quoted text, the tagger is completely ignoring the quotation marks, making it difficult for me to work with these cases.

Is there a way to make sure that the quotes are tagged as quotes, like in stanfordparser?

@NickShahML
Copy link

To add onto this, is there a way for the perceptron tagger to tag all punctuation? For example, can it tag periods, question marks, quotes, and all text it comes across? Would be a really big help. Currently second best that does punctuation is NTLK tagger but yours is much better.

@syllog1sm
Copy link
Collaborator

It should tag all punctuation, but it'll have trouble with unicode entities.

I'm not supporting this code unfortunately anymore --- I'm working full
time on spaCy, which is now under the MIT license too ( http://spacy.io ).
SpaCy handles non-ascii characters appropriately, and is both faster and
more accurate.

NLTK have recently agreed to use this tagger. However, I dont know how well
they support unicode punctuation.

On Thursday, October 15, 2015, LeavesBreathe [email protected]
wrote:

To add onto this, is there a way for the perceptron tagger to tag all
punctuation? For example, can it tag periods, question marks, quotes, and
all text it comes across? Would be a really big help. Currently second best
that does punctuation is NTLK tagger but yours is much better.


Reply to this email directly or view it on GitHub
#2 (comment)
.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants