You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
because it's a string of type 'unicode' in Python 2.
Case 2: Passing encoded Unicode strings:
from pycorenlp import StanfordCoreNLP
nlp = StanfordCoreNLP('http://localhost:9000/')
text = u'Köln is a city in Germany.'.encode('utf-8')
print(nlp.annotate(text))
throws
UnicodeDecodeErrorTraceback (most recent call last)
/home/jovyan/work/python/pycorenlp/corenlp.py in annotate(self, text, properties)
---> 25 data = text.encode()
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)
because the string has already been encoded and cannot be encoded again.
These two lines of code in the error messages were both introduced in #6 in May 2016 to fix some Unicode issues. However, is seems the explicit encoding in line 25 is not required anymore, because if removed case 2 works perfectly (both in Python 2 and Python 3).
Note also that encoding issues were fixed in CoreNLP in October 2016 (stanfordnlp/CoreNLP#270).
The text was updated successfully, but these errors were encountered:
I have to correct myself. The characters encoding in Python 3 will be broken again if text.encode() is removed. So this string problem seems to be caused by one of the incompatible changes in Python 3.
I'm trying to annotate some Unicode strings. But following example throws errors.
Case 1: Passing Unicode strings.
throws
because it's a string of type 'unicode' in Python 2.
Case 2: Passing encoded Unicode strings:
throws
because the string has already been encoded and cannot be encoded again.
These two lines of code in the error messages were both introduced in #6 in May 2016 to fix some Unicode issues.
However, is seems the explicit encoding in line 25 is not required anymore, because if removed case 2 works perfectly (both in Python 2 and Python 3).Note also that encoding issues were fixed in CoreNLP in October 2016 (stanfordnlp/CoreNLP#270).The text was updated successfully, but these errors were encountered: