You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current version doesn't work with cyrillic texts. It gives a Unicode error.
More specifically:
with markdown
Unexpected Error: <type 'exceptions.UnicodeDecodeError'>
Traceback (most recent call last):
File "criticParser_CLI.py", line 348, in <module>
h = markdown.markdown(h, extensions=['extra', 'codehilite', 'meta'])
File "/usr/lib/python2.7/dist-packages/markdown/__init__.py", line 396, in markdown
return md.convert(text)
File "/usr/lib/python2.7/dist-packages/markdown/__init__.py", line 266, in convert
source = unicode(source)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xd0 in position 0: ordinal not in range(128). -- Note: Markdown only accepts unicode input!
with markdown2
Using the Markdown2 module for processing
/path-to-program/CriticMarkup-toolkit/CLI/1.html
Unexpected Error: <type 'exceptions.UnicodeEncodeError'>
Traceback (most recent call last):
File "criticParser_CLI.py", line 371, in <module>
filesource.write(h)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 3667-3670: ordinal not in range(128)
I found a workaround after some googling. It may not be very elegant, but it does the job. It applies to the command line tool criticParser_CLI.py. I am not a programmer, so maybe there is a better way to do it.
This is not only a problem with Cyrillic text but with every text that is not just English or classical Latin (i.e. only uses ASCII). It would be enough to replace open(args.source, "r") with codecs.open(args.source, "r", encoding="UTF-8"), or even add an encoding parameter. This is a little less hacky than sys.setdefaultencoding('utf8').
teoric
added a commit
to teoric/CriticMarkup-toolkit
that referenced
this issue
Jul 7, 2015
addresses CriticMarkup#33
Just using `open` tends to expect ASCII encoding. UTF-8 seems to be a
more sensible default.
Maybe in the long run, an encoding parameter makes sense?
The current version doesn't work with cyrillic texts. It gives a Unicode error.
More specifically:
I found a workaround after some googling. It may not be very elegant, but it does the job. It applies to the command line tool criticParser_CLI.py. I am not a programmer, so maybe there is a better way to do it.
First, this section
should become
Then this section
Should become
The text was updated successfully, but these errors were encountered: