-
Notifications
You must be signed in to change notification settings - Fork 361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle non-ASCII chars correctly #22
Conversation
As per pndurette#21 non ASCII chars were not correctly being calculated. This due to an error in the calculate_token function. The diff should be self explanatory when comparing it with the token-script.js. I added a couple of tests to make sure all unicode chars are correctly being calculated.
Any suggestions as to properly handle both Python 2.7 and 3.x unicode string handling without too much ado? (Besides dropping support for Python 3.2?) |
As it turns out a lot of the token-script was just doing the utf-8 encoding of a piece of text. Python can also do that, so now it's way simpler.
Otherwise there is six.text_type, but that would introduce a new dependency. |
From what I read, dropping Python 3.2 is the thing to do and what most projects are doing. So I guess we can agree on this. Thanks a lot for this (fast) fix @Boudewijn26! Will release this shortly. |
Handle non-ASCII chars correctly
According to #21 tokens aren't correctly calculated when they contain non-ASCII chars. I found and fixed the bug, as well as adding a couple extra tests.