-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
python3 compatibility #187
Conversation
I've found a workaround for the second type of failing tests. See @8cb6fe8 for details. It's not the best solution possible, but it's the best thing I can do while keeping python2 and python3 supported at the same level. |
Right, I found the issue for the remaining test cases. The vevents with the which the test outputs are compared have UIDs defined. These UIDs are the ones generated by python2 after seeing python3's randomness algorithm differs from python2's one, so these tests fail. I've updated the tests to replace the value of the events with a pre-defined one, since UIDs generation isn't what we indend to test. This fixes the last test when running with python3. |
@@ -21,6 +21,8 @@ | |||
# WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. | |||
# | |||
|
|||
from __future__ import unicode_literals | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't do this! Some Python 2 APIs exhibit weird behavior when fed unicode strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't aware of this. :(
Have you found any in khal's case? Some brief testing proved that everything worked, and tests pass, but I'm willing to look into this on a case-base-case basis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably khal doesn't even use any APIs where this matters, I just wouldn't do it as a general rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great. So let's keep this around in khal's case. 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really wouldn't do it... to argue my points, I've found this, but it doesn't go really into detail for all of the issues we could experience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe from a readability POV (although a very weak point IMO): Because only some modules have that future-import, you can't know whether a particular string literal produces a unicode or bytestring. So we should then do this import for every module, but that means we'd have to wrap 70% of string literals with str(...)
(because "native strings" is what most stdlib APIs actually expect)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The linked website only mentions a few issues in detail, and I don't see them applying to us.
Because only some modules have that future-import, you can't know whether a particular string literal produces a unicode or bytestring.
Without this import, literals are str
, not bytestrings. With it, they become unicode strings (unicode
). And stdlib still works with unicode strings, eg:
>>> with open(u'test', u'w') as f:
... f.write("1")
It's not merely readability either, IMHO. When writing new code, you don't need to remember to add adding u'
to new strings - and forgetting that may cause subtle unexpected error.
Personally, I don't see drawbacks with using unicode_literals
on khal's codebase, nor any way this can come back and bite us in a forseeable future, so I'd much rather keep it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this import, literals are str, not bytestrings. With it, they become unicode strings (unicode).
Somehow the rest of the sentence slipped: You first have to look at the top of the file whether unicode_literals
are imported to make sense of the literals, which is IMO dangerous while reviewing PRs.
and forgetting that may cause subtle unexpected error
The opposite does too. Try the following code in Python 2, with file
having nonascii characters in it:
with open('file') as f:
print(u'output: {}'.format(f.read()))
You don't have those problems when you use native string literals (by which I mean 'asdasd'
) for the format string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So basically my point is that changing literals everywhere (by importing unicode_literals
) might create regressions for current Python 2 users, which are arguably the majority right now.
elif conf['default']['print_new'] == 'path': | ||
path = collection._calnames[event.calendar].path + event.href | ||
print(path.encode(conf['locale']['encoding'])) | ||
echo(path.encode(conf['locale']['encoding'])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is always bytes.
I can squash some commits pre-merge if you like since one or two ended up being reversed (I avoided doing so thus far because it'd break the comments on existing commits). |
I'm still really averse to the whole Regarding squashing, I think GitHub preserves the diff comments I've made, at least on this PR view. |
I don't see any downsides to it in our case. Scenarios like wsgi or other network-oriented IO might have issues, but khal doesn't do this, and most likely never will. |
Using unicode_literals from __future__ is a lot cleaner that using the "u" prefix. There's also less change of accidental breakage.
Use the name zip_longest (which is the name in python3), so as to avoid one 2to3 false positive. The old name is also questionably obsolete.
I'd rather say that it tries to conceal the differences badly. I feel like we could do this in a separate PR where we would switch Khal's internal APIs and representations to unicode strings, but personally I am not really happy with it. @geier might have different opinions though. |
Background: Previous tests use the UID that python2 will auto-generate when `random` is seeded with `1`. Different versions of python return different values, and we shouldn't rely the assumption that random can be made non-random anyway. Update the tests to set the UID of generated events to a pre-defined on, so that tests can be written to match with these expected values.
Thanks @untitaker for pointing out the issue here.
Rebased and squashed a few commits. Thanks a lot for all the feedback, @untitaker! |
Awesome, thanks @hobarrera! |
Thanks hobarrera! I've added you as a contributor, but I'd appreciate it, if you would keep making pull requests for anything bigger. |
Thanks for that. :) Cheers! |
Do not merge yetUpdate: This PR is ready for merging. See below for further details.This is a work-in-progress PR, mostly to avoid anybody else duplicating the effort, and to get some preliminary feedback.
khal
andikhal
work flawlessly. There are still nine tests failing, however, I'm inclined to think that it's due to how the tests are written rather than the code itself.There's two types of failing tests with the tests right now: