python3 compatibility #187

WhyNotHugo · 2015-05-02T00:43:13Z

~~Do not merge yet~~ Update: This PR is ready for merging. See below for further details.

This is a work-in-progress PR, mostly to avoid anybody else duplicating the effort, and to get some preliminary feedback.

khal and ikhal work flawlessly. There are still nine tests failing, however, I'm inclined to think that it's due to how the tests are written rather than the code itself.

There's two types of failing tests with the tests right now:

Comparisons where the generated UID does not match the expected one. I haven't looked deeply into this yet.
Comparisons where python3 seems to be including shell escape code for bold, hence, outputs do not match: '\x1b[1mNo events\x1b[0m\n' == 'No events\n'. Fixing the test would break the python2 version, so I'll have to think of a better solution later on.

WhyNotHugo · 2015-05-02T04:14:56Z

I've found a workaround for the second type of failing tests. See @8cb6fe8 for details. It's not the best solution possible, but it's the best thing I can do while keeping python2 and python3 supported at the same level.

WhyNotHugo · 2015-05-02T05:08:34Z

Right, I found the issue for the remaining test cases.

The vevents with the which the test outputs are compared have UIDs defined. These UIDs are the ones generated by python2 after seeing random with 1, hence, they match the autogenerated ones.

python3's randomness algorithm differs from python2's one, so these tests fail. I've updated the tests to replace the value of the events with a pre-defined one, since UIDs generation isn't what we indend to test.

This fixes the last test when running with python3.

untitaker · 2015-05-02T08:57:34Z

khal/controllers.py

@@ -21,6 +21,8 @@
 # WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 #

+from __future__ import unicode_literals
+


Don't do this! Some Python 2 APIs exhibit weird behavior when fed unicode strings.

I wasn't aware of this. :(

Have you found any in khal's case? Some brief testing proved that everything worked, and tests pass, but I'm willing to look into this on a case-base-case basis.

Probably khal doesn't even use any APIs where this matters, I just wouldn't do it as a general rule.

Great. So let's keep this around in khal's case. 😉

I really wouldn't do it... to argue my points, I've found this, but it doesn't go really into detail for all of the issues we could experience.

Maybe from a readability POV (although a very weak point IMO): Because only some modules have that future-import, you can't know whether a particular string literal produces a unicode or bytestring. So we should then do this import for every module, but that means we'd have to wrap 70% of string literals with str(...) (because "native strings" is what most stdlib APIs actually expect)

The linked website only mentions a few issues in detail, and I don't see them applying to us.

Because only some modules have that future-import, you can't know whether a particular string literal produces a unicode or bytestring.

Without this import, literals are str, not bytestrings. With it, they become unicode strings (unicode). And stdlib still works with unicode strings, eg:

>>> with open(u'test', u'w') as f: ... f.write("1")

It's not merely readability either, IMHO. When writing new code, you don't need to remember to add adding u' to new strings - and forgetting that may cause subtle unexpected error.

Personally, I don't see drawbacks with using unicode_literals on khal's codebase, nor any way this can come back and bite us in a forseeable future, so I'd much rather keep it.

Without this import, literals are str, not bytestrings. With it, they become unicode strings (unicode).

Somehow the rest of the sentence slipped: You first have to look at the top of the file whether unicode_literals are imported to make sense of the literals, which is IMO dangerous while reviewing PRs.

and forgetting that may cause subtle unexpected error

The opposite does too. Try the following code in Python 2, with file having nonascii characters in it:

with open('file') as f: print(u'output: {}'.format(f.read()))

You don't have those problems when you use native string literals (by which I mean 'asdasd') for the format string

So basically my point is that changing literals everywhere (by importing unicode_literals) might create regressions for current Python 2 users, which are arguably the majority right now.

untitaker · 2015-05-02T12:56:17Z

khal/controllers.py

        elif conf['default']['print_new'] == 'path':
            path = collection._calnames[event.calendar].path + event.href
-            print(path.encode(conf['locale']['encoding']))
+            echo(path.encode(conf['locale']['encoding']))


This is always bytes.

WhyNotHugo · 2015-05-02T13:26:38Z

I can squash some commits pre-merge if you like since one or two ended up being reversed (I avoided doing so thus far because it'd break the comments on existing commits).

untitaker · 2015-05-02T13:27:38Z

I'm still really averse to the whole unicode_literals thing.

Regarding squashing, I think GitHub preserves the diff comments I've made, at least on this PR view.

WhyNotHugo · 2015-05-02T13:33:56Z

I'm still really averse to the whole unicode_literals thing.

I don't see any downsides to it in our case.

Scenarios like wsgi or other network-oriented IO might have issues, but khal doesn't do this, and most likely never will.
It also makes sure new code behaves consistently between python2/python3 in case any developer forgets about u' (which is rather common).

Using unicode_literals from __future__ is a lot cleaner that using the "u" prefix. There's also less change of accidental breakage.

Use the name zip_longest (which is the name in python3), so as to avoid one 2to3 false positive. The old name is also questionably obsolete.

untitaker · 2015-05-02T13:36:41Z

It also makes sure new code behaves consistently between python2/python3 in case any developer forgets about u' (which is rather common).

I'd rather say that it tries to conceal the differences badly.

I feel like we could do this in a separate PR where we would switch Khal's internal APIs and representations to unicode strings, but personally I am not really happy with it. @geier might have different opinions though.

Background: Previous tests use the UID that python2 will auto-generate when `random` is seeded with `1`. Different versions of python return different values, and we shouldn't rely the assumption that random can be made non-random anyway. Update the tests to set the UID of generated events to a pre-defined on, so that tests can be written to match with these expected values.

@untitaker

Thanks @untitaker for pointing out the issue here.

WhyNotHugo · 2015-05-02T13:40:41Z

Rebased and squashed a few commits. Thanks a lot for all the feedback, @untitaker!

python3 compatibility

untitaker · 2015-05-02T13:48:43Z

Awesome, thanks @hobarrera!

geier · 2015-05-04T19:49:15Z

Thanks hobarrera!

I've added you as a contributor, but I'd appreciate it, if you would keep making pull requests for anything bigger.

WhyNotHugo · 2015-05-04T20:19:44Z

Thanks for that. :)
Yes, I'll still make PRs so someone else reviews the changes before merging.

Cheers!

WhyNotHugo changed the title ~~[WIP] python3 compatibility~~ python3 compatibility May 2, 2015

untitaker reviewed May 2, 2015
View reviewed changes

untitaker added the inwork label May 2, 2015

untitaker reviewed May 2, 2015
View reviewed changes

Hugo Osvaldo Barrera added 4 commits May 2, 2015 10:35

Use unicode_literals.

4b2e1c4

Using unicode_literals from __future__ is a lot cleaner that using the "u" prefix. There's also less change of accidental breakage.

Suppress false 2to3 positive.

e702091

Use the name zip_longest (which is the name in python3), so as to avoid one 2to3 false positive. The old name is also questionably obsolete.

Use compat functions to convert bytes/unicode.

0f99658

Enable python3 tests.

bfd43ad

Hugo Osvaldo Barrera added 5 commits May 2, 2015 10:37

Use click's styling functions rather than our own.

f347562

Fix bytestrings being badly rendered on the UI.

2163464

Avoid casting output to bytestring.

d65e787

Thanks @untitaker for pointing out the issue here.

Add an XXX task to later fix a string generation.

7df2f83

This was referenced May 2, 2015

Review usage of unicode_literals #189

Closed

Review types of strings used internally #190

Closed

untitaker added a commit that referenced this pull request May 2, 2015

Merge pull request #187 from hobarrera/python3

99a1f7b

python3 compatibility

untitaker merged commit 99a1f7b into pimutils:master May 2, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

python3 compatibility #187

python3 compatibility #187

WhyNotHugo commented May 2, 2015

WhyNotHugo commented May 2, 2015

WhyNotHugo commented May 2, 2015

untitaker May 2, 2015

WhyNotHugo May 2, 2015

untitaker May 2, 2015

WhyNotHugo May 2, 2015

untitaker May 2, 2015

untitaker May 2, 2015

WhyNotHugo May 2, 2015

untitaker May 2, 2015

untitaker May 2, 2015

untitaker May 2, 2015

WhyNotHugo commented May 2, 2015

untitaker commented May 2, 2015

WhyNotHugo commented May 2, 2015

untitaker commented May 2, 2015

WhyNotHugo commented May 2, 2015

untitaker commented May 2, 2015

geier commented May 4, 2015

WhyNotHugo commented May 4, 2015

python3 compatibility #187

python3 compatibility #187

Conversation

WhyNotHugo commented May 2, 2015

WhyNotHugo commented May 2, 2015

WhyNotHugo commented May 2, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

WhyNotHugo commented May 2, 2015

untitaker commented May 2, 2015

WhyNotHugo commented May 2, 2015

untitaker commented May 2, 2015

WhyNotHugo commented May 2, 2015

untitaker commented May 2, 2015

geier commented May 4, 2015

WhyNotHugo commented May 4, 2015