Sort dictionaries for doctests independent of ipythons pretty print #29042

antonio-rojas · 2020-01-18T23:11:28Z

This is a prerequisite for #28197 - ipython does no longer honor the DICT_IS_ORDERED variable since 7.10 [1].

[1] ipython/ipython@dffa2c3

For doctesting we overwrite the default pretty print function for dictionaries to sort the keys and behave exactly as before.

CC: @jhpalmieri @dimpase @tscrim @antonio-rojas

Component: refactoring

Author: Jonathan Kliem, Antonio Rojas

Branch/Commit: 68b10e1

Reviewer: François Bissey

Issue created by migration from https://trac.sagemath.org/ticket/29042

The text was updated successfully, but these errors were encountered:

antonio-rojas · 2020-01-18T23:12:26Z

Branch: u/arojas/remove_sorting_of_dicts

antonio-rojas · 2020-01-18T23:15:50Z

Author: Antonio Rojas

antonio-rojas · 2020-01-18T23:15:50Z

New commits:

`264d83a`	`Remove dict sorting and update doctests`

antonio-rojas · 2020-01-18T23:15:50Z

Commit: 264d83a

nbruin · 2020-01-19T00:44:42Z

comment:3

This is going to be quite fragile, because now you're encoding a rather arbitrary dictionary order into the doctests, where no order matters (in py3 it's probably insertion order; a change in code somewhere that changes insertion order would break tests again!).

Why not introduce a little utility function print_dict_sorted that recovers the sorted dictionaries? It would amount to

sorted( D.items(), key=lambda t:t[0])

but with an appropriate routine you could print the result in a more dict-compatible way so that you can fix each doctest with a single-line edit.

That way you end up with a more future-proof fix, and it makes total sense to insert such a print routine at least into the doctest namespace (but then almost by necessity also in the interactive namespace)

antonio-rojas · 2020-01-25T13:09:24Z

comment:4

Sounds reasonable but that's beyond my ability, so someone else will have to do it.

mwageringel · 2020-01-25T21:02:55Z

comment:5

The idiom sorted(D.items()) is already used in several places in the doctests. Alternatively,

sage: from pprint import pprint
sage: pprint(D)

could be used to print a dict sorted by keys, which also seems to take care of nested dictionaries.

On the other hand, if one really wants to keep the sorted output of dicts in general, another solution would be to add a displayhook during doctests that imitates what is currently done by IPython. This would be more robust than using pprint everywhere, as it is likely that new doctests depending on insertion order would be introduced in the long run otherwise.

nbruin · 2020-01-26T04:15:57Z

comment:6

Replying to @mwageringel:

The idiom sorted(D.items()) is already used in several places in the doctests. Alternatively,
sage: from pprint import pprint
sage: pprint(D)
could be used to print a dict sorted by keys, which also seems to take care of nested dictionaries

Thanks for the suggestion! The routine I was thinking about already exists!. I think in "examples" doctests, the pprint is a little more desirable, because its output looks like a dictionary: reading the example with the printed output barely has any additional cognitive load. For that purpose, it might be worth considering to inject pprint in the global namespace by default. I think it serves a real purpose now that IPython dict printing is no longer sorted.

On the other hand, if one really wants to keep the sorted output of dicts in general, another solution would be to add a displayhook during doctests that imitates what is currently done by IPython. This would be more robust than using pprint everywhere, as it is likely that new doctests depending on insertion order would be introduced in the long run otherwise.

True, but that would cause a deviation between doctest output and the output that the actual example in the command line would give. The IPython reason for dispensing with sorting dictionaries is a good one, so I would not be in favour of bringing back that hook in general.

I also think that there are legitimate reasons to make doctests that do print the dict in the "Py3" order, for instance if used to compare with the iteration order. In those cases it would be possible to formulate the doctests differently, no doubt, but I would rather not preempt such use. I think putting up with having to use pprint is reasonable, with the price that it's a magnet for fragile doctests (but in reality, doctests for lots of more complicated routines are hard to make non-fragile. By comparison I think the dict issue is relatively mild.

mwageringel · 2020-01-26T17:11:36Z

comment:7

Replying to @nbruin:

For that purpose, it might be worth considering to inject pprint in the global namespace by default. I think it serves a real purpose now that IPython dict printing is no longer sorted.

In that case, rather than injecting pprint, it might be better to improve the pretty print function that already exists in the global namespace, for example by adding a new keyword pretty_print(D, sort=True). IMO, this makes it clearer why the pretty printing is used. This is synonymous to show(D, sort=True).

True, but that would cause a deviation between doctest output and the output that the actual example in the command line would give. The IPython reason for dispensing with sorting dictionaries is a good one, so I would not be in favour of bringing back that hook in general.

This deviation already exists, as the IPython sorting of dicts is only active during doctests. That said though, I do not have a strong preference for either solution – though I agree it is generally desirable to keep doctest output as close as possible to the actual output.

nbruin · 2020-01-27T02:43:13Z

comment:8

Replying to @mwageringel:

In that case, rather than injecting pprint, it might be better to improve the pretty print function that already exists in the global namespace, for example by adding a new keyword pretty_print(D, sort=True). IMO, this makes it clearer why the pretty printing is used. This is synonymous to show(D, sort=True).

It seems pretty_print aims at a different target: its default is latex output in the command line (that's a bad default), and in general "this function chooses the highest-quality output supported by the user interface", so that doesn't quite hit the goals of doctests (which should be concise and human-readable).

If we do

dm = get_display_manager()
dm.preferences.text='plain'

we end up with better results for our doctesting purposes. But that would be worse than from pprint import pprint.

mwageringel · 2020-01-27T18:43:28Z

comment:9

Replying to @nbruin:

It seems pretty_print aims at a different target: its default is latex output in the command line (that's a bad default), and in general "this function chooses the highest-quality output supported by the user interface", so that doesn't quite hit the goals of doctests (which should be concise and human-readable).

I fail to see the rationale behind defaulting to latex, or the intended use case of pretty_print. Making a distinction between %display plain and %display None is bizarre and seems wrong to me. The usual REPL output at least does not make such a distinction (implemented in sage.repl.rich_output.display_manager.DisplayManager._preferred_text_formatter).

Should we add pprint to the global namespace then? I am not sure this is the best thing to do, but it seems practical. Or just add explicit from pprint import pprint statements in the doctests?

mantepse · 2020-01-27T21:32:25Z

comment:10

It seems to me that it would be good to discuss this at sage-devel, because it affects all developers.

Let me add that it is unclear to me what the best solution is. Initially, I liked the idea of pretty_print(D, sort=True). I think that yet another pretty print function is a bad idea, I find the existence of show, pretty_print, ascii_art, .pp() and possibly others confusing enough.

sagetrac-git · 2020-02-03T07:49:46Z

Changed commit from 264d83a to adcc477

sagetrac-git · 2020-02-03T07:49:46Z

Branch pushed to git repo; I updated commit sha1. New commits:

`f75b41f`	`Merge branch 'develop' of git://git.sagemath.org/sage into t/29042/remove_sorting_of_dicts`
`adcc477`	`Fix two new tests in ring_extension.pyx`

sagetrac-git · 2020-03-02T21:05:01Z

Changed commit from adcc477 to 1f82c67

sagetrac-git · 2020-03-02T21:05:01Z

Branch pushed to git repo; I updated commit sha1. New commits:

`81c03c7`	`Merge branch 'develop' of git://git.sagemath.org/sage into t/29042/remove_sorting_of_dicts`
`1f82c67`	`Sync with develop`

sagetrac-git · 2020-03-08T18:09:11Z

Branch pushed to git repo; I updated commit sha1. New commits:

`5c36c87`	`Merge 9.1.beta7`

sagetrac-git · 2020-03-08T18:09:11Z

Changed commit from 1f82c67 to 5c36c87

sagetrac-git · 2020-03-11T21:29:10Z

Changed commit from 5c36c87 to 71427f1

sagetrac-git · 2020-03-11T21:29:10Z

Branch pushed to git repo; I updated commit sha1. New commits:

`71427f1`	`Update for 9.1.beta7`

kiwifb · 2020-03-29T01:55:36Z

comment:15

This needs rebasing. My own thinking is that calling the structure dictionary lead humans to have an expectation for it to be ordered the way the physical object is. In fact we have look up tables which are very handy for the computer to manipulate but not necessarily meant for full display. In most case sorting doesn't matter objectively.

So I am not sure we should compare whole dictionaries for doctests. Size of the dictionary and some key elements should be enough.

sagetrac-git · 2020-03-29T08:26:32Z

Changed commit from 71427f1 to a86697a

sagetrac-git · 2020-03-29T08:26:32Z

Branch pushed to git repo; I updated commit sha1. New commits:

`55e6360`	`Merge branch 'develop' of git://git.sagemath.org/sage into t/29042/remove_sorting_of_dicts`
`a86697a`	`Merge branch 'develop' of git://git.sagemath.org/sage into t/29042/remove_sorting_of_dicts`

sagetrac-git · 2020-03-29T14:17:09Z

Changed commit from a86697a to 7ba1e5f

sagetrac-git · 2020-03-29T14:17:09Z

Branch pushed to git repo; I updated commit sha1. New commits:

`7ba1e5f`	`Fix new tests in 9.1.beta9`

dimpase · 2020-06-01T11:28:06Z

comment:45

I don't quite understand what this ticket is about:
This will break the affected doctests (but not functionality) on python2

9.2 has nothing to do with python2. We can break it, why not?

antonio-rojas · 2020-06-01T11:31:19Z

comment:46

Replying to @dimpase:

I don't quite understand what this ticket is about:
This will break the affected doctests (but not functionality) on python2

9.2 has nothing to do with python2. We can break it, why not?

This was opened in January

dimpase · 2020-06-01T11:50:09Z

comment:47

is anything needed here for py3 ?

antonio-rojas · 2020-06-01T11:52:07Z

comment:48

Yes - this is meant to fix the breakage that will happen in tests when upgrading ipython. "This will break the affected doctests (but not functionality) on python2" refers to the proposed patch at the time.

dimpase · 2020-06-01T11:56:40Z

comment:49

Could someone knowledgeable about the ticket please amend its description?

kliem · 2020-06-05T09:03:29Z

comment:51

Maybe removing the sorting of dictionaries isn't necessary. We can just overwrite the pretty print function during testing:

+++ b/src/sage/doctest/forker.py
@@ -61,6 +61,7 @@ from .parsing import SageOutputChecker, pre_hash, get_source
 from sage.repl.user_globals import set_globals
 from sage.cpython.atexit import restore_atexit
 from sage.cpython.string import bytes_to_str, str_to_bytes
+import IPython.lib.pretty
 
 
 # All doctests run as if the following future imports are present
@@ -86,6 +87,27 @@ _OSError_SUBCLASSES = [
        exc is not OSError
 ]
 
+def _sorted_dict_pprinter_factory(start, end):
+    """
+    Factory that returns a pprint function used by the default pprint of
+    dicts and dict proxies.
+    """
+    def inner(obj, p, cycle):
+        if cycle:
+            return p.text('{...}')
+        step = len(start)
+        p.begin_group(step, start)
+        keys = obj.keys()
+        keys = IPython.lib.pretty._sorted_for_pprint(keys)
+        for idx, key in p._enumerate(keys):
+            if idx:
+                p.text(',')
+                p.breakable()
+            p.pretty(key)
+            p.text(': ')
+            p.pretty(obj[key])
+        p.end_group(step, end)
+    return inner
 
 
 def init_sage():
@@ -185,11 +207,11 @@ def init_sage():
     # IPython's pretty printer sorts the repr of dicts by their keys by default
     # (or their keys' str() if they are not otherwise orderable).  However, it
     # disables this for CPython 3.6+ opting to instead display dicts' "natural"
-    # insertion order, which is preserved in those versions).  This makes for
-    # inconsistent results with Python 2 tests that return dicts, so here we
-    # force the Python 2 style dict printing
-    import IPython.lib.pretty
-    IPython.lib.pretty.DICT_IS_ORDERED = False
+    # insertion order, which is preserved in those versions).
+    # However, this order is random in some instances.
+    # Also code changes may change the order, but the result stays correct.
+    # So we force sorting the dictionary.
+    IPython.lib.pretty.for_type(dict, _sorted_dict_pprinter_factory('{', '}'))
 
     # Switch on extra debugging
     from sage.structure.debug_options import debug

This appears to do the job. I'm testing the entire library now.

kliem · 2020-06-05T10:48:53Z

Changed commit from 2594a80 to 68b10e1

kliem · 2020-06-05T10:48:53Z

comment:52

sage -t --long --all passes for me with this branch.

New commits:

`68b10e1`	`keep dictionaries ordered for doctesting`

kliem · 2020-06-05T10:48:53Z

Changed author from Antonio Rojas to Jonathan Kliem, Antonio Rojas

kliem · 2020-06-05T10:48:53Z

Changed branch from u/fbissey/dictionaries to public/29042

kiwifb · 2020-06-05T11:05:27Z

comment:53

I can do some testing tomorrow. It looks so much simpler.

kliem · 2020-06-05T12:38:47Z

comment:54

I was suprised how easy this actually is. Took me a number of hours to figure out though.

dimpase · 2020-06-05T12:45:12Z

comment:55

Does this need new iPython?

kliem · 2020-06-05T12:54:13Z

comment:56

No, I didn't upgrade and it worked for me.

The function _sorted_for_pprint that it uses requires iPython at least 5:

https://github.com/ipython/ipython/blob/5.x/IPython/lib/pretty.py

If that is a problem or if this is ever removed, we can just copy paste this tiny function.

The function for_type has been in their since IPython version 1. So I hope that it is somewhat stable.

kiwifb · 2020-06-06T03:27:47Z

comment:57

Works for me. That's one of the things I would like to see merged soon.

kiwifb · 2020-06-06T03:27:47Z

Reviewer: François Bissey

kliem · 2020-06-06T07:42:00Z

comment:58

Thank you.

dimpase · 2020-06-06T09:30:04Z

comment:59

hopefully it simplifies the thankless task of allowing multiple versions of pari/gp, etc.

vbraun · 2020-06-21T22:37:31Z

Changed branch from public/29042 to 68b10e1

antonio-rojas added this to the sage-9.1 milestone Jan 18, 2020

antonio-rojas added the p: major / 3 label Jan 18, 2020

This comment has been minimized.

Sign in to view

antonio-rojas added t: enhancement labels Jan 18, 2020

This comment has been minimized.

Sign in to view

kliem changed the title ~~Remove sorting of dicts~~ Sort dictionaries for doctests independent of ipythons pretty print Jun 5, 2020

kliem added the s: needs review label Jun 5, 2020

kiwifb added s: positive review and removed s: needs review labels Jun 6, 2020

dimpase added p: critical / 2 and removed p: major / 3 labels Jun 6, 2020

vbraun removed the s: positive review label Jun 21, 2020

vbraun closed this as completed in 119b6d8 Jun 21, 2020

slel mentioned this issue Sep 11, 2020

Upgrade: Python 3.8.5 #27754

Closed

Sort dictionaries for doctests independent of ipythons pretty print #29042

Sort dictionaries for doctests independent of ipythons pretty print #29042

Comments

antonio-rojas commented Jan 18, 2020

antonio-rojas commented Jan 18, 2020

antonio-rojas commented Jan 18, 2020

This comment has been minimized.

antonio-rojas commented Jan 18, 2020

antonio-rojas commented Jan 18, 2020

nbruin commented Jan 19, 2020

antonio-rojas commented Jan 25, 2020

mwageringel commented Jan 25, 2020

nbruin commented Jan 26, 2020

mwageringel commented Jan 26, 2020

nbruin commented Jan 27, 2020

mwageringel commented Jan 27, 2020

mantepse commented Jan 27, 2020

sagetrac-git mannequin commented Feb 3, 2020

sagetrac-git mannequin commented Feb 3, 2020

sagetrac-git mannequin commented Mar 2, 2020

sagetrac-git mannequin commented Mar 2, 2020

sagetrac-git mannequin commented Mar 8, 2020

sagetrac-git mannequin commented Mar 8, 2020

sagetrac-git mannequin commented Mar 11, 2020

sagetrac-git mannequin commented Mar 11, 2020

kiwifb commented Mar 29, 2020

sagetrac-git mannequin commented Mar 29, 2020

sagetrac-git mannequin commented Mar 29, 2020

sagetrac-git mannequin commented Mar 29, 2020

sagetrac-git mannequin commented Mar 29, 2020

dimpase commented Jun 1, 2020

antonio-rojas commented Jun 1, 2020

dimpase commented Jun 1, 2020

antonio-rojas commented Jun 1, 2020

dimpase commented Jun 1, 2020

This comment has been minimized.

kliem commented Jun 5, 2020

kliem commented Jun 5, 2020

This comment has been minimized.

kliem commented Jun 5, 2020

kliem commented Jun 5, 2020

kliem commented Jun 5, 2020

kiwifb commented Jun 5, 2020

kliem commented Jun 5, 2020

dimpase commented Jun 5, 2020

kliem commented Jun 5, 2020

kiwifb commented Jun 6, 2020

kiwifb commented Jun 6, 2020

kliem commented Jun 6, 2020

dimpase commented Jun 6, 2020

vbraun commented Jun 21, 2020