Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort dictionaries for doctests independent of ipythons pretty print #29042

Closed
antonio-rojas opened this issue Jan 18, 2020 · 76 comments
Closed

Comments

@antonio-rojas
Copy link
Contributor

This is a prerequisite for #28197 - ipython does no longer honor the DICT_IS_ORDERED variable since 7.10 [1].

[1] ipython/ipython@dffa2c3

For doctesting we overwrite the default pretty print function for dictionaries to sort the keys and behave exactly as before.

CC: @jhpalmieri @dimpase @tscrim @antonio-rojas

Component: refactoring

Author: Jonathan Kliem, Antonio Rojas

Branch/Commit: 68b10e1

Reviewer: François Bissey

Issue created by migration from https://trac.sagemath.org/ticket/29042

@antonio-rojas antonio-rojas added this to the sage-9.1 milestone Jan 18, 2020
@antonio-rojas
Copy link
Contributor Author

Branch: u/arojas/remove_sorting_of_dicts

@antonio-rojas
Copy link
Contributor Author

Author: Antonio Rojas

@antonio-rojas

This comment has been minimized.

@antonio-rojas
Copy link
Contributor Author

New commits:

264d83aRemove dict sorting and update doctests

@antonio-rojas
Copy link
Contributor Author

Commit: 264d83a

@nbruin
Copy link
Contributor

nbruin commented Jan 19, 2020

comment:3

This is going to be quite fragile, because now you're encoding a rather arbitrary dictionary order into the doctests, where no order matters (in py3 it's probably insertion order; a change in code somewhere that changes insertion order would break tests again!).

Why not introduce a little utility function print_dict_sorted that recovers the sorted dictionaries? It would amount to

sorted( D.items(), key=lambda t:t[0])

but with an appropriate routine you could print the result in a more dict-compatible way so that you can fix each doctest with a single-line edit.

That way you end up with a more future-proof fix, and it makes total sense to insert such a print routine at least into the doctest namespace (but then almost by necessity also in the interactive namespace)

@antonio-rojas
Copy link
Contributor Author

comment:4

Sounds reasonable but that's beyond my ability, so someone else will have to do it.

@mwageringel
Copy link

comment:5

The idiom sorted(D.items()) is already used in several places in the doctests. Alternatively,

sage: from pprint import pprint
sage: pprint(D)

could be used to print a dict sorted by keys, which also seems to take care of nested dictionaries.

On the other hand, if one really wants to keep the sorted output of dicts in general, another solution would be to add a displayhook during doctests that imitates what is currently done by IPython. This would be more robust than using pprint everywhere, as it is likely that new doctests depending on insertion order would be introduced in the long run otherwise.

@nbruin
Copy link
Contributor

nbruin commented Jan 26, 2020

comment:6

Replying to @mwageringel:

The idiom sorted(D.items()) is already used in several places in the doctests. Alternatively,

sage: from pprint import pprint
sage: pprint(D)

could be used to print a dict sorted by keys, which also seems to take care of nested dictionaries

Thanks for the suggestion! The routine I was thinking about already exists!. I think in "examples" doctests, the pprint is a little more desirable, because its output looks like a dictionary: reading the example with the printed output barely has any additional cognitive load. For that purpose, it might be worth considering to inject pprint in the global namespace by default. I think it serves a real purpose now that IPython dict printing is no longer sorted.

On the other hand, if one really wants to keep the sorted output of dicts in general, another solution would be to add a displayhook during doctests that imitates what is currently done by IPython. This would be more robust than using pprint everywhere, as it is likely that new doctests depending on insertion order would be introduced in the long run otherwise.

True, but that would cause a deviation between doctest output and the output that the actual example in the command line would give. The IPython reason for dispensing with sorting dictionaries is a good one, so I would not be in favour of bringing back that hook in general.

I also think that there are legitimate reasons to make doctests that do print the dict in the "Py3" order, for instance if used to compare with the iteration order. In those cases it would be possible to formulate the doctests differently, no doubt, but I would rather not preempt such use. I think putting up with having to use pprint is reasonable, with the price that it's a magnet for fragile doctests (but in reality, doctests for lots of more complicated routines are hard to make non-fragile. By comparison I think the dict issue is relatively mild.

@mwageringel
Copy link

comment:7

Replying to @nbruin:

For that purpose, it might be worth considering to inject pprint in the global namespace by default. I think it serves a real purpose now that IPython dict printing is no longer sorted.

In that case, rather than injecting pprint, it might be better to improve the pretty print function that already exists in the global namespace, for example by adding a new keyword pretty_print(D, sort=True). IMO, this makes it clearer why the pretty printing is used. This is synonymous to show(D, sort=True).

True, but that would cause a deviation between doctest output and the output that the actual example in the command line would give. The IPython reason for dispensing with sorting dictionaries is a good one, so I would not be in favour of bringing back that hook in general.

This deviation already exists, as the IPython sorting of dicts is only active during doctests. That said though, I do not have a strong preference for either solution – though I agree it is generally desirable to keep doctest output as close as possible to the actual output.

@nbruin
Copy link
Contributor

nbruin commented Jan 27, 2020

comment:8

Replying to @mwageringel:

In that case, rather than injecting pprint, it might be better to improve the pretty print function that already exists in the global namespace, for example by adding a new keyword pretty_print(D, sort=True). IMO, this makes it clearer why the pretty printing is used. This is synonymous to show(D, sort=True).

It seems pretty_print aims at a different target: its default is latex output in the command line (that's a bad default), and in general "this function chooses the highest-quality output supported by the user interface", so that doesn't quite hit the goals of doctests (which should be concise and human-readable).

If we do

dm = get_display_manager()
dm.preferences.text='plain'

we end up with better results for our doctesting purposes. But that would be worse than from pprint import pprint.

@mwageringel
Copy link

comment:9

Replying to @nbruin:

It seems pretty_print aims at a different target: its default is latex output in the command line (that's a bad default), and in general "this function chooses the highest-quality output supported by the user interface", so that doesn't quite hit the goals of doctests (which should be concise and human-readable).

I fail to see the rationale behind defaulting to latex, or the intended use case of pretty_print. Making a distinction between %display plain and %display None is bizarre and seems wrong to me. The usual REPL output at least does not make such a distinction (implemented in sage.repl.rich_output.display_manager.DisplayManager._preferred_text_formatter).

Should we add pprint to the global namespace then? I am not sure this is the best thing to do, but it seems practical. Or just add explicit from pprint import pprint statements in the doctests?

@mantepse
Copy link
Collaborator

comment:10

It seems to me that it would be good to discuss this at sage-devel, because it affects all developers.

Let me add that it is unclear to me what the best solution is. Initially, I liked the idea of pretty_print(D, sort=True). I think that yet another pretty print function is a bad idea, I find the existence of show, pretty_print, ascii_art, .pp() and possibly others confusing enough.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Feb 3, 2020

Changed commit from 264d83a to adcc477

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Feb 3, 2020

Branch pushed to git repo; I updated commit sha1. New commits:

f75b41fMerge branch 'develop' of git://git.sagemath.org/sage into t/29042/remove_sorting_of_dicts
adcc477Fix two new tests in ring_extension.pyx

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 2, 2020

Changed commit from adcc477 to 1f82c67

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 2, 2020

Branch pushed to git repo; I updated commit sha1. New commits:

81c03c7Merge branch 'develop' of git://git.sagemath.org/sage into t/29042/remove_sorting_of_dicts
1f82c67Sync with develop

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 8, 2020

Branch pushed to git repo; I updated commit sha1. New commits:

5c36c87Merge 9.1.beta7

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 8, 2020

Changed commit from 1f82c67 to 5c36c87

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 11, 2020

Changed commit from 5c36c87 to 71427f1

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 11, 2020

Branch pushed to git repo; I updated commit sha1. New commits:

71427f1Update for 9.1.beta7

@kiwifb
Copy link
Member

kiwifb commented Mar 29, 2020

comment:15

This needs rebasing. My own thinking is that calling the structure dictionary lead humans to have an expectation for it to be ordered the way the physical object is. In fact we have look up tables which are very handy for the computer to manipulate but not necessarily meant for full display. In most case sorting doesn't matter objectively.

So I am not sure we should compare whole dictionaries for doctests. Size of the dictionary and some key elements should be enough.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 29, 2020

Changed commit from 71427f1 to a86697a

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 29, 2020

Branch pushed to git repo; I updated commit sha1. New commits:

55e6360Merge branch 'develop' of git://git.sagemath.org/sage into t/29042/remove_sorting_of_dicts
a86697aMerge branch 'develop' of git://git.sagemath.org/sage into t/29042/remove_sorting_of_dicts

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 29, 2020

Changed commit from a86697a to 7ba1e5f

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Mar 29, 2020

Branch pushed to git repo; I updated commit sha1. New commits:

7ba1e5fFix new tests in 9.1.beta9

@dimpase
Copy link
Member

dimpase commented Jun 1, 2020

comment:45

I don't quite understand what this ticket is about:
This will break the affected doctests (but not functionality) on python2

9.2 has nothing to do with python2. We can break it, why not?

@antonio-rojas
Copy link
Contributor Author

comment:46

Replying to @dimpase:

I don't quite understand what this ticket is about:
This will break the affected doctests (but not functionality) on python2

9.2 has nothing to do with python2. We can break it, why not?

This was opened in January

@dimpase
Copy link
Member

dimpase commented Jun 1, 2020

comment:47

is anything needed here for py3 ?

@antonio-rojas
Copy link
Contributor Author

comment:48

Yes - this is meant to fix the breakage that will happen in tests when upgrading ipython. "This will break the affected doctests (but not functionality) on python2" refers to the proposed patch at the time.

@dimpase
Copy link
Member

dimpase commented Jun 1, 2020

comment:49

Could someone knowledgeable about the ticket please amend its description?

@antonio-rojas

This comment has been minimized.

@kliem
Copy link
Contributor

kliem commented Jun 5, 2020

comment:51

Maybe removing the sorting of dictionaries isn't necessary. We can just overwrite the pretty print function during testing:

+++ b/src/sage/doctest/forker.py
@@ -61,6 +61,7 @@ from .parsing import SageOutputChecker, pre_hash, get_source
 from sage.repl.user_globals import set_globals
 from sage.cpython.atexit import restore_atexit
 from sage.cpython.string import bytes_to_str, str_to_bytes
+import IPython.lib.pretty
 
 
 # All doctests run as if the following future imports are present
@@ -86,6 +87,27 @@ _OSError_SUBCLASSES = [
        exc is not OSError
 ]
 
+def _sorted_dict_pprinter_factory(start, end):
+    """
+    Factory that returns a pprint function used by the default pprint of
+    dicts and dict proxies.
+    """
+    def inner(obj, p, cycle):
+        if cycle:
+            return p.text('{...}')
+        step = len(start)
+        p.begin_group(step, start)
+        keys = obj.keys()
+        keys = IPython.lib.pretty._sorted_for_pprint(keys)
+        for idx, key in p._enumerate(keys):
+            if idx:
+                p.text(',')
+                p.breakable()
+            p.pretty(key)
+            p.text(': ')
+            p.pretty(obj[key])
+        p.end_group(step, end)
+    return inner
 
 
 def init_sage():
@@ -185,11 +207,11 @@ def init_sage():
     # IPython's pretty printer sorts the repr of dicts by their keys by default
     # (or their keys' str() if they are not otherwise orderable).  However, it
     # disables this for CPython 3.6+ opting to instead display dicts' "natural"
-    # insertion order, which is preserved in those versions).  This makes for
-    # inconsistent results with Python 2 tests that return dicts, so here we
-    # force the Python 2 style dict printing
-    import IPython.lib.pretty
-    IPython.lib.pretty.DICT_IS_ORDERED = False
+    # insertion order, which is preserved in those versions).
+    # However, this order is random in some instances.
+    # Also code changes may change the order, but the result stays correct.
+    # So we force sorting the dictionary.
+    IPython.lib.pretty.for_type(dict, _sorted_dict_pprinter_factory('{', '}'))
 
     # Switch on extra debugging
     from sage.structure.debug_options import debug

This appears to do the job. I'm testing the entire library now.

@kliem
Copy link
Contributor

kliem commented Jun 5, 2020

Changed commit from 2594a80 to 68b10e1

@kliem

This comment has been minimized.

@kliem
Copy link
Contributor

kliem commented Jun 5, 2020

comment:52

sage -t --long --all passes for me with this branch.


New commits:

68b10e1keep dictionaries ordered for doctesting

@kliem
Copy link
Contributor

kliem commented Jun 5, 2020

Changed author from Antonio Rojas to Jonathan Kliem, Antonio Rojas

@kliem
Copy link
Contributor

kliem commented Jun 5, 2020

Changed branch from u/fbissey/dictionaries to public/29042

@kliem kliem changed the title Remove sorting of dicts Sort dictionaries for doctests independent of ipythons pretty print Jun 5, 2020
@kiwifb
Copy link
Member

kiwifb commented Jun 5, 2020

comment:53

I can do some testing tomorrow. It looks so much simpler.

@kliem
Copy link
Contributor

kliem commented Jun 5, 2020

comment:54

I was suprised how easy this actually is. Took me a number of hours to figure out though.

@dimpase
Copy link
Member

dimpase commented Jun 5, 2020

comment:55

Does this need new iPython?

@kliem
Copy link
Contributor

kliem commented Jun 5, 2020

comment:56

No, I didn't upgrade and it worked for me.

The function _sorted_for_pprint that it uses requires iPython at least 5:

https://github.com/ipython/ipython/blob/5.x/IPython/lib/pretty.py

If that is a problem or if this is ever removed, we can just copy paste this tiny function.

The function for_type has been in their since IPython version 1. So I hope that it is somewhat stable.

@kiwifb
Copy link
Member

kiwifb commented Jun 6, 2020

comment:57

Works for me. That's one of the things I would like to see merged soon.

@kiwifb
Copy link
Member

kiwifb commented Jun 6, 2020

Reviewer: François Bissey

@kliem
Copy link
Contributor

kliem commented Jun 6, 2020

comment:58

Thank you.

@dimpase
Copy link
Member

dimpase commented Jun 6, 2020

comment:59

hopefully it simplifies the thankless task of allowing multiple versions of pari/gp, etc.

@vbraun
Copy link
Member

vbraun commented Jun 21, 2020

Changed branch from public/29042 to 68b10e1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests