-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: html repr #1820
WIP: html repr #1820
Conversation
The sizing of variable name and dimensions columns according to their content is tricky because (1) we want these columns be aligned between different sections ( While using But I don't see neither any robust way to calculate these sizes. One option could be to use >>> import tkinter as tk
>>> from tkinter import font
>>> tk.Tk()
>>> root = tk.Tk()
>>> front_end_font = font.Font(family='Helvetica', size=11, weight='bold')
>>> front_end_font.measure("variable_name")
76
>>> root.destroy() That's not very elegant to say the least, but it has the advantage of being part of the Python standard library. The problem is that we don't know the font-family and font-size. We could define it explicitly in the CSS code but it's better to inherit it from the notebook front-ends (in some cases it is dynamically defined, e.g., the jupyterlab presentation mode). So a workaround might be to use a common font which has wide characters to calculate the width + add a good safety margin. If anyone has a better idea, e.g., a layout using some kind of smart CSS grid system... that would be great! |
I'm not sure I follow here. It's been a while since I wrote much html, but I would think you could achieve this using a table with
I think we should probably avoid adding a tkinter dependency. I would rather assume a fixed column-width for the first column. |
Something like this with a table? https://codepen.io/rgbkrk/pen/XVYpEE You'll have to embed some JS to do it though (I'm using jquery here, you could write it with I still haven't gotten a chance to use CSS grid, been hoping for a good moment. |
Yes we could, but I was indeed thinking more about the expandable/hide-able part. With pure html/css the hidden/shown container must be child or sibling of its controller, and I don't know how to achieve that with our current layout design using a table.
Even considering that tkinter is already shipped with CPython as part of the standard library? My concern with an arbitraily fixed column-width is that it should be wide enough to cover a reasonable range of use cases, but when the variable names are really short (it occurs often in examples, e.g., 'foo', 'x', 'y'...) it won't look very nice (I haven't tested it yet, though). I guess we can also calculate the width by hand considering the worst case scenario in order to have a good margin...
That solution (JS included) would be nice if we can support all notebook front-ends without any extra installation or configuration step. |
Is this something CSS grid would solve? Or is it not clear yet?
Yes, but that doesn't mean it's actually bundled into every Python install. For example, it requires a separate package on Ubuntu: https://stackoverflow.com/questions/34890383/python3-tkinter-ubuntu-trusty-does-not-work-under-virtual-environment My bigger concern is that it feels hacky and might be slow. |
Agreed! Moreover, I have a "Python" icon appearing in the MacOS Dock, which I think it's caused by initializing tk. That's bad! I played a bit and it seems feasible to estimate an approximate relationship between the text width and the number of characters (see https://gist.github.com/benbovy/fce796c663728b1bdbb3f1514daa458c -- it's a very naive approach, though).
I don't know much about it, but it seems very powerful. That would be the cleanest solution. I'll take a look. |
I re-implemented the Dataset repr using CSS grid (https://jsfiddle.net/Lmqq7yzz/9/), which I think is much cleaner for column widths that fit the content. However, one big limitation is that it's currently compatible only in Firefox! Because we want the columns in different sections aligned, I had to define a single grid at the top level and then use Two other, smaller issues:
Note : in the link above, I changed a bit the design. Variable attributes and data repr can now be show/hidden using clickable icons on the right (tooltips are still needed). This is better from a UX point of view, IMO. EDIT: tooltips would be also very useful to show full variable names and/or lists of dimensions when these are truncated. |
Wow, that does work really well on Firefox. |
It looks like CSS grid is coming to Chrome very soon -- the relevant bug is now listed as fixed. |
It looks like this will make it into Chrome stable by roughly mid-March 2018: https://www.chromium.org/developers/calendar If we're on Chrome and Firefox, that's probably good enough. We still might want to have an option that makes this easy to turn on/off (default value TBD). |
I'll try if we can have good results using fixed columns widths (thus not using |
I played around a little with using I'm sure we could figure out some better CSS magic that shows the full variable name when you hover over it. |
Let's revive this excellent idea! In particular, I would be interested in using the HTML repr on its own in conjunction with #2659 (dict / json serialization of dataset schema). If we could develop a standalone html repr function that interprets the output of |
We could also borrow ideas from https://github.com/agoose77/numpy-html or xtensor-stack/xframe@90638ec for displaying the data of each variable here. |
is there an example of what the xframe output HTML looks like? |
You can see it by running the xframe example notebook with binder. It actually looks very much like pandas dataframe (with "multi-index" rows for ndims > 2), with some hover effects showing the coordinates names/values at data elements. The output of xtensor objects is slightly different but interesting too, with nested tables (xtensor's binder). I haven't checked if |
Hi @benbovy - how can we convince you to work more on this amazing idea? What help / support do you need from other xarray devs? |
I just came by to say that the attached sample notebook is very, very pretty, and I would love to see this line of work continue! |
I did a little more tweaking of text-overflow for truncation. This version shows the full name when you hover over it: https://jsfiddle.net/1g04ykum/ |
Nice!
I'd really like to see this finally happen soon, especially that've I already spend a good amount of time on it (a while ago, I admit). But honestly (and sadly), it's been hard for me to find free time to continue the work on this feature. I'm sorry for that. I'm also a bit worried by the things (mostly related to compatibility with notebook front-ends and themes) that we'll need to support/fix quickly when this will be ready. Maybe we should make it opt-in for one or two releases. I would be extremely pleased if anyone is willing to jump in and help on the front-end part (HTML/CSS)! See the checklist at the top of this PR. Unfortunately, my limited expertise in this area makes me rather unproductive. |
yeah, I guess this is the major issue here. Who could we get in to help out? Does @pydata/xarray know anyone from the extended community with an interest in these things? |
Should we email the "announce" list and ask for help. |
@benbovy is the checklist still up-to-date? The length of it is a bit scary TBH ;-) |
Perhaps we could leverage our recently formed links between Pangeo and the Jupyter folks to help confront these front-end issues, in which we have limited expertise as a project. @ian-r-rose, a developer of jupyter server extensions, has been a very helpful resource. Maybe he could give us some advice? |
Yes it is still up-to-date :-) But this list is exhaustive and a lot of things could be saved for later! Some of the items are easy to implement but require a decision. |
I was just following the new draft dask repr, and it seems the tools are in place to be able to autogenerate a html repr of a full xarray dataset which includes an image, e.g. autogenerate something like: It seems to me @benbovy that 90% of your ToDo list is nice-to-have or special-case stuff which can be left for later? The main thing that has to be done before merging is tests? If that bare-bones version gets merged (even as a hidden feature) then others can start having a go at adding images like dask? |
Ooh that's nice! Iris and zarr html representations look nice too (i hadn't followed those developments), definitely some good ideas for the xarray html repr! I think the dask and zarr html outputs would integrate very well with the repr here and it would be quite straightforward to encapsulate it in the drop-down html containers of each coordinate / data variable here. I also like the idea of the summary image like shown above, although this could be harder to achieve.
Yes, actually most of the work is done. I was mainly worried by how the html repr would look in the different notebook front-ends, but now that other projects (dask, iris, zarr) have such repr, it looks like there's is no major issue. I also struggled with grid column resizing for correctly displaying the variable names, but I think that @shoyer's suggestion https://jsfiddle.net/1g04ykum/ is good enough for now. |
I'm just starting to look at this, was there any experiment with the html "detail" and "summary" pairs ? They are made to do collapsible sections, and will likely allow to get rid of (some of) the UUID logic. Here is a full example of a summary section.
|
Sidenote: the css is not injected at load time when the notebook is not trusted, so the REPRs may looked garbled. |
Details/Summary does look like a nice way to simplify things! It's too bad that CSS isn't processed with untrusted inputs. How do Iris and Dask deal with this limitation? |
I agree it would highly simplify the HTML code, but when I tried it things were not that easy (I don't remember exactly what, I think it had to do with alignment of nested lists) and I had some weird issues with conflicts between HTML reprs in different output cells. See: jupyterlab/jupyterlab#3200 (comment) and the comment below. Probably I'm missing something obvious?
I've quickly checked the related PRs dask/dask#4794 and SciTools/iris#2918. Dask adds |
Yeah, we just use raw HTML |
I'll say that I'm looking forward to this getting in, mostly so that I can raise an issue about adding Dask's chunked array images :) |
We have done something similar using inline svg (see, e.g., https://scipp.readthedocs.io/en/latest/user-guide/data-structures.html#Dataset). It is basically a hack for testing right now, but is sufficient for auto-generated illustration in the documentation. I am pretty impressed by the html representation previewed in #1627. Since our data structures are very similar I would be happy to contribute to this output rendering somehow, since we could then also benefit from it (with a few tweaks, probably). So let me know if I can help out somehow (unfortunately I do not know much html and css, just C++ and a bit of Python). |
@SimonHeybrock very cool to see your Scipp project! I will make some comments over in your repo but I'm impressed with what you've done. I'd love to find ways to collaborate more in the future, many of the problems you're solving are also important for xarray users. |
Is there anything that I can do to help get this PR in? Are the items on the TODO list prioritized? One minor comment is that in terms of style for overflow, it might be more legible if the var_names were bolded on hover (fiddle), although that might make them look clickable. |
@jsignell feel free to pick this up, that would be great if you could make this finally happen! (Again, I'm sorry for letting this sit so long). I'm going to edit the checklist in my 1st comment. There is indeed a lot of things that we can move to follow up issues. |
Ok thanks! I'll get cracking :) |
The last fiddle and this PR seem fairly different. Does the fiddle have the most up-to-date hierarchy or is it just somewhere where people were playing around with ideas (in which case I should see what is improved and try to pull those bits of css)? |
git diff upstream/master **/*py | flake8 --diff
whats-new.rst
for all changes andapi.rst
for new APIThis is work in progress, although the basic functionality is there. You can see a preview here:
http://nbviewer.jupyter.org/gist/benbovy/3009f342fb283bd0288125a1f7883ef2
TODO:
Nice to have (keep this for later):
Coordinates
,Data variables
andAttributes
sections (maybe expose them as global options).Dataset.coords
andDataset.data_vars
as well?Other thoughts (old)
A big challenge here is to provide both robust and flexible styling (CSS):
!important
). Probably this could be a bit cleaned and optimized (unfortunately my CSS skills are limited).