Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ux: change non-interactive repr to look more like interactive repr #10095

Open
jcrist opened this issue Sep 11, 2024 · 7 comments
Open

ux: change non-interactive repr to look more like interactive repr #10095

jcrist opened this issue Sep 11, 2024 · 7 comments
Labels
ux User experience related issues

Comments

@jcrist
Copy link
Member

jcrist commented Sep 11, 2024

Currently when constructing ibis expressions in non-interactive mode (the default), expressions repr as a description of the operations they're composed of:

In [1]: import ibis

In [2]: t = ibis.examples.diamonds.fetch()

In [3]: t.mutate(volume=t.x * t.y * t.z)
Out[3]: 
r0 := DatabaseTable: diamonds
  carat   float64
  cut     string
  color   string
  clarity string
  depth   float64
  table   float64
  price   int64
  x       float64
  y       float64
  z       float64

Project[r0]
  carat:   r0.carat
  cut:     r0.cut
  color:   r0.color
  clarity: r0.clarity
  depth:   r0.depth
  table:   r0.table
  price:   r0.price
  x:       r0.x
  y:       r0.y
  z:       r0.z
  volume:  r0.x * r0.y * r0.z

While this expr repr can be nice for inspection, it's rarely what I want when building up expressions lazily. Since ibis expressions are very composable, rarely do I need to know the steps used to get to a certain expression (e.g. I don't care that a group_by or filter was called earlier). Really all I care about is the schema/type of the object.

I propose we:

  • Keep around the existing expr repr, but expose it via some other method. Perhaps expr.explain() or something.
  • Move to using a similar repr as the interactive repr, except showing no rows and only ellipsis. This would give a similar experience to iterating in interactive mode, except without executing anything. For prior art, this is also what dask does.

A quick mockup:

In [1]: import ibis

In [2]: t = ibis.examples.diamonds.fetch()

In [3]: t.mutate(volume=t.x * t.y * t.z)
Out[3]: 
┏━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━┓
┃ caratcutcolorclaritydepthtablepricexyzvolume    ┃
┡━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━┩
│ float64stringstringstringfloat64float64int64float64float64float64float64   │
├─────────┼───────────┼────────┼─────────┼─────────┼─────────┼───────┼─────────┼─────────┼─────────┼───────────┤
│       … │ …         │ …      │ …       │       … │       … │     … │       … │       … │       … │         … │
└─────────┴───────────┴────────┴─────────┴─────────┴─────────┴───────┴─────────┴─────────┴─────────┴───────────┘

In [5]: t.mutate(volume=t.x * t.y * t.z).select("carat", "volume")
Out[5]: 
┏━━━━━━━━━┳━━━━━━━━━━━┓
┃ caratvolume    ┃
┡━━━━━━━━━╇━━━━━━━━━━━┩
│ float64float64   │
├─────────┼───────────┤
│       … │         … │
└─────────┴───────────┘
@cpcloud
Copy link
Member

cpcloud commented Sep 11, 2024

image

@cpcloud
Copy link
Member

cpcloud commented Sep 11, 2024

In all seriousness, I really like this idea!

@jcrist
Copy link
Member Author

jcrist commented Sep 11, 2024

Sounds good! I think we should aim to get this in for 10.0 then.

One open question is what to do with scalars (since in interactive mode they only show the value, not the type).

A few options:

  • Add the type to the interactive repr (but keep scalars unnamed)?
# Interactive
┌────────────┐
│ float64    │
├────────────┤
│   43040.87 │
└────────────┘ 

# Non-interactive (could also only add the type to the non-interactive version?)
┌─────────┐
│ float64 │
├─────────┤
│       … │
└─────────┘ 
  • Some non-boxed repr?
# Interactive
┌──────────┐
│ 43040.87 │
└──────────┘

# Non-interactive
Scalar<float64>
  • Put the type in the box?
# Interactive
┌──────────┐
│ 43040.87 │
└──────────┘

# Non-interactive (this might be easy to mistake for an interactive string scalar with value `"Scalar<float64>"`)
┌─────────────────┐
│ Scalar<float64> │
└─────────────────┘
  • Something else?

I have a slight preference for the first option, but 🤷.

@jcrist jcrist added the ux User experience related issues label Sep 11, 2024
@jcrist jcrist added this to the 10.0 milestone Sep 11, 2024
@gforsyth
Copy link
Member

I definitely like the look of this -- it might be nice to keep the old repr around for OUR inspection, but make it private.

I like option 1 above, but I can get on board with any of them.

@drin
Copy link

drin commented Sep 11, 2024

I randomly found this and just wanted to chime in: I think this sounds like a great idea and moving the old repr to an explain function or something similar makes a lot of sense.

it might be nice to keep the old repr around for OUR inspection, but make it private.

not sure what visibility you mean by private (maybe just surrounded wth __?) but it'd be nice for it to be easily accessible for substrait users. I could also imagine wanting to extend it with various verbosity flags (ops only, ops + predicates, etc.) to make validation or general observability easier.

@gforsyth
Copy link
Member

not sure what visibility you mean by private

yeah, just with a leading _ so it doesn't show up in tab-completion, but I'm also not opposed to leaving it more readily available if there's desire for that.

@cpcloud
Copy link
Member

cpcloud commented Dec 30, 2024

We don't have any guarantees about what the repr looks like, so this doesn't need to happen in a major release.

@cpcloud cpcloud removed this from the 10.0 milestone Dec 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ux User experience related issues
Projects
Status: backlog
Development

No branches or pull requests

4 participants