Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include alt.ExprRef capabilities in alt.expr() #2886

Merged
merged 6 commits into from
Feb 22, 2023

Conversation

mattijn
Copy link
Contributor

@mattijn mattijn commented Feb 13, 2023

This PR is an attempt to fix #2880.

Before this PR the following could be written as such:

import altair as alt

slider = alt.binding_range(min=1, max=75, step=1)
param_slider = alt.param(value=50, bind=slider)

alt.ExprRef(alt.expr.pow(param_slider, 2) * alt.expr.PI)
ExprRef({
  expr: (pow(param_1,2) * PI)
})

With this PR it also can be written as such:

alt.expr(alt.expr.pow(param_slider, 2) * alt.expr.PI)
ExprRef({
  expr: (pow(param_1,2) * PI)
})

Basically meaning that the alt.expr method is extended with the abilities to function as a function, alt.expr(expr), which is returned as an alt.ExprRef(expr).

I have no idea if I placed this piece of code in the right place. I've tried it to put it in the altair/expr/__init__.py, but was not able to make that work (it became alt.expr.expr()).

I now placed it in altair/vegalite/v5/__init__.py, there are no other class definitions in there, so I'm open for suggestions if there are better locations to place this code (I placed it there, because that is the first place where alt.expr is imported, from ... import expr)


A few more examples:

Just like the normal alt.ExprRef(), you have to make sure that you call the expr in the right order. The following would fail:

alt.expr(param_slider * param_slider * alt.expr.PI)
altair.vegalite.v5.schema.core.ExprRef->expr, validating 'type'
{'expr': '((param_1 * param_1) * PI)'} is not of type 'string'

You'll have to do:

alt.expr((param_slider * param_slider * alt.expr.PI).expr)
ExprRef({
  expr: ((param_1 * param_1) * PI)
})

Basically nesting the objects and expressing it as an expression (...).expr
Or you have to reorder if possible, to place an alt.expr.<...> as first operand:

alt.expr(alt.expr.PI * (param_slider * param_slider))
ExprRef({
  expr: (PI * (param_1 * param_1))
})

As mentioned in the error above, the expr within expr() has to be of type str, so in order to create a list including a reference to an alt.param(), you can define it for example as such:

alt.expr(f'[{param_slider.name}, 0]')
ExprRef({
  expr: '[param_1, 0]'
})

Since calling the parameter directly, will serialise it into a VariableParameter, something that you don't want:

alt.expr(f'[{param_slider}, 0]')
ExprRef({
  expr: "[Parameter('param_1', VariableParameter({\n  bind: BindRange({\n    input: 'range',\n    max: 75,\n    min: 1,\n    step: 1\n  }),\n  name: 'param_1',\n  value: 50\n})), 0]"
})

@mattijn
Copy link
Contributor Author

mattijn commented Feb 14, 2023

I moved the class that is introduced in this PR to altair/expr/__init__.py, so everything related to expr is together. I then import this expr in altair/vegalite/v5/__init__.py using from ...expr import expr instead of from ... import expr. I think this make a bit more sense.

Copy link
Contributor

@joelostblom joelostblom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the changes here are small and well-tested, I struggle a bit to understand the main advantages (probably because I don't use expressions often myself). Is the main benefit here that it is more convenient to type alt.expr(...) than alt.ExprRef(...)? And maybe that this is more in line with how it looks in the VL spec, where it seems like the translated syntax becomes {'expr': ...} and the word "ExprRef" does not appear anywhere in the VL spec if I understand correctly.

I will say that I find something like

alt.expr(alt.expr.pow(param_slider, 2) * alt.expr.PI)

a bit confusing at first glance and I would personally find it easier to read:

alt.ExprRef(alt.expr.pow(param_slider, 2) * alt.expr.PI)

But on the other hand something like the following reads fine:

alt.expr(f"pow({param_slider}, 2) * PI")

What are the main benefit you see in changing the syntax here instead of using ExprRef? (As a side note we should also add docs for expr/ExprRef, maybe based on https://vega.github.io/vega-lite/docs/parameter.html but shorter and more to the point and clear.

altair/vegalite/v5/__init__.py Outdated Show resolved Hide resolved
@mattijn
Copy link
Contributor Author

mattijn commented Feb 19, 2023

Is the main benefit here that it is more convenient to type alt.expr(...) than alt.ExprRef(...)? And maybe that this is more in line with how it looks in the VL spec, where it seems like the translated syntax becomes {'expr': ...} and the word "ExprRef" does not appear anywhere in the VL spec if I understand correctly.

Yes.
And it brings it in line with the existing alt.datum(...), alt.value(...):

  • alt.datum(...): Specify a datum for use in an encoding
  • alt.value(...): Specify a value for use in an encoding
  • alt.expr(...): Specify an expression for use in an attribute

Let me give an example where I use all three of above:
Lets begin with a basic wind rose

import altair as alt
import pandas as pd

source = pd.DataFrame(
    {"order": range(8),
     "value": [2, 2.5, 3, 4, 5, 5.5, 4, 3]}
)

arcs = alt.Chart(source).mark_arc().encode(
    theta="order:N",
    radius="value:Q",
    color="value:Q",
)

arcs

image

Add a outer circle to it and add some text near it, using alt.datum() and alt.value()

arc_datum = alt.Chart(source).mark_arc(
    filled=False, 
    stroke='darkslategray'
).encode(
    radius=alt.datum(5.5)
)

text_value = alt.Chart(source).mark_text(
    color='darkslategray'
).encode(
    radius=alt.datum(6),
    text=alt.value('my_outer_circle')
)

arcs + arc_datum + text_value

image

Now include a few expressions that change the appearance based on the width of the chart using alt.expr()

bind_range = alt.binding_range(min=100, max=300)
param_width = alt.param(name='width', bind=bind_range)
param_height = alt.param(name='height', expr='width')

expr_grad_length = alt.expr("width < 200 ? 100 : 150")
expr_grad_thickness = alt.expr("width < 200 ? 10 : 20")

expr_font_size = alt.expr("width < 200 ? 10 : 15")
expr_color = alt.expr("width < 200 ? 'red' : 'green'")

arcs = alt.Chart(source).mark_arc().encode(
    theta="order:N",
    radius="value:Q",
    color=alt.Color("value:Q").legend(
        gradientLength=expr_grad_length,
        gradientThickness=expr_grad_thickness
    )
)

text_value = alt.Chart(source).mark_text(
    color=expr_color,
    fontSize=expr_font_size,
).encode(
    radius=alt.datum(6),
    text=alt.value('my_outer_circle')
)

(arcs + arc_datum + text_value).add_params(param_width, param_height)

image

Where the ternary operator a ? b : c, also could be expressed using the alt.expr.if_() function, eg:

expr_grad_length = alt.expr(alt.expr.if_(param_width < 200, 100, 150))

I hope this helps you understand why I find this approach useful. Does it make sense to you?

@joelostblom
Copy link
Contributor

joelostblom commented Feb 19, 2023

Great, thanks for adding that additional example @mattijn ! I think we should definitely include an example like this in the docs, either the gallery or in a section that expands on expressions; I opened #2904 to track the latter. Maybe the only change would be the use of width as a name because I think that is reserved in VL expression strings (if I am reading that page correctly).

I think we can probably merge this PR and pursue the docs in a separate PR since the doc upgrade needed is bigger than the changes in this PR. I will try to take a closer look and maybe merge tonight (see my thoughts in the next comment of what to address before merging). One thing I did react to in your comparison to alt.datum and alt.value is that those two always take a single value inside them (I think?), which make them pretty straightforward. alt.expr can contain both alt.expr.<expression_name> as well as alt.datum.<column_name>, which is still confusing to me at first glance but I probably just need to play with it a bit myself and get a feel for it and then it will make sense.

@joelostblom
Copy link
Contributor

joelostblom commented Feb 19, 2023

Before merging this, I think we should decide whether to keep ExprRef around at all or remove it/hide it? I think it can be confusing when keeping multiple ways of achieving the same thing (and since ExprRef is quite new, we don't have to worry as much about backwards compatibility here I think).

We would also need to update the docs to point to expr instead of ExprRef:

image

@mattijn
Copy link
Contributor Author

mattijn commented Feb 21, 2023

[..] and since ExprRef is quite new, we don't have to worry as much about backwards compatibility here I think

ExprRef is available in Altair version 4.2.0, released Dec 29, 2021. I don't think we should delete ExprRef. There is also alt.datum() and alt.XDatum() side-by-side etc. I can imagine it would be nice to have the docs pointing to alt.expr(), but I have no idea how that can be done.

@joelostblom
Copy link
Contributor

Do you think we should hide it from the tab completion menu and schedule for deprecation later similar to how we have done with selection_single/multi which links to selection_point? Or is there anything that Expr can do that expr cannot after this PR?

I would be tempted to say that we should do the same for alt.XDatum etc, but for those there is an argument to keep them due to the help popup being specific to the encoding channel which is more helpful than the general help popup from alt.datum (and I don't see us being able to replicate this inside alt.datum since it will depend on which channel it is used for):

image

image

@mattijn
Copy link
Contributor Author

mattijn commented Feb 22, 2023

No.. These are all part of the Vega-Lite schema and not introduced by Altair.
These are encoding channels or low level schema wrappers. There are many, we can't or shouldn't deprecate these. Hiding is another discussion.

What I'm proposing is to have a utility function named expr() around the low-level wrapper ExprRef() and thus lift it to become one of the Altair API functions.

In the coming days I will try to improve the docs on this in the separate issue you opened.

If you have any ideas how we can patch the docstring/types to use expr instead of ExprRef, that would be great.

Copy link
Contributor

@joelostblom joelostblom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are all part of the Vega-Lite schema and not introduced by Altair.

Ah okay, thanks for pointing that out. I didn't realize that and I thought ExprRef the was introduced by altair. I agree with you that we should not deprecate or remove anything that's coming directly from VL, but I do think it can be very useful to hide some of all these functions and classes to make it easier to work with Altair. I have opened #2918 to continue this discussion as I now understand that it extends beyond this PR.

I think this PR looks good then, and it would be great to get the docs in also to be able to understand the recommended usage patterns of expressions in Altair. Thank you for working on that!

If you have any ideas how we can patch the docstring/types to use expr instead of ExprRef, that would be great.

I don't unfortunately but added it to #2904 so that we don't lose track of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Upgrade alt.expr to include behaviour of alt.ExprRef
2 participants