Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Chart.transform_filter() to mirror alt.when() #3657

Open
dangotbanned opened this issue Oct 24, 2024 · 5 comments · May be fixed by #3664
Open

Update Chart.transform_filter() to mirror alt.when() #3657

dangotbanned opened this issue Oct 24, 2024 · 5 comments · May be fixed by #3664

Comments

@dangotbanned
Copy link
Member

What is your suggestion?

The goal is simply to provide equivalent predicate parsing functionality, that is found in alt.when().

For example, this would be easy to port over:

verbose_composition = (
    (alt.datum.Name == "Name_1")
    & (alt.datum.Color == "Green")
    & (alt.datum.Age == 25)
    & (alt.datum.StartDate == "2000-10-01")
)
when_verbose = alt.when(verbose_composition)
when_concise = alt.when(Name="Name_1", Color="Green", Age=25, StartDate="2000-10-01")

I was reading through transform/filter and looking at the implementation:

.transform_filter() implementation

if isinstance(filter, Parameter):
new_filter: dict[str, Any] = {"param": filter.name}
if "empty" in kwargs:
new_filter["empty"] = kwargs.pop("empty")
elif isinstance(filter.empty, bool):
new_filter["empty"] = filter.empty
filter = new_filter
return self._add_transform(core.FilterTransform(filter=filter, **kwargs))

AFAIK, the signature can be changed without impacting any existing code:

def transform_filter(
self,
filter: str
| Expr
| Expression
| Predicate
| Parameter
| PredicateComposition
| dict[str, Predicate | str | list | bool],
**kwargs: Any,
) -> Self:

  • There are no checks on filter being in **kwargs
  • The only "valid" keyword is empty
    • condition() has already been safely updated in this way

I definitely want to do some more testing, but I can't see much that would be blocking closing the gap between the two:

def when(
predicate: Optional[_PredicateType] = Undefined,
*more_predicates: _ComposablePredicateType,
empty: Optional[bool] = Undefined,
**constraints: _FieldEqualType,
) -> When:

Side note

#695 is referenced on that page, but I think it was resolved a while ago?

Also there are references to selection() which was deprecated in 5.0.0

Have you considered any alternative solutions?

  • Leave .transform_filter() unchanged
  • Accept alt.When (class) as filter
    • Would work by unwrapping When._condition
    • I'd prefer to add the expressiveness, without annotating .transform_filter with When
      • Seems less intuitive to me
@mattijn
Copy link
Contributor

mattijn commented Oct 25, 2024

I think I support this request, but to be sure with an example:

import altair as alt
from altair import datum

from vega_datasets import data
pop = data.population.url

alt.Chart(pop).mark_area().encode(
    x='age:O',
    y='people:Q',
).transform_filter(
    (datum.year == 2000) & (datum.sex == 1)
)
  • with this feature request
import altair as alt
from vega_datasets import data

source = data.population.url

alt.Chart(source).mark_area().encode(
    x='age:O',
    y='people:Q',
).transform_filter(
    alt.when(year=2000, sex=1)
)

@dangotbanned
Copy link
Member Author

dangotbanned commented Oct 25, 2024

I think I support this request, but to be sure with an example:

Thanks for the quick response @mattijn

Apologies, I wrote this up yesterday in a bit of a rush.

To clarify, below I've rewritten your comment with what I intended:

Current (first example on transform/filter)

import altair as alt
from altair import datum
from vega_datasets import data

pop = data.population.url

alt.Chart(pop).mark_area().encode(
    x='age:O',
    y='people:Q',
).transform_filter(
    (datum.year == 2000) & (datum.sex == 1)
)

With this feature request

import altair as alt
from vega_datasets import data

source = data.population.url

alt.Chart(source).mark_area().encode(
    x='age:O',
    y='people:Q',
).transform_filter(year=2000, sex=1)

Alternative (2)

Your example corresponds to what I mentioned at the end of the description

Have you considered any alternative solutions?

  • Leave .transform_filter() unchanged
  • Accept alt.When (class) as filter
    • Would work by unwrapping When._condition
    • I'd prefer to add the expressiveness, without annotating .transform_filter with When
      • Seems less intuitive to me
import altair as alt
from vega_datasets import data

source = data.population.url

alt.Chart(source).mark_area().encode(
    x='age:O',
    y='people:Q',
).transform_filter(
    alt.when(year=2000, sex=1)
)

Expanding on why I prefer not to include When

  • Using *args, **kwds would correspond directly to polars.DataFrame.filter
    • I feel there is a nice alignment then between:
      • pl.when -> alt.when
      • pl.DataFrame.filter -> alt.Chart.transform_filter
    • This makes it intuitive for users introduced to altair via polars
      • And in the future, maybe the reverse would also be true
  • We would not need to add &, |, ~ operators to When
    • E.g. this isn't valid
      • alt.when(year=2000, sex=1) | datum.name == "john"
  • Syntactically, there are fewer concepts happening in the first - but it is not less expressive
    • filter-predicates-constraints
    • filter-when-predicates-constraints

@mattijn
Copy link
Contributor

mattijn commented Oct 25, 2024

Ah, thanks for clarifying. Sounds like a good idea 👍

@jonmmease
Copy link
Contributor

I'm in favor, we're not using the kwargs for anything currently, and using them for direct equality makes sense.

@dangotbanned
Copy link
Member Author

dangotbanned commented Oct 30, 2024

Experimenting with this now, will publish a branch soon tracking in #3664.

Early findings

  • Entirely removing **kwargs breaks no tests/examples
    • Provided empty is added as named keyword-only argument
  • Ranged Dot Plot is the only example which uses filter as a keyword
    • Not a deal-breaker
    • Will probably want to issue a deprecation warning and suggest passing as positional
    • During the deprecation period, we'll have one edge case that cannot support the new syntax
      • .transform_filter(filter=value), will still need to use
      • .transform_filter(alt.datum["filter"] == value)

dangotbanned added a commit that referenced this issue Oct 30, 2024
Interestingly, this doesn't break any tests,
Would indicate what I suspected in (#3657) was true
dangotbanned added a commit that referenced this issue Oct 30, 2024
- Replaces `filter` used as a keyword argument (#3657 (comment))
- Replaces `dict` filters with `FieldOneOfPredicate`
dangotbanned added a commit that referenced this issue Nov 4, 2024
- Builds on the style introdcued for `alt.when`
- Shows a few specific kinds of predicates - due to the prior doc listing 5

#3657
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants