Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sort argument doesn't sort categories/colors in stacked bar #1987

Closed
gschivley opened this issue Feb 25, 2020 · 5 comments
Closed

sort argument doesn't sort categories/colors in stacked bar #1987

gschivley opened this issue Feb 25, 2020 · 5 comments
Labels

Comments

@gschivley
Copy link

There are several places indicating that the argument sort can be used universally to set a custom ordering by passing an array (list) with the desired order. This works for categories on the x-axis of a grouped bar chart but not the y-axis.

Current versions:

# Name                    Version                   Build  Channel
altair                    4.0.1                      py_0    conda-forge
jupyterlab                1.2.6                      py_0    conda-forge

Example (vega-lite spec):

import pandas as pd
import altair as alt

resource_order = [
    "Onshore Wind",
    "Solar",
    "Battery",
    "NGCC",
]
resource_color_scale = alt.Scale(
    domain=resource_order, 
    range=[
        "#17becf", # onshore wind
        "#d62728",  # solar 
        '#c5b0d5', # battery
        "#bcbd22", # NGCC
    ]
)
resource_colors = alt.Color("Resource Name", scale=resource_color_scale)
    
data = pd.DataFrame(
    {
        "Resource Name": ["Battery", "NGCC", "Onshore Wind", "Solar"] * 2, 
         "Capacity": [2, 20, 10, 15, 3, 10, 30, 25], 
         "Case": ["A", "A", "A", "A", "B", "B", "B", "B"]
    }
)

alt.Chart(data).mark_bar().encode(
    x=alt.X('Case', sort=["B", "A"]), # Default order is alphabetical "A", "B"
    y=alt.Y('Capacity', sort=resource_order),
    color=resource_colors
)

image

I did, however, find a workaround by ordering according to a new indexing column (vega-lite spec):

resource_order_idx = {
    resource: idx 
    for idx, resource in enumerate(resource_order[::-1]) # Reverse list to align colors
}                                                        # with legend order

# Create "idx" column with integer values indicating order in stacked bar
data["idx"] = data["Resource Name"].map(resource_order_idx)

alt.Chart(data).mark_bar().encode(
    x=alt.X('Case', sort=["B", "A"]),
    y=alt.Y('Capacity'),
    order="idx",
    color=resource_colors
)

image

I'm not sure if this is a bug in vega-lite, an oversight in the API, or the way that it is supposed to work, but I'm providing this example and my work-around to help anyone else who might be searching for for help with the same issue.

@jakevdp
Copy link
Collaborator

jakevdp commented Feb 25, 2020

The order encoding is the documented way to specify sort order for stacked bars, so I believe this is working as intended.

The docs on the Altair side, as always, could be improved.

@jakevdp
Copy link
Collaborator

jakevdp commented Feb 25, 2020

The y encoding's sort property applies to the axis, not to the marks that appear on the chart. To see why this is not a sensical place to specify stack order, imagine a horizontally-concatenated chart showing two datasets with different stack orders but the same numerical y scale: specifying two different stack orders via one shared axis would not be possible, but each panel can have its own order encoding without any problem.

@gschivley
Copy link
Author

To see why this is not a sensical place to specify stack order, imagine a horizontally-concatenated chart showing two datasets with different stack orders but the same numerical y scale: specifying two different stack orders via one shared axis would not be possible, but each panel can have its own order encoding without any problem.

Ok, I think I see what you mean. A simple way to order stacked bars that accepts an array would be more intuitive (and easier) but it seems that's a Vega-Lite limitation.

As always, thanks for all the effort!

@jakevdp
Copy link
Collaborator

jakevdp commented Feb 25, 2020

I believe this issue tracks the functionality you want.

@gilzero
Copy link

gilzero commented Oct 31, 2022

I came across this 'issue' (perhaps not a issue, but a feature in vega-lite), as suggested above, a work around is to assign a new order column with quantitative values after preprocess with whatever required order in dataframe:

a quick example:

import pandas as pd
import numpy as np

data = [
    {"name": "nike", "share": 61, },
    {"name": "adidias", "share": 30, },
    {"name": "asic", "share": 6, },
    {"name": "reebok", "share": 3, }
]

data = pd.DataFrame(data)

df = pd.DataFrame(data).sort_values(by='share', ascending=False)

df['order'] = np.arange(0, df.shape[0])

df

Screen Shot 2022-10-31 at 12 21 31

bars = alt.Chart(data).mark_bar().encode(
    x=alt.X('share:Q', 
            stack='zero', 
           ),
    color=alt.Color('name',
                   sort=df.name.values,
                   ),
    order='order:Q'
)


text = alt.Chart(data).mark_text(dx=-8, dy=3, color='white').encode(
    x=alt.X('share:Q', 
            stack='zero', 
           ),
    text=alt.Text('share:Q'),
    order='order:Q'
)


bars + text

Screen Shot 2022-10-31 at 12 21 45

(recall above comments that sort is applying to axis, e.g the legend name follow that specified 'sort=', the actual bar follows 'order=')

further ref:
https://altair-viz.github.io/user_guide/generated/channels/altair.Order.html?highlight=order#altair.Order

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants