-
Notifications
You must be signed in to change notification settings - Fork 621
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Boxplots with many identical values or just one value are missing the median line #8126
Comments
On a closer look, it appears that the median line is actually there, but since it is drawn in white, it is invisible unless the colored box is present or the chart background is dark: It would be nice to add some logic that changes the color of this line to the color of the box/outliers when the box is not present (so blue in this case). This could be thought of as compressing the box (q1 and q3) to a line at q2 and draw it on top of the median line. Maybe also increasing the thickness slightly to 2 (only when there is no box), leading to this appearance, which I think makes it clear what is going on: |
Ahh, good catch. The tricky bit is that Vega-Lite never sees the data so we have to build the logic in Vega spec. |
Another scenario where the current behavior makes it hard to detect the median, is if it is the same as one of the quartiles as in this case: An alternative to introducing logic for these special cases on the Vega side of things would be to change the default median line to a black thicker line (the same grey as the whiskers is hard to see): This doesn't look quite as great as white in most cases, but it does solve both the edge cases I have reported here.
Another example: |
Could we add a colored outline around a white line? |
I tried that a little before, but it was difficult to get the top and bottom of the outline flush with the box, since the corners seem to be a bit rounded regardless of the cap style I choose: If you are OK with the median line being contained within the box (rather than the current appearance of splitting the box in two), then I think it can work:
|
I'll defer to @kanitw who might have a better idea. |
Boxplot with just one value also suffers from this problem. I think another option to consider is to do conditional encoding (don't use white color if max === median === max)? |
Yes, that sounds like a good alternative too |
For datasets with many identical values, it is understandable that there is no box drawn for q1 and q3, but the median line at q2 should always be present. Currently only outliers are shown which is confusing since it gives the indication that the dataset only contains a few observations, rather than potentially many observations compressed at the same value.
Open the Chart in the Vega Editor
My expectation would be to see the chart like it is shown in seaborn:
The text was updated successfully, but these errors were encountered: