Don't add impute for violin plot #5611

domoritz · 2019-12-02T17:21:52Z

{
  "data": {"url": "data/iris.json"},
  "transform": [
    {
      "fold": ["sepalLength", "sepalWidth",
      "petalLength", "petalWidth"],
      "as": ["feature", "value"]
    },
    {
      "density": "value",
      "extent": [0, 8],
      "groupby": ["feature"]
    }
  ],
  "mark": {"type": "area", "orient": "horizontal"},
  "encoding": {
    "column": {"type": "nominal", "field": "feature"},
    "x": {"type": "quantitative", "field": "density", "stack": "center"},
    "y": {"type": "quantitative", "field": "value"}
  },
  "width": 60
}

@jheer said: I took a closer look and the culprit is not a sorting issue, but rather the auto-magical inclusion of an impute transform that has no business being there. By default, Vega performs adaptive sampling to determine which points along the density curve to include. As this can result in different sample points for the different areas, their domain values should not be used together to perform imputation. @domoritz, @kanitw I think this needs to be fixed prior to a v4 release.

domoritz · 2019-12-03T06:45:18Z

In the future, we will provide alignment to create violin plots. For now, we won't change anything about the spec above (I also don't know how what we would change) but we should allow disabling of imputation like this:

{
  "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
  "data": {"url": "data/iris.json"},
  "transform": [
    {
      "fold": ["sepalLength", "sepalWidth", "petalLength", "petalWidth"],
      "as": ["feature", "value"]
    },
    {"density": "value", "extent": [0, 8], "groupby": ["feature"]}
  ],
  "mark": {"type": "area", "orient": "horizontal"},
  "encoding": {
    "column": {"type": "quantitative", "field": "feature"},
    "x": {
      "type": "quantitative",
      "field": "density",
      "stack": "center",
      "impute": null
    },
    "y": {"type": "quantitative", "field": "value"}
  },
  "width": 60
}

jheer · 2019-12-03T07:04:55Z

Not clear to me why imputation would be opt out rather than opt in. Also, if the same logic also applies to line marks we have the same problem for regression lines. In general it is simply not correct to assume that different groups within a groupby should have identical x/y domain values.

domoritz · 2019-12-03T07:08:09Z

I'm looking into why we initially decided to add imputation by default. Maybe we can disable it.

domoritz · 2019-12-03T07:10:20Z

We are currently only adding imputation when stacking path marks.

domoritz · 2019-12-03T07:35:28Z

Here is what happens without imputation by default for stacked path marks:

And with imputation

{
  "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
  "data": { "url": "data/population.json"},
  "transform": [
    {"filter": "datum.year == 2000"},
    {"filter": "datum.age>50 || datum.sex == 2"},
    {"calculate": "datum.sex == 2 ? 'Female' : 'Male'", "as": "gender"}
  ],
  "mark": {"type": "area", "line": true},
  "encoding": {
    "y": {
      "aggregate": "sum", "field": "people", "type": "quantitative"
    },
    "x": {"field": "age", "type": "ordinal"},
    "color": {
      "field": "gender", "type": "nominal",
      "scale": {"range": ["#675193", "#ca8861"]}
    },
    "opacity": {"value": 0.7}
  }
}

Can you say more about why imputation should be opt-in for stacked path marks?

jheer · 2019-12-03T07:35:57Z

Ah that makes sense. Do we have anyway of knowing at compile time that a stack only has one entry, as in this violin case? Or, can we somehow use Vega’s xc channel instead of a stack transform?

domoritz · 2019-12-03T07:37:27Z

Or, can we somehow use Vega’s xc channel instead of a stack transform?

I think that's the right thing to do in general but I'd like to defer this feature to 4.1. For now, we should support disabling imputation.

domoritz · 2019-12-03T08:17:42Z

I have a fix in #5617 for now (which I think we should have either way).

We could know that there is only a single mark (if we don't encode color, opacity, detail, etc) but this would require more modifications that I want to do right now and the right solution is to center marks without stacking. We need to think a bit more about a design for that.

Fixes #5611

domoritz · 2023-12-14T20:37:28Z

New spec with penguins: Open the Chart in the Vega Editor

domoritz added the Bug 🐛 label Dec 2, 2019

domoritz modified the milestones: 4.0? (Maybe in 4.0), 4.x, 4.0 Dec 2, 2019

domoritz self-assigned this Dec 3, 2019

domoritz mentioned this issue Dec 3, 2019

feat: support disabling impute for stacking #5617

Merged

domoritz closed this as completed in #5617 Dec 3, 2019

domoritz added a commit that referenced this issue Dec 3, 2019

feat: support disabling impute for stacking (#5617)

6dbbd11

Fixes #5611

joelostblom mentioned this issue Nov 11, 2023

feat: add explicit option to control how densities are resolved, change how densities are resolved by default #9172

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't add impute for violin plot #5611

Don't add impute for violin plot #5611

domoritz commented Dec 2, 2019 •

edited

Loading

domoritz commented Dec 3, 2019

jheer commented Dec 3, 2019

domoritz commented Dec 3, 2019

domoritz commented Dec 3, 2019 •

edited

Loading

domoritz commented Dec 3, 2019 •

edited

Loading

jheer commented Dec 3, 2019

domoritz commented Dec 3, 2019

domoritz commented Dec 3, 2019

domoritz commented Dec 14, 2023

Don't add impute for violin plot #5611

Don't add impute for violin plot #5611

Comments

domoritz commented Dec 2, 2019 • edited Loading

domoritz commented Dec 3, 2019

jheer commented Dec 3, 2019

domoritz commented Dec 3, 2019

domoritz commented Dec 3, 2019 • edited Loading

domoritz commented Dec 3, 2019 • edited Loading

jheer commented Dec 3, 2019

domoritz commented Dec 3, 2019

domoritz commented Dec 3, 2019

domoritz commented Dec 14, 2023

domoritz commented Dec 2, 2019 •

edited

Loading

domoritz commented Dec 3, 2019 •

edited

Loading

domoritz commented Dec 3, 2019 •

edited

Loading