Nested grouping #41

grantmcdermott · 2023-06-28T18:15:37Z

It would be nice if we could support nested grouping. (Or, put differently, allow colours to vary/repeat across units.) This would mostly be useful for line plots where we want to avoid joining the end of one line with the start of another. The idea is similar to how ggplot2 allows you to specific aes(col = var1, group = var2) separately.

Here is an illustration using the following dataset. The setting is a difference-in-differences research design with staggered treatment. So we have treatment cohorts (first_treat) superimposed on individual units (id).

First, points. (Fine.)

plot2(y ~ time | first_treat, dat)

Second, lines. (Not fine, because we have lines rejoining across units in the same cohort.)

plot2(y ~ time | first_treat, dat, type = "l")

Of course, we could group (colour) by the individual IDs. This stops the rejoining, but means that we lose the colouring by treatment group (which is the interesting thing from a causal inference perspective).

plot2(y ~ time | id, dat, type = "l", legend = FALSE)

I don't have a solution right now, but it probably requires a new argument like bycol. On the formula side, we could potentially represent this via a / nesting interaction. So the call would become plot2(y ~ time | first_treat / id, dat, type = "l"), i.e. units are nested within first treatment cohorts.

The text was updated successfully, but these errors were encountered:

grantmcdermott · 2024-08-30T19:38:32Z

Ran into this again recently and am now thinking a simpler solution is just to support passing a variable to col. It should be pretty simple to grab the corresponding colour breaks and pass them to our group-split data, by using something like tapply(factor(col_var), by_var, FUN = [[, 1) internally.

Manual proof of concept:

library(tinyplot)

set.seed(123456L)

# 60 time periods, 30 individuals, and 5 waves of treatment
tmax = 60
imax = 30
nlvls = 5

dat = 
  expand.grid(time = 1:tmax, id = 1:imax) |>
  within({
    
    cohort      = NA
    effect      = NA
    first_treat = NA
    
    for (chrt in 1:imax) {
      cohort = ifelse(id==chrt, sample.int(nlvls, 1), cohort)
    }
    
    for (lvls in 1:nlvls) {
      effect      = ifelse(cohort==lvls, sample(2:10, 1), effect)
      first_treat = ifelse(cohort==lvls, sample(1:(tmax+20), 1), first_treat)
    }
    
    first_treat = ifelse(first_treat>tmax, Inf, first_treat)
    treat       = time>=first_treat
    rel_time    = time - first_treat
    y           = id + time + ifelse(treat, effect*rel_time, 0) + rnorm(imax*tmax)
    
    rm(chrt, lvls, cohort, effect)
  })

cols = with(dat, tapply(factor(first_treat), id, FUN = `[[`, 1))  # grab group colours
cols
#>  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 
#>  1  3  3  5  4  1  3  2  2  4  2  1  3  4  5  3  4  2  1  2  3  4  3  2  3  5 
#> 27 28 29 30 
#>  5  3  1  3

plt(y ~ time | id, dat, type = "l", col = palette()[cols], legend = FALSE)
#> Warning in tinyplot.default(x = x, y = y, by = by, facet = facet, facet.args = facet.args, : 
#> Continuous legends not supported for this plot type. Reverting to discrete legend.

^{Created on 2024-08-30 with reprex v2.1.1}

TBD on how to handle legends, as well as NSE vs formula arguments.

grantmcdermott mentioned this issue Sep 16, 2024

"contour" #218

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nested grouping #41

Nested grouping #41

grantmcdermott commented Jun 28, 2023 •

edited

Loading

grantmcdermott commented Aug 30, 2024 •

edited

Loading

Nested grouping #41

Nested grouping #41

Comments

grantmcdermott commented Jun 28, 2023 • edited Loading

grantmcdermott commented Aug 30, 2024 • edited Loading

grantmcdermott commented Jun 28, 2023 •

edited

Loading

grantmcdermott commented Aug 30, 2024 •

edited

Loading