Add new functions `fold` and `fold_over` #2

TimTeaFan · 2021-05-11T15:36:26Z

Based on this gist fold and fold_over might be useful add on functions for a future version of dplyover. There should be a better name than fold for this kind of functions.

likert_col <- function(n = 10) {
  sample(7, size = 10, replace = TRUE)
}

# toy data
dat <- tibble(
  cat_1 = likert_col(),
  cat_2 = likert_col(),
  cat_3 = likert_col(),
  dog_1 = likert_col(),
  dog_2 = likert_col()
)

# `fold` does not exist yet
dat %>% 
  transmute(fold(starts_with("cat"),
                 list(sum = ~ rowSums(.x),
                      mean = ~ rowMeans(.x))))

# A tibble: 10 x 2
   cat_sum cat_mean
     <dbl>    <dbl>
 1      11     3.67
 2      10     3.33
 3       6     2   
 4       4     1.33
 5      10     3.33
 6       7     2.33
 7      12     4   
 8      12     4   
 9      17     5.67
10      13     4.33

# `fold_over` does not exist yet
dat %>% 
  transmute(fold_over(cut_names("_[0-9]*$"),
                      ~ starts_with(.x),
                      ~ rowSums(.x)))

# A tibble: 10 x 2
     cat   dog
   <dbl> <dbl>
 1    11    11
 2    10    10
 3     6     6
 4     4     4
 5    10    10
 6     7     7
 7    12    12
 8    12    12
 9    17    17
10    13    13

The text was updated successfully, but these errors were encountered:

TimTeaFan · 2021-05-22T16:39:26Z

I think fold would be a great extension of {dplyover}, but a better name should be found given that {rsample} uses vfold and {furrr} has also a fold function.

Then again, fold does pretty much what it says. It folds down several columns of a data.frame to one column, for example by calculating the rowMean.

vorpalvorpal · 2021-08-18T01:13:06Z

Firstly, thanks for the package. I think this has a far more common use case than Hadley suggested.

Secondly, maybe I'm misunderstanding the purpose of fold here, but wouldn't

summarise(over(starts_with("cat"),
                 list(sum = ~ rowSums(.x),
                      mean = ~ rowMeans(.x))))

do the same thing? At least that way you avoid using the name "fold".

TimTeaFan · 2021-08-18T22:13:00Z

Thank you for your feedback! Unfortunately over and the other functions in the over-across function family don't work like that. over loops over a vector and creates a new column for each element. Apart from that over does not support tidy-select syntax in its .x argument.

However, we could create a named list of data.frames on the fly as input to over and then produce a similar outcome. Having a dedicated function like fold and fold_over would still be helpful I guess, since we wouldn't need to use one or several select calls as input to over.

# instead of fold_over we could do:
dat %>% 
  summarise(over(list(cat = select(., starts_with("cat")),
                      dog = select(., starts_with("dog"))),
                 list(sum  = rowSums,
                      mean = rowMeans)))

#> # A tibble: 10 x 4
#>    cat_sum cat_mean dog_sum dog_mean
#>      <dbl>    <dbl>   <dbl>    <dbl>
#>  1      12     4         12      6  
#>  2      11     3.67       3      1.5
#>  3      19     6.33       4      2  
#>  4       6     2          9      4.5
#>  5       9     3         14      7  
#>  6       4     1.33       7      3.5
#>  7       7     2.33      10      5  
#>  8       8     2.67       3      1.5
#>  9       9     3          9      4.5
#> 10      10     3.33       7      3.5

^{Created on 2021-08-19 by the reprex package (v0.3.0)}

TimTeaFan added the enhancement New feature or request label May 11, 2021

TimTeaFan self-assigned this May 11, 2021

TimTeaFan added the next major release try to implement this in the next major release label May 22, 2021

TimTeaFan added the further discussion needed this issue needs further discussion label May 22, 2021

TimTeaFan added this to the next major release milestone May 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new functions `fold` and `fold_over` #2

Add new functions `fold` and `fold_over` #2

TimTeaFan commented May 11, 2021 •

edited

Loading

TimTeaFan commented May 22, 2021 •

edited

Loading

vorpalvorpal commented Aug 18, 2021

TimTeaFan commented Aug 18, 2021 •

edited

Loading

Add new functions fold and fold_over #2

Add new functions fold and fold_over #2

Comments

TimTeaFan commented May 11, 2021 • edited Loading

TimTeaFan commented May 22, 2021 • edited Loading

vorpalvorpal commented Aug 18, 2021

TimTeaFan commented Aug 18, 2021 • edited Loading

Add new functions `fold` and `fold_over` #2

Add new functions `fold` and `fold_over` #2

TimTeaFan commented May 11, 2021 •

edited

Loading

TimTeaFan commented May 22, 2021 •

edited

Loading

TimTeaFan commented Aug 18, 2021 •

edited

Loading