add by.column=F argument in frollapply #4887

matthewgson · 2021-02-02T23:24:03Z

I might not be fully knowledgeable about the use of frollapply, but as far as I have experimented I was not successful in running rolling custom functions that requires multiple columns.

I found zoo::rollapply function which has by.column=F argument that allowed me to do the job. Thing is, it is not fully compatible with data.table[,by=] arguments so I had to loop manually. Could this by.column argument, or similar be implemented in the future? Or if I'm missing existing feature or workaround, please let me know. Thank you.

The text was updated successfully, but these errors were encountered:

jangorecki · 2021-02-03T08:16:21Z

Good feature request. Please provide reproducible zoo example, or your current loop code.

matthewgson · 2021-02-04T12:17:27Z

Here's sample code similar to what I did.

library(data.table)
library(zoo)
iris = as.data.table(iris)

# rolling calculation on two columns

flow_dt = function(DT){ 
  # Data table with two columns
  # needs to be applied in the zoo::rollapply function. 
  flow = (DT[2,1] - DT[1,1] * (1+DT[2,2])) / (DT[1,1])
  return(flow)
}

return = rollapply(iris[,1:2], 2, flow_dt, by.column=F)
dim(iris) # 150 5
length(return) # 149
frollapply(iris[,1:2], 2, flow_dt) # Error in DT[2, 1] : incorrect number of dimensions

iris[, flow := c(NA, rollapply(iris[,1:2], 2, flow_dt, by.column=F))] # works fine
iris[, flow := c(NA, rollapply(iris[,1:2], 2, flow_dt, by.column=F)), by=Species] # error

I looped my code by splitting data.table by column and running zoo::rollapply on each.


split_table = split(iris, by='Species')
split_table
for (dt in split_table){
  dt[, flow := c(NA, rollapply(dt[,1:2], 2, flow_dt, by.column=F))]
}

result = rbindlist(split_table)

jangorecki · 2022-10-07T16:50:48Z

@matthewgson Hi there,
there is a PR candidate that implements by.column=FALSE

install.packages("data.table", repos="https://jangorecki.gitlab.io/data.table")
library(data.table)
iris = as.data.table(iris)
flow_dt = function(DT){ 
  flow = (DT[2,1] - DT[1,1] * (1+DT[2,2])) / (DT[1,1])
  return(flow)
}
frollapply(iris[,1:2], 2, flow_dt, by.column=FALSE, fill=data.table(Sepal.Length=NA_real_))
#     Sepal.Length
#            <num>
#  1:           NA
#  2:    -3.039216
#  3:    -3.240816
#  4:    -3.121277
#  5:    -3.513043
# ---             
#146:    -3.000000
#147:    -2.559701
#148:    -2.968254
#149:    -3.446154
#150:    -3.048387

It is currently in my private fork, because it is based on another branch, rather than master branch. Once the other branch will be merged to master I will rebase this one to master and push to github.
Manual can be found in https://jangorecki.gitlab.io/data.table/reference/frollapply.html
Testing is very welcome.

jangorecki · 2022-10-07T17:09:05Z

Btw. I simplified your function as it was returning single row single column data.tables. Now its just scalar numeric, and fill is automatically handled as well. Easier to apply by group.

flow = function(DT) {
  v1 = DT[[1L]]
  v2 = DT[[2L]]
  (v1[2L] - v1[1L] * (1+v2[2L])) / v1[1L]
}
iris[, "flow" := frollapply(.SD, 2, flow, by.column=F), by=Species, .SDcols=1:2][]
#     Sepal.Length Sepal.Width Petal.Length Petal.Width   Species      flow
#            <num>       <num>        <num>       <num>    <fctr>     <num>
#  1:          5.1         3.5          1.4         0.2    setosa        NA
#  2:          4.9         3.0          1.4         0.2    setosa -3.039216
#  3:          4.7         3.2          1.3         0.2    setosa -3.240816
#  4:          4.6         3.1          1.5         0.2    setosa -3.121277
#  5:          5.0         3.6          1.4         0.2    setosa -3.513043
# ---                                                                      
#146:          6.7         3.0          5.2         2.3 virginica -3.000000
#147:          6.3         2.5          5.0         1.9 virginica -2.559701
#148:          6.5         3.0          5.2         2.0 virginica -2.968254
#149:          6.2         3.4          5.4         2.3 virginica -3.446154
#150:          5.9         3.0          5.1         1.8 virginica -3.048387

Waldi73 · 2024-03-05T16:23:10Z

@jangorecki, thanks for adding this option, it would be great for rolling regression like here.
Didn't find it yet in 1.15.2. Is merge planned in upcoming versions?

jangorecki · 2024-03-06T19:40:50Z

Hopefully in 1.16.0 but there are many PRs on the way that has to be merged first. If you need it very much you can install branch of the PR that closes this issue. You are as well welcome to contribute by amending requested changes to PRs needed to have this one merged.

jangorecki added the feature request label Feb 3, 2021

This was referenced Aug 29, 2022

rolling functions: adaptive left, frollmax, frollapply adaptive, partial #5441

Open

rolling functions, rolling aggregates, sliding window, moving average #2778

Open

jangorecki added the froll label Sep 26, 2022

jangorecki self-assigned this Sep 29, 2022

This was referenced Oct 7, 2022

Rolling regression #4075

Closed

roll regression #5314

Closed

jangorecki added this to the 1.14.7 milestone Oct 10, 2022

jangorecki linked a pull request Jan 3, 2023 that will close this issue

frollapply rewritten, supports by.column=F #5575

Open

jangorecki modified the milestones: 1.14.11, 1.15.1 Oct 29, 2023

MichaelChirico modified the milestones: 1.16.0, 1.17.0 Jul 14, 2024

MichaelChirico modified the milestones: 1.17.0, 1.18.0 Jan 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add by.column=F argument in frollapply #4887

add by.column=F argument in frollapply #4887

matthewgson commented Feb 2, 2021 •

edited

Loading

jangorecki commented Feb 3, 2021

matthewgson commented Feb 4, 2021 •

edited

Loading

jangorecki commented Oct 7, 2022 •

edited

Loading

jangorecki commented Oct 7, 2022 •

edited

Loading

Waldi73 commented Mar 5, 2024 •

edited

Loading

jangorecki commented Mar 6, 2024

add by.column=F argument in frollapply #4887

add by.column=F argument in frollapply #4887

Comments

matthewgson commented Feb 2, 2021 • edited Loading

jangorecki commented Feb 3, 2021

matthewgson commented Feb 4, 2021 • edited Loading

jangorecki commented Oct 7, 2022 • edited Loading

jangorecki commented Oct 7, 2022 • edited Loading

Waldi73 commented Mar 5, 2024 • edited Loading

jangorecki commented Mar 6, 2024

matthewgson commented Feb 2, 2021 •

edited

Loading

matthewgson commented Feb 4, 2021 •

edited

Loading

jangorecki commented Oct 7, 2022 •

edited

Loading

jangorecki commented Oct 7, 2022 •

edited

Loading

Waldi73 commented Mar 5, 2024 •

edited

Loading