add a 'having' parameter to `[.data.table` #788

nigmastar · 2014-08-29T13:16:22Z

Currently, to have the equivalent (or something similar) of the SQL having clause you need to write a [.data.table first using by and then feed the result into the i parameter of a second [.data.table, like in:

dt <- data.table(id   = rep(1:2, each = 2),
                 var  = c(0.2, 0.5, 1.5, 1.3))

dt[dt[, mean(var) > 1, by = id]$id]
   id var
1:  2 1.5
2:  2 1.3

Another option is to use conditional statement inside j, very powerful, I do all the time, and so far there is nothing that the current syntax did not allow me to do. However having a having parameter I believe will allow writing much more clear and readable codes. For example the above can be written as:

dt[, if(mean(var) > 1) .SD, by = id]

What I propose is something like:

dt[, .SD, by = id, having = mean(var) > 1]

The idea is to have an expression that always evaluates to a logical of length 1 which would tell whether or not j has to be evaluated for the current group.

Thanks,
Michele

The text was updated successfully, but these errors were encountered:

arunsrinivasan · 2014-08-29T13:24:17Z

Great FR. I've been pondering about this use case for quite a while as well. We can do this without the additional argument like so:

dt[, .SD[mean(var)>1], by=id]

(But for speed, this'll need optimisation of .SD[.] internally - #735.)

It's most likely this case that we resort to .I instead:

dt[dt[, .I[mean(var) > 1], by=id]$V1]

And it'd be great to get this directly (even better if we can achieve it without having) - maybe if j expression evaluates to a 1-column logical vector? Just thinking out loud.

nigmastar · 2014-09-01T16:57:31Z

Hi Arun. Thanks for the answer. Once optimisation of .SD will be available then this would be just a matter of 'taste' in terms of what is clearer to read between:

dt[, .SD[mean(var)>1], by=id]

and

dt[, .SD, by = id, having = mean(var) > 1]

Even though the second may also have a better appeal for people coming from other languages (SQL in particular). But again, this might be just my opinion. Maybe I just using SQL too much in the last period (lol).

eantonya · 2014-09-02T15:58:26Z

As far as the taste part goes - I really dislike adding an extra param, when it can be accomplished with simple and standard syntax (i.e. the first option above).

nigmastar · 2014-09-02T16:12:54Z

Curious. I was sure that you were that one most likely to appreciate this :-) (considering how much you wanted to eliminate by-without-by, mainly to improve readability, especially for people coming from other languages, if I remember correctly). Anyway, I know the two are quite different scenarios. I just wanted to share my point of view,:

I'm pretty sure it will be more readable for anyone not very familiar with R (or just data.table)
15 vs 14 (current) parameters of [.data.table doesn't really harm
You won't be forced to use it and it won't break any code. It will be an expression that, if provided, will possibly skip the execution of j for a particular group

eantonya · 2014-09-02T16:23:32Z

The reasons I didn't like the silent by-without-by and "having" are actually the same - I don't like to remember extra stuff, whether it be extra params or extra strange behavior.

I would argue that the first expression you wrote is much easier to read, because you don't have to keep reading the line, then discover that some new param is specified, and have to go back to the beginning of the sentence and reevaluate your mental model of what's going on.

jangorecki · 2015-01-17T23:56:33Z

What do you think of not adding having arg to [, but turning it into having() function and making it works in i argument, the same as order() works, e.g.:

dt[ having(var > 1), .(var = mean(var)), by = id ]
# would perform below without additional copy:
dt[, .(var = mean(var)), by = id ][ var > 1 ]

having would be a function to evaluate its argument in frame of dt and provide filtering to i.

franknarf1 · 2015-09-30T02:18:19Z

I think this FR is closely tied with #1269 "Returning only groups." I often want to get groups with some attribute and store them in a vector, like my_teams in this SO post. Here's the relevant line:

my_teams <- FantasyTeams[, max(table(Team)) <= 3, by=team_no][(V1)]$team_no
# or 
my_teams <- FantasyTeams[, if ( max(table(Team)) <= 3 ) 1, by=team_no]$team_no

With the having and "Returning only groups" FRs, this could be something like

my_teams <- FantasyTeams[, .(), by = team_no, having = { max(table(Team)) <= 3 }]$team_no

The code is just as long, but I prefer it, so I don't have to read j carefully to understand the objective.

franknarf1 · 2016-03-29T18:36:59Z

Another example from SO. The goal is to overwrite the Value column with 3L if some by-group level condition holds:

DT = setDT(structure(list(Ind = c(1L, 1L, 2L, 2L, 3L, 3L, 4L, 4L), ID = c("A", 
"A", "A", "A", "B", "B", "B", "B"), RegionStart = c(1L, 101L, 
1L, 101L, 1L, 101L, 1L, 101L), RegionEnd = c(100L, 200L, 100L, 
200L, 100L, 200L, 100L, 200L), Value = c(3L, 2L, 3L, 2L, 3L, 
2L, 5L, 5L), TN = c("N", "N", "T", "T", "N", "N", "T", "T")), .Names = c("Ind", 
"ID", "RegionStart", "RegionEnd", "Value", "TN"), row.names = c(NA, 
-8L), class = "data.frame"))

# current syntax 
DT[, Value := { 
  fixit = ( Value[TN=="N"] != 3L ) & ( uniqueN(Value) == 1L )
  if (fixit) 3L else Value
}, by=.(ID, RegionStart)]

# with "having"
DT[,
  Value := 3L
, by=.(ID, RegionStart)
, having={ ( Value[TN=="N"] != 3L ) & ( n_distinct(Value) == 1L ) }]

Besides arguably nicer syntax, I guess the having= way might also be more efficient, since only a subset of by-groups need to be modified. The most efficient way without having= probably looks like...

myeyes = DT[, .I[ ( Value[TN=="N"] != 3L ) & ( uniqueN(Value) == 1L )], by=.(ID, RegionStart)]$V1
DT[ myeyes, Value := 3L]

# or 

mygs = DT[, ( Value[TN=="N"] != 3L ) & ( uniqueN(Value) == 1L ), by=.(ID, RegionStart)][(V1)][, V1 := NULL]
DT[ mygs, Value := 3L, on=names(mygs)]

which are rather convoluted.

Edit: And another example to update if/when this feature is available: http://stackoverflow.com/q/36292702
(2016/4/26:) http://stackoverflow.com/q/36869784
(2016/06/16:) http://stackoverflow.com/q/37855013/

franknarf1 · 2016-05-27T15:17:08Z

Another example from SO. It could be used to select strictly unique rows (related to #1163 ):

DT = setDT(structure(list(id = c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 
2, 3, 4), dsp = c(5, 6, 7, 8, 6, 6, 7, 8, 5, 6, 9, 8, 5, 6, 7, 
NA), status = c(FALSE, TRUE, FALSE, TRUE, FALSE, FALSE, FALSE, 
TRUE, FALSE, TRUE, FALSE, FALSE, FALSE, TRUE, FALSE, TRUE)), .Names = c("id", 
"dsp", "status"), row.names = c(NA, -16L), class = "data.frame"))

# my current way to select "strictly unique" rows
Bigdt[, .N, by=names(Bigdt)][N == 1][, N := NULL][]

# could be...
Bigdt[, .SD, by=names(Bigdt), having ={.N == 1L}]

Note that Bigdt[, if (.N == 1L) .SD, by=names(Bigdt)] does not work here, since .SD is empty. Maybe that could be helped by #1269 , though.

And another from SO: http://stackoverflow.com/q/38272608/ They want to select groups based on stuff in the last row, so having = heath condition[.N] == "not healthy" should do it.

And another simple case (filtering by size): http://stackoverflow.com/q/39085450/

And another, with an anti join:

ID <- c("A","A","A","B","B","C","D")
Value <- c(0,1,2,0,2,0,0)
df <- data.frame(ID,Value)

library(data.table)
setDT(df)

# use j = max() to get GForce speedup
df[ !df[, max(Value), by=ID][V1 > 0, .(ID, Value = 0)], on=.(ID, Value)]

# do the more standard thing, if j = if (...) x
df[ !df[, if (max(Value) > 0) .(Value = 0), by=ID], on=.(Value, ID) ]

# desired syntax
df[ !df[, .(Value = 0), by=ID, having = max(Value) > 0], on=.(Value, ID) ]

Not such a great example, though.

And another with an answer like dt[, if(uniqueN(time)==1L) .SD, by=name, .SDcols="time"]

And another: http://stackoverflow.com/q/43354165/

And another: http://stackoverflow.com/q/43613087/

Another (though it might get deleted): http://stackoverflow.com/q/43635968/

Another http://stackoverflow.com/a/43765352/

Another http://chat.stackoverflow.com/transcript/message/37148860#37148860

Another https://stackoverflow.com/questions/45464333/assign-a-binary-vector-based-on-blocks-of-data-within-another-vector/

Another https://stackoverflow.com/questions/32259620/how-to-remove-unique-entry-and-keep-duplicates-in-r/32259758#32259758

Un autre https://stackoverflow.com/q/45557011/

Haiyou https://stackoverflow.com/questions/45598397/filter-data-frame-matching-all-values-of-a-vector

Um mais https://stackoverflow.com/a/45721286/

lingwai yige https://stackoverflow.com/a/45820567/

and https://stackoverflow.com/q/46251221/

uno mas https://stackoverflow.com/questions/46307315/show-sequences-that-include-a-variable-in-r

tambem https://stackoverflow.com/q/46638058/

And another. I want to subset my data.table (myDT) to entries that aren't found in a reference table (idDT):

library(data.table)
idDT = data.table(id = 1:3, v = c("A","B","C"))
myDT = data.table(id = 3:4, z = c("gah","egad"))

# my attempt
idDT[myDT, on=.(id), .SD[.N == 0L], by=.EACHI]
# Empty data.table (0 rows) of 2 cols: id,v

# workaround
myDT[, .SD[idDT[.SD, on=.(id), .N == 0, by=.EACHI]$V1]]

# desired notation (with having=)
myDT[, .SD, by = id, having = idDT[.BY, on=.(id), .N]==0L]

This would be inefficient, though, since my desired notation entails each by= value making a separate join to idDT. In that sense, maybe it's not the best example.

mais um https://stackoverflow.com/questions/47765283/r-data-table-group-by-where/47765308?noredirect=1#comment82524998_47765308 could do DT[, if (any(status == "A") && !any(status == "B")) .SD, by=id] or with a having parameter DT[, .SD, by=id, having = any(status == "A") && !any(status == "B")]

and then https://stackoverflow.com/a/48669032/ m[, if(isTRUE(any(passed))) .SD, by=id] should be m[by = id, having = isTRUE(any(passed))]

mais um exemplo https://stackoverflow.com/q/49072250/

ein anderer https://stackoverflow.com/a/49211292/ stock_profile[, sum(Value), by=Pcode, having=any(Location=="A" & NoSales == "Y")][, sum(V1)]

mais um https://stackoverflow.com/a/49366998/

autre https://stackoverflow.com/a/49919015/

y https://stackoverflow.com/questions/50257643/deleting-rows-in-r-with-value-less-than-x

moar https://stackoverflow.com/q/54582048

e https://stackoverflow.com/q/56283005

keep groups if .N==k (also many at the dupe target) https://stackoverflow.com/questions/56794306/only-get-data-table-groups-with-a-given-number-of-rows

keep groups if any(diff(sorted_col)) <= threshold https://stackoverflow.com/q/57512417

keep if max(x) < threshold https://stackoverflow.com/a/57698641

ywhcuhk · 2016-05-28T04:52:14Z

@eantonya IMHO, adding the having parameter will actually make it easier to remember. Excessive conciseness can be hard to memorize. Moreover, making data.table more like SQL isn't a bad idea.

In data.table FAQ:

2.16 I’ve heard that data.table syntax is analogous to SQL.
Yes : ...

jangorecki · 2016-07-09T20:20:53Z

@ywhuofu data.table already accepts order function to i argument, which is what base R user would expect. The same way we translate sql ORDER to i = order(...) we can do with HAVING. It fits well, as i in data.frame is used for subsetting (having is just delayed subsetting after aggregation) or reordering.

ColeMiller1 · 2020-02-15T01:08:35Z

Could this be the API?

dt <- data.table(id   = rep(1:2, each = 2),
                 var  = c(0.2, 0.5, 1.5, 1.3))

dt[having.i(mean(var) > 1, by = id)]
  id var
1  2 1.5
2  2 1.3

I have implemented this version although it sets a restriction of only using gforce optimized functions as well as some functions that do not depend on groupings (e.g., +, |, &, etc.). In SQL, summary functions are what are used although I would understand if Cdogroups would want to be supported.

One additional note. It seems like it would be difficult to fit in dt[having(var > 3), .(var = mean(x)), by = .(grp)] within the current '[.data.table' code. There would need to be some checks to make sure the syntax were correct.

n = 1e6
grps = 1e5
head_n = 2L
dt = data.table::data.table(x = sample(grps, n, TRUE), y = runif(n))
 
# A tibble: 2 x 13
  expression                                                            min   median
  <bch:expr>                                                       <bch:tm> <bch:tm>
1 lw[having.i(.N < 2L | sum(y) > 11 | median(y) < 0.7, by = x)]     114.13ms 124.98ms
2 dt[dt[, .I[.N < 2L | sum(y) > 11 | median(y) < 0.7], by = x]$V1] 4000ms   4000ms


# A tibble: 2 x 13
  expression                           min  median `itr/sec` mem_alloc `gc/sec` n_itr
  <bch:expr>                       <bch:t> <bch:t>     <dbl> <bch:byt>    <dbl> <int>
1 lw[having.i(.N < 2L, by = x)]     30.2ms  35.3ms     27.9     8.02MB     3.99    14
2 dt[dt[, .I[.N < 2L], by = x]$V1] 106.1ms 110.4ms      8.81    6.13MB    10.6      5

MichaelChirico · 2020-02-15T01:20:12Z

I would prefer this as an added parameter, named either as having= or group_filter= (or something else that doesn't rely on SQL awareness to know what it does on sight).

e.g. I think it would be confusing to combine row filters in i with group-level filters also in i

ColeMiller1 · 2020-02-15T01:46:48Z

Would having = work on a subset of the data or would you only be able to use the i arg or the having arg? I also assume that having would occur prior to the j being evaluated. How would .BY and .GRP and soon .NGRP work with ```having = ````?

renkun-ken · 2020-02-15T02:05:54Z

There are not many syntactic choices:

Adding a new argument such as having
Leveraging existing arguments: i, j, by.

If row filter and group filter are both needed, dt[row_selector & group_selector, ...] does not look right therefore it seems that for such use case, row filter and group filter should not appear in the same argument. Then i is ruled out.

Then there won't be many syntactic choices.

Leveraging by could make it look confusing. For example,

dt[, .SD, by = having(.(id), mean(var > 1))]
dt[, .SD, by = id ~ mean(var) > 1]

Adding special function to j does not look good.

dt[, having(mean(var) > 1, .SD), by = id]

Now, the code I find look best to me is the most original version

dt[, if (mean(var) > 1) .SD, by = id]
dt[, if (mean(var) > 1) .(x = sum(x), y = sum(y)), by = id]

What I really want is keep the optimization done after the group filtering. Can we detect theif expression in j and optimize into it such as keeping GForce to work inside if statement?

franknarf1 · 2020-02-15T03:35:41Z

@renkun-ken Or overload another infix operator?

dt[, mean(var) > 1 ? .SD, by=id]

One advantage of a special symbol over if is that there's no chance the user put a matching else later.

renkun-ken · 2020-02-15T03:47:12Z

@franknarf1 It seems that while we are trying to detect if in j, we could also check if has else if and else. We could optimize if-only case and leave if-else un-optimized. Maybe later we could handle if-else case too. Personally I still prefer optimizing code to overriding or leveraging too much on existing operators.

jangorecki · 2020-02-15T03:48:53Z

@franknarf1 this is cool C syntax, although not sure if it wouldn't complicate to much here.
var > 1 ? d : e could work as well, isn't it?

renkun-ken · 2020-02-15T03:52:01Z

var > 1 ? d : e looks concise but only works for inline simple case since d and e could be anything like {...} and the operator precedence may be confusing. Are we only trying to allow .SD to perform pure group filtering, or any expression in j here?

Adding syntax has a problem that user needs to be aware that the syntax is specially handled and should not work inside j. For example, user may expect

dt[, mean(var) > 1 ? 0 : (sd(var) < 1 ? 1 : 0), by = id]

to work, and even

dt[, mean(var) > 1 ? 0 : 1]
dt[, mean(var) > 1 ? 0 : (sd(var) < 1 ? 1 : 0)]

to work in general.

renkun-ken · 2020-02-15T04:17:25Z

I'm a bit confused here.

Does

dt[, .SD, by = id, having = mean(var) > 1]

have any advantage over

dt[, if(mean(var) > 1) .SD, by = id]

since mean(var) > 1 will always be evaluated for each group. Does it only serve as a syntactic sugar or we are trying to optimize over this somehow to have higher performance?

franknarf1 · 2020-02-15T05:40:31Z

@jangorecki

@franknarf1 this is cool C syntax, although not sure if it wouldn't complicate to much here.
var > 1 ? d : e could work as well, isn't it?

Yeah, that would be cool. Operator precedence might get in the way without {}s as @renkun-ken pointed out (ex = quote(x & y ? a+b : v+w); str(rapply(as.list(ex), as.list, how="replace")))

I'm a bit confused here.

Does
dt[, .SD, by = id, having = mean(var) > 1]
have any advantage over
dt[, if(mean(var) > 1) .SD, by = id]
since mean(var) > 1 will always be evaluated for each group. Does it only serve as a syntactic sugar or we are trying to optimize over this somehow to have higher performance?

I guess until now I had preferred having= because I find it a little clearer to read and imagine it's easier to maintain as compared to adding further syntactical magic to j. On the other hand, I think I might instead prefer the j syntax magic, since

I am used to if () ... already; and like the ? way too if it's feasible.
If it is integrated in j, then no additional questions need to be answered about its behavior (eg, DT[, x := if (cond) y, by=id] creates NAs if the condition is met in some groups but not others and this behavior shouldn't need to be re-explained for having=).

Regarding optimization, it seems like there are a lot of examples where the having condition itself could benefit from some version of GForce, since it usually is an expression like max(x) > 0, max(x) == 0.

For my own use, besides the optimization, I guess it would be mostly useful for the return-only-groups case mentioned above #1269

> dt[, if (mean(var) > 1) .(), by=id] 
> # instead of ...
> dt[, mean(var) > 1, by=id][V1 == TRUE, !"V1"]
   id
1:  2

MichaelChirico · 2020-02-15T05:47:51Z

Nice points Frank. in addition to the huge compendium of use cases you've built (thanks again btw!). it may in fact be easier to do GForce in the having= version since we can just apply the gforce logic to having similar to j rather than trying to do NSE to accomplish the same. though that may interact w Jan's WIP to move a lot of j code to C -- any thoughts there Jan?

…

On Sat, Feb 15, 2020, 1:40 PM Frank ***@***.***> wrote: @jangorecki <https://github.com/jangorecki> @franknarf1 <https://github.com/franknarf1> this is cool C syntax, although not sure if it wouldn't complicate to much here. var > 1 ? d : e could work as well, isn't it? Yeah, that would be cool. Operator precedence might get in the way without {}s as @renkun-ken <https://github.com/renkun-ken> pointed out (ex = quote(x & y ? a+b : v+w); str(rapply(as.list(ex), as.list, how="replace")) ) I'm a bit confused here. Does dt[, .SD, by = id, having = mean(var) > 1] have any advantage over dt[, if(mean(var) > 1) .SD, by = id] since mean(var) > 1 will always be evaluated for each group. Does it only serve as a syntactic sugar or we are trying to optimize over this somehow to have higher performance? I guess until now I had preferred having= because I find it a little clearer to read and imagine it's easier to maintain as compared to adding further syntactical magic to j. On the other hand, I think I might instead prefer the j syntax magic, since - I am used to if () ... already; and like the ? way too if it's feasible. - If it is integrated in j, then no additional questions need to be answered about its behavior (eg, DT[, x := if (cond) y, by=id] creates NAs if the condition is met in some groups but not others and this behavior shouldn't need to be re-explained for having=). Regarding optimization, it seems like there are a lot of examples where the having condition itself could benefit from some version of GForce, since it usually is an expression like max(x) > 0, max(x) == 0. For my own use, besides the optimization, I guess it would be mostly useful for the return-only-groups case mentioned above #1269 <#1269> > dt[, if (mean(var) > 1) .(), by=id] > # instead of ... > dt[, mean(var) > 1, by=id][V1 == TRUE, !"V1"] id 1: 2 — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#788?email_source=notifications&email_token=AB2BA5OCN4IW3N6QQJU6RJ3RC555BA5CNFSM4ATSQPMKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEL3CK7A#issuecomment-586556796>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB2BA5MD7ZXWSRRHVEJM6C3RC555BANCNFSM4ATSQPMA> .

jangorecki · 2020-02-15T14:08:22Z

j code to be moved to C is the code that is responsible for column selection only, so guessing with argument. Won't interfere here.

ColeMiller1 · 2020-02-15T15:23:17Z

Since the FR is for add a 'having' parameter..., the word having should be somewhere in the solution. Optimization for ternary operators seems like a separate issue.

My preference for having.i() is because of the mantra of data.table: subset / order in i, select in j, group in by. having is just a special case of subsetting.

Regardless, if there is a new argument having, would the API support an i arg? Most use cases do not seem to need that requirement.

ColeMiller1 · 2020-05-02T14:57:54Z

What should be the behavior of ordering? That is, most of the current approaches automatically re-order:

library(data.table)

dt = data.table(grp = c(1L, 2L, 1L, 2L), x = letters[sample(4L)])
dt
#>      grp      x
#>    <int> <char>
#> 1:     1      a
#> 2:     2      b
#> 3:     1      c
#> 4:     2      d
dt[dt[, .I[.N > 0L], by = grp]$V1]
#>      grp      x
#>    <int> <char>
#> 1:     1      a
#> 2:     1      c
#> 3:     2      b
#> 4:     2      d

Should the having arg return a result that is re-ordered according to by?

franknarf1 · 2020-05-02T23:50:14Z

What should be the behavior of ordering?
Should the having arg return a result that is re-ordered according to by?

@ColeMiller1 Fwiw, I would expect having= to only appear when by= also appears, so the results would be grouped as in your example with ...$V1.

MichaelChirico · 2020-05-03T01:24:27Z

Yes I would expect the ordering to be consistent:

DT[i, j, by, having]
# < == >
DT[i, if (having) j, by]

jangorecki · 2020-05-03T06:14:54Z

I think there was no agreement on API, particularly on having new having argument in [. @mattdowle wdyt?
Current approach of using DT[, if (.N > 1L) .SD, col1] is good one, not really complicated, easy to extend, but little bit more difficult to optimise.
My idea was to use having as function call in i: DT[having(N > 1L), .N, col1], but then it is not possible to provide normal subsetting to i.
Alternatively new arg could be a sub-argument of by, haven't much thought about it but something like DT[, .N, by=.(col1, .having = N > 1L)], so the extra grouping-related argument is encapsulated into by argument. This is the proper way to scale up number of arguments.

arunsrinivasan added the feature request label Aug 29, 2014

This was referenced Jul 22, 2015

Feature request: return a vector of keys instead of a data.table when conditioning in j #1242

Closed

Why doesn't DT[v1==max(v1), , by=v2] work? #1246

Closed

franknarf1 mentioned this issue Dec 1, 2015

[Request] Filter by group criteria #1456

Closed

franknarf1 mentioned this issue Jan 13, 2016

.I in DT[i, .I, by] should return row numbers of DT not of DT[i] #1494

Open

DavidArenburg mentioned this issue Jun 19, 2019

Generate data tables lazily tidyverse/dtplyr#38

Closed

MichaelChirico mentioned this issue Jul 5, 2019

Two features for data.table one liners #3681

Closed

MichaelChirico mentioned this issue Feb 15, 2020

Master list of most-requested issues #3189

Open

75 tasks

ColeMiller1 mentioned this issue May 2, 2020

Having arg #4412

Draft

7 tasks

MichaelChirico added the High label May 30, 2020

jangorecki removed the High label Jun 3, 2020

MichaelChirico mentioned this issue Apr 7, 2024

Return all duplicated rows #924

Closed

MichaelChirico added the top request One of our most-requested issues label Apr 14, 2024

MichaelChirico mentioned this issue Aug 25, 2024

filter by group #6397

Closed

MichaelChirico mentioned this issue Dec 3, 2024

Topn doing order |> head/tail #5167

Open

4 tasks

MichaelChirico mentioned this issue Dec 12, 2024

New function to evaluate any in i grouped by keyby or by #6661

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add a 'having' parameter to `[.data.table` #788

add a 'having' parameter to `[.data.table` #788

nigmastar commented Aug 29, 2014

arunsrinivasan commented Aug 29, 2014

nigmastar commented Sep 1, 2014

eantonya commented Sep 2, 2014

nigmastar commented Sep 2, 2014

eantonya commented Sep 2, 2014

jangorecki commented Jan 17, 2015

franknarf1 commented Sep 30, 2015

franknarf1 commented Mar 29, 2016 •

edited

Loading

franknarf1 commented May 27, 2016 •

edited

Loading

ywhcuhk commented May 28, 2016

jangorecki commented Jul 9, 2016 •

edited

Loading

ColeMiller1 commented Feb 15, 2020

MichaelChirico commented Feb 15, 2020

ColeMiller1 commented Feb 15, 2020

renkun-ken commented Feb 15, 2020 •

edited

Loading

franknarf1 commented Feb 15, 2020

renkun-ken commented Feb 15, 2020 •

edited

Loading

jangorecki commented Feb 15, 2020 •

edited

Loading

renkun-ken commented Feb 15, 2020 •

edited

Loading

renkun-ken commented Feb 15, 2020

franknarf1 commented Feb 15, 2020

MichaelChirico commented Feb 15, 2020 via email

jangorecki commented Feb 15, 2020

ColeMiller1 commented Feb 15, 2020

ColeMiller1 commented May 2, 2020

franknarf1 commented May 2, 2020

MichaelChirico commented May 3, 2020

jangorecki commented May 3, 2020

add a 'having' parameter to [.data.table #788

add a 'having' parameter to [.data.table #788

Comments

nigmastar commented Aug 29, 2014

arunsrinivasan commented Aug 29, 2014

nigmastar commented Sep 1, 2014

eantonya commented Sep 2, 2014

nigmastar commented Sep 2, 2014

eantonya commented Sep 2, 2014

jangorecki commented Jan 17, 2015

franknarf1 commented Sep 30, 2015

franknarf1 commented Mar 29, 2016 • edited Loading

franknarf1 commented May 27, 2016 • edited Loading

ywhcuhk commented May 28, 2016

jangorecki commented Jul 9, 2016 • edited Loading

ColeMiller1 commented Feb 15, 2020

MichaelChirico commented Feb 15, 2020

ColeMiller1 commented Feb 15, 2020

renkun-ken commented Feb 15, 2020 • edited Loading

franknarf1 commented Feb 15, 2020

renkun-ken commented Feb 15, 2020 • edited Loading

jangorecki commented Feb 15, 2020 • edited Loading

renkun-ken commented Feb 15, 2020 • edited Loading

renkun-ken commented Feb 15, 2020

franknarf1 commented Feb 15, 2020

MichaelChirico commented Feb 15, 2020 via email

jangorecki commented Feb 15, 2020

ColeMiller1 commented Feb 15, 2020

ColeMiller1 commented May 2, 2020

franknarf1 commented May 2, 2020

MichaelChirico commented May 3, 2020

jangorecki commented May 3, 2020

add a 'having' parameter to `[.data.table` #788

add a 'having' parameter to `[.data.table` #788

franknarf1 commented Mar 29, 2016 •

edited

Loading

franknarf1 commented May 27, 2016 •

edited

Loading

jangorecki commented Jul 9, 2016 •

edited

Loading

renkun-ken commented Feb 15, 2020 •

edited

Loading

renkun-ken commented Feb 15, 2020 •

edited

Loading

jangorecki commented Feb 15, 2020 •

edited

Loading

renkun-ken commented Feb 15, 2020 •

edited

Loading