patterns in .SDcols #1878

eantonya · 2016-10-14T22:18:16Z

I thought I've seen this FR before, but couldn't find it.

Would be nice if we could specify column names using regex expressions in .SDcols. Currently one has to do something like .SDcols = grep("mypattern", names(myDT)), which you can't chain on, and is pretty fragile.

Perhaps the patterns function from melt can be reused here, making the syntax .SDcols = patterns("mypattern").

The text was updated successfully, but these errors were encountered:

ksavin · 2016-10-26T16:37:11Z

I'd like to add, that patterns would be super useful in j as well.

It is often needed to select columns with grep and the only way is to refer it via names(), e.g.
veryLongDataTableName[, grep('lag', names(veryLongDataTableName), with = FALSE]

or to remove multiple columns, e.g.
dt[, (grep('lag', names(dt)) := NULL]

or to make new column names:
dt[, paste0(grep('^test', names(dt), value = TRUE), '_sqrt') := lapply(.SD, sqrt), .SDcols = grep('^test', names(dt), value = TRUE)]

These would be much shorter with patterns available in j as well:
veryLongDataTableName[, patterns('lag'), with = FALSE]
dt[, patterns('lag') := NULL]
dt[, paste0(patterns('^test'), '_sqrt') := lapply(.SD, sqrt), .SDcols = patterns('^test')]

Alternatively, it would be handy to have a special symbol for column names, selected in .SDcols, e.g. .NM

MichaelChirico · 2016-10-26T16:39:22Z

names(.SD) should suffice...

On Oct 26, 2016 12:37 PM, "ksavin" [email protected] wrote:

I'd like to add, that patterns would be super useful in j as well.

It is often needed to select columns with grep and the only way is to
refer it via names(), e.g.
veryLongDataTableName[, grep('lag', names(veryLongDataTableName), with =
FALSE]

or to remove multiple columns, e.g.
dt[, (grep('lag', names(dt)) := NULL]

or to make new column names:
dt[, paste0(grep('^test', names(dt), value = TRUE), '_sqrt') :=
lapply(.SD, sqrt), .SDcols = grep('^test', names(dt), value = TRUE)]

These would be much shorter with patterns available in j as well:
veryLongDataTableName[, patterns('lag'), with = FALSE]
dt[, patterns('lag') := NULL]
dt[, paste0(patterns('^test'), '_sqrt') := lapply(.SD, sqrt), .SDcols =
patterns('^test')]

Alternatively, it would be handy to have a special symbol for column
names, selected in .SDcols, e.g. .NM

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1878 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHQQdapOU-mUdVF-4ADOsKkuD4cSC3Lvks5q34G7gaJpZM4KXfit
.

ksavin · 2016-10-26T16:49:57Z

Forgot I am in fact using names(.SD) for these cases :)
Still, way cleaner and shorter with patterns.

MichaelChirico · 2016-10-26T19:20:51Z

I do regularly things like grep(pattern, names(.SD)) (and maybe add value = TRUE)... Maybe I'm just used to this setup.

mbacou · 2016-12-08T05:43:16Z

Another one I tend to use is .SDcols=names(.SD) %like% "mypattern", a little ugly. Upvoting this FR as well.

hannes101 · 2017-03-24T13:26:02Z

Just as a reference to a SO question, please update it there also if it got implemented :-)
https://stackoverflow.com/questions/42999949/select-data-table-columns-with-grep-like-partial-matching

franknarf1 · 2017-11-28T16:10:11Z

SO q to update: https://stackoverflow.com/questions/47535845/perform-operations-on-data-table-columns-based-on-regex?noredirect=1

A to update: https://stackoverflow.com/a/51331981/

HughParsonage · 2018-05-18T15:11:47Z

I wrote select_grep in package hutils before realizing this was an outstanding issue:

library(hutils)
library(data.table)
dt <- data.table(x1 = 1, x2 = 2, y = 0)
select_grep(dt, "x")
#>    x1 x2
#> 1:  1  2
    select_grep(dt, "x", .and = "y")
#>    x1 x2 y
#> 1:  1  2 0
    select_grep(dt, "x", .and = "y", .but.not = "x2")
#>    x1 y
#> 1:  1 0

Created on 2018-05-19 by the reprex package (v0.2.0).

eantonya added the feature request label Oct 14, 2016

MichaelChirico mentioned this issue Dec 5, 2018

.SDcols gives a strange message if length() 0 #3185

Closed

MichaelChirico pushed a commit that referenced this issue Dec 5, 2018

Closes #1878 and #3185 -- .SDcols accepts patterns

b24f490

This was referenced Dec 5, 2018

RFC: .SDcols=patterns() #3186

Merged

Master list of most-requested issues #3189

Open

mattdowle added this to the 1.12.0 milestone Dec 14, 2018

mattdowle closed this as completed in #3186 Dec 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

patterns in .SDcols #1878

patterns in .SDcols #1878

eantonya commented Oct 14, 2016

ksavin commented Oct 26, 2016

MichaelChirico commented Oct 26, 2016

ksavin commented Oct 26, 2016

MichaelChirico commented Oct 26, 2016

mbacou commented Dec 8, 2016

hannes101 commented Mar 24, 2017 •

edited

Loading

franknarf1 commented Nov 28, 2017 •

edited

Loading

HughParsonage commented May 18, 2018

patterns in .SDcols #1878

patterns in .SDcols #1878

Comments

eantonya commented Oct 14, 2016

ksavin commented Oct 26, 2016

MichaelChirico commented Oct 26, 2016

ksavin commented Oct 26, 2016

MichaelChirico commented Oct 26, 2016

mbacou commented Dec 8, 2016

hannes101 commented Mar 24, 2017 • edited Loading

franknarf1 commented Nov 28, 2017 • edited Loading

HughParsonage commented May 18, 2018

hannes101 commented Mar 24, 2017 •

edited

Loading

franknarf1 commented Nov 28, 2017 •

edited

Loading