Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: pass column to key #1474

Closed
ptoche opened this issue Dec 20, 2015 · 5 comments
Closed

Feature Request: pass column to key #1474

ptoche opened this issue Dec 20, 2015 · 5 comments

Comments

@ptoche
Copy link

ptoche commented Dec 20, 2015

# make a data.table
set.seed(1)
dt <- data.table(id = 1:5, x1 = 1:5, x2 = 5:1, x3 = round(runif(5, 1, 5), 0), key = "id")

I can define the data.table with either id = 1:10 or "id" = 1:10, but I must define the key with key = "id" as key = id does not work:

# Feature Request:  ``key = id``  
dt <- data.table(id = 1:5, x1 = 1:5, x2 = 5:1, x3 = round(runif(5, 1, 5), 0), key = id)
##Error in data.table(id = 1:5, x1 = 1:5, x2 = 5:1, x3 = round(runif(5,  : 
##  object 'id' not found
@jangorecki
Copy link
Member

Any proposal how to handle following key definition?

library(data.table)
k = "b"
dt = data.table(b = "a", k = "b", key = k)

Just put your expected results.

@MichaelChirico
Copy link
Member

@jangorecki this is in relation to a SO question about consistency as it seems some data.table functions (e.g., subset.data.table) accept expressions while others don't.

The relevant code of data.table is here:

if (!is.null(key)) {
        if (!is.character(key)) 
            stop("key argument of data.table() must be character")
        if (length(key) == 1L) {
            key = strsplit(key, split = ",")[[1L]]
        }
        setkeyv(value, key)
    }

This snippet comes after the assignment of the columns; seems reasonable for there to be a way to switch setkeyv to setkey, perhaps with a with option.

There is where things get dicey, though, because leaving with = TRUE as the default (as in [.data.table]) would break existing code.

Another alternative would be to interpret lists as expressions (as in by/keyby) by default, so that something data.table-y like data.table(b = "a", k = "b", key = .(k)) would set the key to k while data.table(b = "a", k = "b", key = k) would obey the current behavior (i.e. set the key to b).

@franknarf1
Copy link
Contributor

I can define the data.table with either id = 1:10 or "id" = 1:10

This is a basic feature of R and not specific to data.table:

c(a   = 1)   # ok
c("a" = 1)   # ok
c(a   = "b") # ok
c(a   = b)   # nope

something data.table-y like data.table(b = "a", k = "b", key = .(k)) would set the key to k while data.table(b = "a", k = "b", key = k) would obey the current behavior (i.e. set the key to b).

Yeah, I was thinking along those lines too. One problem with that is that it is inconsistent with how .() works in other places, like by=k vs by=.(k), which do the same thing, with the former just being shorthand.

On the other hand by="k,x", by=.(k,x) and by=c("k","x") all work, so I'm sure it can be done to match the OP's expectations. I just dunno how much work it is... there are other places where such syntax (for raw names without quotes) would also be nice, like in .SDcols=: #642 And there's also syntax like .SDcols=V1:V3 (which currently works, even though .SDcols=V1 does not) that might be nice to have in many places. I guess there's an almost endless array of nonstandard-evaluation shortcuts one might ask for...

@MichaelChirico
Copy link
Member

@franknarf1 Good point about by=k vs. by=.(k); what comes to mind there is the use of character vectors on the LHS of :=. So perhaps key=k should default to setting column k as the key, while key=(k) sets b?

@jangorecki
Copy link
Member

Closing this as won't fix. It doesn't seems to bring anythin more useful than 2 quote character less to type, and it introduces the complexity of catching key argument unevaluated. Moreover the problem described almost 5 years ago my be, about such interface, has not been addressed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants