Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

forderv handles complex input #3701

Merged
merged 33 commits into from
Jul 19, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
73a63ca
Closes #1444 -- setkey works on tables with complex columns
Jul 10, 2019
bdce4a5
extra test for group operation mentioned in issue
Jul 10, 2019
e6533dc
new tests for coverage
Jul 11, 2019
c210ede
missing arg
Jul 11, 2019
55bd501
Closes #1703 -- forderv handles complex input
Jul 12, 2019
2e8c996
slight re-tooling, now passing tests
Jul 13, 2019
e3a6aa6
more progress; but stonewalled by bmerge
Jul 13, 2019
37e8431
Merge branch 'master' of https://github.com/Rdatatable/data.table int…
Jul 13, 2019
52bcac0
moved new logic to C so e.g. bmerge can call it from there
Jul 13, 2019
bf59da2
some coverage tests, extension to rleid()
Jul 13, 2019
09c8b19
progress making ctwiddle (dtwiddle for cplx)
Jul 13, 2019
e5d1d1d
start preferring Rcomplex type
Jul 13, 2019
9a4cef4
switch to Rcomplex API
Jul 13, 2019
494468f
Merge branch 'master' of https://github.com/Rdatatable/data.table int…
Jul 13, 2019
e0b17b8
Merge branch 'cplx_setkey' into cplx_forder
Jul 14, 2019
761fe90
setkey now works on complex columns
Jul 14, 2019
7261011
ostensibly done uniqlist; progress on bmerge
Jul 14, 2019
25cdf0d
Merge branch 'master' into cplx_forder
Jul 17, 2019
681ab5d
Merge branch 'master' into cplx_forder
mattdowle Jul 17, 2019
81a941a
Merge branch 'master' into cplx_forder
mattdowle Jul 18, 2019
5f443cb
scale back attempts at bmerge, all of uniqlist
Jul 18, 2019
b922ee9
tidy up tests
Jul 18, 2019
b656541
unique also works
Jul 18, 2019
3c3228b
updated NEWS item & added coverage tests
Jul 18, 2019
009935c
one more nocov
Jul 18, 2019
f59dc57
actually hit LGLSXP branch!
Jul 18, 2019
0338226
more coverage
Jul 18, 2019
523ab00
Merge branch 'master' into cplx_forder
Jul 18, 2019
598097b
use direct double comparison instead of type punning
Jul 19, 2019
2df22a3
replaced big block with smaller modification to the main loop; also h…
mattdowle Jul 19, 2019
dd7b24a
memcmp for complex instead of == on double
mattdowle Jul 19, 2019
5e81694
merge master
mattdowle Jul 19, 2019
a9d9165
news item tidy
mattdowle Jul 19, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 3 additions & 7 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@

16. `as.data.table` now unpacks columns in a `data.frame` which are themselves a `data.frame`. This need arises when parsing JSON, a corollary in [#3369](https://github.com/Rdatatable/data.table/issues/3369#issuecomment-462662752). `data.table` does not allow columns to be objects which themselves have columns (such as `matrix` and `data.frame`), unlike `data.frame` which does. Bug fix 19 in v1.12.2 (see below) added a helpful error (rather than segfault) to detect such invalid `data.table`, and promised that `as.data.table()` would unpack these columns in the next release (i.e. this release) so that the invalid `data.table` is not created in the first place.

17. `CJ` has been ported to C and parallelized, thanks to a PR by Michael Chirico, [#3596](https://github.com/Rdatatable/data.table/pull/3596). All types benefit (including newly supported complex, part of [#3690](https://github.com/Rdatatable/data.table/issues/3690)), and as in many `data.table` operations, factors benefit more than character.
17. `CJ` has been ported to C and parallelized, thanks to a PR by Michael Chirico, [#3596](https://github.com/Rdatatable/data.table/pull/3596). All types benefit, but, as in many `data.table` operations, factors benefit more than character.

```R
# default 4 threads on a laptop with 16GB RAM and 8 logical CPU
Expand All @@ -114,7 +114,7 @@
# 0.357 0.763 0.292 # now
```

18. New function `coalesce(...)` has been written in C, and is multithreaded for numeric, complex, and factor types. It replaces missing values according to a prioritized list of candidates (as per SQL COALESCE, `dplyr::coalesce`, and `hutils::coalesce`), [#3424](https://github.com/Rdatatable/data.table/issues/3424). It accepts any number of vectors in several forms. For example, given three vectors `x`, `y`, and `z`, where each `NA` in `x` is to be replaced by the corresponding value in `y` if that is non-NA, else the corresponding value in `z`, the following equivalent forms are all accepted: `coalesce(x,y,z)`, `coalesce(x,list(y,z))`, and `coalesce(list(x,y,z))`.
18. New function `coalesce(...)` has been written in C, and is multithreaded for `numeric` and `factor`. It replaces missing values according to a prioritized list of candidates (as per SQL COALESCE, `dplyr::coalesce`, and `hutils::coalesce`), [#3424](https://github.com/Rdatatable/data.table/issues/3424). It accepts any number of vectors in several forms. For example, given three vectors `x`, `y`, and `z`, where each `NA` in `x` is to be replaced by the corresponding value in `y` if that is non-NA, else the corresponding value in `z`, the following equivalent forms are all accepted: `coalesce(x,y,z)`, `coalesce(x,list(y,z))`, and `coalesce(list(x,y,z))`.

```R
# default 4 threads on a laptop with 16GB RAM and 8 logical CPU
Expand All @@ -131,9 +131,7 @@
# TRUE
```

19. `shift` now supports type `complex`, part of [#3690](https://github.com/Rdatatable/data.table/issues/3690).

20. `setkey` now supports type `complex` as value columns (not as key columns), [#1444](https://github.com/Rdatatable/data.table/issues/1444). Thanks Gareth Ward for the report.
19. Type `complex` is now supported by `setkey`, `setorder`, `:=`, `by=`, `keyby=`, `shift`, `dcast`, `frank`, `rowid`, `rleid`, `CJ`, `coalesce`, `unique`, and `uniqueN`, [#3690](https://github.com/Rdatatable/data.table/issues/3690). Thanks to Gareth Ward and Elio Campitelli for their reports and input. Sorting `complex` is achieved the same way as base R; i.e., first by the real part then by the imaginary part (as if the `complex` column were two separate columns of `double`). There is no plan to support joining/merging on `complex` columns until a user demonstrates a need for that.

#### BUG FIXES

Expand Down Expand Up @@ -198,8 +196,6 @@

24. `column not found` could incorrectly occur in rare non-equi-join cases, [#3635](https://github.com/Rdatatable/data.table/issues/3635). Thanks to @UweBlock for the report.

25. Complex columns used in `j` during grouping would get mangled, [#3639](https://github.com/Rdatatable/data.table/issues/3639). A related bug prevented assigning complex values using `:=` except for full-column plonks. We still do not support grouping `by` a complex column. Thanks to @eliocamp for filing the bug report.

#### NOTES

1. `rbindlist`'s `use.names="check"` now emits its message for automatic column names (`"V[0-9]+"`) too, [#3484](https://github.com/Rdatatable/data.table/pull/3484). See news item 5 of v1.12.2 below.
Expand Down
2 changes: 1 addition & 1 deletion R/bmerge.R
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ bmerge = function(i, x, icols, xcols, roll, rollends, nomatch, mult, ops, verbos
# careful to only plonk syntax (full column) on i/x from now on otherwise user's i and x would change;
# this is why shallow() is very importantly internal only, currently.

supported = c("logical", "integer", "double", "character", "factor", "integer64")
supported = c(ORDERING_TYPES, "factor", "integer64")

getClass = function(x) {
ans = typeof(x)
Expand Down
2 changes: 1 addition & 1 deletion R/data.table.R
Original file line number Diff line number Diff line change
Expand Up @@ -829,7 +829,7 @@ replace_order = function(isub, verbose, env) {
if (!is.list(byval)) stop("'by' or 'keyby' must evaluate to a vector or a list of vectors (where 'list' includes data.table and data.frame which are lists, too)")
if (length(byval)==1L && is.null(byval[[1L]])) bynull=TRUE #3530 when by=(function()NULL)()
if (!bynull) for (jj in seq_len(length(byval))) {
if (!typeof(byval[[jj]]) %chin% c("integer","logical","character","double")) stop("column or expression ",jj," of 'by' or 'keyby' is type ",typeof(byval[[jj]]),". Do not quote column names. Usage: DT[,sum(colC),by=list(colA,month(colB))]")
if (!typeof(byval[[jj]]) %chin% ORDERING_TYPES) stop("column or expression ",jj," of 'by' or 'keyby' is type ",typeof(byval[[jj]]),". Do not quote column names. Usage: DT[,sum(colC),by=list(colA,month(colB))]")
}
tt = vapply_1i(byval,length)
if (any(tt!=xnrow)) stop("The items in the 'by' or 'keyby' list are length (",paste(tt,collapse=","),"). Each must be length ", xnrow, "; the same length as there are rows in x (after subsetting if i is provided).")
Expand Down
18 changes: 7 additions & 11 deletions R/setkey.R
Original file line number Diff line number Diff line change
Expand Up @@ -51,14 +51,9 @@ setkeyv = function(x, cols, verbose=getOption("datatable.verbose"), physical=TRU
}
if (identical(cols,"")) stop("cols is the empty string. Use NULL to remove the key.")
if (!all(nzchar(cols))) stop("cols contains some blanks.")
if (!length(cols)) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this if is redundant & contradictory -- just a few lines earlier (L47) has if (!length(cols)) as a warning. I guess this branch was some leftover or copy/paste from setkey... revealed by Codecov

cols = colnames(x) # All columns in the data.table, usually a few when used in this form
} else {
# remove backticks from cols
cols = gsub("`", "", cols, fixed = TRUE)
miss = !(cols %chin% colnames(x))
if (any(miss)) stop("some columns are not in the data.table: ", paste(cols[miss], collapse=","))
}
cols = gsub("`", "", cols, fixed = TRUE)
miss = !(cols %chin% colnames(x))
if (any(miss)) stop("some columns are not in the data.table: ", paste(cols[miss], collapse=","))

## determine, whether key is already present:
if (identical(key(x),cols)) {
Expand All @@ -83,7 +78,7 @@ setkeyv = function(x, cols, verbose=getOption("datatable.verbose"), physical=TRU
if (".xi" %chin% names(x)) stop("x contains a column called '.xi'. Conflicts with internal use by data.table.")
for (i in cols) {
.xi = x[[i]] # [[ is copy on write, otherwise checking type would be copying each column
if (!typeof(.xi) %chin% c("integer","logical","character","double")) stop("Column '",i,"' is type '",typeof(.xi),"' which is not supported as a key column type, currently.")
if (!typeof(.xi) %chin% ORDERING_TYPES) stop("Column '",i,"' is type '",typeof(.xi),"' which is not supported as a key column type, currently.")
}
if (!is.character(cols) || length(cols)<1L) stop("Internal error. 'cols' should be character at this point in setkey; please report.") # nocov

Expand Down Expand Up @@ -178,6 +173,7 @@ is.sorted = function(x, by=seq_along(x)) {
# Important to call forder.c::fsorted here, for consistent character ordering and numeric/integer64 twiddling.
}

ORDERING_TYPES = c('logical', 'integer', 'double', 'complex', 'character')
forderv = function(x, by=seq_along(x), retGrp=FALSE, sort=TRUE, order=1L, na.last=FALSE)
{
if (!(sort || retGrp)) stop("At least one of retGrp or sort must be TRUE")
Expand Down Expand Up @@ -205,7 +201,7 @@ forderv = function(x, by=seq_along(x), retGrp=FALSE, sort=TRUE, order=1L, na.las
stop("'by' is type 'double' and one or more items in it are not whole integers")
}
by = as.integer(by)
if ( (length(order) != 1L && length(order) != length(by)) || any(!order %in% c(1L, -1L)) )
if ( (length(order) != 1L && length(order) != length(by)) || !all(order %in% c(1L, -1L)) )
stop("x is a list, length(order) must be either =1 or =length(by) and each value should be 1 or -1 for each column in 'by', corresponding to ascending or descending order, respectively. If length(order) == 1, it will be recycled to length(by).")
if (length(order) == 1L) order = rep(order, length(by))
}
Expand Down Expand Up @@ -327,7 +323,7 @@ setorderv = function(x, cols = colnames(x), order=1L, na.last=FALSE)
if (".xi" %chin% colnames(x)) stop("x contains a column called '.xi'. Conflicts with internal use by data.table.")
for (i in cols) {
.xi = x[[i]] # [[ is copy on write, otherwise checking type would be copying each column
if (!typeof(.xi) %chin% c("integer","logical","character","double")) stop("Column '",i,"' is type '",typeof(.xi),"' which is not supported for ordering currently.")
if (!typeof(.xi) %chin% ORDERING_TYPES) stop("Column '",i,"' is type '",typeof(.xi),"' which is not supported for ordering currently.")
}
if (!is.character(cols) || length(cols)<1L) stop("Internal error. 'cols' should be character at this point in setkey; please report.") # nocov

Expand Down
96 changes: 81 additions & 15 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -6460,7 +6460,7 @@ test(1464.03, rleidv(DT, "b"), c(1L, 1L, 2L, 2L, 3L, 3L, 3L, 4L, 5L, 5L))
test(1464.04, rleid(DT$b), c(1L, 1L, 2L, 2L, 3L, 3L, 3L, 4L, 5L, 5L))
test(1464.05, rleidv(DT, "c"), c(1L, 1L, 2L, 2L, 3L, 3L, 3L, 4L, 5L, 5L))
test(1464.06, rleid(DT$c), c(1L, 1L, 2L, 2L, 3L, 3L, 3L, 4L, 5L, 5L))
test(1464.07, rleid(as.complex(c(1,0+5i,0+5i,1))), error="Type 'complex' not supported")
test(1464.07, rleid(as.raw(c(3L, 1L, 2L))), error="Type 'raw' not supported")
test(1464.08, rleidv(DT, 0), error="outside range")
test(1464.09, rleidv(DT, 5), error="outside range")
test(1464.10, rleidv(DT, 1:4), 1:nrow(DT))
Expand Down Expand Up @@ -11713,11 +11713,11 @@ test(1844.2, forder(DT,V1,V2,na.last=NA), INT(2,1,3,0,4)) # prior to v1.12.0 th
# now with two NAs in that 2-group covers forder.c:forder line 1269 starting: else if (nalast == 0 && tmp==-2) {
DT = data.table(c("a","a","a","b","b"),c(2,1,3,NA,NA))
test(1844.3, forder(DT,V1,V2,na.last=NA), INT(2,1,3,0,0))
DT = data.table((0+0i)^(-3:3), 7:1)
test(1844.4, forder(DT,V1,V2), error="Column 1 of by= (1) is type 'complex', not yet supported")
test(1844.5, forder(DT,V2,V1), error="Column 2 of by= (2) is type 'complex', not yet supported")
DT = data.table((0+0i)^(-3:3), c(5L,5L,1L,2L,2L,2L,2L))
test(1844.6, forder(DT,V2,V1), error="Column 2 of by= (2) is type 'complex', not yet supported")
DT = data.table(as.raw(0:6), 7:1)
test(1844.4, forder(DT,V1,V2), error="Column 1 of by= (1) is type 'raw', not yet supported")
test(1844.5, forder(DT,V2,V1), error="Column 2 of by= (2) is type 'raw', not yet supported")
DT = data.table(as.raw(0:6), c(5L,5L,1L,2L,2L,2L,2L))
test(1844.6, forder(DT,V2,V1), error="Column 2 of by= (2) is type 'raw', not yet supported")

# fix for non-equi joins issue #1991. Thanks to Henrik for the nice minimal example.
d1 <- data.table(x = c(rep(c("b", "a", "c"), each = 3), c("a", "b")), y = c(rep(c(1, 3, 6), 3), 6, 6), id = 1:11)
Expand Down Expand Up @@ -13170,9 +13170,9 @@ setnames(DT, '.xi')
setkey(DT, NULL)
test(1962.037, setkey(DT, .xi),
error = "x contains a column called '.xi'")
DT = data.table(a = 1+3i)
DT = data.table(a = as.raw(0))
test(1962.038, setkey(DT, a),
error = "Column 'a' is type 'complex'")
error = "Column 'a' is type 'raw'")

test(1962.039, is.sorted(3:1, by = 'x'),
error = 'x is vector but')
Expand Down Expand Up @@ -13228,8 +13228,8 @@ test(1962.064, setorderv(copy(DT)),
test(1962.065, setorderv(DT, 'c'), error = 'some columns are not in the data.table')
setnames(DT, 1L, '.xi')
test(1962.066, setorderv(DT, 'b'), error = "x contains a column called '.xi'")
test(1962.067, setorderv(data.table(a = 1+3i), 'a'),
error = "Column 'a' is type 'complex'")
test(1962.067, setorderv(data.table(a = as.raw(0)), 'a'),
error = "Column 'a' is type 'raw'")

DT = data.table(
color = c("yellow", "red", "green", "red", "green", "red",
Expand Down Expand Up @@ -13754,7 +13754,7 @@ test(1984.05, DT[ , sum(b), keyby = c, verbose = TRUE],
### hitting byval = eval(bysub, setattr(as.list(seq_along(xss)), ...)
test(1984.06, DT[1:3, sum(a), by=b:c], data.table(b=10:8, c=1:3, V1=1:3))
test(1984.07, DT[, sum(a), by=call('sin',pi)], error='must evaluate to a vector or a list of vectors')
test(1984.08, DT[, sum(a), by=1+3i], error='column or expression.*type complex')
test(1984.08, DT[, sum(a), by=as.raw(0)], error='column or expression.*type raw')
test(1984.09, DT[, sum(a), by=.(1,1:2)], error='The items.*list are length [(]1,2[)].*Each must be length 10; .*rows in x.*after subsetting')
options('datatable.optimize' = Inf)
test(1984.10, DT[ , 1, by = .(a %% 2), verbose = TRUE],
Expand Down Expand Up @@ -14766,14 +14766,14 @@ dt1 <- data.table(int = 1L:10L,
bool = c(rep(FALSE, 9), TRUE),
char = letters[1L:10L],
fact = factor(letters[1L:10L]),
complex = as.complex(1:5))
raw = as.raw(1:5))
dt2 <- data.table(int = 1L:5L,
doubleInt = as.numeric(1:5),
realDouble = seq(0.5, 2.5, by = 0.5),
bool = TRUE,
char = letters[1L:5L],
fact = factor(letters[1L:5L]),
complex = as.complex(1:5))
raw = as.raw(1:5))
if (test_bit64) {
dt1[, int64 := as.integer64(c(1:9, 3e10))]
dt2[, int64 := as.integer64(c(1:4, 3e9))]
Expand All @@ -14790,8 +14790,8 @@ test(2044.08, nrow(dt1[dt2, on="fact==fact", verbose=TRUE]), nrow(dt
if (test_bit64) {
test(2044.09, nrow(dt1[dt2, on = "int64==int64", verbose=TRUE]), nrow(dt2), output="No coercion needed")
}
test(2044.10, dt1[dt2, on = "int==complex"], error = "i.complex is type complex which is not supported by data.table join")
test(2044.11, dt1[dt2, on = "complex==int"], error = "x.complex is type complex which is not supported by data.table join")
test(2044.10, dt1[dt2, on = "int==raw"], error = "i.raw is type raw which is not supported by data.table join")
test(2044.11, dt1[dt2, on = "raw==int"], error = "x.raw is type raw which is not supported by data.table join")
# incompatible types
test(2044.20, dt1[dt2, on="bool==int"], error="Incompatible join types: x.bool (logical) and i.int (integer)")
test(2044.21, dt1[dt2, on="bool==doubleInt"], error="Incompatible join types: x.bool (logical) and i.doubleInt (double)")
Expand Down Expand Up @@ -15331,6 +15331,72 @@ test(2068.3, setkey(DT, ID), error="Item 2 of list is type 'raw'")
# setreordervec triggers !isNewList branch for coverage
test(2068.4, setreordervec(DT$r, order(DT$ID)), error="reorder accepts vectors but this non-VECSXP")

# forderv (and downstream functions) handles complex vector input, part of #3690
DT = data.table(
a = c(1L, 1L, 8L, 2L, 1L, 9L, 3L, 2L, 6L, 6L),
b = c(3+9i, 10+5i, 8+2i, 10+4i, 3+3i, 1+2i, 5+1i, 8+1i, 8+2i, 10+6i),
c = 6
)
test(2069.01, DT[order(a, b)], DT[base::order(a, b)])
test(2069.02, DT[order(a, -b)], DT[base::order(a, -b)])
test(2069.03, forderv(DT$b, order = 1L), base::order(DT$b))
test(2069.04, forderv(DT$b, order = -1L), base::order(-DT$b))
test(2069.05, forderv(DT, by = 2:1), forderv(DT[ , 2:1]))
test(2069.06, forderv(DT, by = 2:1, order = c(1L, -1L)), DT[order(b, -a), which = TRUE])

# downstreams of forder
DT = data.table(
z = c(0, 0, 1, 1, 2, 3) + c(1, 1, 2, 2, 3, 4)*1i,
grp = rep(1:2, 3L),
v = c(3, 1, 4, 1, 5, 9)
)
unq_z = 0:3 + (1:4)*1i
test(2069.07, DT[ , .N, by=z], data.table(z=unq_z, N=c(2L, 2L, 1L, 1L)))
test(2069.08, DT[ , .N, keyby = z], data.table(z=unq_z, N=c(2L, 2L, 1L, 1L), key='z'))
test(2069.09, dcast(DT, z ~ grp, value.var='v', fill=0),
data.table(z=unq_z, `1`=c(3, 4, 5, 0), `2`=c(1, 1, 0, 9), key='z'))
test(2069.10, frank(DT$z), c(1.5, 1.5, 3.5, 3.5, 5, 6))
test(2069.11, frank(DT$z, ties.method='max'), c(2L, 2L, 4L, 4L, 5L, 6L))
test(2069.12, frank(-DT$z, ties.method='min'), c(5L, 5L, 3L, 3L, 2L, 1L))
test(2069.13, DT[ , rowid(z, grp)], rep(1L, 6L))
test(2069.14, DT[ , rowid(z)], c(1:2, 1:2, 1L, 1L))
test(2069.15, rleid(c(1i, 1i, 1i, 0, 0, 1-1i, 2+3i, 2+3i)), rep(1:4, c(3:1, 2L)))
# handling doubles properly
test(2069.16, rleid(c(1i, 1.1i)), 1:2)
test(2069.17, rleidv(DT, "z"), c(1L, 1L, 2L, 2L, 3L, 4L))
test(2069.18, unique(DT, by = 'z'), data.table(z = unq_z, grp = c(1L, 1L, 1L, 2L), v = c(3, 4, 5, 9)))
test(2069.19, unique(DT, by = 'z', fromLast = TRUE), data.table(z = unq_z, grp = c(2L, 2L, 1L, 2L), v = c(1, 1, 5, 9)))
test(2069.20, uniqueN(DT$z), 4L)

# setkey, setorder work
DT = data.table(a = 2:1, z = 0 + (1:0)*1i)
test(2069.21, setkey(copy(DT), z), data.table(a=1:2, z=0+ (0:1)*1i, key='z'))
test(2069.22, setorder(DT, z), data.table(a=1:2, z=0+ (0:1)*1i))

## assorted coverage tests from along the way
if (test_bit64) {
test(2069.23, is.sorted(as.integer64(10:1)), FALSE)
test(2069.24, is.sorted(as.integer64(1:10)))
}
# sort by vector outside of table
ord = 3:1
test(2069.25, forder(data.table(a=3:1), ord), 3:1)
# dogroups.c coverage
test(2069.26, data.table(c='1')[ , expression(1), by=c], error="j evaluates to type 'expression'")
test(2069.27, data.table(c='1', d=2)[ , d := .(NULL), by=c], error='RHS is NULL when grouping :=')
test(2069.28, data.table(c='1', d=2)[ , c(a='b'), by=c, verbose=TRUE], output='j appears to be a named vector')
test(2069.29, data.table(c = '1', d = 2)[ , .(a = c(nm='b')), by = c, verbose = TRUE], output = 'Column 1 of j is a named vector')
DT <- data.table(a = rep(1:3, each = 4), b = LETTERS[1:4], z = 0:3 + (4:1)*1i)
test(2069.30, DT[, .SD[3,], by=b], DT[9:12, .(b, a, z)])
DT = data.table(x=1:4,y=1:2,lgl=TRUE,key="x,y")
test(2069.31, DT[CJ(1:4,1:4), any(lgl), by=.EACHI]$V1,
c(TRUE, NA, NA, NA, NA, TRUE, NA, NA, TRUE, NA, NA, NA, NA, TRUE, NA, NA))
set.seed(45L)
DT1 = data.table(a = sample(3L, 15L, TRUE) + .1, b=sample(c(TRUE, FALSE, NA), 15L, TRUE))
DT2 = data.table(a = sample(3L, 6L, TRUE) + .1, b=sample(c(TRUE, FALSE, NA), 6L, TRUE))
test(2069.32, DT1[DT2, .(y = sum(b, na.rm=TRUE)), by=.EACHI, on=c(a = 'a', b="b")]$y, rep(0L, 6L))
DT = data.table(z = 1i)
test(2069.33, DT[DT, on = 'z'], error = "Type 'complex' not supported for joining/merging")

###################################
# Add new tests above this line #
Expand Down
6 changes: 3 additions & 3 deletions src/bmerge.c
Original file line number Diff line number Diff line change
Expand Up @@ -299,7 +299,7 @@ void bmerge_r(int xlowIn, int xuppIn, int ilowIn, int iuppIn, int col, int thisg
// ilow and iupp now surround the group in ic, too
}
break;
case STRSXP :
case STRSXP : {
if (op[col] != EQ) error("Only '==' operator is supported for columns of type %s.", type2char(TYPEOF(xc)));
ival.s = ENC2UTF8(STRING_ELT(ic,ir));
while(xlow < xupp-1) {
Expand Down Expand Up @@ -338,7 +338,7 @@ void bmerge_r(int xlowIn, int xuppIn, int ilowIn, int iuppIn, int col, int thisg
xval.s = ENC2UTF8(STRING_ELT(ic, o ? o[mid]-1 : mid));
if (xval.s == ival.s) tmpupp=mid; else ilow=mid; // see above re ==
}
break;
} break;
case REALSXP : {
double *dic = REAL(ic);
double *dxc = REAL(xc);
Expand Down Expand Up @@ -406,7 +406,7 @@ void bmerge_r(int xlowIn, int xuppIn, int ilowIn, int iuppIn, int col, int thisg
}
break;
default:
error("Type '%s' not supported as key column", type2char(TYPEOF(xc)));
error("Type '%s' not supported for joining/merging", type2char(TYPEOF(xc)));
}
if (xlow<xupp-1) { // if value found, low and upp surround it, unlike standard binary search where low falls on it
if (col<ncol-1) {
Expand Down
Loading