Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

names(.SD) should work #4163

Merged
merged 49 commits into from
Mar 20, 2024
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
316c42b
Update data.table.R
ColeMiller1 Jan 8, 2020
e9ae7d3
Update tests.Rraw
ColeMiller1 Jan 8, 2020
17e80c4
Update data.table.R
ColeMiller1 Jan 8, 2020
d1c7a99
Update tests.Rraw
ColeMiller1 Jan 8, 2020
9d26aa7
Update datatable-reference-semantics.Rmd
ColeMiller1 Jan 8, 2020
54f35f6
Update assign.Rd
ColeMiller1 Jan 8, 2020
3c68d6e
Update NEWS.md
ColeMiller1 Jan 8, 2020
a009df0
Update NEWS.md
ColeMiller1 Jan 8, 2020
7bd494d
Merge branch 'master' into names_SD
ColeMiller1 Jan 8, 2020
18ccd2f
Update data.table.R
ColeMiller1 Jan 9, 2020
15e95f8
Update tests.Rraw
ColeMiller1 Jan 9, 2020
112f81d
Update tests.Rraw
ColeMiller1 Jan 9, 2020
2c39630
Update data.table.R
ColeMiller1 Jan 10, 2020
fcb270a
Update tests.Rraw
ColeMiller1 Jan 10, 2020
21d3a93
replace iris with raw dataset
ColeMiller1 Jan 10, 2020
10b36db
Update tests.Rraw
ColeMiller1 Jan 14, 2020
7993419
update replace_names_sd and made .SD := not work
ColeMiller1 Jan 19, 2020
269967e
change .SD to names(.SD)
ColeMiller1 Jan 19, 2020
76b5e64
update typo; change .SD to names(.SD)
ColeMiller1 Jan 19, 2020
ed879f6
update to names(.SD)
ColeMiller1 Jan 19, 2020
1fbd631
include names(.SD) and fx to .SD usage
ColeMiller1 Jan 21, 2020
8df7af5
Updates news to names(.SD)
ColeMiller1 Jan 21, 2020
8c2d273
Update typo.
ColeMiller1 Jan 30, 2020
7267766
tweak NEWS
MichaelChirico Feb 2, 2020
197cb54
minor grammar
MichaelChirico Feb 2, 2020
8d7f232
jans comment
MichaelChirico Feb 2, 2020
29cc659
jan's comment (ii)
MichaelChirico Feb 2, 2020
f7adef8
added "footnote"
MichaelChirico Feb 2, 2020
9469e4e
Add is.name(e[[2L]])
ColeMiller1 Feb 2, 2020
3ba5518
Put tests above Add new tests here
ColeMiller1 Feb 2, 2020
8e1c109
added test to test names(.SD(2))
ColeMiller1 Feb 2, 2020
2ef29e7
Merge branch 'master' into names_SD
ColeMiller1 Feb 2, 2020
c389b3c
include .SDcols in example for assign
ColeMiller1 Feb 2, 2020
2c3fb51
included .SDcols = function example
ColeMiller1 Feb 2, 2020
a2b568b
Merge branch 'master' into names_SD
ColeMiller1 Feb 17, 2020
82b7cfd
Merge branch 'master' into names_SD
ColeMiller1 Feb 27, 2020
f5ab271
test 2138 is greater than 2137
ColeMiller1 Feb 27, 2020
3be7e22
Merge branch 'master' into names_SD
MichaelChirico Feb 26, 2024
9d816d7
Merge branch 'master' into names_SD
MichaelChirico Feb 27, 2024
be720a3
bad merge
MichaelChirico Feb 27, 2024
7b0f8f1
Make updates per Michael's comments.
ColeMiller1 Mar 19, 2024
3635c3d
Added test where .SD is used as well as some columns not in .SD.
ColeMiller1 Mar 19, 2024
5fec7bc
Mention count of reactions in issue
MichaelChirico Mar 19, 2024
7ae1ea3
small copy-edit
MichaelChirico Mar 19, 2024
2cb48ea
more specific
MichaelChirico Mar 19, 2024
5a587e7
specify LHS/RHS
MichaelChirico Mar 19, 2024
212a774
Simplify implementation to probe for names(.SD) and new test
ColeMiller1 Mar 20, 2024
b91dab5
fine-tune comment
MichaelChirico Mar 20, 2024
8fe60ee
Merge branch 'master' into names_SD
MichaelChirico Mar 20, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,8 @@ unit = "s")

10. The dimensions of objects in a `list` column are now displayed, [#3671](https://github.com/Rdatatable/data.table/issues/3671). Thanks to @randomgambit for the request, and Tyson Barrett for the PR.

11. Using `dt[, .SD := lapply(.SD, fx)]` now works, [#795](https://github.com/Rdatatable/data.table/issues/795). Thanks to @brodieG for the report and @ColeMiller1 for PR.

## BUG FIXES

1. A NULL timezone on POSIXct was interpreted by `as.IDate` and `as.ITime` as UTC rather than the session's default timezone (`tz=""`) , [#4085](https://github.com/Rdatatable/data.table/issues/4085).
Expand Down
8 changes: 6 additions & 2 deletions R/data.table.R
Original file line number Diff line number Diff line change
Expand Up @@ -1027,9 +1027,13 @@ replace_dot_alias = function(e) {
lhs = jsub[[2L]]
jsub = jsub[[3L]]
if (is.name(lhs)) {
lhs = as.character(lhs)
if (lhs == as.name('.SD')) lhs = sdvars else lhs = as.character(lhs)
} else {
# e.g. (MyVar):= or get("MyVar"):=
#i.e lhs is names(.SD) || setdiff(names(.SD), cols) || (cols)
ColeMiller1 marked this conversation as resolved.
Show resolved Hide resolved
if (lhs[[1]] == as.name('names') && lhs[[2]] == as.name('.SD')) lhs = sdvars
for (i in seq_along(lhs)[-1]){
if (lhs[[i]] == as.name('names(.SD)')) lhs[[i]] = sdvars
ColeMiller1 marked this conversation as resolved.
Show resolved Hide resolved
}
lhs = eval(lhs, parent.frame(), parent.frame())
}
} else {
Expand Down
7 changes: 7 additions & 0 deletions inst/tests/tests.Rraw
Original file line number Diff line number Diff line change
Expand Up @@ -16740,3 +16740,10 @@ test(2131, lapply(x[ , list(dt = list(.SD)), by = a]$dt, attr, '.data.table.lock
########################
# Add new tests here #
########################

ColeMiller1 marked this conversation as resolved.
Show resolved Hide resolved
## names(.SD) - issue #795
DT <- data.table(a=1:6, b=1:6, c=rep(c(T,F), 3))
mycols <- 1:2
test(2131.1, DT[, .SD := lapply(.SD, `*`, 2), .SDcols = mycols], data.table(a = (1:6)*2, b = (1:6)*2, c = rep(c(T, F), 3)))
ColeMiller1 marked this conversation as resolved.
Show resolved Hide resolved
test(2131.3, DT[, .SD := lapply(.SD, '*', 2), .SDcols = -3L], data.table(a = (1:6)*4, b = (1:6)*4, c = rep(c(T, F), 3)))
test(2131.5, DT[, .SD := lapply(.SD, '*', 2)], data.table(a = (1:6)*8, b = (1:6)*8, c = rep(c(T, F), 3) * 2))
ColeMiller1 marked this conversation as resolved.
Show resolved Hide resolved
3 changes: 3 additions & 0 deletions man/assign.Rd
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@
# LHS2 = RHS2,
# ...), by = ...]

# 3. Multiple columns in place
# DT[i, .SD = lapply(.SD, fx), by = ..., .SDcols = ...]
ColeMiller1 marked this conversation as resolved.
Show resolved Hide resolved
ColeMiller1 marked this conversation as resolved.
Show resolved Hide resolved

set(x, i = NULL, j, value)
}
\arguments{
Expand Down
18 changes: 17 additions & 1 deletion vignettes/datatable-reference-semantics.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -258,14 +258,30 @@ head(flights)
* The `LHS := RHS` form allows us to operate on multiple columns. In the RHS, to compute the `max` on columns specified in `.SDcols`, we make use of the base function `lapply()` along with `.SD` in the same way as we have seen before in the *"Introduction to data.table"* vignette. It returns a list of two elements, containing the maximum value corresponding to `dep_delay` and `arr_delay` for each group.

#
Before moving on to the next section, let's clean up the newly created columns `speed`, `max_speed`, `max_dep_delay` and `max_arr_delay`.
Let's clean up the newly created columns `speed`, `max_speed`, `max_dep_delay` and `max_arr_delay`.
MichaelChirico marked this conversation as resolved.
Show resolved Hide resolved

```{r}
# RHS gets automatically recycled to length of LHS
flights[, c("speed", "max_speed", "max_dep_delay", "max_arr_delay") := NULL]
head(flights)
```

#### -- How can we update multiple existing columns in place using `.SD`?

```{r}
char_cols <- sapply(flights, is.character)
flights[, .SD := lapply(.SD, as.factor), .SDcols = char_cols]
str(flights[, ..char_cols])
```
#### {.bs-callout .bs-callout-info}

* We also could have used `(char_cols)` on the `LHS` but `.SD` is a shorthand.

Let's clean up again and make our newly made factor columns back to character columns.
```{r}
flights[, .SD := lapply(.SD, as.character), .SDcols = char_cols]
str(flights[, ..char_cols])
```
## 3) `:=` and `copy()`

`:=` modifies the input object by reference. Apart from the features we have discussed already, sometimes we might want to use the update by reference feature for its side effect. And at other times it may not be desirable to modify the original object, in which case we can use `copy()` function, as we will see in a moment.
Expand Down