Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] replace uses of T and F with TRUE and FALSE #5778

Merged
merged 3 commits into from
Jun 11, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion R-package/demo/basic_walkthrough.R
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ print(paste("test-error=", err))

# You can dump the tree you learned using xgb.dump into a text file
dump_path = file.path(tempdir(), 'dump.raw.txt')
xgb.dump(bst, dump_path, with_stats = T)
xgb.dump(bst, dump_path, with_stats = TRUE)

# Finally, you can check which features are the most important.
print("Most important features (look at column Gain):")
Expand Down
2 changes: 1 addition & 1 deletion R-package/demo/caret_wrapper.R
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ require(e1071)
# Load Arthritis dataset in memory.
data(Arthritis)
# Create a copy of the dataset with data.table package (data.table is 100% compliant with R dataframe but its syntax is a lot more consistent and its performance are really good).
df <- data.table(Arthritis, keep.rownames = F)
df <- data.table(Arthritis, keep.rownames = FALSE)

# Let's add some new categorical features to see if it helps. Of course these feature are highly correlated to the Age feature. Usually it's not a good thing in ML, but Tree algorithms (including boosted trees) are able to select the best features, even in case of highly correlated features.
# For the first feature we create groups of age by rounding the real age. Note that we transform it to factor (categorical data) so the algorithm treat them as independant values.
Expand Down
2 changes: 1 addition & 1 deletion R-package/demo/create_sparse_matrix.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ if (!require(vcd)) {
data(Arthritis)

# create a copy of the dataset with data.table package (data.table is 100% compliant with R dataframe but its syntax is a lot more consistent and its performance are really good).
df <- data.table(Arthritis, keep.rownames = F)
df <- data.table(Arthritis, keep.rownames = FALSE)

# Let's have a look to the data.table
cat("Print the dataset\n")
Expand Down
8 changes: 4 additions & 4 deletions R-package/demo/interaction_constraints.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,18 +19,18 @@ treeInteractions <- function(input_tree, input_max_depth){
setorderv(parents_left, 'ID_merge')
setorderv(parents_right, 'ID_merge')

trees <- merge(trees, parents_left, by='ID_merge', all.x=T)
trees <- merge(trees, parents_left, by='ID_merge', all.x=TRUE)
trees[!is.na(i.id), c(paste0('parent_', i-1), paste0('parent_feat_', i-1)):=list(i.id, i.feature)]
trees[, c('i.id','i.feature'):=NULL]

trees <- merge(trees, parents_right, by='ID_merge', all.x=T)
trees <- merge(trees, parents_right, by='ID_merge', all.x=TRUE)
trees[!is.na(i.id), c(paste0('parent_', i-1), paste0('parent_feat_', i-1)):=list(i.id, i.feature)]
trees[, c('i.id','i.feature'):=NULL]
}

# Extract nodes with interactions
interaction_trees <- trees[!is.na(Split) & !is.na(parent_1),
c('Feature',paste0('parent_feat_',1:(input_max_depth-1))), with=F]
c('Feature',paste0('parent_feat_',1:(input_max_depth-1))), with=FALSE]
interaction_trees_split <- split(interaction_trees, 1:nrow(interaction_trees))
interaction_list <- lapply(interaction_trees_split, as.character)

Expand Down Expand Up @@ -96,7 +96,7 @@ x1 <- sort(unique(x[['V1']]))
for (i in 1:length(x1)){
testdata <- copy(x[, -c('V1')])
testdata[['V1']] <- x1[i]
testdata <- testdata[, paste0('V',1:10), with=F]
testdata <- testdata[, paste0('V',1:10), with=FALSE]
pred <- predict(bst3, as.matrix(testdata))

# Should not print out anything due to monotonic constraints
Expand Down
2 changes: 1 addition & 1 deletion R-package/demo/tweedie_regression.R
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ exclude <- c('POLICYNO', 'PLCYDATE', 'CLM_FREQ5', 'CLM_AMT5', 'CLM_FLAG', 'IN_Y
# retains the missing values
# NOTE: this dataset is comes ready out of the box
options(na.action = 'na.pass')
x <- sparse.model.matrix(~ . - 1, data = dt[, -exclude, with = F])
x <- sparse.model.matrix(~ . - 1, data = dt[, -exclude, with = FALSE])
options(na.action = 'na.omit')

# response
Expand Down
10 changes: 5 additions & 5 deletions R-package/tests/testthat/test_helpers.R
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ flag_32bit = .Machine$sizeof.pointer != 8

set.seed(1982)
data(Arthritis)
df <- data.table(Arthritis, keep.rownames = F)
df <- data.table(Arthritis, keep.rownames = FALSE)
df[,AgeDiscret := as.factor(round(Age / 10,0))]
df[,AgeCat := as.factor(ifelse(Age > 30, "Old", "Young"))]
df[,ID := NULL]
Expand Down Expand Up @@ -47,7 +47,7 @@ test_that("xgb.dump works", {
if (!flag_32bit)
expect_length(xgb.dump(bst.Tree), 200)
dump_file = file.path(tempdir(), 'xgb.model.dump')
expect_true(xgb.dump(bst.Tree, dump_file, with_stats = T))
expect_true(xgb.dump(bst.Tree, dump_file, with_stats = TRUE))
expect_true(file.exists(dump_file))
expect_gt(file.size(dump_file), 8000)

Expand Down Expand Up @@ -160,16 +160,16 @@ test_that("SHAPs sum to predictions, with or without DART", {
objective = "reg:squarederror",
eval_metric = "rmse"),
if (booster == "dart")
list(rate_drop = .01, one_drop = T)),
list(rate_drop = .01, one_drop = TRUE)),
data = d,
label = y,
nrounds = nrounds)

pr <- function(...)
predict(fit, newdata = d, ...)
pred <- pr()
shap <- pr(predcontrib = T)
shapi <- pr(predinteraction = T)
shap <- pr(predcontrib = TRUE)
shapi <- pr(predinteraction = TRUE)
tol = 1e-5

expect_equal(rowSums(shap), pred, tol = tol)
Expand Down
2 changes: 1 addition & 1 deletion R-package/tests/testthat/test_interactions.R
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ test_that("SHAP contribution values are not NAN", {

shaps <- as.data.frame(predict(fit,
newdata = as.matrix(subset(d, fold == 1)[, ivs]),
predcontrib = T))
predcontrib = TRUE))
result <- cbind(shaps, sum = rowSums(shaps), pred = predict(fit,
newdata = as.matrix(subset(d, fold == 1)[, ivs])))

Expand Down
7 changes: 3 additions & 4 deletions R-package/tests/testthat/test_lint.R
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
context("Code is of high quality and lint free")
test_that("Code Lint", {
skip_on_cran()
skip_on_travis()
skip_if_not_installed("lintr")
my_linters <- list(
absolute_paths_linter=lintr::absolute_paths_linter,
assignment_linter=lintr::assignment_linter,
Expand All @@ -21,7 +19,8 @@ test_that("Code Lint", {
spaces_inside_linter=lintr::spaces_inside_linter,
spaces_left_parentheses_linter=lintr::spaces_left_parentheses_linter,
trailing_blank_lines_linter=lintr::trailing_blank_lines_linter,
trailing_whitespace_linter=lintr::trailing_whitespace_linter
trailing_whitespace_linter=lintr::trailing_whitespace_linter,
true_false=lintr::T_and_F_symbol_linter
)
# lintr::expect_lint_free(linters=my_linters) # uncomment this if you want to check code quality
lintr::expect_lint_free(linters=my_linters) # uncomment this if you want to check code quality
})
2 changes: 1 addition & 1 deletion R-package/vignettes/discoverYourData.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ The first step is to load `Arthritis` dataset in memory and wrap it with `data.t

```{r, results='hide'}
data(Arthritis)
df <- data.table(Arthritis, keep.rownames = F)
df <- data.table(Arthritis, keep.rownames = FALSE)
```

> `data.table` is 100% compliant with **R** `data.frame` but its syntax is more consistent and its performance for large dataset is [best in class](http://stackoverflow.com/questions/21435339/data-table-vs-dplyr-can-one-do-something-well-the-other-cant-or-does-poorly) (`dplyr` from **R** and `Pandas` from **Python** [included](https://github.com/Rdatatable/data.table/wiki/Benchmarks-%3A-Grouping)). Some parts of **Xgboost** **R** package use `data.table`.
Expand Down
2 changes: 1 addition & 1 deletion R-package/vignettes/xgboostPresentation.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -363,7 +363,7 @@ xgb.plot.importance(importance_matrix = importance_matrix)
You can dump the tree you learned using `xgb.dump` into a text file.

```{r dump, message=T, warning=F}
xgb.dump(bst, with_stats = T)
xgb.dump(bst, with_stats = TRUE)
```

You can plot the trees from your model using ```xgb.plot.tree``
Expand Down
2 changes: 1 addition & 1 deletion demo/data/gen_autoclaims.R
Original file line number Diff line number Diff line change
Expand Up @@ -14,5 +14,5 @@ data$STATE = as.factor(data$STATE)
data$CLASS = as.factor(data$CLASS)
data$GENDER = as.factor(data$GENDER)

data.dummy <- dummy.data.frame(data, dummy.class='factor', omit.constants=T);
data.dummy <- dummy.data.frame(data, dummy.class='factor', omit.constants=TRUE);
write.table(data.dummy, 'autoclaims.csv', sep=',', row.names=F, col.names=F, quote=F)
4 changes: 2 additions & 2 deletions demo/kaggle-otto/otto_train_pred.R
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
require(xgboost)
require(methods)

train = read.csv('data/train.csv',header=TRUE,stringsAsFactors = F)
test = read.csv('data/test.csv',header=TRUE,stringsAsFactors = F)
train = read.csv('data/train.csv',header=TRUE,stringsAsFactors = FALSE)
test = read.csv('data/test.csv',header=TRUE,stringsAsFactors = FALSE)
train = train[,-1]
test = test[,-1]

Expand Down
16 changes: 8 additions & 8 deletions demo/kaggle-otto/understandingXGBoostModel.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@ require(xgboost)
require(methods)
require(data.table)
require(magrittr)
train <- fread('data/train.csv', header = T, stringsAsFactors = F)
test <- fread('data/test.csv', header=TRUE, stringsAsFactors = F)
train <- fread('data/train.csv', header = T, stringsAsFactors = FALSE)
test <- fread('data/test.csv', header=TRUE, stringsAsFactors = FALSE)
```
> `magrittr` and `data.table` are here to make the code cleaner and much more rapid.

Expand All @@ -42,13 +42,13 @@ Let's explore the dataset.
dim(train)

# Training content
train[1:6,1:5, with =F]
train[1:6,1:5, with =FALSE]

# Test dataset dimensions
dim(test)

# Test content
test[1:6,1:5, with =F]
test[1:6,1:5, with =FALSE]
```
> We only display the 6 first rows and 5 first columns for convenience

Expand All @@ -70,7 +70,7 @@ According to its description, the **Otto** challenge is a multi class classifica

```{r searchLabel}
# Check the content of the last column
train[1:6, ncol(train), with = F]
train[1:6, ncol(train), with = FALSE]
# Save the name of the last column
nameLastCol <- names(train)[ncol(train)]
```
Expand All @@ -86,7 +86,7 @@ For that purpose, we will:

```{r classToIntegers}
# Convert from classes to numbers
y <- train[, nameLastCol, with = F][[1]] %>% gsub('Class_','',.) %>% {as.integer(.) -1}
y <- train[, nameLastCol, with = FALSE][[1]] %>% gsub('Class_','',.) %>% {as.integer(.) -1}

# Display the first 5 levels
y[1:5]
Expand All @@ -95,7 +95,7 @@ y[1:5]
We remove label column from training dataset, otherwise **XGBoost** would use it to guess the labels!

```{r deleteCols, results='hide'}
train[, nameLastCol:=NULL, with = F]
train[, nameLastCol:=NULL, with = FALSE]
```

`data.table` is an awesome implementation of data.frame, unfortunately it is not a format supported natively by **XGBoost**. We need to convert both datasets (training and test) in `numeric` Matrix format.
Expand Down Expand Up @@ -163,7 +163,7 @@ Each *split* is done on one feature only at one value.
Let's see what the model looks like.

```{r modelDump}
model <- xgb.dump(bst, with.stats = T)
model <- xgb.dump(bst, with.stats = TRUE)
model[1:10]
```
> For convenience, we are displaying the first 10 lines of the model only.
Expand Down
2 changes: 1 addition & 1 deletion doc/R-package/discoverYourData.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ The first step is to load `Arthritis` dataset in memory and wrap it with `data.t

```r
data(Arthritis)
df <- data.table(Arthritis, keep.rownames = F)
df <- data.table(Arthritis, keep.rownames = FALSE)
```

> `data.table` is 100% compliant with **R** `data.frame` but its syntax is more consistent and its performance for large dataset is [best in class](http://stackoverflow.com/questions/21435339/data-table-vs-dplyr-can-one-do-something-well-the-other-cant-or-does-poorly) (`dplyr` from **R** and `Pandas` from **Python** [included](https://github.com/Rdatatable/data.table/wiki/Benchmarks-%3A-Grouping)). Some parts of **Xgboost** **R** package use `data.table`.
Expand Down
2 changes: 1 addition & 1 deletion doc/R-package/xgboostPresentation.md
Original file line number Diff line number Diff line change
Expand Up @@ -489,7 +489,7 @@ You can dump the tree you learned using `xgb.dump` into a text file.


```r
xgb.dump(bst, with_stats = T)
xgb.dump(bst, with_stats = TRUE)
```

```
Expand Down