Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fct_sort function #156

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ export(fct_reorder2)
export(fct_rev)
export(fct_shift)
export(fct_shuffle)
export(fct_sort)
export(fct_unify)
export(fct_unique)
export(last2)
Expand Down
38 changes: 38 additions & 0 deletions R/sort.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
#' Automatically sort factor levels according to a user-defined function
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe remove "Automatically"?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description now starts #' Sort factor levels ...

#'
#' @param .f A factor (or character vector).
#' @param .fun A function that will sort or permute the existing factor levels.
#' It must accept one character argument and return a character argument of
#' the same length as it's input.
#'
#' @param ... Additional arguments to `.fun`.
#' @export
#' @examples
#' chromosomes <- c("chr2", "chr1", "chr10")
#' chr_fac <- factor(chromosomes, levels = chromosomes)
#' chr_fac
#'
#' # naive alphanumeric sorting "1" < "10" < "2"
#' fct_sort(chr_fac, sort)
#'
#' # number-based alphanumeric sorting "1" < "2" < "10"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is maybe a bit complicated for an example? Maybe just do something with alphabetical sorting? And maybe sample() to show random reordering?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I replaced the example. The new examples includes

  • alphabetical-sort,
  • alphabetical-decreasing-sort (equiv to fct_rev)
  • alphabetical sort with an out-of-order baseline level (equiv to fct_relevel)
  • sampling from the levels

#' parse_number <- function(x){
#' as.numeric(gsub(".*?([[:digit:]]+).*", "\\1", x))
#' }
#' sort_numeric <- function(x){
#' x[order(parse_number(x))]
#' }
#' fct_sort(chr_fac, sort_numeric)
#'
fct_sort <- function(.f, .fun, ...) {
f <- check_factor(.f)
.fun <- rlang::as_function(.fun)

old_levels <- levels(f)
new_levels <- .fun(old_levels, ...)
stopifnot(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind making this a slightly friendlier function using the style guide at http://style.tidyverse.org/error-messages.html ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added more informative messages to indicate:

  • when .fun returns with something other than a vector

  • when the sorted-levels are of different length from the input-levels

  • when the sorted-levels contains at least one level that is absent from the input-levels

length(old_levels) == length(new_levels) &&
all(new_levels %in% old_levels)
)
fct_relevel(f, new_levels)
}
38 changes: 38 additions & 0 deletions man/fct_sort.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

74 changes: 74 additions & 0 deletions tests/testthat/test-fct_sort.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
context("fct_sort")

test_that("error if not a factor or character", {
f1 <- 1:3
expect_error(fct_sort(f1, sort))
})

# .fun is a function
test_that(".fun should be a function", {
f1 <- factor(letters[1:3])
expect_error(fct_sort(f1, "Not a function"))
})

# .fun(levels(.f)) should be the same length as levels(.f)
# .fun(levels(.f)) should contain the same values as levels(.f)
test_that("level-set returned by fct_sort should match the input levels", {
f1 <- factor(letters[1:3])
expect_error(
fct_sort(f1, function(x) x[1:2]),
info = "Levels should be the same length"
)
expect_error(
fct_sort(f1, toupper),
info = "Contents of level-sets should be unchanged by fct_sort"
)
})

# .fun = identity ==> result should equal .f
test_that("identity-sort should return an unchanged factor", {
f1 <- factor(letters[1:3])
expect_equal(fct_sort(f1, identity), f1)
})

# .fun = rev ==> result should equal fct_rev(.f)
test_that("reverse-sort should match fct_rev", {
f1 <- factor(letters[1:3])
expect_equal(fct_sort(f1, rev), fct_rev(f1))
})


# .fun with args (sort decreasing=TRUE)
test_that("sort-function with additional arguments", {
f1 <- letters[1:3]
expect_equal(
fct_sort(f1, sort, decreasing = TRUE),
factor(f1, levels = rev(f1))
)
})

# example sort:
# parse_number <- function(x) as.numeric(gsub(".*?([[:digit:]]+).*", "\\1", x))
# sort_numeric <- function(x) x[order(parse_number(x))]
# fct_sort(c("chr2", "chr1", "chr10"), .fun = sort)
# fct_sort(c("chr2", "chr1", "chr10"), .fun = sort_numeric)
test_that("mixed-sort: typical use case", {
f1 <- c("chr2", "chr1", "chr10")
expect_equal(
fct_sort(f1, sort),
factor(f1, levels = c("chr1", "chr10", "chr2")),
info = "naive alphanumeric sort: 1 < 10 < 2"
)

parse_number <- function(x){
as.numeric(gsub(".*?([[:digit:]]+).*", "\\1", x))
}
sort_numeric <- function(x){
x[order(parse_number(x))]
}
expect_equal(
fct_sort(f1, sort_numeric),
factor(f1, levels = c("chr1", "chr2", "chr10")),
info = "numeric sorting of strings: 1 < 2 < 10"
)
})