-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add fct_sort
function
#156
Conversation
`fct_sort` takes a factor or character-vector (implictly converted to factor) and reorders the `levels` of that factor using a user-specified function `.fun`. Code added to `R/sort.R`, unit tests into `tests/testthat/test-fct_sort.R`. An example showing how to use `fct_sort` to sort number-containing character `levels` by the contained number.
Fixes #117 |
R/sort.R
Outdated
|
||
old_levels <- levels(f) | ||
new_levels <- .fun(old_levels, ...) | ||
stopifnot( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind making this a slightly friendlier function using the style guide at http://style.tidyverse.org/error-messages.html ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added more informative messages to indicate:
-
when .fun returns with something other than a vector
-
when the sorted-levels are of different length from the input-levels
-
when the sorted-levels contains at least one level that is absent from the input-levels
R/sort.R
Outdated
@@ -0,0 +1,38 @@ | |||
#' Automatically sort factor levels according to a user-defined function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe remove "Automatically"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The description now starts #' Sort factor levels ...
R/sort.R
Outdated
#' # naive alphanumeric sorting "1" < "10" < "2" | ||
#' fct_sort(chr_fac, sort) | ||
#' | ||
#' # number-based alphanumeric sorting "1" < "2" < "10" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is maybe a bit complicated for an example? Maybe just do something with alphabetical sorting? And maybe sample()
to show random reordering?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I replaced the example. The new examples includes
- alphabetical-sort,
- alphabetical-decreasing-sort (equiv to fct_rev)
- alphabetical sort with an out-of-order baseline level (equiv to fct_relevel)
- sampling from the levels
The name Worth contemplating: could this be achieved this by making If that's a no-go, I still think it's worth trying to think of a better name here. |
The same is true for other But perhaps |
I would prefer having a separate |
I think @jennybc was suggesting that |
Renamed man/R/tests files to correspond to the renamed function. Changed the description of `fct_sort_levels` to "Sort factor levels ..." following code review.
#' Sort factor levels according to a user-defined function | ||
#' | ||
#' @param .f A factor (or character vector). | ||
#' @param .fun A function that will sort or permute the existing factor levels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#' @param .fun A function that will sort or permute the existing factor levels. | |
#' @param .fun A function that will permute the existing factor levels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hadley : Given the emphasis on permutation in your edited version of the docs, would fct_permute_levels
be a more appropriate name for the function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or, indeed, lvls_permute
?
#' @param .f A factor (or character vector). | ||
#' @param .fun A function that will sort or permute the existing factor levels. | ||
#' It must accept one character argument and return a character argument of | ||
#' the same length as it's input. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#' the same length as it's input. | |
#' the same length as its input. |
#' # Default (alphabetical) level-sorting: | ||
#' fct_sort_levels(medieval_experiment, sort) | ||
#' | ||
#' # Reversed ordering (equivalent to `fct_rev`): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not in general equivalent to fct_rev()
, so I think it'd be better to just remove that comparison
#' # Reversed ordering (equivalent to `fct_rev`): | ||
#' fct_sort_levels(medieval_experiment, sort, decreasing = TRUE) | ||
#' | ||
#' # Level-sorting with "Control" as the first level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And this seems inferior to fct_relevel()
so I'd also drop it.
#' medieval_experiment, | ||
#' function(x) c("Control", sort(setdiff(x, "Control")))) | ||
#' | ||
#' # Randomised sorting of the levels: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#' # Randomised sorting of the levels: | |
#' # Randomly permute the levels: |
call. = FALSE | ||
) | ||
} | ||
if (length(old_levels) != length(new_levels)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a little overwrought to me. Doesn't fct_relevel()
already give error messages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it provdes an error in this setting - if passed a (vector) proper subset of a factor's levels, fct_relevel
just pulls the subset to the start of the factor's levels:
fct_relevel(factor(letters[1:3]), c("b", "a"))
[1] a b c
Levels: b a c
(I thought the error messages might be overkill, but was trying to follow the examples in your style guide)
After thinking about @jennybc's comment more, I think this is best implemented as a special case of |
thanks |
fct_sort
takes a factor or character-vector (implictly converted to factor) and reorders thelevels
of that factor using a user-specified function.fun
.Code added to
R/sort.R
, unit tests intotests/testthat/test-fct_sort.R
.An example showing how to use
fct_sort
to sort number-containing characterlevels
by the contained number.Ensures that the values in the factor levels are unchanged by the sorting function, and uses
...
to pass further arguments for the sorting function.