Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement defer() on top of on.exit() #221

Merged
merged 7 commits into from
Mar 22, 2023
Merged

Implement defer() on top of on.exit() #221

merged 7 commits into from
Mar 22, 2023

Conversation

lionel-
Copy link
Member

@lionel- lionel- commented Mar 22, 2023

Branched from #220.

With this rewrite defer() is now a thin layer on top of on.exit(). This is possible thanks to two contributions that we made to R 3.5:

  • Added argument for FIFO cleanup: on.exit(after = FALSE)
  • Calling sys.on.exit() elsewhere than top-level didn't work. This is needed for manual invokation with deferred_run().

Because of this change all users of the standalone file now need to update their file. Until they do, the order of execution of defer()-based handlers from different packages will not be correct because they no longer share the same data structure. I'm only aware of rlang and @gaborcsardi's packages that use a standalone defer.

One benefit of switching to on.exit() is that everything now shares the same data structure implemented in the R call stack. This should fix forward-compatibility issues with standalone defer(), and more generally make defer() compatible with on.exit() and other wrappers of on.exit().

The other benefit is increased performance, cc @DavisVaughan:

library(withr)

base <- function() on.exit(NULL)
withr <- function() defer(NULL)

# CRAN
bench::mark(base(), withr(), check = FALSE)[1:8]
#> # A tibble: 2 × 8
#>   expression      min   median `itr/sec` mem_al…¹ gc/se…² n_itr  n_gc
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:by>   <dbl> <int> <dbl>
#> 1 base()         41ns  123.1ns  4576094.       0B      0  10000     0
#> 2 withr()      24.9µs   26.3µs    37077.   28.8KB    213.  9943    57

# With `source()` disabled by default
bench::mark(base(), withr(), check = FALSE)[1:8]
#> # A tibble: 2 × 8
#>   expression      min   median `itr/sec` mem_al…¹ gc/se…² n_itr  n_gc
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:by>   <dbl> <int> <dbl>
#> 1 base()         41ns  123.1ns   710408.       0B      0  10000     0
#> 2 withr()      13.6µs   14.5µs    67197.   93.3KB    209.  9969    31

# This PR
#> # A tibble: 2 × 8
#>   expression      min   median `itr/sec` mem_al…¹ gc/se…² n_itr  n_gc
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:by>   <dbl> <int> <dbl>
#> 1 base()      82.02ns 205.07ns  4261544.       0B     0   10000     0
#> 2 withr()      3.81µs   4.47µs   216726.       0B    43.4  9998     2

The remaining of the time is spent in identical() (to detect global envs) and match.arg().

prof_tbl(for (i in 1:10000) withr())
#> # A tibble: 8 × 5
#>   fn                      total.time total.pct self.time self.pct
#>   <chr>                        <dbl>     <dbl>     <dbl>    <dbl>
#> 1 identical                     0.02      33.3      0.02     33.3
#> 2 match.arg                     0.02      33.3      0.02     33.3
#> 3 do.call                       0.01      16.7      0.01     16.7
#> 4 getOption                     0.01      16.7      0.01     16.7
#> 5 defer                         0.06     100        0         0
#> 6 prof_tbl                      0.06     100        0         0
#> 7 withr                         0.06     100        0         0
#> 8 is_top_level_global_env       0.02      33.3      0         0

It would be hard to do better without implementing in C. But then I think I'd move defer() from withr to rlang. The withr implementation can remain pure R.

defer(expr, envir, priority),
list(expr = substitute(expr), envir = parent.frame(2), priority = priority, defer = defer)
), envir = parent.frame())
defer(expr, parent.frame(2), priority = priority)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This simplification is possible because the handlers are now implemented as thunks.

However I'm tempted to supersede defer_parent() in favour of defer(envir = parent.frame()) which I find easier to understand and update (e.g. pass the frame across one additional layer). Any objections?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function confused me quite a bit so I have no objections to that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(hmm actually the simplification would have worked with the former impl of defer() too)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never used defer_parent() so I agree that it's good to supersede it.

} else {
handlers <- c(get_handlers(envir), list(handler))
}
thunk <- as.call(list(function() expr))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A withr handler is now a thunk, a function with promise semantics that causes evaluation of expr.

defer(expr, envir, priority),
list(expr = substitute(expr), envir = parent.frame(2), priority = priority, defer = defer)
), envir = parent.frame())
defer(expr, parent.frame(2), priority = priority)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never used defer_parent() so I agree that it's good to supersede it.

@@ -9,11 +9,13 @@
#
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we could now implement defer() in C if we wanted to, I vote for moving defer() to rlang.

To me it feels like the natural place for this function anyways (because it is superpowered on.exit(), i.e. an rlang variant of an existing base function), and then the point of withr becomes providing very useful wrappers on top of it (with rlang still retaining low level ones like local_options()).

I think rlang would then own the pure R variant of standalone-defer.R, and withr could either:

  • Use import-standalone-defer.R to remain at 0 deps
  • Import rlang and use the faster rlang::defer(), at the cost of 1 dep, which does not bother me personally

DESCRIPTION Outdated Show resolved Hide resolved
NEWS.md Outdated Show resolved Hide resolved
R/defer.R Show resolved Hide resolved
}
}
frame_clear_exits <- function(frame = parent.frame()) {
do.call(on.exit, list(), envir = frame)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the point here is that add = FALSE is being used? Maybe a comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is directly equivalent to calling on.exit().

@@ -42,55 +44,29 @@ defer <<- defer <- function(expr, envir = parent.frame(), priority = c("first",
}

priority <- match.arg(priority, choices = c("first", "last"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth hacking together our own match.arg() approximation since this is 1/3 of the remaining time?

This doesn't handle a single NA_character_ but is otherwise ok ish

fn <- function(priority = c("first", "last")) {
  if (identical(priority, c("first", "last"))) {
    "first"
  } else if (typeof(priority) == "character" && length(priority) == 1L && (priority == "first" || priority == "last")) {
    priority
  } else {
    stop("oh no")
  }
}

fn2 <- function(priority = c("first", "last")) {
  match.arg(priority, choices = c("first", "last"))
}

bench::mark(fn(), fn2(), iterations = 100000)
#> # A tibble: 2 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 fn()          855ns   1.04µs   672336.        0B     20.2
#> 2 fn2()        1.76µs   2.06µs   458890.        0B     22.9

bench::mark(fn(priority = "last"), fn2(priority = "last"), iterations = 100000)
#> # A tibble: 2 × 6
#>   expression                  min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>             <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 fn(priority = "last")    1.28µs   1.56µs   613656.        0B     24.5
#> 2 fn2(priority = "last")   3.44µs    3.9µs   244106.        0B     24.4

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so but let's first see if we rewrite in C.

R/utils.R Show resolved Hide resolved
@lionel- lionel- changed the base branch from global-defer to main March 22, 2023 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants