You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I recently implemented a new feature to my logger that requires the use of .traceback to fetch the function name and line number that generated that logger function call. A while later, I realised this was slowing things down drastically when run sequentially or in parallel with future_Map function. I am not sure if this has the same effect on other future_* functions (e.g., lapply).
Took me a while to figure out and come up with a small example, but here it is:
require(future)
require(future.apply)
myDF<-function(...) {
t<- system.time(x<- .traceback(x=1))[["elapsed"]]
cat("time: ", t, ", length: ", length(x), ", size_mb: ", object.size(x)/1024/1024, "\n", sep="")
# print(x) # prints the entire call stack where the data is completely populated when called from future_Mapbase::data.frame(...)
}
ll<- replicate(3, sample(10, 1e6, TRUE), simplify=FALSE)
system.time(Map(function(x, y) myDF(x, y), ll, ll))
# time: 0, length: 6, size_mb: 0.002937317# time: 0, length: 6, size_mb: 0.002937317# time: 0, length: 6, size_mb: 0.002937317# user system elapsed# 0.15 0.00 0.16
system.time(future_Map(function(x, y) myDF(x, y), ll, ll))
# time: 4.24, length: 27, size_mb: 53.27061# time: 4.15, length: 27, size_mb: 53.27061# time: 4.18, length: 27, size_mb: 53.27061# user system elapsed# 13.01 0.03 13.05
If you uncomment the print(x) statement and run them, you'll see the difference in the way the call is populated. This is probably because of the way you use do.call() in generating the mapply function call: do.call(mapply, args=args). Here args, every element of args gets evaluated and are therefore not just calls/expressions, but materialised objects, I suspect.
The text was updated successfully, but these errors were encountered:
I recently implemented a new feature to my logger that requires the use of
.traceback
to fetch the function name and line number that generated that logger function call. A while later, I realised this was slowing things down drastically when run sequentially or in parallel withfuture_Map
function. I am not sure if this has the same effect on otherfuture_*
functions (e.g.,lapply
).Took me a while to figure out and come up with a small example, but here it is:
If you uncomment the
print(x)
statement and run them, you'll see the difference in the way the call is populated. This is probably because of the way you usedo.call()
in generating themapply
function call:do.call(mapply, args=args)
. Hereargs
, every element ofargs
getseval
uated and are therefore not just calls/expressions, but materialised objects, I suspect.The text was updated successfully, but these errors were encountered: