-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance improvements? #102
Comments
Thanks for the pointer. I tried playing with fmt but did not see any immediate performance improvements. This is potentially because a lot of the writing is broken up into smaller pieces due to the architecture of svglite... So, it is possible that a speed gain could be had if each tag were written in one go using fmt, but as the number of style elements and attributes is variable the logic for creating the formatting string will end up being quite horrible |
Interesting, yes I think instruction cache misses could be a part of it. I am collecting all the draw call data in a vector and render them in one go. |
do you have a pointer to where you are doing this? |
Sure, Edit: If you were wondering: |
I wonder if the main speed-up comes from the fact that you are formatting directly into the ostream object... Due to the structure of svglite I have to format into a string buffer first and then write the buffer to the final stream... |
So, I figured out the main difference between the performance of svglite and httpgd. It has nothing to do with string formatting or allocations or anything like that. svglite maintains a valid svgfile at all times which means that it always closes the open tags and then rewinds the stream position. The rewinding makes up half the time of the plotting which equals to exactly the performance difference |
That does make a lot of sense. I just re-ran the benchmark with the current development version of svglite. |
Thanks - I ran the httpgd benchmarks as part of all this and httpgd was consistently twice as fast as svglite, but that disappeared with the removal of the stream seek call. I'm going to make this optional as that feature in itself is quite niche and can't justify the performance toll... You should look into the new text rendering setup in svglite and update httpgd to match it as it open up a lot of new text features |
Must be caused by my setup then. I will look into integrating the benchmark in the CI for it to be more consistent. Thanks for the tip! I will do that. |
I have been doing some more optimization and benchmarking (now only calling in-memory functions to not be bottlenecked by disk writes) and noticed that httpgd is still exponentially faster than svglite: I think this is most likely caused by svgstring fixing Lines 1064 to 1072 in 654ab4a
Is there a reason for this? Benchmark codelibrary(svglite)
library(httpgd)
# Benchmark: Time to plot
results <- bench::press(
pts = 2^(0:18),
{
set.seed(1234)
x <- runif(pts)
y <- runif(pts)
svglite_test <- function() {
stringSVG({
plot(x, y)
})
}
httpgd_test <- function() {
hgd_inline({
plot(x, y)
})
}
bench::mark(httpgd_test(), svglite_test(), iterations = 128, check = FALSE)
}
)
# Benchmark: SVG size
df <- data.frame(pts = 2^(0:18))
df$pts
df["svglite_test()"] <- vapply(df$pts, function(i) {
set.seed(1234)
x <- runif(i)
y <- runif(i)
nchar(stringSVG({
plot(x, y)
}))},
numeric(1)
)
df["httpgd_test()"] <- vapply(df$pts, function(i) {
set.seed(1234)
x <- runif(i)
y <- runif(i)
nchar(hgd_inline({
plot(x, y)
}))},
numeric(1)
)
# Merge data
df <- tidyr::pivot_longer(df,c("svglite_test()", "httpgd_test()"), names_to = "expression", values_to = "chars")
results$expression <- as.character(results$expression)
df <- dplyr::inner_join(df, results)
df$mem_alloc <- as.numeric(df$mem_alloc)
dfmem <- tidyr::pivot_longer(df,c("mem_alloc", "chars"), names_to = "mem_type", values_to = "mem_val")
# Plot results
g1 <- ggplot(df, aes(x=pts, y=as.numeric(median), colour=expression)) +
scale_x_log10(name = 'number of plot points',
breaks = 10^(0:5),
labels = function(x) format(x, scientific = FALSE)) +
scale_y_log10(name = 'time to plot (sec)') +
scale_colour_discrete(name = '', labels=list(`svglite_test()`="svglite", `httpgd_test()`="httpgd")) +
geom_point() +
geom_line() +
theme_bw() +
annotation_logticks() +
theme(legend.position="bottom")
g2 <- ggplot(dfmem, aes(x=pts, y=mem_val/1024, colour=expression, shape=mem_type)) +
scale_x_log10(name = 'number of plot points',
breaks = 10^(0:5),
labels = function(x) format(x, scientific = FALSE)) +
scale_y_log10(name = 'size (KB)') +
scale_shape_discrete(name = '', labels=list(chars="SVG size", mem_alloc="allocated memory")) +
scale_colour_discrete(name = '', labels=list(`svglite_test()`="svglite", `httpgd_test()`="httpgd")) +
geom_point() +
geom_line() +
theme_bw() +
annotation_logticks() +
theme(legend.position="bottom", legend.box="vertical", legend.margin=margin())
gridExtra::grid.arrange(g1, g2, ncol = 2) |
The two devices |
Then it is something else :-) maybe the formatting you mentioned earlier... but it is not related to |
Yes good to know, thanks for the responses. I primarily wanted to share the new benchmarks. |
much appreciated - I may look at it closer next time I'm working on svglite |
Would you mind sharing your performance comparison setup? I'm a bit unsure how to do a fair comparison with httpgd for pure svg performance |
No problem at all, httpgd has a helper function that plots to SVG by default and returns an R string: library(httpgd)
mysvg <- hgd_inline({
hist(rnorm(100))
}) this basically starts and closes an offline device and is equivalent to: library(httpgd)
hgd(webserver=F) # start device
hist(rnorm(100)) # plot something
mysvg <- hgd_plot() # render last plot to svg
dev.off() # close device Keeping the device open and continuously plotting with the same device would have a lower overhead but that should be constant (independent of the number of previous plots) so using This is the code I used previously for the benchmark, but measuring R overhead with your devoid device is missing: https://github.com/nx10/httpgd/blob/44ccccaa6352ee5a80f43a3d7c79880fce35ad18/docs/benchmark.R I added an alternative SVG renderer recently that can be set with All httpgd plots will be returned as memory objects by default, but will be written to disk instead when the |
Just a reminder to investigate if there are any venues open for improving performance
The text was updated successfully, but these errors were encountered: