-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-linear scale changes geom_density values #4783
Comments
Related: r-lib/scales#322 |
Right, density functions need to have the Jacobian correction applied when being transformed. In this case, given a density function In this case g(x) is log10(x) and [g-1]'(x) is 10xlog(10) so this simplifies to f(x)|xlog(10)|. You can apply this correction manually: library(tidyverse)
set.seed(1234)
tibble(x = rnorm(1000, mean = 10)) %>%
ggplot(aes(x)) +
geom_density(aes(x)) +
geom_function(fun = \(x) dnorm(x, mean = 10) * abs(x * log(10)), color = "red") +
scale_x_log10() Created on 2022-04-03 by the reprex package (v2.0.1) Of course it would be nice if the correction did not need to be applied manually. Shameless plug: {ggdist} currently supports this by finding the derivatives of scale transformations and applying them automatically to distributions' density functions, for example: library(tidyverse)
library(ggdist)
library(distributional)
set.seed(1234)
tibble(x = rnorm(1000, mean = 10)) %>%
ggplot() +
stat_slab(aes(xdist = dist), data = data.frame(dist = dist_normal(10, 1)), normalize = "none", scale = 1) +
geom_density(aes(x)) +
scale_x_log10() Created on 2022-04-03 by the reprex package (v2.0.1) Currently ggdist does this through a combination of symbolic and numeric differentiation, but it would be nice not to have to do that (this is the motivation for r-lib/scales#322). I'm not sure what or if there's a good way of handling that in |
Thanks for filing this issue! Unfortunately, I think it's out of scope for this package: developing good software requires relentless focus, which means that we have to say no to many good ideas. Even though I'm closing this issue, I really appreciate the feedback, and hope you'll continue to contribute in the future 😄 |
It is often useful to use
geom_density()
andgeom_function()
together, to compare data to a model. This is even an example in thegeom_function()
man page. However, the density values output bygeom_density()
change when a logarithmic axis is used, making such comparisons difficult.I suspect that
geom_density()
is performing the kernel density estimation upon the data after the scale transformation has been applied. This will yield smoother curves in the logarithmic scale. However, it also reduces the distance between points, and this increases the density value.If this is what's going on, perhaps we can find a way to apply an inverse transformation to the density values?
Created on 2022-03-30 by the reprex package (v2.0.1)
The text was updated successfully, but these errors were encountered: