You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It has already been noted (see this issue) that the testGGPlot() function is "too rigorous" in that it detects differences between an expected and a (student) generated plot which look actually look identical in R. The problem is not with the testGGPlot() function, however, which does what it is supposed to do. The issue is that ggplot2 stores a graph differently depending on which layers you use. This can be illustrated by the following two graphs which both produce a scatter plot with a LOESS smoother through the points:
p1 <- ggplot(cars, aes(x = speed, y = dist)) + geom_smooth(method = "loess", formula = y ~ x)
p2 <- ggplot(cars, aes(x = speed, y = dist)) + stat_smooth(method = "loess", formula = y ~ x)
The first graph uses a geom layer and the second graph uses a stat layer but both graphs look identical when visualized in R (which can easily be checked with the commands print(p1) and print(p2)). However, the objects p1 and p2, which are made by the ggplot() function of the ggplot2 package, are not the same. This can be checked with the function all.equal():
all.equal(p1, p2)
[1] "Component “layers”: Component 1: Component 2: Names: 3 string mismatches"
[2] "Component “layers”: Component 1: Component 2: Component 1: 1 element mismatch"
[3] "Component “layers”: Component 1: Component 2: Component 2: 'is.NA' value mismatch: 0 in current 1 in target"
[4] "Component “layers”: Component 1: Component 2: Component 3: 'is.NA' value mismatch: 1 in current 0 in target"
[5] "Component “layers”: Component 1: Component 4: Names: 5 string mismatches"
[R output truncated]
All differences between the two objects p1 and p2 appear to be in the first component of the layers component of each object, so we can have a look at what this component contains:
Further exploration shows that the differences actually reside in the components geom_params and stat_params. For geom_params, the difference is just one of the order of the elements:
p1[["layers"]][[1]]$geom_params
$na.rm
[1] FALSE
$orientation
[1] NA
$se
[1] TRUE
p2[["layers"]][[1]]$geom_params
$se
[1] TRUE
$na.rm
[1] FALSE
$orientation
[1] NA
For stat_params, however, the elements of p1 are a subset of the elements of p2 (in a different order):
p1[["layers"]][[1]]$stat_params
$na.rm
[1] FALSE
$orientation
[1] NA
$se
[1] TRUE
$method
[1] "loess"
$formula
y ~ x
It is not immediately obvious how to deal with this. Creating a customized function for testing the equality of geom_params and stat_params (assuming there are no differences for other plot types) is probably too laborious. The solution seems to be to explicitly tell students which layers to use.
The text was updated successfully, but these errors were encountered:
It has already been noted (see this issue) that the
testGGPlot()
function is "too rigorous" in that it detects differences between an expected and a (student) generated plot which look actually look identical in R. The problem is not with thetestGGPlot()
function, however, which does what it is supposed to do. The issue is that ggplot2 stores a graph differently depending on which layers you use. This can be illustrated by the following two graphs which both produce a scatter plot with a LOESS smoother through the points:The first graph uses a geom layer and the second graph uses a stat layer but both graphs look identical when visualized in R (which can easily be checked with the commands
print(p1)
andprint(p2)
). However, the objectsp1
andp2
, which are made by theggplot()
function of the ggplot2 package, are not the same. This can be checked with the functionall.equal()
:All differences between the two objects
p1
andp2
appear to be in the first component of thelayers
component of each object, so we can have a look at what this component contains:Further exploration shows that the differences actually reside in the components
geom_params
andstat_params
. Forgeom_params
, the difference is just one of the order of the elements:For
stat_params
, however, the elements ofp1
are a subset of the elements ofp2
(in a different order):It is not immediately obvious how to deal with this. Creating a customized function for testing the equality of
geom_params
andstat_params
(assuming there are no differences for other plot types) is probably too laborious. The solution seems to be to explicitly tell students which layers to use.The text was updated successfully, but these errors were encountered: