Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formula processing #197

Merged
merged 7 commits into from
Aug 20, 2024
Merged

Formula processing #197

merged 7 commits into from
Aug 20, 2024

Conversation

zeileis
Copy link
Collaborator

@zeileis zeileis commented Aug 6, 2024

Fixes #191

I started to work on a completely revised formula handling to address the issue above. This is still leaner than employing the Formula package but uses a similar approach:

  • Auxiliary function tinyformula explicitly breaks up the formulas into separate formulas for x, y (if any), by (if any), and facet (if a formula) plus a full formula combining all elements (on the right-hand side).
  • Auxiliary function tinyframe to extract the variables from the different formulas from the overall model frame.
  • This allows for better sanity checking, e.g., that there is exactly one x and at most one y etc.

By and large things seem to work. However, I have apparently messed up the two-sided facet formulas. Grant @grantmcdermott if you have the time to look at this, this would be great. I'm not sure whether I fully understand how the attributes work...

@grantmcdermott
Copy link
Owner

Thanks so much for taking this on @zeileis!

By and large things seem to work. However, I have apparently messed up the two-sided facet formulas. Grant @grantmcdermott if you have the time to look at this, this would be great. I'm not sure whether I fully understand how the attributes work...

This week is tough for me due to work deadlines, but I'll try to take a deeper look over the weekend.

@zeileis
Copy link
Collaborator Author

zeileis commented Aug 6, 2024

That's perfectly fine. I'll be on vacation for a bit more than two weeks. So I won't have a lot of time either but should be able to respond to smaller messages.

I hope that the new code is also reasonably easy to read and understand. Most of it works fine but I still didn't understand how the facet = a ~ b specification is processed correctly. Some cases I was able to handle correctly but others I didn't understand properly. BTW, in principle we would now also have the infrastructure to have facet specifications with multiple variables on both siders, e.g., a1 + a2 ~ b1 + b2 etc. But I didn't actually test this, yet, because I couldn't get even the simple case right.

@grantmcdermott
Copy link
Owner

grantmcdermott commented Aug 17, 2024

Hi @zeileis. I finally had some time to review this and everything looks excellent. Thank you!

I only have two minor comments and then we can merge:

@zeileis
Copy link
Collaborator Author

zeileis commented Aug 18, 2024

I hope I have correctly incorporated your second point.

And regarding the first point: When the number of rows and columns is different, then certain combinations of facet specifications do not work. For example, this looks as intended:

tinyplot(mpg ~ wt, facet = vs ~ gear, data = mtcars)

But this is messed up because the layout is still 2 x 3 rather than 3 x 2 and hence it is not surprising that the annotation is not correct:

tinyplot(mpg ~ wt, facet = gear ~ vs, data = mtcars)

If you have more variables in the facets that are then interacted, things can get even worse.

@grantmcdermott
Copy link
Owner

Ah, I see now.

Thanks, let me take a deeper look. Hopefully, the fix won't be too difficult...

@grantmcdermott
Copy link
Owner

grantmcdermott commented Aug 19, 2024

Hi @zeileis. I've pushed a few commits directly to your PR—hope that's okay—that seem to have plugged the problem areas. Quick examples...

pkgload::load_all("~/Documents/Projects/tinyplot")
#> ℹ Loading tinyplot
tpar(las = 1)

This was working beforehand and still does (2 x 3 arrangement):

tinyplot(mpg ~ wt, facet = vs ~ gear, data = mtcars)

This one wasn't working correctly beforehand, but now does (3 x 2 arrangement)):

tinyplot(mpg ~ wt, facet = gear ~ vs, data = mtcars)

And here's a more complicated example with multiple variables in the facet formula LHS , which used to trigger an error due to missing values in some facet windows, but now plots everything correctly.

tinyplot(mpg ~ wt, facet = gear:vs ~ cyl, data = mtcars, pch = 16, axes = "l", grid = TRUE)

Compare against equivalent ggplot2 call:

library(ggplot2)
ggplot(mtcars, aes(wt, mpg)) +
  geom_point() +
  facet_grid(vs+gear~cyl, labeller = labeller(.multi_line = FALSE)) +
  theme_minimal()

Created on 2024-08-18 with reprex v2.1.1

Given that you're on (a well-deserved!) vacation, I imagine that it may be tricky for you to test on your side. But let me know what you think.

@zeileis
Copy link
Collaborator Author

zeileis commented Aug 19, 2024

Wonderful, thanks a lot. I'll try to have a look tonight!

@grantmcdermott
Copy link
Owner

Fixes #205.

@zeileis
Copy link
Collaborator Author

zeileis commented Aug 19, 2024

Very nice, thanks for these fixes! I've played around a bit and didn't find anything else. So IMO you can move forward and merge this.

Just one comment and one question:

@grantmcdermott
Copy link
Owner

Agree with both! The if...{} else...{} construct is mostly a function of habit from my side. I'll revert to your concise suggestion and then merge.

@grantmcdermott grantmcdermott merged commit 0288d76 into main Aug 20, 2024
3 checks passed
@grantmcdermott grantmcdermott deleted the formula-terms branch August 20, 2024 00:19
@grantmcdermott
Copy link
Owner

grantmcdermott commented Aug 20, 2024

Thank you @zeileis! This is great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

tinyplot(fn(x) ~ x) triggers inadvertent continuous groups
2 participants