Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

r?ply issues #165

Closed
wants to merge 6 commits into from
Closed

r?ply issues #165

wants to merge 6 commits into from

Conversation

krlmlr
Copy link

@krlmlr krlmlr commented Jul 16, 2013

An attempt to seamlessly support both expressions and functions in the r?ply family, maintaining consistency for both side effects and output.

Fixes #158 and #164

@hadley
Copy link
Owner

hadley commented Jul 18, 2013

There's no way to fix this problem in general without breaking old code. For that reason, I think we need an additional argument to r*ply that allows you to override the default guess. I'd suggest quote = NA for the current behaviour, quote = FALSE to assume it's already a function and quote = TRUE to wrap an expression into a function.

@hadley hadley closed this Jul 18, 2013
@krlmlr
Copy link
Author

krlmlr commented Jul 18, 2013

In what way does my implementation break old code?

@hadley
Copy link
Owner

hadley commented Jul 21, 2013

Can we please keep the discussion of the pull request in one place? Here is some more detailed feedback:

  • I really don't like the .cc function - the name is not intuitive, and I think it would be easier to just test for .n == 0 in each of the individual functions (and if you're going to do that, you probably should actually test that .n is an integer vector of length 1)
  • r_ply was deliberately implemented without any intermediate data structures because it is slightly more efficient and more clearly conveys the intent of the function. It also means that you don't need the .print and .discard arguments. It would make more sense to me to extract the code that figures out whether the input is an expression or a function into it's own function.
  • I don't think you've provided a full set of cases for the various things that .expr might be (a function, an expression, an function that returns an expression, a function that evaluates to a function etc), and I don't understand how the code works, so you either need to add comments or refactor it into simpler components.
  • I don't understand why you're testing the r*ply functions using a function with side effects. This is not something I'd expect someone to use except for with r_ply. Also, aliasing the actual function being tested to rply makes it rather hard to understand what's going on.

@krlmlr
Copy link
Author

krlmlr commented Jul 21, 2013

Thank you for your very concise input. I have addressed most of your remarks, please see the updated branch for the most recent changes.

  1. .cc has been removed, I have found an implementation that doesn't need it.
  2. I can convert .rlply_worker to a stripped-down version that does not accumulate a list, but I'd prefer to do this once we agree upon an implementation of .rlply_worker. Also, I'd like to measure the performance impact of using a more generic function -- what do you think?
  3. I have supplied comments to both the tests and the crucial parts of the new code. I hope it is clearer now. Also, tests have been fixed and some new tests for r_ply have been added.
  4. What if .expr is a function that is very expensive to evaluate? In such a case we'd save one evaluation also with the other r*ply functions.

If we agree on the code, I'll be glad to make a sausage and open a new pull request.

@hadley
Copy link
Owner

hadley commented Jul 21, 2013

It'd be great if we could keep everything in this one pull request instead of scattering it in multiple places - could you please merge the commits into this branch?

@krlmlr
Copy link
Author

krlmlr commented Jul 21, 2013

They are in this branch, but the pull request is closed, so they don't show up.

https://github.com/krlmlr/plyr/tree/fix/158-rply-expr

@hadley hadley reopened this Jul 26, 2013
@hadley
Copy link
Owner

hadley commented Jul 26, 2013

That's annoying - reopened just so I can see the changes.

@krlmlr
Copy link
Author

krlmlr commented Jul 26, 2013


result <- vector("list", length = .n)

# The logic below is responsible for ascertaining that .expr is evaluated
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this explanation - it's useful, and I now understand (and like!) your strategy. However, I think it would be better separated into its own function, like:

expr_to_fun <- function(x) {
  expr <- substitute(x)
  res <- force(x)

  if (is.function(res)) {
    list(f = res, val = res())
  } else {
    f <- eval.parent(substitute(function() expr))
    list(f = f, val = res)
  }
}

expr_to_fun(function() runif(10))
expr_to_fun(runif(10))

then that can easily be tested independently of r*ply.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. This should be doable, thanks to lazy evaluation. Will push to this branch.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, I had to change the implementation, see 1eaf74d. The reason can be seen in a small self-contained test I have prepared. This test fails, but works if line 20 is replaced by f2e <- .expr_to_fun(inc()). Unfortunately, the code in the test is an extract of what is happening for, e.g., raply or rdply -- and rlply if a generic worker function is used.

@krlmlr
Copy link
Author

krlmlr commented Aug 5, 2013

Indeed, with the current code r_ply runs much slower than before. I will create and use a new function .r_ply_worker from the finally accepted version of .rlply_worker.

@krlmlr
Copy link
Author

krlmlr commented Oct 24, 2013

squashed and optimized somewhat

@hadley
Copy link
Owner

hadley commented Jan 2, 2014

Could you please rebase/merge, update docs with latest roxygen2 and open a new pull request?

@hadley hadley closed this Jan 2, 2014
@krlmlr krlmlr mentioned this pull request Jan 6, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

r_ply repeats one too many times?
2 participants