Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

r_ply repeats one too many times? #158

Closed
garrettgman opened this issue May 7, 2013 · 14 comments · Fixed by #189
Closed

r_ply repeats one too many times? #158

garrettgman opened this issue May 7, 2013 · 14 comments · Fixed by #189

Comments

@garrettgman
Copy link

I'd expect this to replicate the expression once.
If this isn't a bug, could we change it for teaching purposes?

r_ply(1, print("ping"))
[1] "ping"
[1] "ping"
NULL
@hadley
Copy link
Owner

hadley commented May 7, 2013

Doh. That's definitely a bug.

@everpeace
Copy link

is.function evaluates argument object as below

> is.function(print("ping"))
[1] "ping"
[1] FALSE

I came up this idea for checking the expression is function without evaluating it.

> grepl('^function',substitute(print("ping"))[[1]])
[1] FALSE
> grepl('^function',substitute(function()print("ping"))[[1]])
[1] TRUE

What do you think about this method?
If it's ok, I'm ready for pull request.
I made a small commit to my forked repo below.
everpeace@e77db00

@krlmlr
Copy link

krlmlr commented Jul 14, 2013

The is.function check could be replaced by !is.call(substitute(...)).

> checker <- function (c) { is.call(substitute(c)) }
> checker(print)
[1] FALSE
> checker(print("ping"))
[1] TRUE

@everpeace
Copy link

Thanks for your suggestion.
Unfortunately, is.function couldn't be replaced by !is.call(substitute(...)) as below.
As you can see in #162 , I closed my pull request because my patch does also include some bugs...

> checker <- function (c) { is.call(substitute(c)) }
> checker(print)
[1] FALSE
> checker(function()print("bing"))
[1] TRUE

@everpeace
Copy link

As crowding commented in #162,
There are multiple place where is.function is used for checking expr.

We probably also want to maintain consistency between r_ply rlply raply and rdply.

@krlmlr
Copy link

krlmlr commented Jul 14, 2013

The is.call just seemed to work too beautifully, but of course it's wrong. Also, I don't see a way at all to check, without evaluating, if .expr is a function or an expression to be executed. I was thinking about an implementation along the following lines:

  • If n <= 0 --> early exit, don't look at .expr
  • Otherwise, use is.function to detect the nature of .expr
    • If it is in fact an expression that needs to be wrapped in a function, decrease n by one
  • Proceed as before

Same for the other r?ply functions. Perhaps this could be wrapped to avoid repetition in the code.

Tests can be added to verify the behavior for the three interesting cases (print, print("ping"), function() print("ping")). Printing can be simulated by increasing a global variable.

@everpeace
Copy link

How about this??
This method can be fully reusable and no evaluation for .expr happens.

> library(testthat)
> 
> is_function <- function(.expr){
+   if(is.call(substitute(.expr))){
+     grepl('^function',substitute(.expr)[[1]])
+   } else {
+     is.function(.expr)
+   }
+ }
> 
> 
> expect_false(is_function(print("bing"))) # function call
> expect_false(is_function({print("bing")})) # some block
> expect_true(is_function(function()print("bing"))) # anonymous function
> expect_true(is_function(print)) # function object

I know grepl is agly, but I can't hit on other ideas other than krlmlr's way.

krlmlr's way would be good I think. But we have to decrease n by one outside of the codes which determines .expr is function or not. This breaks DRY principle.

@wch
Copy link
Collaborator

wch commented Jul 16, 2013

Here are some other "fun" test cases, along with the actual results from is_function(). I don't know if these results should be considered correct or not.

is_function({ 1+1; function() 20 })
# FALSE
is_function((function() 20))
# FALSE
is_function({ function() 20 })
# FALSE
functionxx <- function() 20
is_function(functionxx())
# TRUE

I suspect there's not really a perfectly reliable way to solve the problem.

@krlmlr
Copy link

krlmlr commented Jul 16, 2013

@everpeace: If you insist on DRY, it's possible to write a function that returns a list of two components -- converted .expr and remaining repetition count.

@hadley
Copy link
Owner

hadley commented Jul 16, 2013

@everpeace grepl is definitely not the way to work with language objects - see https://github.com/hadley/devtools/wiki/Computing-on-the-language for more details

@wch Right - I don't think it's possible to tell if something is a function without evaluating it, or righting something that is equivalent in complexity to the evaluator

One option would be to add an argument that allows you to manually describe whether or not you're passing in a function or an expression,

@krlmlr
Copy link

krlmlr commented Jul 16, 2013

Especially the requirement to provide a consistent interface for all r?ply function is difficult to fulfill. The functions raply and rdply both call rlply with .expr = .expr, which already evaluates .expr. Any magic treatment of .expr will have to happen inside these functions, so code duplication seems to be unavoidable. I see two options for maintaining consistency of both side effects and results:

  1. Automatic solution
    • Evaluate .expr to see if it is a function
      • Yes: Simple case
      • No:
        • Create a wrapper (this has to be done before forcing)
        • Record the result of the evaluation and use it as first element of the result
        • Decrease .n by one
  2. An explicit parameter, as suggested by @hadley.

The automatic solution is difficult to implement even for r_ply (I will push my humble attempts), and difficult to reuse for the other r?ply functions. On the other hand, how is an explicit parameter different from just prepending function() (11 characters including the space)?

Perhaps the simplest thing would be just to require .expr to be a function.

@hadley
Copy link
Owner

hadley commented Jul 16, 2013

@krlmlr the main difference is that adding a parameter would not break any existing code, and being able to supply an expression to r*ply is very important.

@krlmlr
Copy link

krlmlr commented Jul 16, 2013

@hadley: Indeed, backward compatibility is at a premium for plyr, given the number of reverse depends. Have to jump through hoops, then.

@krlmlr krlmlr mentioned this issue Jul 16, 2013
@krlmlr
Copy link

krlmlr commented Jul 20, 2013

@hadley: I was surprised that you rejected my pull request #165 with the following comment:

There's no way to fix this problem in general without breaking old code. For that reason, I think we need an additional argument to r*ply that allows you to override the default guess. ...

I think that my solution shown in #165 solves the problem without breaking old code (short of making sure that exactly n side effects are seen). Who would rely on such odd behavior as documented in this issue? Is this behavior really worth keeping? -- If yes, then we indeed need an extra parameter.

The implementation follows the "automatic solution" outline of my earlier comment. The expression is evaluated exactly n times in all cases. Other features include:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants