Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20490][SPARKR] Add R wrappers for eqNullSafe and ! / not #17783

Closed
wants to merge 13 commits into from

Conversation

zero323
Copy link
Member

@zero323 zero323 commented Apr 27, 2017

What changes were proposed in this pull request?

  • Add null-safe equality operator %<=>% (sames as o.a.s.sql.Column.eqNullSafe, o.a.s.sql.Column.<=>)
  • Add boolean negation operator ! and function not .

How was this patch tested?

Existing unit tests, additional unit tests, check-cran.sh.

@zero323 zero323 changed the title [SPARK-20490][SPARKR][WIP] Add R wrappers for eqNullSafe and ! / not. [SPARK-20490][SPARKR][WIP] Add R wrappers for eqNullSafe and ! / not Apr 27, 2017
@SparkQA
Copy link

SparkQA commented Apr 27, 2017

Test build #76226 has finished for PR 17783 at commit 79857c9.

  • This patch fails R style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

#' ## 3 FALSE
#' }
#' @note not since 2.3.0
setMethod("not",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this conflict with testthat..

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, but testthat::not is deprecated anyway.

R/pkg/R/column.R Outdated
#' ##3 FALSE
#' }
#' @note ! since 2.3.0
setMethod("!",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this not conflict with any existing R operator?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

! is S4 generic. It is not different than & and | which we already support. I believe it hasn't been added so far just because it didn't fit into autogenerate template.

Copy link
Member Author

@zero323 zero323 Apr 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any thoughts about providing an example output in the docs. I see it makes Jenkins unhappy

R/column.R:325:5: style: Commented code should be removed.

but I believe this is an internal requirement, not something enforced by CRAN.

Note: I cannot reproduce this with local env (lintr 1.0). How is it checked on Jenkins?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which lintr? the current release is 0.2.0?
but I don't think we have a pattern for including output in example doc.

I think you could try

#' #  (x = y) (x <=> y)

or

#'  (x = y) (x <=> y)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://cran.r-project.org/web/packages/lintr/, do we use something else here?

@SparkQA
Copy link

SparkQA commented Apr 27, 2017

Test build #76235 has finished for PR 17783 at commit 39c9720.

  • This patch fails R style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 27, 2017

Test build #76237 has finished for PR 17783 at commit a443a84.

  • This patch fails R style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -67,8 +67,7 @@ operators <- list(
"+" = "plus", "-" = "minus", "*" = "multiply", "/" = "divide", "%%" = "mod",
"==" = "equalTo", ">" = "gt", "<" = "lt", "!=" = "notEqual", "<=" = "leq", ">=" = "geq",
# we can not override `&&` and `||`, so use `&` and `|` instead
"&" = "and", "|" = "or", #, "!" = "unary_$bang"
"^" = "pow"
"&" = "and", "|" = "or", "^" = "pow"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens with #, ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this PR implements ! I think it is just confusing, don't you? Not to mention that createOperator didn't work with !. I can restore it if you believe it serves some purpose.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, let me back track... it looks like "!" = "unary_$bang" has always been commented out. the #, does that
I agree we don't need to leave that comment here

R/pkg/R/column.R Outdated
#' @note ! since 2.3.0
setMethod("!",
signature(x = "Column"),
function(x) not(x))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this should be single line?

setMethod("!", signature(x = "Column"), function(x) not(x))


df <- as.DataFrame(data.frame(is_true = c(TRUE, FALSE, NA)))
expect_equal(
collect(select(df, alias(SparkR::not(df$is_true), "is_false"))),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need SparkR:: here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially I experienced some shading issues with testthat::not, but I cannot reproduce this any longer. Removed.

@@ -1965,6 +1975,16 @@ test_that("filter() on a DataFrame", {

# Test stats::filter is working
#expect_true(is.ts(filter(1:100, rep(1, 3)))) # nolint

# test suites for %<=>%
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move this before # Test stats::filter is working block

@felixcheung
Copy link
Member

felixcheung commented Apr 28, 2017 via email

@SparkQA
Copy link

SparkQA commented Apr 28, 2017

Test build #76279 has finished for PR 17783 at commit 47c4b41.

  • This patch fails R style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zero323
Copy link
Member Author

zero323 commented Apr 28, 2017

Looks like the same package, just GitHub tags / releases are behind the CRAN. OK, let's remove example output for now and leave lintr issues for another day.

I wonder if we could actually remove \dontrun, push sparkR.session into \dontshow and still keep inside the CRAN time limits...

@SparkQA
Copy link

SparkQA commented Apr 28, 2017

Test build #76280 has finished for PR 17783 at commit dc7c3dd.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 28, 2017

Test build #76281 has finished for PR 17783 at commit 3004310.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 28, 2017

Test build #76282 has finished for PR 17783 at commit 1788d6f.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 28, 2017

Test build #76285 has finished for PR 17783 at commit a0c9222.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zero323 zero323 force-pushed the SPARK-20490 branch 2 times, most recently from f83bcbd to 457b9c2 Compare April 29, 2017 01:10
@SparkQA
Copy link

SparkQA commented Apr 29, 2017

Test build #76290 has finished for PR 17783 at commit f83bcbd.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 29, 2017

Test build #76291 has finished for PR 17783 at commit 457b9c2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zero323
Copy link
Member Author

zero323 commented Apr 29, 2017

Jenkins retest this please.

@SparkQA
Copy link

SparkQA commented Apr 29, 2017

Test build #76295 has finished for PR 17783 at commit 457b9c2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@felixcheung
Copy link
Member

is this ready? please remove the [WIP] in the title

@felixcheung
Copy link
Member

felixcheung commented Apr 29, 2017

I wonder if we could actually remove \dontrun, push sparkR.session into \dontshow and still keep inside the CRAN time limits...

we could try but the number of methods and examples we have is much? higher than a lot of other packages....

@zero323
Copy link
Member Author

zero323 commented Apr 29, 2017

we could try but the number of methods and examples we have is much? higher than a lot of other packages....

Yeah... Maybe it is possible to hack docs generation process instead. Leave \dontrun for examples to keep CRAN happy but actually execute these blocks when we create docs?

is this ready? please remove the [WIP] in the title

I want to give it some thought first. I am still confused about the way in which CRAN check and roxygen generate docs for !. Between \name should not contain ! and checker don't recognizing that docs exist it feels hacky.

@felixcheung
Copy link
Member

felixcheung commented Apr 29, 2017 via email

@zero323
Copy link
Member Author

zero323 commented Apr 29, 2017

In short, depending on how you set @rdname and @name we get warnings

 `prepare_Rd: not.Rd: \name should not contain !`

or

Undocumented S4 methods:
  generic '!' and siglist 'Column'

(not that we document any other Column operator) and both are considered failures in our build. Attaching these to Column docs seems to do the trick, but we put column function there as well, and it is not that clean.

I kind of know how to handle this by writing Rd files by hand, but it is not a solution here.

@felixcheung
Copy link
Member

didn't it warn about @rdname column being lower cased? Not sure how later tests are passing.
what if we change both @rdname and @name?

#' @rdname not
#' @name not

@zero323
Copy link
Member Author

zero323 commented Apr 30, 2017

didn't it warn about @Rdname column being lower cased?

We use lower case column as @rdname for both column and Column-class so it is OK. We just have both in the same output file (and now ! as well).

I am pretty sure this is the one I used at the beginning, and it worked form me locally, but failed on Jenkins. I'll double check this but it feels like cargo cult programming.

@SparkQA
Copy link

SparkQA commented Apr 30, 2017

Test build #76314 has finished for PR 17783 at commit 99f8dd2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zero323
Copy link
Member Author

zero323 commented Apr 30, 2017

Huh, seems to work after all.

@felixcheung
Copy link
Member

I'm surprised @name %<=>% does too

R/pkg/R/column.R Outdated
#' head(select(df, !column("x") > 0))
#' }
#' @note ! since 2.3.0
#' @seealso \link{not}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't need this since @rdname not - they both go to the same html page

@SparkQA
Copy link

SparkQA commented Apr 30, 2017

Test build #76316 has finished for PR 17783 at commit f44aa74.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zero323 zero323 changed the title [SPARK-20490][SPARKR][WIP] Add R wrappers for eqNullSafe and ! / not [SPARK-20490][SPARKR] Add R wrappers for eqNullSafe and ! / not Apr 30, 2017
@SparkQA
Copy link

SparkQA commented Apr 30, 2017

Test build #76329 has finished for PR 17783 at commit 8ab9e0c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@felixcheung
Copy link
Member

merged to master

@asfgit asfgit closed this in 80e9cf1 May 1, 2017
@zero323
Copy link
Member Author

zero323 commented May 1, 2017

Thanks.

@zero323 zero323 deleted the SPARK-20490 branch May 8, 2017 09:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants