-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-15490][R][DOC] SparkR 2.0 QA: New R APIs and API docs for non-MLib changes #13394
Conversation
Test build #59589 has finished for PR 13394 at commit
|
Jenkins test this please |
Test build #59600 has finished for PR 13394 at commit
|
@felixcheung Could you take a look at this PR ? |
@vectorijk Thanks for the PR. Changes look pretty good to me. We can do that in a separate JIRA/PR or if you wish we can also do it in this PR. |
|
#' @rdname tojson | ||
#' @noRd | ||
#' @rdname toJSON | ||
#' @name toJSON |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wait, why are we changing this from @noRd
? this is not exported from SparkR and should not be documented.
@shivaram For updating the programming guide, I'd love to do this in a separate PR. |
Test build #59648 has finished for PR 13394 at commit
|
Thanks @vectorijk - I created https://issues.apache.org/jira/browse/SPARK-15672 for that |
@@ -2514,7 +2529,9 @@ setMethod("attach", | |||
#' environment. Then, the given expression is evaluated in this new | |||
#' environment. | |||
#' | |||
#' @title with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shivaram Is this supposed to be a long-form title, or just the name of the method? Looking at other examples, it looks like it should be a short description
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shivaram Yes, I also notice titles of other examples are not consistent. Which one should we use? Short description or just the name of the method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should follow the example of existing R packages and use the long form as the title. For example if you look at https://stat.ethz.ch/R-manual/R-devel/library/stats/html/glm.html the title of the page is "Fitting Generalized Linear Models"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vectorijk Could you please update the PR this way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkbradley I will do it ASAP.
Test build #59909 has finished for PR 13394 at commit
|
@@ -1069,6 +1079,8 @@ setMethod("first", | |||
#' | |||
#' @param x A SparkDataFrame | |||
#' | |||
#' @family SparkDataFrame functions | |||
#' @rdname tordd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's revert these 2 lines as well? thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@felixcheung ok, should we also remove these two lines in toJSON
part in line 631?
@@ -628,8 +628,6 @@ setMethod("repartition", | |||
#' | |||
#' @param x A SparkDataFrame | |||
#' @return A StringRRDD of JSON objects | |||
#' @family SparkDataFrame functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@felixcheung I removed these two lines in toJSON part. Correct me, if I am wrong.
Test build #60030 has finished for PR 13394 at commit
|
@@ -647,7 +645,7 @@ setMethod("toJSON", | |||
RDD(jrdd, serializedMode = "string") | |||
}) | |||
|
|||
#' write.json | |||
#' Save the contents of DataFrame as a JSON file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use SparkDataFrame as opposed to DataFrame (see https://issues.apache.org/jira/browse/SPARK-12148 for some more details).
@@ -582,9 +586,9 @@ setMethod("summary", signature(object = "AFTSurvivalRegressionModel"), | |||
return(list(coefficients = coefficients)) | |||
}) | |||
|
|||
#' Make predictions from an AFT survival regression model | |||
#' predict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto: keep long title
Test build #60473 has finished for PR 13394 at commit
|
This LGTM now. @shivaram @felixcheung let me know if I missed something; I'm still getting used to R doc syntax & conventions. |
LGTM thanks! only a minor comment - in the cases where you have a short title (eg. "predict", "Histogram") can you think of a longer more descriptive title? that would help make it consistent with everything else. |
@felixcheung Check out the comment above from @vectorijk about putting multiple predict methods in a single page. Is there a better way to organize these? |
Yeah I think the approach used by @vectorijk is fine. We could have the title as |
Thanks! @jkbradley @felixcheung @shivaram Sure. How about use title |
How about
|
Test build #3109 has finished for PR 13394 at commit
|
@@ -402,6 +406,8 @@ setMethod("spark.naiveBayes", signature(data = "SparkDataFrame", formula = "form | |||
return(new("NaiveBayesModel", jobj = jobj)) | |||
}) | |||
|
|||
#' Save fitted MLlib model to the input path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkbradley Likewise, I changed title write.ml
to Save fitted MLlib model to the input path
rather than Save the Bernoulli naive Bayes model to the input path.
for all four different models.
Test build #60611 has finished for PR 13394 at commit
|
LGTM |
…MLib changes ## What changes were proposed in this pull request? R Docs changes include typos, format, layout. ## How was this patch tested? Test locally. Author: Kai Jiang <[email protected]> Closes #13394 from vectorijk/spark-15490. (cherry picked from commit 5fd20b6) Signed-off-by: Joseph K. Bradley <[email protected]>
### What changes were proposed in this pull request? This pr aims to upgrade netty from 4.1.92 to 4.1.93. ### Why are the changes needed? 1.v4.1.92 VS v4.1.93 netty/netty@netty-4.1.92.Final...netty-4.1.93.Final 2.The new version brings some bug fix, eg: - Reset byte buffer in loop for AbstractDiskHttpData.setContent ([#13320](netty/netty#13320)) - OpenSSL MAX_CERTIFICATE_LIST_BYTES option supported ([#13365](netty/netty#13365)) - Adapt to DirectByteBuffer constructor in Java 21 ([#13366](netty/netty#13366)) - HTTP/2 encoder: allow HEADER_TABLE_SIZE greater than Integer.MAX_VALUE ([#13368](netty/netty#13368)) - Upgrade to latest netty-tcnative to fix memory leak ([#13375](netty/netty#13375)) - H2/H2C server stream channels deactivated while write still in progress ([#13388](netty/netty#13388)) - Channel#bytesBefore(un)writable off by 1 ([#13389](netty/netty#13389)) - HTTP/2 should forward shutdown user events to active streams ([#13394](netty/netty#13394)) - Respect the number of bytes read per datagram when using recvmmsg ([#13399](netty/netty#13399)) 3.The release notes as follows: - https://netty.io/news/2023/05/25/4-1-93-Final.html 4.Why not upgrade to `4-1-94-Final` version? Because the return value of the 'threadCache()' (from `PoolThreadCache` to `PoolArenasCache`) method of the netty Inner class used in the 'arrow memory netty' version '12.0.1' has changed and belongs to break change, let's wait for the upgrade of the 'arrow memory netty' before upgrading to the '4-1-94-Final' version. The reference is as follows: https://github.com/apache/arrow/blob/6af660f48472b8b45a5e01b7136b9b040b185eb1/java/memory/memory-netty/src/main/java/io/netty/buffer/PooledByteBufAllocatorL.java#L164 https://github.com/netty/netty/blob/da1a448d5bc4f36cc1744db93fcaf64e198db2bd/buffer/src/main/java/io/netty/buffer/PooledByteBufAllocator.java#L732-L736 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Pass GA. Closes #41681 from panbingkun/upgrade_netty. Authored-by: panbingkun <[email protected]> Signed-off-by: Dongjoon Hyun <[email protected]>
What changes were proposed in this pull request?
R Docs changes
include typos, format, layout.
How was this patch tested?
Test locally.