-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPARK-22896 Improvement in String interpolation #20070
Conversation
OK, we can do this for consistency. In general we don't spend a ton of time on changes like this, because there is already way too much else to review. Please make one change only (this one) that updates all examples in one go. |
@srowen Absolutely correct, this all in one shot. I did changes in all. |
@srowen also i did merge another similiar PR with graphx to here. so Just FYI - we are good. |
I think there are many more examples of this; a quick search suggests there are about 40 example files that have some string concatenation. |
In scala ? I don't think so. I am re-iterating and doing double check. |
You're correct - I missed other packages. I will re-confirm soon. Thanks. |
@srowen I rechecked all scala examples and this is commulative PR for the same. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small comments here but looking fine as an improvement. Also have a look for more old .format()
calls if you like; we could update those.
@@ -145,9 +145,9 @@ object Analytics extends Logging { | |||
// TriangleCount requires the graph to be partitioned | |||
.partitionBy(partitionStrategy.getOrElse(RandomVertexCut)).cache() | |||
val triangles = TriangleCount.run(graph) | |||
println("Triangles: " + triangles.vertices.map { | |||
println(s"Triangles: ${triangles.vertices.map { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably more readable if the expression is stored in a local val first. Anything nontrivial like this gets hard to parse in an interpolated string. While we're here, worth fixing up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
println("pValues = " + chi.getAs[Vector](0)) | ||
println("degreesOfFreedom = " + chi.getSeq[Int](1).mkString("[", ",", "]")) | ||
println("statistics = " + chi.getAs[Vector](2)) | ||
println(s"pValues = ${chi.getAs[Vector](0)}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is OK; anything more complex I might suggest breaking out the expression into a val.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
@@ -51,10 +51,10 @@ object CorrelationExample { | |||
|
|||
val df = data.map(Tuple1.apply).toDF("features") | |||
val Row(coeff1: Matrix) = Correlation.corr(df, "features").head | |||
println("Pearson correlation matrix:\n" + coeff1.toString) | |||
println(s"Pearson correlation matrix:\n ${coeff1.toString}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing we could improve: .toString
is redundant here I believe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed.
@@ -169,10 +169,10 @@ private class MyLogisticRegressionModel( | |||
Vectors.dense(-margin, margin) | |||
} | |||
|
|||
/** Number of classes the label can take. 2 indicates binary classification. */ | |||
// Number of classes the label can take. 2 indicates binary classification. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good to make this a standard comment, not scaladoc style
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
@@ -31,12 +31,11 @@ object QuantileDiscretizerExample { | |||
|
|||
// $example on$ | |||
val data = Array((0, 18.0), (1, 19.0), (2, 8.0), (3, 5.0), (4, 2.2)) | |||
val df = spark.createDataFrame(data).toDF("id", "hour") | |||
val df = spark.createDataFrame(data).toDF("id", "hour").repartition(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although it looks weird, I think the author intended the repartition(1) to not appear in the body of the example that's copied into the docs. I wouldn't change this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
@@ -82,9 +82,9 @@ class CustomReceiver(host: String, port: Int) | |||
var socket: Socket = null | |||
var userInput: String = null | |||
try { | |||
logInfo("Connecting to " + host + ":" + port) | |||
logInfo(s"Connecting to $host $port") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: we could make the string consistent with the one two lines below by adding a colon
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, addressed
Test build #4026 has finished for PR 20070 at commit
|
} | ||
// $example off$ | ||
|
||
sc.stop() | ||
} | ||
|
||
} | ||
// scalastyle:on println | ||
// scalastyle:on println |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It didn't like the lack of newline at the end of this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen apologize for delay :) I have addressed suggestion. thanks
} | ||
// $example off$ | ||
|
||
sc.stop() | ||
} | ||
|
||
} | ||
// scalastyle:on println | ||
// scalastyle:on println |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, addressed.
@@ -82,9 +82,9 @@ class CustomReceiver(host: String, port: Int) | |||
var socket: Socket = null | |||
var userInput: String = null | |||
try { | |||
logInfo("Connecting to " + host + ":" + port) | |||
logInfo(s"Connecting to $host $port") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, addressed
Test build #4028 has finished for PR 20070 at commit
|
@srowen please do re-run the build. |
Test build #4029 has finished for PR 20070 at commit
|
println(s"Chose ${categoricalFeatures.size} categorical features: " + | ||
categoricalFeatures.mkString(", ")) | ||
println(s"Chose ${categoricalFeatures.size} " + | ||
s"categorical features: {$categoricalFeatures.mkString(", ")}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo here. Please make sure build/test/style passes locally before pushing again, or else this takes a lot of work to review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did fixed this. Can you please give me steps as a check list before commit for test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
./dev/run-tests
Test build #4031 has finished for PR 20070 at commit
|
@srowen Okey. current status looks good |
@@ -45,7 +45,7 @@ object QuantileDiscretizerExample { | |||
.setNumBuckets(3) | |||
|
|||
val result = discretizer.fit(df).transform(df) | |||
result.show() | |||
result.show(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more question - is it necessary to make this not truncate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're following same style in other examples so it is good to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which other examples? most do not set this, and the Java equivalent doesn't either. If there's a good reason that the output needs to be untruncated, that's fine, just also change the Java example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen correct either way it works for ex. examples/ml/LDAExamples.scala
@@ -68,7 +68,7 @@ object HypothesisTestingExample { | |||
// against the label. | |||
val featureTestResults: Array[ChiSqTestResult] = Statistics.chiSqTest(obs) | |||
featureTestResults.zipWithIndex.foreach { case (k, v) => | |||
println("Column " + (v + 1).toString + ":") | |||
println(s"Column ${(v + 1).toString} :") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.toString is redundant here and elsewhere with interpolation. I think that should be simplified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen Thanks , Changes addressed
print("Topic " + topic + ":") | ||
for (word <- Range(0, ldaModel.vocabSize)) { print(" " + topics(word, topic)); } | ||
print(s"Topic $topic :") | ||
for (word <- Range(0, ldaModel.vocabSize)) { print(s" ${topics(word, topic)}") } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Go ahead and put the print on a new line (I know it wasn't before)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen Thanks for suggestion, it has been addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still more instances of .toString. Just search on the changes page, even, to see them.
@@ -45,7 +45,7 @@ object QuantileDiscretizerExample { | |||
.setNumBuckets(3) | |||
|
|||
val result = discretizer.fit(df).transform(df) | |||
result.show() | |||
result.show(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which other examples? most do not set this, and the Java equivalent doesn't either. If there's a good reason that the output needs to be untruncated, that's fine, just also change the Java example.
@@ -46,7 +46,10 @@ object LatentDirichletAllocationExample { | |||
val topics = ldaModel.topicsMatrix | |||
for (topic <- Range(0, 3)) { | |||
print(s"Topic $topic :") | |||
for (word <- Range(0, ldaModel.vocabSize)) { print(s" ${topics(word, topic)}") } | |||
for (word <- Range(0, ldaModel.vocabSize)) | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we put the open brace on the preceding line with for, and don't triple indent. See any other for loop in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen sure done.
@srowen Request for review when you get on this. |
Test build #4033 has finished for PR 20070 at commit
|
Merged to master/2.3 |
## What changes were proposed in this pull request? * String interpolation in ml pipeline example has been corrected as per scala standard. ## How was this patch tested? * manually tested. Author: chetkhatri <[email protected]> Closes #20070 from chetkhatri/mllib-chetan-contrib. (cherry picked from commit 9a2b65a) Signed-off-by: Sean Owen <[email protected]>
What changes were proposed in this pull request?
How was this patch tested?