Fixed coding style issues in Spark SQL #208

liancheng · 2014-03-23T10:31:44Z

This PR addresses various coding style issues in Spark SQL, including but not limited to those mentioned by @mateiz in PR #146.

As this PR affects lots of source files and may cause potential conflicts, it would be better to merge this as soon as possible after PR #205 (In-memory columnar representation for Spark SQL) is merged.

liancheng · 2014-03-23T10:43:07Z

All comments from @marmbrus were addressed, thanks for all the suggestions! Hope this can be a good foundation for future contributions :)

AmplabJenkins · 2014-03-23T11:09:57Z

Merged build triggered.

AmplabJenkins · 2014-03-23T11:09:57Z

Merged build started.

AmplabJenkins · 2014-03-23T12:07:36Z

Merged build finished.

AmplabJenkins · 2014-03-23T12:07:36Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13364/

Conflicts: sql/core/src/main/scala/org/apache/spark/sql/execution/Exchange.scala

AmplabJenkins · 2014-03-23T20:17:31Z

Merged build triggered.

AmplabJenkins · 2014-03-23T20:18:30Z

Merged build started.

liancheng · 2014-03-23T20:19:08Z

@pwendell Merged most recent master, should be ready to be merged if Jenkins is happy.

marmbrus · 2014-03-23T20:31:52Z

LGTM

AmplabJenkins · 2014-03-23T21:10:58Z

Merged build finished.

AmplabJenkins · 2014-03-23T21:10:58Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13372/

pwendell · 2014-03-23T22:21:56Z

Thanks, merged.

rxin · 2014-03-24T06:25:21Z

To make it consistent with the rest of Spark, can you also submit a PR to update the "package" definitions to use fully qualified packages?

marmbrus · 2014-03-24T06:32:02Z

They are fully qualified, as there is only only possible path that will define the package for a given file. For this reason I'm not sure if the logic behind avoiding relative imports applies here.

There are just breaks placed in to implicitly include things from the package object of parent packages. I think this is a useful construct to include system-wide things like logging or type aliases without adding verbosity to every file in the project.

rxin · 2014-03-24T06:41:30Z

package org.apache.spark.sql
package catalyst

vs

package org.apache.spark.sql.catalyst

There are three reasons I think it'd be better to just have one package statement in a file.

It is easier to understand by most programmers, especially the ones that come from Java land (I was chatting with another committer just now and he got confused by the meaning of having two or three package statements in a Scala file).
It requires explicit import to open up the parent package scope and avoids polluting the namespace (there is no difference in terms of line count here since you add one import but remove one package)
It is more consistent with the rest of Spark code base.

Now this is a highly subjective topic, so we should get others to chime in.

pwendell · 2014-03-24T06:42:54Z

Hey I just want to make sure I understand. It seems like there are two options here:

package org.apache.spark.sql
package catalyst

and

package org.apache.spark.sql.catalyst

import org.apache.spark.sql._

And these accomplish the same goal. Is that correct?

If the trade-off is to save ~15 characters vs having something more explicit, I definitely prefer just being explicit. At least in the past we've tried to err on the side of making things accessible.

liancheng · 2014-03-24T07:03:35Z

@pwendell When I started on this PR, I was also puzzled which option to use at first, until I saw the same usage in Scala standard library. But I should confess that I wasn't 100% sure about what the first option exactly mean until I played around both of them a bit.

I think the most significant "advantage" of the first option is that we open parent packages implicitly. But since we forbid relative package imports, this is not exactly an advantage any more. So I vote for the second option now.

liancheng · 2014-03-24T07:21:57Z

I think a point here is whether we should assume all contributors use IDEs like IntelliJ and their automated features. At least, I find in most scenarios, default behaviours of IntelliJ match Spark coding convention well. Exceptions include indentation and false positive suggestions about adding/removing parenthesis to/from Java getter/setter methods.

Maybe we can suggest developers (including ourselves) to rely on IDE more and even provide a default IntelliJ configuration that match Spark coding convention better?

shivaram · 2014-03-24T07:32:41Z

FWIW I'd vote for package org.apache.spark.sql.catalyst and not assuming that everybody uses an IDE

aarondav · 2014-03-24T07:49:56Z

Also vote for package org.apache.spark.sql.catalyst, for new developers and to keep consistent with the rest of the Spark codebase.
edit: prematurely submitted

alig · 2014-03-24T12:40:24Z

Also +1 for package org.apache.spark.sql.catalyst, just because it's simpler to understand for the majority of the programmers in the world ;)

liancheng · 2014-03-24T16:34:26Z

Thanks for all your votes! I'll fix this in a separate PR ASAP.

According to discussions in comments of PR #208, this PR unifies package definition format in Spark SQL. Some broken links in ScalaDoc and typos detected along the way are also fixed. Author: Cheng Lian <[email protected]> Closes #225 from liancheng/packageDefinition and squashes the following commits: 75c47b3 [Cheng Lian] Fixed file line length 4f87968 [Cheng Lian] Unified package definition format in Spark SQL

According to discussions in comments of PR apache#208, this PR unifies package definition format in Spark SQL. Some broken links in ScalaDoc and typos detected along the way are also fixed. Author: Cheng Lian <[email protected]> Closes apache#225 from liancheng/packageDefinition and squashes the following commits: 75c47b3 [Cheng Lian] Fixed file line length 4f87968 [Cheng Lian] Unified package definition format in Spark SQL

[SPARKR-188] Add profiling of R execution on worker side Conflicts: pkg/inst/worker/worker.R

This PR pulls in recent changes in SparkR-pkg, including cartesian, intersection, sampleByKey, subtract, subtractByKey, except, and some API for StructType and StructField. Author: cafreeman <[email protected]> Author: Davies Liu <[email protected]> Author: Zongheng Yang <[email protected]> Author: Shivaram Venkataraman <[email protected]> Author: Shivaram Venkataraman <[email protected]> Author: Sun Rui <[email protected]> Closes #5436 from davies/R3 and squashes the following commits: c2b09be [Davies Liu] SQLTypes -> schema a5a02f2 [Davies Liu] Merge branch 'master' of github.com:apache/spark into R3 168b7fe [Davies Liu] sort generics b1fe460 [Davies Liu] fix conflict in README.md e74c04e [Davies Liu] fix schema.R 4f5ac09 [Davies Liu] Merge branch 'master' of github.com:apache/spark into R5 41f8184 [Davies Liu] rm man ae78312 [Davies Liu] Merge pull request #237 from sun-rui/SPARKR-154_3 1bdcb63 [Zongheng Yang] Updates to README.md. 5a553e7 [cafreeman] Use object attribute instead of argument 71372d9 [cafreeman] Update docs and examples 8526d2e [cafreeman] Remove `tojson` functions 6ef5f2d [cafreeman] Fix spacing 7741d66 [cafreeman] Rename the SQL DataType function 141efd8 [Shivaram Venkataraman] Merge pull request #245 from hqzizania/upstream 9387402 [Davies Liu] fix style 40199eb [Shivaram Venkataraman] Move except into sorted position 07d0dbc [Sun Rui] [SPARKR-244] Fix test failure after integration of subtract() and subtractByKey() for RDD. 7e8caa3 [Shivaram Venkataraman] Merge pull request #246 from hlin09/fixCombineByKey ed66c81 [cafreeman] Update `subtract` to work with `generics.R` f3ba785 [cafreeman] Fixed duplicate export 275deb4 [cafreeman] Update `NAMESPACE` and tests 1a3b63d [cafreeman] new version of `CreateDF` 836c4bf [cafreeman] Update `createDataFrame` and `toDF` be5d5c1 [cafreeman] refactor schema functions 40338a4 [Zongheng Yang] Merge pull request #244 from sun-rui/SPARKR-154_5 20b97a6 [Zongheng Yang] Merge pull request #234 from hqzizania/assist ba54e34 [Shivaram Venkataraman] Merge pull request #238 from sun-rui/SPARKR-154_4 c9497a3 [Shivaram Venkataraman] Merge pull request #208 from lythesia/master b317aa7 [Zongheng Yang] Merge pull request #243 from hqzizania/master 136a07e [Zongheng Yang] Merge pull request #242 from hqzizania/stats cd66603 [cafreeman] new line at EOF 8b76e81 [Shivaram Venkataraman] Merge pull request #233 from redbaron/fail-early-on-missing-dep 7dd81b7 [cafreeman] Documentation 0e2a94f [cafreeman] Define functions for schema and fields

Resync kube upstream and fix SBT build

This is to remove related jobs from theopenlab repo

liancheng added 5 commits March 22, 2014 19:48

Fixed relative package imports for package catalyst

3dcbbbd

Fixed coding style issues in sql/core

9265366

Addressed styling issues mentioned by @marmbrus

fae7b02

Fixed coding style issues in sql/core

0b56f77

Fixed coding style issues in sql/hive

b531273

liancheng mentioned this pull request Mar 23, 2014

[SPARK-1292] In-memory columnar representation for Spark SQL #205

Closed

Merge branch 'master' into fixCodingStyle

fc2b528

Conflicts: sql/core/src/main/scala/org/apache/spark/sql/execution/Exchange.scala

asfgit closed this in 8265dc7 Mar 23, 2014

liancheng mentioned this pull request Mar 25, 2014

Unified package definition format in Spark SQL #225

Closed

liancheng deleted the fixCodingStyle branch September 24, 2014 00:15

davies pushed a commit to davies/spark that referenced this pull request Apr 14, 2015

Merge pull request apache#208 from lythesia/master

c9497a3

[SPARKR-188] Add profiling of R execution on worker side Conflicts: pkg/inst/worker/worker.R

mccheah pushed a commit to mccheah/spark that referenced this pull request Oct 12, 2017

Merge pull request apache#208 from palantir/resync-kube-upstream

2cb5923

Resync kube upstream and fix SBT build

bzhaoopenstack pushed a commit to bzhaoopenstack/spark that referenced this pull request Sep 11, 2019

Move gophercloud jobs to gophercloud repo (apache#208)

827c36d

This is to remove related jobs from theopenlab repo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed coding style issues in Spark SQL #208

Fixed coding style issues in Spark SQL #208

liancheng commented Mar 23, 2014

liancheng commented Mar 23, 2014

AmplabJenkins commented Mar 23, 2014

AmplabJenkins commented Mar 23, 2014

AmplabJenkins commented Mar 23, 2014

AmplabJenkins commented Mar 23, 2014

AmplabJenkins commented Mar 23, 2014

AmplabJenkins commented Mar 23, 2014

liancheng commented Mar 23, 2014

marmbrus commented Mar 23, 2014

AmplabJenkins commented Mar 23, 2014

AmplabJenkins commented Mar 23, 2014

pwendell commented Mar 23, 2014

rxin commented Mar 24, 2014

marmbrus commented Mar 24, 2014

rxin commented Mar 24, 2014

pwendell commented Mar 24, 2014

liancheng commented Mar 24, 2014

liancheng commented Mar 24, 2014

shivaram commented Mar 24, 2014

aarondav commented Mar 24, 2014

alig commented Mar 24, 2014

liancheng commented Mar 24, 2014

Fixed coding style issues in Spark SQL #208

Fixed coding style issues in Spark SQL #208

Conversation

liancheng commented Mar 23, 2014

liancheng commented Mar 23, 2014

AmplabJenkins commented Mar 23, 2014

AmplabJenkins commented Mar 23, 2014

AmplabJenkins commented Mar 23, 2014

AmplabJenkins commented Mar 23, 2014

AmplabJenkins commented Mar 23, 2014

AmplabJenkins commented Mar 23, 2014

liancheng commented Mar 23, 2014

marmbrus commented Mar 23, 2014

AmplabJenkins commented Mar 23, 2014

AmplabJenkins commented Mar 23, 2014

pwendell commented Mar 23, 2014

rxin commented Mar 24, 2014

marmbrus commented Mar 24, 2014

rxin commented Mar 24, 2014

pwendell commented Mar 24, 2014

liancheng commented Mar 24, 2014

liancheng commented Mar 24, 2014

shivaram commented Mar 24, 2014

aarondav commented Mar 24, 2014

alig commented Mar 24, 2014

liancheng commented Mar 24, 2014