Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pull latest from apache spark #6

Merged
merged 1,347 commits into from
Mar 18, 2016
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
1347 commits
Select commit Hold shift + click to select a range
645c3a8
[SPARK-13423][HOTFIX] Static analysis fixes for 2.x / fixed for Scala…
srowen Mar 3, 2016
70f6f96
[SPARK-13013][DOCS] Replace example code in mllib-clustering.md using…
keypointt Mar 3, 2016
9a48c65
[SPARK-13599][BUILD] remove transitive groovy dependencies from Hive
steveloughran Mar 3, 2016
511d492
[SPARK-12877][ML] Add train-validation-split to pyspark
JeremyNixon Mar 3, 2016
cf95d72
[SPARK-13543][SQL] Support for specifying compression codec for Parqu…
HyukjinKwon Mar 3, 2016
ce58e99
[MINOR][ML][DOC] Remove duplicated periods at the end of some sharedP…
yanboliang Mar 3, 2016
52035d1
[SPARK-13423][HOTFIX] Static analysis fixes for 2.x / fixed for Scala…
srowen Mar 3, 2016
941b270
[MINOR] Fix typos in comments and testcase name of code
dongjoon-hyun Mar 3, 2016
3edcc40
[SPARK-13632][SQL] Move commands.scala to command package
Mar 3, 2016
ad0de99
[SPARK-13584][SQL][TESTS] Make ContinuousQueryManagerSuite not output…
zsxwing Mar 3, 2016
b373a88
[SPARK-13415][SQL] Visualize subquery in SQL web UI
Mar 4, 2016
d062587
[SPARK-13601] [TESTS] use 1 partition in tests to avoid race conditions
Mar 4, 2016
15d57f9
[SPARK-13647] [SQL] also check if numeric value is within allowed ran…
cloud-fan Mar 4, 2016
f6ac7c3
[SPARK-12941][SQL][MASTER] Spark-SQL JDBC Oracle dialect fails to map…
thomastechs Mar 4, 2016
465c665
[SPARK-13652][CORE] Copy ByteBuffer in sendRpcSync as it will be recy…
zsxwing Mar 4, 2016
dd83c20
[SPARK-13603][SQL] support SQL generation for subquery
Mar 4, 2016
27e88fa
[SPARK-13646][MLLIB] QuantileDiscretizer counts dataset twice in get…
eliasah Mar 4, 2016
c04dc27
[SPARK-13398][STREAMING] Move away from thread pool task support to f…
holdenk Mar 4, 2016
204b02b
[SPARK-12925] Improve HiveInspectors.unwrap for StringObjectInspector.…
rbalamohan Mar 4, 2016
e617508
[SPARK-13673][WINDOWS] Fixed not to pollute environment variables.
tsudukim Mar 4, 2016
c8f2545
[SPARK-13676] Fix mismatched default values for regParam in LogisticR…
dongjoon-hyun Mar 4, 2016
83302c3
[SPARK-13036][SPARK-13318][SPARK-13319] Add save/load for feature.py
yinxusen Mar 4, 2016
b7d4147
[SPARK-13633][SQL] Move things into catalyst.parser package
Mar 4, 2016
5f42c28
[SPARK-13459][WEB UI] Separate Alive and Dead Executors in Executor T…
ajbozarth Mar 4, 2016
a6e2bd3
[SPARK-13255] [SQL] Update vectorized reader to directly return Colum…
nongli Mar 4, 2016
f19228e
[SPARK-12073][STREAMING] backpressure rate controller consumes events…
JasonMWhite Mar 5, 2016
adce5ee
[SPARK-12720][SQL] SQL Generation Support for Cube, Rollup, and Group…
gatorsmile Mar 5, 2016
8290004
[SPARK-13693][STREAMING][TESTS] Stop StreamingContext before deleting…
zsxwing Mar 5, 2016
8ff8809
Revert "[SPARK-13616][SQL] Let SQLBuilder convert logical plan withou…
liancheng Mar 6, 2016
ee913e6
[SPARK-13697] [PYSPARK] Fix the missing module name of TransformFunct…
zsxwing Mar 6, 2016
bc7a3ec
[SPARK-13685][SQL] Rename catalog.Catalog to ExternalCatalog
Mar 7, 2016
4b13896
[SPARK-13705][DOCS] UpdateStateByKey Operation documentation incorrec…
Mar 7, 2016
03f57a6
Fixing the type of the sentiment happiness value
heliocentrist Mar 7, 2016
d7eac9d
[SPARK-13651] Generator outputs are not resolved correctly resulting …
dilipbiswal Mar 7, 2016
4896411
[SPARK-13694][SQL] QueryPlan.expressions should always include all ex…
cloud-fan Mar 7, 2016
ef77003
[SPARK-13495][SQL] Add Null Filters in the query plan for Filters/Joi…
sameeragarwal Mar 7, 2016
e72914f
[SPARK-12243][BUILD][PYTHON] PySpark tests are slow in Jenkins.
dongjoon-hyun Mar 7, 2016
a3ec50a
[MINOR][DOC] improve the doc for "spark.memory.offHeap.size"
CodingCat Mar 7, 2016
b6071a7
[SPARK-13722][SQL] No Push Down for Non-deterministics Predicates thr…
gatorsmile Mar 7, 2016
e9e67b3
[SPARK-13655] Improve isolation between tests in KinesisBackedBlockRD…
JoshRosen Mar 7, 2016
e1fb857
[SPARK-529][CORE][YARN] Add type-safe config keys to SparkConf.
Mar 7, 2016
8577260
[SPARK-13442][SQL] Make type inference recognize boolean types
HyukjinKwon Mar 7, 2016
0eea12a
[SPARK-13596][BUILD] Move misc top-level build files into appropriate…
srowen Mar 7, 2016
e720dda
[SPARK-13665][SQL] Separate the concerns of HadoopFsRelation
marmbrus Mar 7, 2016
46f25c2
[SPARK-13648] Add Hive Cli to classes for isolated classloader
preecet Mar 7, 2016
da7bfac
[SPARK-13689][SQL] Move helper things in CatalystQl to new utils object
Mar 8, 2016
25bba58
[SPARK-13404] [SQL] Create variables for input row when it's actually…
Mar 8, 2016
017cdf2
[SPARK-13711][CORE] Don't call SparkUncaughtExceptionHandler in AppCl…
zsxwing Mar 8, 2016
e52e597
[SPARK-13659] Refactor BlockStore put*() APIs to remove returnValues
JoshRosen Mar 8, 2016
7771c73
[HOT-FIX][BUILD] Use the new location of `checkstyle-suppressions.xml`
dongjoon-hyun Mar 8, 2016
9bf76dd
[SPARK-13117][WEB UI] WebUI should use the local ip not 0.0.0.0
Mar 8, 2016
9e86e6e
[SPARK-13675][UI] Fix wrong historyserver url link for application ru…
jerryshao Mar 8, 2016
7d05d02
[SPARK-13637][SQL] use more information to simplify the code in Expan…
cloud-fan Mar 8, 2016
ca1a7b9
[HOTFIX][YARN] Fix yarn cluster mode fire and forget regression
jerryshao Mar 8, 2016
54040f8
[SPARK-13715][MLLIB] Remove last usages of jblas in tests
srowen Mar 8, 2016
78d3b60
[SPARK-13657] [SQL] Support parsing very long AND/OR expressions
Mar 8, 2016
ad3c9a9
[SPARK-13695] Don't cache MEMORY_AND_DISK blocks as bytes in memory a…
JoshRosen Mar 8, 2016
46881b4
[SPARK-12727][SQL] support SQL generation for aggregate with multi-di…
cloud-fan Mar 8, 2016
9740954
[ML] testEstimatorAndModelReadWrite should call checkModelData
yanboliang Mar 8, 2016
d5ce617
[SPARK-13740][SQL] add null check for _verify_type in types.py
cloud-fan Mar 8, 2016
d57daf1
[SPARK-13593] [SQL] improve the `createDataFrame` to accept data type…
cloud-fan Mar 8, 2016
076009b
[SPARK-13400] Stop using deprecated Octal escape literals
dongjoon-hyun Mar 8, 2016
1e28840
[SPARK-13738][SQL] Cleanup Data Source resolution
marmbrus Mar 8, 2016
e430614
[SPARK-13668][SQL] Reorder filter/join predicates to short-circuit is…
sameeragarwal Mar 8, 2016
81f54ac
[SPARK-13755] Escape quotes in SQL plan visualization node labels
JoshRosen Mar 9, 2016
d8813fa
[SPARK-13625][PYSPARK][ML] Added a check to see if an attribute is a …
BryanCutler Mar 9, 2016
982ef2b
[SPARK-13750][SQL] fix sizeInBytes of HadoopFsRelation
Mar 9, 2016
cc4ab37
[SPARK-13754] Keep old data source name for backwards compatibility
falaki Mar 9, 2016
035d3ac
[SPARK-7286][SQL] Deprecate !== in favour of =!=
jodersky Mar 9, 2016
f3201ae
[SPARK-13692][CORE][SQL] Fix trivial Coverity/Checkstyle defects
dongjoon-hyun Mar 9, 2016
2c5af7d
[SPARK-13640][SQL] Synchronize ScalaReflection.mirror method.
ueshin Mar 9, 2016
cbff280
[SPARK-13631][CORE] Thread-safe getLocationsWithLargestOutputs
Mar 9, 2016
c3689bc
[SPARK-13702][CORE][SQL][MLLIB] Use diamond operator for generic inst…
dongjoon-hyun Mar 9, 2016
8e8633e
[SPARK-13769][CORE] Update Java Doc in Spark Submit
Mar 9, 2016
53ba6d6
[SPARK-13698][SQL] Fix Analysis Exceptions when Using Backticks in Ge…
dilipbiswal Mar 9, 2016
9634e17
[SPARK-13242] [SQL] codegen fallback in case-when if there many branches
Mar 9, 2016
7791d0c
Revert "[SPARK-13668][SQL] Reorder filter/join predicates to short-ci…
davies Mar 9, 2016
256704c
[SPARK-13595][BUILD] Move docker, extras modules into external
srowen Mar 9, 2016
23369c3
[SPARK-13763][SQL] Remove Project when its Child's Output is Nil
gatorsmile Mar 9, 2016
cad29a4
[SPARK-13728][SQL] Fix ORC PPD test so that pushed filters can be che…
HyukjinKwon Mar 9, 2016
0dd0648
[SPARK-13615][ML] GeneralizedLinearRegression supports save/load
yanboliang Mar 9, 2016
3dc9ae2
[SPARK-13523] [SQL] Reuse exchanges in a query
Mar 9, 2016
c6aa356
[SPARK-13527][SQL] Prune Filters based on Constraints
gatorsmile Mar 9, 2016
e1772d3
[SPARK-11861][ML] Add feature importances for decision trees
sethah Mar 9, 2016
dbf2a7c
[SPARK-13781][SQL] Use ExpressionSets in ConstraintPropagationSuite
sameeragarwal Mar 9, 2016
37fcda3
[SPARK-13747][SQL] Fix concurrent query with fork-join pool
Mar 10, 2016
40e0676
[SPARK-13778][CORE] Set the executor state for a worker when removing it
zsxwing Mar 10, 2016
238447d
[SPARK-13775] History page sorted by completed time desc by default.
Mar 10, 2016
5f7dbdb
[MINOR] Fix typo in 'hypot' docstring
tristanreid Mar 10, 2016
a4a0add
[SPARK-13492][MESOS] Configurable Mesos framework webui URL.
Mar 10, 2016
926e9c4
[SPARK-13760][SQL] Fix BigDecimal constructor for FloatType
sameeragarwal Mar 10, 2016
7906461
Revert "[SPARK-13760][SQL] Fix BigDecimal constructor for FloatType"
yhuai Mar 10, 2016
aa0eba2
[SPARK-13766][SQL] Consistent file extensions for files written by in…
HyukjinKwon Mar 10, 2016
8a3acb7
[SPARK-13794][SQL] Rename DataFrameWriter.stream() DataFrameWriter.st…
rxin Mar 10, 2016
8bcad28
[SPARK-7420][STREAMING][TESTS] Enable test: o.a.s.streaming.JobGenera…
lw-lin Mar 10, 2016
3e3c3d5
[SPARK-13706][ML] Add Python Example for Train Validation Split
JeremyNixon Mar 10, 2016
9525c56
[MINOR][SQL] Replace DataFrameWriter.stream() with startStream() in c…
dongjoon-hyun Mar 10, 2016
9fe38ab
[SPARK-11108][ML] OneHotEncoder should support other numeric types
sethah Mar 10, 2016
927e22e
[SPARK-13663][CORE] Upgrade Snappy Java to 1.1.2.1
srowen Mar 10, 2016
74267be
[SPARK-13758][STREAMING][CORE] enhance exception message to avoid mis…
wei-mao-intel Mar 10, 2016
d24801a
[SPARK-13636] [SQL] Directly consume UnsafeRow in wholestage codegen …
viirya Mar 10, 2016
235f4ac
[SPARK-13727][CORE] SparkConf.contains does not consider deprecated keys
Mar 10, 2016
19f4ac6
[SPARK-13759][SQL] Add IsNotNull constraints for expressions with an …
sameeragarwal Mar 10, 2016
747d2f5
[SPARK-13790] Speed up ColumnVector's getDecimal
nongli Mar 10, 2016
3d2b6f5
[SQL][TEST] Increased timeouts to reduce flakiness in ContinuousQuery…
tdas Mar 10, 2016
81d4853
[SPARK-13696] Remove BlockStore class & simplify interfaces of mem. &…
JoshRosen Mar 10, 2016
91fed8e
[SPARK-3854][BUILD] Scala style: require spaces before `{`.
dongjoon-hyun Mar 10, 2016
020ff8c
[SPARK-13751] [SQL] generate better code for Filter
Mar 11, 2016
27fe6ba
[SPARK-13604][CORE] Sync worker's state after registering with master
zsxwing Mar 11, 2016
1d54278
[SPARK-13244][SQL] Migrates DataFrame to Dataset
liancheng Mar 11, 2016
88fa866
[MINOR][DOC] Fix supported hive version in doc
dongjoon-hyun Mar 11, 2016
416e71a
[SPARK-13327][SPARKR] Added parameter validations for colnames<-
Mar 11, 2016
c3a6269
[SPARK-13789] Infer additional constraints from attribute equality
sameeragarwal Mar 11, 2016
4d535d1
[SPARK-13389][SPARKR] SparkR support first/last with ignore NAs
yanboliang Mar 11, 2016
560489f
[SPARK-13732][SPARK-13797][SQL] Remove projectList from Window and El…
gatorsmile Mar 11, 2016
6871cc8
[SPARK-12718][SPARK-13720][SQL] SQL generation support for window fun…
cloud-fan Mar 11, 2016
74c4e26
[HOT-FIX] fix compile
cloud-fan Mar 11, 2016
e33bc67
[MINOR][CORE] Fix a duplicate "and" in a log message.
Mar 11, 2016
d18276c
[SPARK-13672][ML] Add python examples of BisectingKMeans in ML and MLLIB
zhengruifeng Mar 11, 2016
6ca990f
[SPARK-13294][PROJECT INFRA] Remove MiMa's dependency on spark-class …
JoshRosen Mar 11, 2016
0b713e0
[SPARK-13512][ML] add example and doc for MaxAbsScaler
hhbyyh Mar 11, 2016
234f781
[SPARK-13787][ML][PYSPARK] Pyspark feature importances for decision t…
sethah Mar 11, 2016
8fff0f9
[HOT-FIX][SQL][ML] Fix compile error from use of DataFrame in Java Ma…
MLnick Mar 11, 2016
07f1c54
[SPARK-13577][YARN] Allow Spark jar to be multiple jars, archive.
Mar 11, 2016
6d37e1e
[SPARK-13817][BUILD][SQL] Re-enable MiMA and removes object DataFrame
liancheng Mar 11, 2016
99b7187
[SPARK-13780][SQL] Add missing dependency to build.
Mar 11, 2016
eb650a8
[STREAMING][MINOR] Fix a duplicate "be" in comments
lw-lin Mar 11, 2016
ff776b2
[SPARK-13328][CORE] Poor read performance for broadcast variables wit…
nezihyigitbasi Mar 11, 2016
073bf9d
[SPARK-13807] De-duplicate `Python*Helper` instantiation code in PySp…
JoshRosen Mar 11, 2016
42afd72
[SPARK-13814] [PYSPARK] Delete unnecessary imports in python examples…
zhengruifeng Mar 11, 2016
66d9d0e
[SPARK-13139][SQL] Parse Hive DDL commands ourselves
Mar 11, 2016
2ef4c59
[SPARK-13830] prefer block manager than direct result for large result
Mar 11, 2016
ba8c86d
[SPARK-13671] [SPARK-13311] [SQL] Use different physical plans for RD…
Mar 12, 2016
4eace4d
[SPARK-13828][SQL] Bring back stack trace of AnalysisException thrown…
liancheng Mar 12, 2016
c079420
[SPARK-13841][SQL] Removes Dataset.collectRows()/takeRows()
liancheng Mar 13, 2016
db88d02
[MINOR][DOCS] Replace `DataFrame` with `Dataset` in Javadoc.
dongjoon-hyun Mar 13, 2016
515e4af
[SPARK-13810][CORE] Add Port Configuration Suggestions on Bind Except…
bjornjon Mar 13, 2016
c7e68c3
[SPARK-13812][SPARKR] Fix SparkR lint-r test errors.
Mar 13, 2016
f3daa09
[SQL] fix typo in DataSourceRegister
jackylk Mar 14, 2016
473263f
[SPARK-13834][BUILD] Update sbt and sbt plugins for 2.x.
dongjoon-hyun Mar 14, 2016
1840852
[SPARK-13823][CORE][STREAMING][SQL] Always specify Charset in String …
srowen Mar 14, 2016
e58fa19
Closes #11668
rxin Mar 14, 2016
acdf219
[MINOR][DOCS] Fix more typos in comments/strings.
dongjoon-hyun Mar 14, 2016
31d069d
[SPARK-13746][TESTS] stop using deprecated SynchronizedSet
Mar 14, 2016
250832c
[SPARK-13207][SQL] Make partitioning discovery ignore _SUCCESS files.
yhuai Mar 14, 2016
9a1680c
[SPARK-13139][SQL] Follow-ups to #11573
Mar 14, 2016
9a87afd
[SPARK-13833] Guard against race condition when re-caching disk block…
JoshRosen Mar 14, 2016
45f8053
[SPARK-13578][CORE] Modify launch scripts to not use assemblies.
Mar 14, 2016
63f642a
[SPARK-13779][YARN] Avoid cancelling non-local container requests.
rdblue Mar 14, 2016
6a4bfcd
[SPARK-13658][SQL] BooleanSimplification rule is slow with large bool…
viirya Mar 14, 2016
07cb323
[SPARK-13848][SPARK-5185] Update to Py4J 0.9.2 in order to fix classl…
JoshRosen Mar 14, 2016
310981d
[SPARK-12583][MESOS] Mesos shuffle service: Don't delete shuffle file…
Mar 14, 2016
9f13f0f
[MINOR][DOCS] Added Missing back slashes
danielsan Mar 14, 2016
e06493c
[MINOR][COMMON] Fix copy-paste oversight in variable naming
bjornjon Mar 14, 2016
23385e8
[SPARK-13054] Always post TaskEnd event for tasks
Mar 14, 2016
a48296f
[SPARK-13686][MLLIB][STREAMING] Add a constructor parameter `reqParam…
dongjoon-hyun Mar 14, 2016
38529d8
[SPARK-10907][SPARK-6157] Remove pendingUnrollMemory from MemoryStore
JoshRosen Mar 14, 2016
8301fad
[SPARK-13626][CORE] Avoid duplicate config deprecation warnings.
Mar 14, 2016
06dec37
[SPARK-13843][STREAMING] Remove streaming-flume, streaming-mqtt, stre…
zsxwing Mar 14, 2016
992142b
[SPARK-11826][MLLIB] Refactor add() and subtract() methods
ehsanmok Mar 15, 2016
17eec0a
[SPARK-13664][SQL] Add a strategy for planning partitioned and bucket…
marmbrus Mar 15, 2016
4bf4609
[SPARK-13882][SQL] Remove org.apache.spark.sql.execution.local
rxin Mar 15, 2016
8e0b030
[SPARK-10380][SQL] Fix confusing documentation examples for astype/dr…
rxin Mar 15, 2016
b5e3bd8
[SPARK-13791][SQL] Add MetadataLog and HDFSMetadataLog
zsxwing Mar 15, 2016
e76679a
[SPARK-13880][SPARK-13881][SQL] Rename DataFrame.scala Dataset.scala,…
rxin Mar 15, 2016
9256840
[SPARK-13661][SQL] avoid the copy in HashedRelation
Mar 15, 2016
f72743d
[SPARK-13353][SQL] fast serialization for collecting DataFrame/Dataset
Mar 15, 2016
e649580
[SPARK-13884][SQL] Remove DescribeCommand's dependency on LogicalPlan
rxin Mar 15, 2016
43304b1
[SPARK-13888][DOC] Remove Akka Receiver doc and refer to the DStream …
zsxwing Mar 15, 2016
a51f877
[SPARK-13870][SQL] Add scalastyle escaping correctly in CVSSuite.scala
dongjoon-hyun Mar 15, 2016
276c2d5
[SPARK-13890][SQL] Remove some internal classes' dependency on SQLCon…
rxin Mar 15, 2016
99bd2f0
[SPARK-13840][SQL] Split Optimizer Rule ColumnPruning to ColumnPrunin…
gatorsmile Mar 15, 2016
10251a7
[SPARK-13660][SQL][TESTS] ContinuousQuerySuite floods the logs with g…
keypointt Mar 15, 2016
dafd70f
[SPARK-12379][ML][MLLIB] Copy GBT implementation to spark.ml
sethah Mar 15, 2016
bd5365b
[SPARK-13803] restore the changes in SPARK-3411
CodingCat Mar 15, 2016
48978ab
[SPARK-13576][BUILD] Don't create assembly for examples.
Mar 15, 2016
5e6f2f4
[SPARK-13893][SQL] Remove SQLContext.catalog/analyzer (internal method)
rxin Mar 15, 2016
d89c714
[SPARK-13642][YARN] Changed the default application exit state to fai…
jerryshao Mar 15, 2016
50e3644
[SPARK-13896][SQL][STRING] Dataset.toJSON should return Dataset
Mar 15, 2016
dddf2f2
[MINOR] a minor fix for the comments of a method in RPC Dispatcher
CodingCat Mar 15, 2016
41eaabf
[SPARK-13626][CORE] Revert change to SparkConf's constructor.
Mar 15, 2016
643649d
[SPARK-13895][SQL] DataFrameReader.text should return Dataset[String]
rxin Mar 15, 2016
bbd887f
[SPARK-13918][SQL] Merge SortMergeJoin and SortMergerOuterJoin
Mar 16, 2016
52b6a89
[MINOR][TEST][SQL] Remove wrong "expected" parameter in checkNaNWitho…
Mar 16, 2016
421f6c2
[SPARK-13917] [SQL] generate broadcast semi join
Mar 16, 2016
3665294
[SPARK-9837][ML] R-like summary statistics for GLMs via iteratively r…
yanboliang Mar 16, 2016
3c578c5
[SPARK-13920][BUILD] MIMA checks should apply to @Experimental and @D…
dongjoon-hyun Mar 16, 2016
9202479
[SPARK-13899][SQL] Produce InternalRow instead of external Row at CSV…
HyukjinKwon Mar 16, 2016
431a3d0
[SPARK-12653][SQL] Re-enable test "SPARK-8489: MissingRequirementErro…
dongjoon-hyun Mar 16, 2016
05ab294
[SPARK-13906] Ensure that there are at least 2 dispatcher threads.
yonran Mar 16, 2016
3b461d9
[SPARK-13823][SPARK-13397][SPARK-13395][CORE] More warnings, Standard…
srowen Mar 16, 2016
56d8824
[SPARK-13396] Stop using our internal deprecated .metrics on Exceptio…
GayathriMurali Mar 16, 2016
1d95fb6
[SPARK-13793][CORE] PipedRDD doesn't propagate exceptions while readi…
tejasapatil Mar 16, 2016
496d2a2
[SPARK-13889][YARN] Fix integer overflow when calculating the max num…
carsonwang Mar 16, 2016
9412547
[SPARK-13823][HOTFIX] Increase tryAcquire timeout and assert it succe…
srowen Mar 16, 2016
5f6bdf9
[SPARK-13281][CORE] Switch broadcast of RDD to exception from warning
Mar 16, 2016
eacd9d8
[SPARK-13360][PYSPARK][YARN] PYSPARK_DRIVER_PYTHON and PYSPARK_PYTHON…
zjffdu Mar 16, 2016
d9e8f26
[SPARK-13924][SQL] officially support multi-insert
cloud-fan Mar 16, 2016
d9670f8
[SPARK-13894][SQL] SqlContext.range return type from DataFrame to Dat…
chenghao-intel Mar 16, 2016
9198497
[SPARK-13816][GRAPHX] Add parameter checks for algorithms in Graphx
zhengruifeng Mar 16, 2016
1d1de28
[SPARK-13827][SQL] Can't add subquery to an operator with same-name o…
cloud-fan Mar 16, 2016
c4bd576
[SPARK-12721][SQL] SQL Generation for Script Transformation
gatorsmile Mar 16, 2016
ae6c677
[SPARK-13038][PYSPARK] Add load/save to pipeline
yinxusen Mar 16, 2016
3f06eb7
[SPARK-13613][ML] Provide ignored tests to export test dataset into C…
yanboliang Mar 16, 2016
6fc2b65
[SPARK-11888][ML] Decision tree persistence in spark.ml
jkbradley Mar 16, 2016
85c42fd
[SPARK-13927][MLLIB] add row/column iterator to local matrices
mengxr Mar 16, 2016
27e1f38
[SPARK-13034] PySpark ml.classification support export/import
GayathriMurali Mar 16, 2016
4ce2d24
[SPARK-13942][CORE][DOCS] Remove Shark-related docs for 2.x
dongjoon-hyun Mar 16, 2016
b90c020
[SPARK-13922][SQL] Filter rows with null attributes in vectorized par…
sameeragarwal Mar 16, 2016
f96997b
[SPARK-13871][SQL] Support for inferring filters from data constraints
sameeragarwal Mar 16, 2016
77ba302
[SPARK-13869][SQL] Remove redundant conditions while combining filters
sameeragarwal Mar 16, 2016
d4d8493
[SPARK-11011][SQL] Narrow type of UDT serialization
jodersky Mar 16, 2016
92b7057
[SPARK-13761][ML] Deprecate validateParams
hhbyyh Mar 17, 2016
ca9ef86
[SPARK-13923][SQL] Implement SessionCatalog
Mar 17, 2016
917f400
[SPARK-13719][SQL] Parse JSON rows having an array type and a struct …
HyukjinKwon Mar 17, 2016
c100d31
[SPARK-13873] [SQL] Avoid copy of UnsafeRow when there is no join in …
Mar 17, 2016
7eef246
[SPARK-13118][SQL] Expression encoding for optional synthetic classes
jodersky Mar 17, 2016
c890c35
[MINOR][SQL][BUILD] Remove duplicated lines
dongjoon-hyun Mar 17, 2016
d1c193a
[SPARK-12855][MINOR][SQL][DOC][TEST] remove spark.sql.dialect from do…
adrian-wang Mar 17, 2016
de1a84e
[SPARK-13926] Automatically use Kryo serializer when shuffling RDDs w…
JoshRosen Mar 17, 2016
5faba9f
[SPARK-13403][SQL] Pass hadoopConfiguration to HiveConf constructors.
rdblue Mar 17, 2016
82066a1
[SPARK-13948] MiMa check should catch if the visibility changes to pr…
JoshRosen Mar 17, 2016
30c1884
Revert "[SPARK-13840][SQL] Split Optimizer Rule ColumnPruning to Colu…
davies Mar 17, 2016
204c9de
[MINOR][DOC] Add JavaStreamingTestExample
zhengruifeng Mar 17, 2016
357d82d
[SPARK-13629][ML] Add binary toggle Param to CountVectorizer
hhbyyh Mar 17, 2016
ea9ca6f
[SPARK-13901][CORE] correct the logDebug information when jump to the…
trueyao Mar 17, 2016
8ef3399
[SPARK-13928] Move org.apache.spark.Logging into org.apache.spark.int…
cloud-fan Mar 17, 2016
1974d1d
[SPARK-12719][SQL] SQL generation support for Generate
cloud-fan Mar 17, 2016
65b75e6
[SPARK-13776][WEBUI] Limit the max number of acceptors and selectors …
zsxwing Mar 17, 2016
637a78f
[SPARK-13427][SQL] Support USING clause in JOIN.
dilipbiswal Mar 17, 2016
5f3bda6
[SPARK-13838] [SQL] Clear variable code to prevent it to be re-evalua…
viirya Mar 17, 2016
3ee7996
[SPARK-12719][HOTFIX] Fix compilation against Scala 2.10
tedyu Mar 17, 2016
828213d
[SPARK-13937][PYSPARK][ML] Change JavaWrapper _java_obj from static t…
BryanCutler Mar 17, 2016
edf8b87
[SPARK-11891] Model export/import for RFormula and RFormulaModel
yinxusen Mar 17, 2016
4c08e2c
Revert "[SPARK-12719][HOTFIX] Fix compilation against Scala 2.10"
yhuai Mar 17, 2016
b39e80d
[SPARK-13761][ML] Remove remaining uses of validateParams
jkbradley Mar 17, 2016
1614485
[SPARK-10788][MLLIB][ML] Remove duplicate bins for decision trees
sethah Mar 17, 2016
453455c
[SPARK-13974][SQL] sub-query names do not need to be globally unique …
cloud-fan Mar 18, 2016
6037ed0
[SPARK-13976][SQL] do not remove sub-queries added by user when gener…
cloud-fan Mar 18, 2016
6c2d894
[SPARK-13921] Store serialized blocks as multiple chunks in MemoryStore
JoshRosen Mar 18, 2016
90a1d8d
[SPARK-12719][HOTFIX] Fix compilation against Scala 2.10
tedyu Mar 18, 2016
10ef4f3
[SPARK-13826][SQL] Revises Dataset ScalaDoc
liancheng Mar 18, 2016
750ed64
[SPARK-13930] [SQL] Apply fast serialization on collect limit operator
viirya Mar 18, 2016
bb1fda0
[SPARK-13826][SQL] Addendum: update documentation for Datasets
rxin Mar 18, 2016
7783b6f
[MINOR][ML] When trainingSummary is None, it should throw RuntimeExce…
yanboliang Mar 18, 2016
0f1015f
[SPARK-14001][SQL] support multi-children Union in SQLBuilder
cloud-fan Mar 18, 2016
53f32a2
[MINOR][DOC] Fix nits in JavaStreamingTestExample
zhengruifeng Mar 18, 2016
0acb32a
[SPARK-13972][SQ] hive tests should fail if SQL generation failed
cloud-fan Mar 18, 2016
14c7236
[SPARK-14004][SQL][MINOR] AttributeReference and Alias should only us…
liancheng Mar 18, 2016
9c23c81
[SPARK-13977] [SQL] Brings back Shuffled hash join
Mar 18, 2016
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
12 changes: 12 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)


## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)


(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

4 changes: 0 additions & 4 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,6 @@ cache
work/
out/
.DS_Store
third_party/libmesos.so
third_party/libmesos.dylib
build/apache-maven*
build/zinc*
build/scala*
Expand Down Expand Up @@ -60,8 +58,6 @@ dev/create-release/*final
spark-*-bin-*.tgz
unit-tests.log
/lib/
ec2/lib/
rat-results.txt
scalastyle.txt
scalastyle-output.xml
R-unit-tests.log
Expand Down
30 changes: 17 additions & 13 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@

Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
Expand Down Expand Up @@ -237,8 +236,7 @@ The following components are provided under a BSD-style license. See project lin
The text of each license is also included at licenses/LICENSE-[project].txt.

(BSD 3 Clause) netlib core (com.github.fommil.netlib:core:1.1.2 - https://github.com/fommil/netlib-java/core)
(BSD 3 Clause) JPMML-Model (org.jpmml:pmml-model:1.1.15 - https://github.com/jpmml/jpmml-model)
(BSD 3-clause style license) jblas (org.jblas:jblas:1.2.4 - http://jblas.org/)
(BSD 3 Clause) JPMML-Model (org.jpmml:pmml-model:1.2.7 - https://github.com/jpmml/jpmml-model)
(BSD License) AntLR Parser Generator (antlr:antlr:2.7.7 - http://www.antlr.org/)
(BSD licence) ANTLR ST4 4.0.4 (org.antlr:ST4:4.0.4 - http://www.stringtemplate.org)
(BSD licence) ANTLR StringTemplate (org.antlr:stringtemplate:3.2.1 - http://www.stringtemplate.org)
Expand All @@ -250,22 +248,22 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(Interpreter classes (all .scala files in repl/src/main/scala
except for Main.Scala, SparkHelper.scala and ExecutorClassLoader.scala),
and for SerializableMapWrapper in JavaUtils.scala)
(BSD-like) Scala Actors library (org.scala-lang:scala-actors:2.10.5 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-compiler:2.10.5 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-reflect:2.10.5 - http://www.scala-lang.org/)
(BSD-like) Scala Library (org.scala-lang:scala-library:2.10.5 - http://www.scala-lang.org/)
(BSD-like) Scalap (org.scala-lang:scalap:2.10.5 - http://www.scala-lang.org/)
(BSD-style) scalacheck (org.scalacheck:scalacheck_2.10:1.10.0 - http://www.scalacheck.org)
(BSD-style) spire (org.spire-math:spire_2.10:0.7.1 - http://spire-math.org)
(BSD-style) spire-macros (org.spire-math:spire-macros_2.10:0.7.1 - http://spire-math.org)
(BSD-like) Scala Actors library (org.scala-lang:scala-actors:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-compiler:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Compiler (org.scala-lang:scala-reflect:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scala Library (org.scala-lang:scala-library:2.11.7 - http://www.scala-lang.org/)
(BSD-like) Scalap (org.scala-lang:scalap:2.11.7 - http://www.scala-lang.org/)
(BSD-style) scalacheck (org.scalacheck:scalacheck_2.11:1.10.0 - http://www.scalacheck.org)
(BSD-style) spire (org.spire-math:spire_2.11:0.7.1 - http://spire-math.org)
(BSD-style) spire-macros (org.spire-math:spire-macros_2.11:0.7.1 - http://spire-math.org)
(New BSD License) Kryo (com.esotericsoftware.kryo:kryo:2.21 - http://code.google.com/p/kryo/)
(New BSD License) MinLog (com.esotericsoftware.minlog:minlog:1.2 - http://code.google.com/p/minlog/)
(New BSD License) ReflectASM (com.esotericsoftware.reflectasm:reflectasm:1.07 - http://code.google.com/p/reflectasm/)
(New BSD license) Protocol Buffer Java API (com.google.protobuf:protobuf-java:2.5.0 - http://code.google.com/p/protobuf)
(New BSD license) Protocol Buffer Java API (org.spark-project.protobuf:protobuf-java:2.4.1-shaded - http://code.google.com/p/protobuf)
(The BSD License) Fortran to Java ARPACK (net.sourceforge.f2j:arpack_combined_all:0.1 - http://f2j.sourceforge.net)
(The BSD License) xmlenc Library (xmlenc:xmlenc:0.52 - http://xmlenc.sourceforge.net)
(The New BSD License) Py4J (net.sf.py4j:py4j:0.9 - http://py4j.sourceforge.net/)
(The New BSD License) Py4J (net.sf.py4j:py4j:0.9.2 - http://py4j.sourceforge.net/)
(Two-clause BSD-style license) JUnit-Interface (com.novocode:junit-interface:0.10 - http://github.com/szeiger/junit-interface/)
(BSD licence) sbt and sbt-launch-lib.bash
(BSD 3 Clause) d3.min.js (https://github.com/mbostock/d3/blob/master/LICENSE)
Expand All @@ -284,11 +282,17 @@ The text of each license is also included at licenses/LICENSE-[project].txt.
(MIT License) SLF4J API Module (org.slf4j:slf4j-api:1.7.5 - http://www.slf4j.org)
(MIT License) SLF4J LOG4J-12 Binding (org.slf4j:slf4j-log4j12:1.7.5 - http://www.slf4j.org)
(MIT License) pyrolite (org.spark-project:pyrolite:2.0.1 - http://pythonhosted.org/Pyro4/)
(MIT License) scopt (com.github.scopt:scopt_2.10:3.2.0 - https://github.com/scopt/scopt)
(MIT License) scopt (com.github.scopt:scopt_2.11:3.2.0 - https://github.com/scopt/scopt)
(The MIT License) Mockito (org.mockito:mockito-core:1.9.5 - http://www.mockito.org)
(MIT License) jquery (https://jquery.org/license/)
(MIT License) AnchorJS (https://github.com/bryanbraun/anchorjs)
(MIT License) graphlib-dot (https://github.com/cpettitt/graphlib-dot)
(MIT License) dagre-d3 (https://github.com/cpettitt/dagre-d3)
(MIT License) sorttable (https://github.com/stuartlangridge/sorttable)
(MIT License) boto (https://github.com/boto/boto/blob/develop/LICENSE)
(MIT License) datatables (http://datatables.net/license)
(MIT License) mustache (https://github.com/mustache/mustache/blob/master/LICENSE)
(MIT License) cookies (http://code.google.com/p/cookies/wiki/License)
(MIT License) blockUI (http://jquery.malsup.com/block/)
(MIT License) RowsGroup (http://datatables.net/license/mit)
(MIT License) jsonFormatter (http://www.jqueryscript.net/other/jQuery-Plugin-For-Pretty-JSON-Formatting-jsonFormatter.html)
61 changes: 60 additions & 1 deletion NOTICE
Original file line number Diff line number Diff line change
Expand Up @@ -606,4 +606,63 @@ Vis.js uses and redistributes the following third-party libraries:

- keycharm
https://github.com/AlexDM0/keycharm
The MIT License
The MIT License

===============================================================================

The CSS style for the navigation sidebar of the documentation was originally
submitted by Óscar Nájera for the scikit-learn project. The scikit-learn project
is distributed under the 3-Clause BSD license.
===============================================================================

For CSV functionality:

/*
* Copyright 2014 Databricks
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

/*
* Copyright 2015 Ayasdi Inc
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/


===============================================================================
For dev/sparktestsupport/toposort.py:

Copyright 2014 True Blade Systems, Inc.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
10 changes: 10 additions & 0 deletions R/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,16 @@
# R on Spark

SparkR is an R package that provides a light-weight frontend to use Spark from R.
### Installing sparkR

Libraries of sparkR need to be created in `$SPARK_HOME/R/lib`. This can be done by running the script `$SPARK_HOME/R/install-dev.sh`.
By default the above script uses the system wide installation of R. However, this can be changed to any user installed location of R by setting the environment variable `R_HOME` the full path of the base directory where R is installed, before running install-dev.sh script.
Example:
```
# where /home/username/R is where R is installed and /home/username/R/bin contains the files R and RScript
export R_HOME=/home/username/R
./install-dev.sh
```

### SparkR development

Expand Down
11 changes: 9 additions & 2 deletions R/install-dev.sh
Original file line number Diff line number Diff line change
Expand Up @@ -35,12 +35,19 @@ LIB_DIR="$FWDIR/lib"
mkdir -p $LIB_DIR

pushd $FWDIR > /dev/null
if [ ! -z "$R_HOME" ]
then
R_SCRIPT_PATH="$R_HOME/bin"
else
R_SCRIPT_PATH="$(dirname $(which R))"
fi
echo "USING R_HOME = $R_HOME"

# Generate Rd files if devtools is installed
Rscript -e ' if("devtools" %in% rownames(installed.packages())) { library(devtools); devtools::document(pkg="./pkg", roclets=c("rd")) }'
"$R_SCRIPT_PATH/"Rscript -e ' if("devtools" %in% rownames(installed.packages())) { library(devtools); devtools::document(pkg="./pkg", roclets=c("rd")) }'

# Install SparkR to $LIB_DIR
R CMD INSTALL --library=$LIB_DIR $FWDIR/pkg/
"$R_SCRIPT_PATH/"R CMD INSTALL --library=$LIB_DIR $FWDIR/pkg/

# Zip the SparkR package so that it can be distributed to worker nodes on YARN
cd $LIB_DIR
Expand Down
7 changes: 4 additions & 3 deletions R/pkg/DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: SparkR
Type: Package
Title: R frontend for Spark
Version: 1.6.0
Version: 2.0.0
Date: 2013-09-09
Author: The Apache Software Foundation
Maintainer: Shivaram Venkataraman <[email protected]>
Expand All @@ -18,10 +18,10 @@ Collate:
'schema.R'
'generics.R'
'jobj.R'
'RDD.R'
'pairRDD.R'
'column.R'
'group.R'
'RDD.R'
'pairRDD.R'
'DataFrame.R'
'SQLContext.R'
'backend.R'
Expand All @@ -36,3 +36,4 @@ Collate:
'stats.R'
'types.R'
'utils.R'
RoxygenNote: 5.0.1
38 changes: 30 additions & 8 deletions R/pkg/NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,9 @@ export("print.jobj")
# MLlib integration
exportMethods("glm",
"predict",
"summary")
"summary",
"kmeans",
"fitted")

# Job group lifecycle management methods
export("setJobGroup",
Expand All @@ -27,15 +29,22 @@ exportMethods("arrange",
"attach",
"cache",
"collect",
"colnames",
"colnames<-",
"coltypes",
"coltypes<-",
"columns",
"count",
"cov",
"corr",
"covar_samp",
"covar_pop",
"crosstab",
"describe",
"dim",
"distinct",
"drop",
"dropDuplicates",
"dropna",
"dtypes",
"except",
Expand All @@ -56,6 +65,7 @@ exportMethods("arrange",
"mutate",
"na.omit",
"names",
"names<-",
"ncol",
"nrow",
"orderBy",
Expand Down Expand Up @@ -88,7 +98,10 @@ exportMethods("arrange",
"with",
"withColumn",
"withColumnRenamed",
"write.df")
"write.df",
"write.json",
"write.parquet",
"write.text")

exportClasses("Column")

Expand All @@ -98,6 +111,7 @@ exportMethods("%in%",
"add_months",
"alias",
"approxCountDistinct",
"approxQuantile",
"array_contains",
"asc",
"ascii",
Expand All @@ -123,15 +137,18 @@ exportMethods("%in%",
"count",
"countDistinct",
"crc32",
"cumeDist",
"hash",
"cume_dist",
"date_add",
"date_format",
"date_sub",
"datediff",
"dayofmonth",
"dayofyear",
"denseRank",
"decode",
"dense_rank",
"desc",
"encode",
"endsWith",
"exp",
"explode",
Expand Down Expand Up @@ -188,7 +205,7 @@ exportMethods("%in%",
"next_day",
"ntile",
"otherwise",
"percentRank",
"percent_rank",
"pmod",
"quarter",
"rand",
Expand All @@ -200,7 +217,7 @@ exportMethods("%in%",
"rint",
"rlike",
"round",
"rowNumber",
"row_number",
"rpad",
"rtrim",
"second",
Expand All @@ -221,6 +238,7 @@ exportMethods("%in%",
"stddev",
"stddev_pop",
"stddev_samp",
"struct",
"sqrt",
"startsWith",
"substr",
Expand Down Expand Up @@ -263,8 +281,12 @@ export("as.DataFrame",
"loadDF",
"parquetFile",
"read.df",
"read.json",
"read.parquet",
"read.text",
"sql",
"table",
"str",
"tableToDF",
"tableNames",
"tables",
"uncacheTable")
Expand All @@ -276,4 +298,4 @@ export("structField",
"structType",
"structType.jobj",
"structType.structField",
"print.structType")
"print.structType")
Loading