[SPARK-3085] [SQL] Use compact data structures in SQL joins #1993

mateiz · 2014-08-17T02:47:01Z

This reuses the CompactBuffer from Spark Core to save memory and pointer
dereferences. I also tried AppendOnlyMap instead of java.util.HashMap
but unfortunately that slows things down because it seems to do more
equals() calls and the equals on GenericRow, and especially JoinedRow,
is pretty expensive.

This reuses the CompactBuffer from Spark Core to save memory and pointer dereferences. I also tried AppendOnlyMap instead of java.util.HashMap but unfortunately that slows things down because it seems to do more equals() calls and the equals on GenericRow, and especially JoinedRow, is pretty expensive.

SparkQA · 2014-08-17T02:50:08Z

QA tests have started for PR 1993 at commit 5f903ee.

This patch merges cleanly.

SparkQA · 2014-08-17T03:57:21Z

QA tests have finished for PR 1993 at commit 5f903ee.

This patch fails unit tests.
This patch merges cleanly.
This patch adds no public classes.

mateiz · 2014-08-17T05:34:06Z

Jenkins, retest this please

SparkQA · 2014-08-17T05:40:12Z

QA tests have started for PR 1993 at commit 5f903ee.

This patch merges cleanly.

SparkQA · 2014-08-17T06:50:43Z

QA tests have finished for PR 1993 at commit 5f903ee.

This patch fails unit tests.
This patch merges cleanly.
This patch adds no public classes.

marmbrus · 2014-08-18T02:01:57Z

sql/core/src/main/scala/org/apache/spark/sql/execution/joins.scala

 import org.apache.spark.sql.catalyst.expressions._
 import org.apache.spark.sql.catalyst.plans._
 import org.apache.spark.sql.catalyst.plans.physical._
+import org.apache.spark.util.collection.{AppendOnlyMap, CompactBuffer}


We aren't actually using AppendOnlyMap though right?

Ah true, I'll remove the import

SparkQA · 2014-08-18T03:50:17Z

QA tests have started for PR 1993 at commit 188221e.

This patch merges cleanly.

SparkQA · 2014-08-18T04:56:14Z

QA tests have finished for PR 1993 at commit 188221e.

This patch fails unit tests.
This patch merges cleanly.
This patch adds no public classes.

marmbrus · 2014-08-18T17:45:08Z

Looking at the various runs, this patch only failed the thrift server tests consistently, so I'm going to go ahead and merge it. Thanks matei.

This reuses the CompactBuffer from Spark Core to save memory and pointer dereferences. I also tried AppendOnlyMap instead of java.util.HashMap but unfortunately that slows things down because it seems to do more equals() calls and the equals on GenericRow, and especially JoinedRow, is pretty expensive. Author: Matei Zaharia <[email protected]> Closes #1993 from mateiz/spark-3085 and squashes the following commits: 188221e [Matei Zaharia] Remove unneeded import 5f903ee [Matei Zaharia] [SPARK-3085] [SQL] Use compact data structures in SQL joins (cherry picked from commit 4bf3de7) Signed-off-by: Michael Armbrust <[email protected]>

This reuses the CompactBuffer from Spark Core to save memory and pointer dereferences. I also tried AppendOnlyMap instead of java.util.HashMap but unfortunately that slows things down because it seems to do more equals() calls and the equals on GenericRow, and especially JoinedRow, is pretty expensive. Author: Matei Zaharia <[email protected]> Closes apache#1993 from mateiz/spark-3085 and squashes the following commits: 188221e [Matei Zaharia] Remove unneeded import 5f903ee [Matei Zaharia] [SPARK-3085] [SQL] Use compact data structures in SQL joins

marmbrus reviewed Aug 18, 2014
View reviewed changes

Remove unneeded import

188221e

asfgit closed this in 4bf3de7 Aug 18, 2014

kiszk mentioned this pull request Dec 19, 2017

[SPARK-22668][SQL] Ensure no global variables in arguments of method split by CodegenContext.splitExpressions() #20021

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-3085] [SQL] Use compact data structures in SQL joins #1993

[SPARK-3085] [SQL] Use compact data structures in SQL joins #1993

mateiz commented Aug 17, 2014

SparkQA commented Aug 17, 2014

SparkQA commented Aug 17, 2014

mateiz commented Aug 17, 2014

SparkQA commented Aug 17, 2014

SparkQA commented Aug 17, 2014

marmbrus Aug 18, 2014

mateiz Aug 18, 2014

SparkQA commented Aug 18, 2014

SparkQA commented Aug 18, 2014

marmbrus commented Aug 18, 2014

[SPARK-3085] [SQL] Use compact data structures in SQL joins #1993

[SPARK-3085] [SQL] Use compact data structures in SQL joins #1993

Conversation

mateiz commented Aug 17, 2014

SparkQA commented Aug 17, 2014

SparkQA commented Aug 17, 2014

mateiz commented Aug 17, 2014

SparkQA commented Aug 17, 2014

SparkQA commented Aug 17, 2014

marmbrus Aug 18, 2014

Choose a reason for hiding this comment

mateiz Aug 18, 2014

Choose a reason for hiding this comment

SparkQA commented Aug 18, 2014

SparkQA commented Aug 18, 2014

marmbrus commented Aug 18, 2014