Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distsql: distributed hash join #10438

Merged
merged 2 commits into from
Nov 3, 2016

Conversation

irfansharif
Copy link
Contributor

@irfansharif irfansharif commented Nov 3, 2016

Implementation of the HashJoinerSpec with support for INNER JOIN,
FULL OUTER, LEFT OUTER and RIGHT OUTER joins.
Implementation is the standard hash join algorithm, we distinctly use the
left row source/stream to construct our 'seen' hash map/set and use
the right row source/stream to probe in.

No strong opinions of some of the names chosen here (see bucket/
buckets), open to suggestions.
Part of #7587.

cc @RaduBerinde


This change is Reviewable

Implementation of the HashJoinerSpec with support for INNER JOIN, FULL
OUTER, LEFT OUTER and RIGHT OUTER joins. Implementation is the standard
hash join algorithm, we distinctly use the left row source/stream to
construct our 'seen' hash map/set and use the right row source/stream to
probe in.
@RaduBerinde
Copy link
Member

Wow, you are very fast!

:lgtm:


Review status: 0 of 8 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.


pkg/sql/distsql/hashjoiner.go, line 35 at r2 (raw file):

}

// TODO(irfansharif): It's trivial to use the grace hash join algorithm by using

I would remove this comment - this is exactly the plan, and we don't need to do anything extra in this layer to support it - it's up to the physical planner to set it up. See https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/distributed_sql.md#stream-joins


pkg/sql/distsql/processors.proto, line 192 at r2 (raw file):

// right row (i+1).
message HashJoinerSpec {
  // The join constraints certain columns from the left stream to equal

This comment applies to both left_eq_columns, right_eq_columns. We should keep them next to each other.


Comments from Reviewable

pulls out common structure between the hashJoiner and mergeJoiner
implementation, similar fashion to what was done with readerbase with
respect to table reader and join reader.
@irfansharif irfansharif force-pushed the distributed-hash-joiner branch from d4b9d17 to 1a84aaa Compare November 3, 2016 21:14
@irfansharif
Copy link
Contributor Author

hahaha, thank you! pulling in once CI is green.


Review status: 0 of 8 files reviewed at latest revision, 2 unresolved discussions, some commit checks pending.


pkg/sql/distsql/hashjoiner.go, line 35 at r2 (raw file):

Previously, RaduBerinde wrote…

I would remove this comment - this is exactly the plan, and we don't need to do anything extra in this layer to support it - it's up to the physical planner to set it up. See https://github.com/cockroachdb/cockroach/blob/master/docs/RFCS/distributed_sql.md#stream-joins

oops, Done.

pkg/sql/distsql/processors.proto, line 192 at r2 (raw file):

Previously, RaduBerinde wrote…

This comment applies to both left_eq_columns, right_eq_columns. We should keep them next to each other.

Done.

Comments from Reviewable

@irfansharif irfansharif merged commit 20c41d4 into cockroachdb:master Nov 3, 2016
@irfansharif irfansharif deleted the distributed-hash-joiner branch November 3, 2016 21:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants