Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Broadcast spatial join #9474

Merged
merged 2 commits into from
Mar 8, 2018
Merged

Commits on Mar 8, 2018

  1. Add benchmark for a broadcast spatial join

    Benchmark                           (pointCount)  Mode  Cnt      Score     Error  Units
    BenchmarkSpatialJoin.benchmarkJoin            10  avgt   30     38.842 ±   2.867  ms/op
    BenchmarkSpatialJoin.benchmarkJoin           100  avgt   30    212.509 ±   9.965  ms/op
    BenchmarkSpatialJoin.benchmarkJoin          1000  avgt   30   1937.329 ±  79.988  ms/op
    BenchmarkSpatialJoin.benchmarkJoin         10000  avgt   30  18822.191 ± 460.088  ms/op
    
    BenchmarkSpatialJoin.benchmarkUserOptimizedJoin            10  avgt   30   15.621 ± 1.221  ms/op
    BenchmarkSpatialJoin.benchmarkUserOptimizedJoin           100  avgt   30   16.939 ± 1.209  ms/op
    BenchmarkSpatialJoin.benchmarkUserOptimizedJoin          1000  avgt   30   29.448 ± 1.990  ms/op
    BenchmarkSpatialJoin.benchmarkUserOptimizedJoin         10000  avgt   30  102.185 ± 4.111  ms/op
    mbasmanova committed Mar 8, 2018
    Configuration menu
    Copy the full SHA
    6628d41 View commit details
    Browse the repository at this point in the history
  2. Broadcast spatial join

    An optimizer rule to rewrite a cross join with a spatial filter on top into a spatial join and custom operators to execute spatial joins efficiently (broadcast joins only).
    
    For example, the plan for the following query
    
    SELECT ...
    FROM points, polygons
    WHERE ST_Contains(ST_GeometryFromText(wkt), ST_Point(longitude, latitude))
    
    is rewritten from
    
    - FilterProject[filterPredicate = "st_contains"("st_geometryfromtext"("wkt"), "st_point"("longitude", "latitude"))] => []
        - CrossJoin => [latitude:double, longitude:double, wkt:varchar]
    
    into
    
    - SpatialJoin["st_contains"("st_geometryfromtext", "st_point")] => []
         - ScanProject[table = ...
                 st_point := "st_point"("longitude", "latitude")
         - LocalExchange[SINGLE] () => st_geometryfromtext:Geometry
             - RemoteExchange[REPLICATE] => st_geometryfromtext:Geometry
                 - Project[] => [st_geometryfromtext:Geometry]
                         st_geometryfromtext := "st_geometryfromtext"("wkt")
                     - ScanFilterProject[table = ...
    
    Benchmark results:
    
    Benchmark                                        (pointCount)  Mode  Cnt   Score   Error  Units
    BenchmarkSpatialJoin.benchmarkJoin                         10  avgt   30  15.163 ± 1.610  ms/op
    BenchmarkSpatialJoin.benchmarkJoin                        100  avgt   30  13.837 ± 0.919  ms/op
    BenchmarkSpatialJoin.benchmarkJoin                       1000  avgt   30  16.205 ± 1.360  ms/op
    BenchmarkSpatialJoin.benchmarkJoin                      10000  avgt   30  22.915 ± 1.731  ms/op
    BenchmarkSpatialJoin.benchmarkUserOptimizedJoin            10  avgt   30  14.426 ± 1.048  ms/op
    BenchmarkSpatialJoin.benchmarkUserOptimizedJoin           100  avgt   30  14.507 ± 0.518  ms/op
    BenchmarkSpatialJoin.benchmarkUserOptimizedJoin          1000  avgt   30  16.265 ± 1.447  ms/op
    BenchmarkSpatialJoin.benchmarkUserOptimizedJoin         10000  avgt   30  22.076 ± 1.547  ms/op
    mbasmanova committed Mar 8, 2018
    Configuration menu
    Copy the full SHA
    9f2ac1d View commit details
    Browse the repository at this point in the history