Implement Query Partitioning #1094

Shelnutt2 · 2018-12-13T21:44:52Z

For high level applications such as presto or spark having a query partitioner that can break a query into optimally sized sub arrays will be beneficial. Ideally the partition function can take one or more sub arrays as input and the number of desired partition and return a list of new subarrays to query based off of.

Implementing heuristics so for sparse arrays the partitions can be balanced is important. Currently Presto and spark implement their own naive partitioning which can result in unbalanced reads on a sparse array.

tdenniston · 2019-04-25T13:05:42Z

An experimental partitioner was added in #1197 . See also #1225 -- decision needs to be taken on an API for this or not.

Shelnutt2 added the enhancement label Dec 13, 2018

Shelnutt2 added this to the 1.6.0 milestone Dec 13, 2018

jakebolewski added the API Addition label Dec 13, 2018

tdenniston modified the milestones: 1.6.0, 1.7.0 Jun 13, 2019

tdenniston modified the milestones: 1.7.0, 1.8.0 Nov 12, 2019

joe-maley removed this from the 1.8.0 milestone Oct 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement Query Partitioning #1094

Implement Query Partitioning #1094

Shelnutt2 commented Dec 13, 2018 •

edited

Loading

tdenniston commented Apr 25, 2019

Implement Query Partitioning #1094

Implement Query Partitioning #1094

Comments

Shelnutt2 commented Dec 13, 2018 • edited Loading

tdenniston commented Apr 25, 2019

Shelnutt2 commented Dec 13, 2018 •

edited

Loading