Added support for page-level filter pushdown (indexes) #102

jorgecarleitao · 2022-03-17T17:56:04Z

With #100 merged, we can now work on using indexes to support page-level filter pushdown via indexes.

This issue describe my initial ideas about the topic:

Design using column indexes

Say we have two columns in a row group, c1 and c2, with the following page structure:

c1: [----][----][--][--] (p11, p12, p13, p14)
c2: [----][--][----][--] (p21, p22, p23, p24)

and that we have a filter over c1 that selects it as follows:

selected:        ----      --
c1      : [----][----][--][--] (p11, p12, p13, p14)
c2      : [----][--][----][--] (p21, p22, p23, p24)

the goal is to iterate over c1 and c2 so that we select rows from them accordingly:

            I_1      I_2
           ----      --
c1: [----][----][--][--] (p12, p14)
c2: [----][--][----][--] (p22, p23, p24)

For this, 3 steps come to mind:

compute the row intervals I_j = (start, len)
for each column, compute the set of pages that fill all intervals
pass pages (and their ranges) to consumers to deserialize to their favorite in-memory format

For c1 these are full pages; for c2 these are slices of pages (slices of p22,p23,p24).

Implementation

Because ColumnIndex has encoded data, we need a trait object Index to describe the
different physical layouts, just like we do for statistics, so that consumers do not need to worry about de-serializing the values prior to using them:

pub trait Index: Send + Sync + std::fmt::Debug {
    fn as_any(&self) -> &dyn Any;

    fn physical_type(&self) -> &PhysicalType;
}

/// The index of a page, containing the min and max values of the page.
pub struct PageIndex<T> {
    /// The minimum value in the page. It is None when all values are null
    pub min: Option<T>,
    /// The maximum value in the page. It is None when all values are null
    pub max: Option<T>,
    /// The number of null values in the page
    pub null_count: Option<i64>,
}

/// An index of a column of [`NativeType`] physical representation
pub struct NativeIndex<T: NativeType> {
    pub indexes: Vec<PageIndex<T>>,
    pub boundary_order: BoundaryOrder,
}

impl<T: NativeType> Index for NativeIndex<T> {
    fn as_any(&self) -> &dyn Any {
        self
    }

    fn physical_type(&self) -> &PhysicalType {
        &T::TYPE
    }
}

/// An index of a column of bytes physical type
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub struct ByteIndex {
    pub indexes: Vec<PageIndex<Vec<u8>>>,
    pub boundary_order: BoundaryOrder,
}

...

For step 1, we need to compute the set of intervals that selects the relevant rows from a set of pages:

pub fn compute_rows<'a, T>(
    index: &'a [PageIndex<T>],
    locations: &[PageLocation],
    num_rows: u64,
    selector: &dyn Fn(&'a PageIndex<T>) -> bool,
) -> Result<Vec<Interval>, ParquetError>;

where:

PageLocation contains the location of the page in the file (in bytes)
Interval contain the start and length (in number of rows) of the elements
to retrieve from said page. The invariant is start + length <= page.num_rows().

(multiple-column selectors are computed by the overlapping of each column's interval).

For step 2, we need a function to select pages from any column chunk:

pub fn select_pages(
    intervals: &[Interval],
    locations: &[PageLocation],
    num_rows: u64,
) -> Result<Vec<FilteredPage>, ParquetError>;

where FilteredPage is something like

/// An enum describing a page that was either selected in a filter pushdown or skipped
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
pub enum FilteredPage {
    Select {
        /// Location of the page in the file in bytes
        start: u64,
        /// the length of the page in bytes
        length: usize,
        /// Location of rows to select in the page
        rows_offset: usize,
        rows_length: usize,
    },
    Skip {
        /// Location of the page in the file
        start: u64,
        /// the length of the page in bytes
        length: usize,
        /// number of rows that are skip by skipping this page
        num_rows: usize,
    },
}

With this struct, we have everything we need to implement a reader that skips pages that are not needed. When pages are selected, the rows_offset and rows_length are passed along so that consumers know how which part of the page they must decode.

Consumers (of pages) that apply this filter must be able to decode pages with an offset (Rust's .skip) and length (Rust's .take).

The text was updated successfully, but these errors were encountered:

jorgecarleitao added the feature A new feature label Mar 17, 2022

jorgecarleitao self-assigned this Mar 17, 2022

jorgecarleitao mentioned this issue Mar 20, 2022

Added support for reading indexed parquet pages jorgecarleitao/arrow2#923

Merged

2 tasks

jorgecarleitao mentioned this issue Apr 4, 2022

Fixed design error in filtered pages #114

Merged

jorgecarleitao closed this as completed in #114 Apr 4, 2022

jorgecarleitao mentioned this issue Apr 4, 2022

Feature Request: Predicate Pushdown #101

Closed

jorgecarleitao changed the title ~~Add support for page-level filter pushdown (indexes)~~ Added support for page-level filter pushdown (indexes) Apr 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added support for page-level filter pushdown (indexes) #102

Added support for page-level filter pushdown (indexes) #102

jorgecarleitao commented Mar 17, 2022 •

edited

Loading

Added support for page-level filter pushdown (indexes) #102

Added support for page-level filter pushdown (indexes) #102

Comments

jorgecarleitao commented Mar 17, 2022 • edited Loading

Design using column indexes

Implementation

jorgecarleitao commented Mar 17, 2022 •

edited

Loading