Optimize pattern matching on constants #1185

ospencer · 2022-04-15T22:02:15Z

When matching on constants, we essentially compile something like match (...) { 'a' => ..., 'b' => ..., ... } into an if chain of if (val == 'a') { goto branch 1 } else if (val == 'b') { goto branch 2 } else .... We can instead compile them as though they were variants, and just do a br_table.

The text was updated successfully, but these errors were encountered:

spotandjake · 2024-10-08T02:18:40Z

Investigating this a bit, it does not seem as though using a br_table is always the most efficent, if you have sparse datasets the table is not the best optionally additionally we need to map the data to a 0 based range as such it probably makes sense to use a few heuristics here notably:

Primarily continuous dataset (We can have a few caps and just direct them to the default branch manually)
More then a few elements as br_table is only really more efficent then branching in the case that checking all the options would take longer.
for void constants it would make sense to just eliminate the match all together and do a direct jump.
for bool constants it would make sense to build a select statement instead of a conditional

We can however do an additional optimization when implementing this that is mentioned in the todo comment and even if the dataset is sparse we can load the number data from the heap and untag and do a direct match for some of our number types to speed things up.

My current plan of attack is something like the following, (some furthur optimizations could be made such as on large match statements if we notice there are subranges of data we could break it into multiple chained jump tables or a mix of jumps and conditionals):

// Plan
// Get rid of Conditional (Always compile to a table)
// Two table types, Jump_Table, Conditional_Table
// Ensure all constants are the same type
// Jump_Table: The range of the data is some heuristic relative to the branch count (We want the data to be contigous)
// Steps:
// 1) Validate all constants are the same type
// 2) Check if our constant is eligable for our table based approach (i.e != string, bytes, rational)
// 3) Determine the min, and max constant values
// 4) compare the range of the data to the branch paths if our data looks contigous make jump table otherwise conditional
// 5) Generate our mapping code should just be value-min
// 6) Generate our table for all branches
// 7) If we failed at any step generate a conditional approach like before
// Notes:
// The number type is interesting I think it is possible to handle them but we need to look
// Conditional_Table: Our generic Fallback
// table_type = Jump_Table | Conditional_Table
// Table(
// table_type: table_type,
// alias: Ident.t,
// branches: list((const, descision_tree)),
// min: int,
// max: int,
// default_branch: decision_tree
// ) -- We should be able to untag and subtract min, (Untagging heap is different)

ospencer mentioned this issue Apr 15, 2022

feat(compiler): Support pattern matching "or" patterns #1173

Merged

spotandjake added enhancement compiler labels Mar 6, 2023

spotandjake linked a pull request Oct 9, 2024 that will close this issue

feat(compiler): Improve constant pattern matching #2173

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize pattern matching on constants #1185

Optimize pattern matching on constants #1185

ospencer commented Apr 15, 2022

spotandjake commented Oct 8, 2024 •

edited

Loading

Optimize pattern matching on constants #1185

Optimize pattern matching on constants #1185

Comments

ospencer commented Apr 15, 2022

spotandjake commented Oct 8, 2024 • edited Loading

spotandjake commented Oct 8, 2024 •

edited

Loading