-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gen4 Tracking #7280
Labels
Component: Query Serving
Type: Enhancement
Logical improvement (somewhere between a bug and feature)
Comments
systay
added
the
Type: Enhancement
Logical improvement (somewhere between a bug and feature)
label
Jan 11, 2021
8 tasks
8 tasks
2 tasks
2 tasks
2 tasks
This was referenced Jun 18, 2021
Merged
Merged
2 tasks
This was referenced Jul 14, 2021
Merged
This was referenced Jul 22, 2021
This was referenced Jul 28, 2021
3 tasks
This was referenced Sep 28, 2021
This was referenced Sep 29, 2021
Merged
This was referenced Oct 1, 2021
This was referenced Oct 7, 2021
2 tasks
This was referenced Oct 11, 2021
2 tasks
2 tasks
This was referenced Oct 14, 2021
3 tasks
3 tasks
2 tasks
This was referenced Oct 22, 2021
This was referenced Oct 26, 2021
2 tasks
3 tasks
Should we close this issue. Seems very old and redundant now. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Component: Query Serving
Type: Enhancement
Logical improvement (somewhere between a bug and feature)
This issue is meant to track the work going on on the Gen4 planner.
The Gen4 planner is a new planner in Vitess that explores many different join alternatives and uses a little bit of cost analysis to pick the cheapest plan. In contrast, the V3 planner merges and join tables from left to right, and this made it important for the user to list tables in a good order so that the planner could produce an efficient route.
The gen4 algorithm that we will start implementing is based on the GOO paper (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.31.737), but the infrastructure for it can be reused for other models.
This is a larger rewrite of the vtgate planner. It introduces new passes and intermediate representations of the query.
The old code used these passes over the query:
This refactored planner now uses the following passes:
By splitting the planning process into smaller pieces, each part can be simplified and extended to do more.
Here follows a short description of each new pass.
Semantic Analysis
Responsibilities: Scoping, Binding
Walks the AST and does scoping and binding, so whenever a column name is found, the planner has information about which tables is being referenced. Tables are given a
TableSet
identifier - a bitmask struct that allows the planner to quickly find what dependencies every expression has.Extract Query Graph
Responsibilities: Extract Subqueries, Create Query Graph
The query graph is an intermediate representation that is designed to allow the route planner to quickly consider many different solutions for the query. Instead of keeping the query in the AST, which is limited by the tree structure it has, we produce a graphy representation with all used tables (nodes) in one list, and edges between them in a separate list.
In this pass, subqueries are extracted into a list of queries and the relationships between them. This makes it easier for later passes to plan fully without having to switch back and forth between passes - when doing route planning, we can do all of route planning in one go and don't have to wait for SELECT expressions to be considered before planning subqueries used in SELECT expressions.
Route planning
Responsibilities: Plan how to route the query - plan FROM and WHERE
This pass uses dynamic programming to consider all combinations of tables in order to find the optimal plan. Optimal here means minimal number of route primitives in the plan.
At the end of this stage, we have a tree structure that represents all the route primitives needed and how they should be joined.
Horizon planning
Responsibilities: Plan projections, aggregations, grouping and ordering
Once we have a plan for how to route queries, we plan what projections we need from each route, and how to do
ORDER BY/GROUP BY/LIMIT
et al.Positive outcomes from this refactoring.
Why do this non-trivial piece of work?
We still have a number of query types that are not supported. In order to be able to support more queries, we needed to extend the planner. Instead of adding to the legacy planner which is not very easy to work with, we felt that it was time to introduce this new design, which not only will allow us to support these queries, it also sets us up to be able to do more optimisations in the future.
Known tasks:
SELECT
expressions Planner refactoring #7103A join B
vsB join A
Gen4 Planner: AxB vs BxA #7274Clone()
methods for AST structs Update AST helper generation #7558The text was updated successfully, but these errors were encountered: