Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: simplify relational operators #2919

Closed
cpcloud opened this issue Aug 30, 2021 · 6 comments
Closed

refactor: simplify relational operators #2919

cpcloud opened this issue Aug 30, 2021 · 6 comments
Labels
breaking change Changes that introduce an API break at any level chained joins The bane of Ibis's existence internals Issues or PRs related to ibis's internal APIs refactor Issues or PRs related to refactoring the codebase

Comments

@cpcloud
Copy link
Member

cpcloud commented Aug 30, 2021

Many logically distinct operations like filter, project, and aggregate are fused together into a single operation.

While this was useful initially for generating clean SQL, it has a number of downsides regarding maintenance and complexity of the library:

  1. The current design has led to a very tight coupling between the logical representation of an expression tree, and its optimization.
  2. It becomes difficult to maintain the behavior of these operators in isolation, because one must handle every aspect of a non-join relation when dealing with them in pretty much every context. There's no way to just think about a predicate.

To the end of simplifying the design and maintenance of ibis, I propose two related internal changes.

Proposal:

Separate Selection into distinct operators for each operation:

  1. Project
  2. Filter
  3. OrderBy

Whittle down Aggregation into a class with only metrics and by as instance members, and use where to implement the having API, and Filter and OrderBy to take the place of predicates and sort_keys respectively.

@cpcloud cpcloud added the internals Issues or PRs related to ibis's internal APIs label Aug 30, 2021
@cpcloud cpcloud self-assigned this Aug 30, 2021
@jreback
Copy link
Contributor

jreback commented Aug 31, 2021

would for sure be +1 here

maybe could even do this w/o breaking the world (eg back compatible)

@cpcloud
Copy link
Member Author

cpcloud commented Aug 31, 2021

Yeah, I think this can be done in a backwards compatible way. The generated SQL for certain backends will almost certainly change though.

@icexelloss
Copy link
Contributor

+1 to this. We internally depend on it currently being projection (isinstance checks) but not a huge deal to update to using the new classes.

@cpcloud cpcloud changed the title DSN: Simplify relational operators refactor: simplify relational operators Dec 28, 2021
@cpcloud cpcloud added breaking change Changes that introduce an API break at any level refactor Issues or PRs related to refactoring the codebase labels Dec 28, 2021
@cpcloud cpcloud modified the milestones: 3.0.0, 4.0.0 Jan 14, 2022
@cpcloud cpcloud modified the milestones: 4.0.0, 4.x Jul 6, 2022
@cpcloud cpcloud modified the milestones: 4.x, 4.0.0 Sep 16, 2022
@cpcloud cpcloud removed this from the 4.0.0 milestone Nov 7, 2022
@lostmygithubaccount
Copy link
Member

@cpcloud is this issue still valid?

@cpcloud cpcloud added the chained joins The bane of Ibis's existence label Dec 16, 2023
@cpcloud
Copy link
Member Author

cpcloud commented Dec 16, 2023

@lostmygithubaccount Yep, happening as part of #7580!

@cpcloud cpcloud linked a pull request Dec 16, 2023 that will close this issue
@cpcloud cpcloud removed their assignment Dec 23, 2023
@cpcloud
Copy link
Member Author

cpcloud commented Jan 27, 2024

Done in the-epic-split!

@cpcloud cpcloud closed this as completed Jan 27, 2024
@github-project-automation github-project-automation bot moved this from backlog to done in Ibis planning and roadmap Jan 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change Changes that introduce an API break at any level chained joins The bane of Ibis's existence internals Issues or PRs related to ibis's internal APIs refactor Issues or PRs related to refactoring the codebase
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants