-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify cohort filters #6277
Simplify cohort filters #6277
Conversation
ca1232b
to
d941d6c
Compare
f641a62
to
5b4feaf
Compare
5b4feaf
to
3e21d6a
Compare
92bf70b
to
0a495c3
Compare
444911d
to
cfcb484
Compare
ClickHouse query benchmark results from GitHub Actions Lower numbers are good, higher numbers are bad. A ratio less than 1 Significantly changed benchmark results (PR vs master) before after ratio
[04045ff4] [7203ca4e]
+ 7617.0±2.2e+02 11564.0±2.7e+02 1.52 benchmarks.QuerySuite.track_trends_filter_by_cohort
- 7158.5±59 5905.5±1.4e+02 0.82 benchmarks.QuerySuite.track_trends_filter_by_cohort_materialized
Click to view full benchmark resultsAll benchmarks:
before after ratio
[04045ff4] [7203ca4e]
19180.5±1.4e+03 20986.0±1.2e+03 1.09 benchmarks.QuerySuite.track_trends_event_property_filter
4276.0±4.3e+02 4184.5±88 0.98 benchmarks.QuerySuite.track_trends_event_property_filter_materialized
+ 7617.0±2.2e+02 11564.0±2.7e+02 1.52 benchmarks.QuerySuite.track_trends_filter_by_cohort
- 7158.5±59 5905.5±1.4e+02 0.82 benchmarks.QuerySuite.track_trends_filter_by_cohort_materialized
5592.0±2.6e+02 5416.0±1.3e+02 0.97 benchmarks.QuerySuite.track_trends_filter_by_cohort_precalculated
2074.0±40 2107.5±22 1.02 benchmarks.QuerySuite.track_trends_no_filter
11185.5±72 11656.0±88 1.04 benchmarks.QuerySuite.track_trends_person_property_filter
5879.5±65 6072.0±81 1.03 benchmarks.QuerySuite.track_trends_person_property_filter_materialized
|
0a495c3
to
182837d
Compare
…usePersonQuery class
4566e3b
to
1692f4d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all points I had were covered in the description.
Thought I had while reading through this PR: would be nice to have some sort of full print of queries generated by certain functions. (We've talked about this previously but just thought of it again)
Absolutely. I'm waiting for the dice to fall on the way we'll write queries - if we'll use pytest going forward we can bring in a functioning snapshot library. |
Changes
Follow-up to #6221
This PR starts "simplifying" cohort filters. The idea is to have the business logic bits happen before we do query building, which allows to do better optimizations, e.g. avoid unneeded subqueries.
The cases that got simplified are:
Not everything is simplified yet. There's 3 cases which I'd love to tackle in a follow-up:
Once all of these are done we can "delete" a lot of special-case query building logic :)
One benchmark got slower. This is because a large column got communicated between threads which was previously used only in a subquery. This will be improved again by #5850 which this is a prerequisite for.
How did you test this code?
See test coverage. I verified all of insights + sessions work with the new cohort cases.