forked from prestodb/presto
-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Statistics 12 fix unknown estimates #614
Open
sopel39
wants to merge
105
commits into
master
Choose a base branch
from
statistics-12-fix_unknown_estimates
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Remove statistics tests which just display statistics for whole tree for manual analysis. Those proved to not be very useful and mechanism it uses (getting stats from EXPLAIN plan) would not work for more detailed column statistics anyway.
Test queries cover all the current implementation of CoefficientCostCalculator. Test compares estimated statistics with actual numbers obtained from actual query execution
Instead of recursing in the StatsCalculator api, pass the Lookup instance into the stats computation. Then the method computing a stats for plan node can use Lookup to compute stats (and possibly other traits) for source nodes. The StatelessLookup is a temporary measure for places that don't have access to the lookup for individual plans. It can be used across multiple queries because it doesn't keep state, but it won't resolve GroupReferences.
Using Estimates for statistics computation in StatsCalculators is problematic as method-call based arithmetitc must be used. This makes code hard to read and maintain. As currently Estimate was nothing more than a wrapper around double value we decided to just use doubles for computation and represent unknown values as a NaN. Estimate is still used in SPI to make contract between presto-main and connectores more clear.
Capping of limit set by sesion manger was moved from ClusterMemoryManager.java to SystemSessionProperties.java
It makes this field more independent. It just describes data distribution characteristic for a column and does not connect it with table wide statistics of total number of rows in the table.
Just one range is supported for now
Move pattern matching to separate package and make it independent from optimizer. Thanks to that it will be possible to use pattern matching not only for optimizer, but other components as well.
Fix a copy-paste error in PlanNodeCostEstimate.memoryCost()
Fix a copy-paste error in PlanNodeCostEstimate.memoryCost()
Return NaN from PNStatsEstimate#getOutputSizeInBytes when rowCount is NaN
Not all CostCalculators are thread safe.
…ying Previously there was a lot of map copying (through ImmutableMap.copyOf(...) and new HashMap(...)) which was significantly impacting stats code performance. HashTreePMap is much better for cases where individual entries of base map are modified which is common case in stats code. TODO: This should be split into fixups. Keeping as commit for now since it is one integral change
sopel39
force-pushed
the
statistics-12-fix_unknown_estimates
branch
from
July 12, 2017 11:15
75f2098
to
0379997
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.