planner: large under-estimated cardinality error if the value in equal condition is not in TopN or Histogram Bucket boundary #45919
Labels
affects-6.5
This bug affects the 6.5.x(LTS) versions.
affects-7.1
This bug affects the 7.1.x(LTS) versions.
epic/cardinality-estimation
the optimizer cardinality estimation
sig/planner
SIG: Planner
type/enhancement
The issue or PR belongs to an enhancement.
Run the SQLs below to prepare some data:
Then choose a value that is not in TopN and Histogram Bucket boundary, and we can see its cardinality estimation will be under-estimated a lot (p-error is around 200):
The root cause is that if this value is not in TopN or Bucket Boundary, we'll use
total / NDV
as the estimated values, then for some frequent values, this estimation formula may bring a big estimation error:The text was updated successfully, but these errors were encountered: