Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-43199][SQL] Make InlineCTE idempotent
### What changes were proposed in this pull request? This PR fixes `InlineCTE`'s idempotence. E.g. the following query: ``` WITH x(r) AS (SELECT random()), y(r) AS (SELECT * FROM x), z(r) AS (SELECT * FROM x) SELECT * FROM z ``` currently breaks it because we take into account the reference to `x` from `y` when deciding about not inlining `x` in the first round: ``` === Applying Rule org.apache.spark.sql.catalyst.optimizer.InlineCTE === WithCTE WithCTE :- CTERelationDef 0, false :- CTERelationDef 0, false : +- Project [rand()#218 AS r#219] : +- Project [rand()#218 AS r#219] : +- Project [random(2957388522017368375) AS rand()#218] : +- Project [random(2957388522017368375) AS rand()#218] : +- OneRowRelation : +- OneRowRelation !:- CTERelationDef 1, false +- Project [r#222] !: +- Project [r#219 AS r#221] +- Project [r#220 AS r#222] !: +- Project [r#219] +- Project [r#220] !: +- CTERelationRef 0, true, [r#219] +- CTERelationRef 0, true, [r#220] !:- CTERelationDef 2, false !: +- Project [r#220 AS r#222] !: +- Project [r#220] !: +- CTERelationRef 0, true, [r#220] !+- Project [r#222] ! +- CTERelationRef 2, true, [r#222] ``` But in the next round we inline `x` because `y` was removed due to lack of references: ``` Once strategy's idempotence is broken for batch Inline CTE !WithCTE Project [r#222] !:- CTERelationDef 0, false +- Project [r#220 AS r#222] !: +- Project [rand()#218 AS r#219] +- Project [r#220] !: +- Project [random(2957388522017368375) AS rand()#218] +- Project [r#225 AS r#220] !: +- OneRowRelation +- Project [rand()#218 AS r#225] !+- Project [r#222] +- Project [random(2957388522017368375) AS rand()#218] ! +- Project [r#220 AS r#222] +- OneRowRelation ! +- Project [r#220] ! +- CTERelationRef 0, true, [r#220] ``` ### Why are the changes needed? We use `InlineCTE` as an idempotent rule in the `Optimizer`, `CheckAnalysis` and `ProgressReporter`. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added new UT. Closes #40856 from peter-toth/SPARK-43199-make-inlinecte-idempotent. Authored-by: Peter Toth <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
- Loading branch information