Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MySQL Bugs: #103609 Unknown column 'upper.j' in 'field list' #24444

Open
ChenPeng2013 opened this issue May 7, 2021 · 3 comments
Open

MySQL Bugs: #103609 Unknown column 'upper.j' in 'field list' #24444

ChenPeng2013 opened this issue May 7, 2021 · 3 comments
Assignees
Labels
sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement.

Comments

@ChenPeng2013
Copy link
Contributor

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

use test;
drop table if exists t;
CREATE TABLE t (i char(10), j int);
INSERT INTO t VALUES('A', 1);
INSERT INTO t VALUES('A', 3);
INSERT INTO t VALUES('A', 5);
INSERT INTO t VALUES('B', 1);
INSERT INTO t VALUES('B', 7);
SELECT LAST_VALUE((SELECT upper.j FROM t LIMIT 1)) OVER (PARTITION BY i)  FROM t AS upper;

2. What did you expect to see? (Required)

MySQL 8.0.21

mysql> SELECT LAST_VALUE((SELECT upper.j FROM t LIMIT 1)) OVER (PARTITION BY i)  FROM t AS upper;
+-------------------------------------------------------------------+
| LAST_VALUE((SELECT upper.j FROM t LIMIT 1)) OVER (PARTITION BY i) |
+-------------------------------------------------------------------+
|                                                                 5 |
|                                                                 5 |
|                                                                 5 |
|                                                                 7 |
|                                                                 7 |
+-------------------------------------------------------------------+

3. What did you see instead (Required)

mysql> SELECT LAST_VALUE((SELECT upper.j FROM t LIMIT 1)) OVER (PARTITION BY i)  FROM t AS upper;
ERROR 1054 (42S22): Unknown column 'upper.j' in 'field list'

4. What is your TiDB version? (Required)

Release Version: v4.0.0-beta.2-2807-g289dcfefd-dirty
Edition: Community
Git Commit Hash: 289dcfefdd159345972f71cf42c9d6e594255290
Git Branch: master
UTC Build Time: 2021-05-06 15:11:16
GoVersion: go1.13
Race Enabled: false
TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306
Check Table Before Drop: false
@ChenPeng2013 ChenPeng2013 added type/bug The issue is confirmed as a bug. severity/major sig/sql-infra SIG: SQL Infra labels May 7, 2021
@wjhuang2016 wjhuang2016 added sig/planner SIG: Planner and removed sig/sql-infra SIG: SQL Infra labels May 7, 2021
@eurekaka eurekaka self-assigned this May 8, 2021
@eurekaka
Copy link
Contributor

eurekaka commented May 10, 2021

The root cause of the problem is that, in resolveWindowFunction called by buildSelect, the havingWindowAndOrderbyExprResolver would do nothing and return quickly for SubuqeryExpr, so the j column is not appended into the sel.Fields.Fields. Then when rewriting the subquery in the window function, the expression rewriter cannot find column j in the outer schema.

It is pretty difficult / tricky to fix this problem, we have to write an ad-hoc logic to flatten the subquery containing correlated columns into a simple column list, i.e, expression rewriter should convert the subquery select upper.j from t limit 1 to a column list [upper.j]. It is not easy to achieve that since we have to enforce the MaxOneRow constraint at the same time.

Currently we have only 2 possible actions for a scalar subquery: building an apply if the subquery contains correlated columns, or executing it directly if it is an uncorrelated subquery, both of them are not suitable for our case.

A tricky solution would be that, we hack buildSelect to make it be aware of the fact that we are building a subquery in the window function, and specially substitute the upper.j in the projection list to a constant(e.g, 1), which represents the offset of upper.j in the upper schema(before that, we need to modify resolveWindowFunction to dive into the subquery in the window function and append column j into sel.Fields.Fields, then this j column can be in the outer schema then and be used). After the substitution, the subquery is actually converted to select 1 from t limit 1, then handleScalarSubquery can execute this uncorrelated subquery, and get a constant result 1(MaxOneRow constraint can be enforced in this execution). Then in handleScalarSubquery, instead of directly pushing this constant 1 into the result stack of the expression rewriter, we use it as the offset to retrieve upper schema columns and get the corresponding column j, then push it to the stack.

This approach should be able to solve the problem, but it is too tricky and would make the code messy and complicated. Considered that we have an equivalent, simpler and more reasonable rewriting for the query, i.e, SELECT LAST_VALUE(upper.j) OVER (PARTITION BY i) FROM t AS upper, I suggest we don't optimize this problem(the issue should be an enhancement rather than a bug by the way), and mark it as an known issue instead.

@eurekaka eurekaka added type/enhancement The issue or PR belongs to an enhancement. and removed type/bug The issue is confirmed as a bug. labels May 10, 2021
@eurekaka
Copy link
Contributor

I think "support correlated columns as the window function arguments" should be a more proper title for this issue.

@eurekaka
Copy link
Contributor

Credit to @time-and-fate for reminding that

both of them are not suitable for our case

is wrong, we can actually build the window function on top of the apply after appending j to sel.Fields.Fields.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/planner SIG: Planner type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

No branches or pull requests

3 participants