-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: deprecate ordinal column references #93754
Conversation
Huh, but what feature then remains to select a column from a source by ordinal position? like "i want just the first column" |
also, what's the motivation for this change generally? |
maybe I can still do |
When does this come up? |
NB I'm ok with the change if we don't deprecate the syntax |
5297fc9
to
ccd263e
Compare
I've added an explanation of the motivation to the commit message. |
cheers that works for me 👍 |
This commit disallows ordinal column references by default. Any statements using them will result in an error. The session setting `allow_ordinal_column_references` can be set to true to revert to previous behavior. Note that numeric tuple indexing (e.g., `SELECT ((1,2,3)).@2`) is still allowed. This deprecation is motivated by the inconsistent and surprising behavior that ordinal column references provide. Ordinal column references (e.g., `SELECT @1 FROM t`) are a vestigial and undocumented feature originally motivated to aid the heuristic planner. The PR that added them, cockroachdb#10729, discourages their use by users because "they are not robust against schema changes". It also points out a subtle difference between ordinal column references and SQL standard ordinals that could confuse users. As an example, the following statements are not equivalent: ```sql SELECT @2 FROM t ORDER BY @1; SELECT @2 FROM t ORDER BY 1; ``` In the current implementation, an ordinal column reference `@n` resolves to the n-th column in the current scope's slice of columns. This has several implications: 1. Ordinal columns can refer to any column, not just columns of data sources in `FROM` clauses, as was originally intended. This makes it hard to reason about the resolution of an ordinal column reference. For example, it's somewhat surprising that the result of the `SELECT` below is not `(10, 1)`. ``` CREATE TABLE t (a INT, b INT); INSERT INTO t VALUES (1, 10); SELECT @2, @1 FROM (SELECT @2, @1 FROM t); ?column? | ?column? -----------+----------- 1 | 10 (1 row) ``` 2. The ordering of columns in the scope's slice is not guaranteed to be consistent, so any reasonable ordinal column resolution occurs more-or-less by chance. As an example of unexpected behavior, consider: ``` CREATE TABLE t (a INT PRIMARY KEY, INDEX ((a+10))); INSERT INTO t(a) VALUES (1); ALTER TABLE t ADD COLUMN b INT; SELECT @1, @2 FROM t; ?column? | ?column? -----------+----------- 1 | 11 (1 row) ``` Most users would expect the result of the `SELECT` statement to be `(1, NULL)` because `@1` should resolve to `a` and `@2` should resolve to `b`. Instead, the ordinal column reference `@2` circumvents logic that keeps inaccessible columns from being referenced, and resolves to the virtual computed column backing the secondary index. Epic: None Release note (sql change): Ordinal column references (e.g., `SELECT @1, @2 FROM t`) are now deprecated. By default, statements using this syntax will now result in an error. They can be allowed using the session setting `SET allow_ordinal_column_references=true`. Support for ordinal column references is scheduled to be removed in version 23.2.
ccd263e
to
cfa32f1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 37 of 37 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @benbardin, @knz, and @rafiss)
TFTRs! bors r+ |
Build succeeded: |
This commit removes some leftover ordinal column references in `partialidx.Implicator`'s benchmarks. These benchmarks are failing since cockroachdb#93754 which deprecates ordinal column references. Epic: None Release note: None
93858: xform: use ordering from LIMIT as a hint for streaming group-by r=DrewKimball,rharding6373 a=msirek Fixes #93410 A query with a grouped aggregation, a LIMIT and an ORDER BY may not always explore the best-cost query plan. Due to the existence of unique constraints on a table, the set of grouping columns may be reduced during normalization via rule ReduceGroupingCols such that it no longer includes columns present in the ORDER BY clause. This eliminates possibly cheap plans from consideration, for example, if the input to the aggregation is a lookup join, it may be cheaper to sort the input to the lookup join on the ORDER BY columns if they overlap with the grouping columns, so that a streaming group-by with no TopK operator can be used, and a full scan of the inputs to the join is avoided. This fix adds a new exploration rule which ensures that a grouped aggregation with a LIMIT and ORDER BY clause considers using streaming group-by with no TopK when possible. Release note (bug fix): This patch fixes join queries involving tables with unique constraints using LIMIT, GROUP BY and ORDER BY clauses to ensure the optimizer considers streaming group-by with no TopK operation, when possible. This is often the most efficient query plan. 93992: cli,workload: don't hide useful workloads r=j82w a=knz Prior to this patch, at least half of the workloads were hidden from view in the output of `cockroach --help`. There was no good reason for this: most of the workloads are useful for teaching/learning and for experimentation. They all deserve more exposure, so that folk can learn about them without being told by the one person who built the workload in the first place. So this patch fixes that by exposing of all of them through the online help. One question that could remain is how much teaching value there is in letting someone experiment with a tool that was built for the benefit of one team only. One specific workload is under consideration here: `bulkingest`, used for benchmarking inside the D&R team, does not really do anything akin to what an end-user would possibly expect to do with a database. For that workload, and the benefit of future workloas akin to it, this patch adds a notice in its help text that it was developed for internal testing only. Release note: None Epic: None 94005: opt/partialidx: fix Implicator benchmark r=mgartner a=mgartner This commit removes some leftover ordinal column references in `partialidx.Implicator`'s benchmarks. These benchmarks are failing since #93754 which deprecates ordinal column references. Epic: None Release note: None Co-authored-by: Mark Sirek <[email protected]> Co-authored-by: Raphael 'kena' Poss <[email protected]> Co-authored-by: Marcus Gartner <[email protected]>
This commit fixes a broken roachtest caused by cockroachdb#93754. Fixes cockroachdb#94058 Release note: None
93991: tree: correct mutation/DDL property for some opaque operators r=ZhouXing19 a=rafiss fixes #91713 In ed733ad, a framework was added to mark certain opaque operators as DDL or mutations. This was enhanced in 06581b3, but that change wasn't exhaustive since it marked some statements as read-only, even if they could perform DDL. With the addition of `StatementType()` in 8962176, we can make this a little more correct. This allows the check at https://github.com/cockroachdb/cockroach/blob/48ef0d89e6179c0d348a5236ad308d81fa392f7c/pkg/sql/opt/exec/execbuilder/relational.go#L163 to work correctly, and reject operations that shouldn't be allowed when using a read-only transaction. To explain each change: - BACKUP can modify job state and write to userfiles, so shouldn't be allowed in read-only mode. - SET commands are always allowed in read-only mode in order to match Postgres behavior, and since those changes are all in-memory and session setting modifications don't respect transactions anyway. - The crdb_internal tenant functions modify system tables. - GRANT, REVOKE, and many other privilege-related statements are "DCL" (data control language), and all modify system tables or descriptors. Release note (bug fix): CREATE ROLE, DELETE ROLE, GRANT, and REVOKE statements no longer work when the transaction is in read-only mode. 94025: storage: return error from {MVCC,Engine}Iterator.Value r=mgartner,nicktrav a=sumeerbhola Informs cockroachdb/pebble#1170 Epic: CRDB-20378 Release note: None 94031: kvstreamer: fix the usage of the range iterator r=yuzefovich a=yuzefovich Previously, after `Seek`ing the range iterator to the next key in the batch of requests in the streamer we forgot to check the validity of the iterator. In particular, this could lead to a crash of the process if `Seek` encountered an error for whatever reason. In practice, I've only observed this when running TPCH with high concurrency when GOMEMLIMIT is set. The bug was introduced in an innocently-looking refactor in 041b104. Epic: None Release note (bug fix): CockroachDB could previously crash in rare circumstances when evaluating lookup and index joins. The bug is present since 22.2.0 release. Temporary workaround without upgrading to the release with this fix is changing the value of undocumented cluster setting `sql.distsql.use_streamer.enabled` to `false`. 94073: roachtest: replace ordinal column reference in transfer-leases r=mgartner a=mgartner This commit fixes a broken roachtest caused by #93754. Fixes #94058 Release note: None Co-authored-by: Rafi Shamim <[email protected]> Co-authored-by: sumeerbhola <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Marcus Gartner <[email protected]>
This commit fixes a broken roachtest caused by cockroachdb#93754. Fixes cockroachdb#94058 Release note: None
sql: deprecate ordinal column references
This commit disallows ordinal column references by default. Any
statements using them will result in an error. The session setting
allow_ordinal_column_references
can be set to true to revert toprevious behavior.
Note that numeric tuple indexing (e.g.,
SELECT ((1,2,3)).@2
) is stillallowed.
This deprecation is motivated by the inconsistent and surprising
behavior that ordinal column references provide. Ordinal column
references (e.g.,
SELECT @1 FROM t
) are a vestigial and undocumentedfeature originally motivated to aid the heuristic planner. The PR that
added them, #10729, discourages their use by users because "they are not
robust against schema changes". It also points out a subtle difference
between ordinal column references and SQL standard ordinals that could
confuse users. As an example, the following statements are not
equivalent:
In the current implementation, an ordinal column reference
@n
resolvesto the n-th column in the current scope's slice of columns. This has
several implications:
sources in
FROM
clauses, as was originally intended. This makes ithard to reason about the resolution of an ordinal column reference.
For example, it's somewhat surprising that the result of the
SELECT
below is not
(10, 1)
.consistent, so any reasonable ordinal column resolution occurs
more-or-less by chance. As an example of unexpected behavior,
consider:
Most users would expect the result of the
SELECT
statement to be(1, NULL)
because@1
should resolve toa
and@2
should resolveto
b
. Instead, the ordinal column reference@2
circumvents logicthat keeps inaccessible columns from being referenced, and resolves to
the virtual computed column backing the secondary index.
Epic: None
Release note (sql change): Ordinal column references (e.g.,
SELECT @1, @2 FROM t
) are now deprecated. By default, statements usingthis syntax will now result in an error. They can be allowed using the
session setting
SET allow_ordinal_column_references=true
. Support forordinal column references is scheduled to be removed in version 23.2.