sql: sequences created by `SERIAL` have weird semantics with regards to database #51090

ajwerner · 2020-07-07T21:00:01Z

Describe the problem

When we create SERIAL types and we have serial_normalization set to 'sql_sequence' we construct a new sequence. Unfortunately, the sequence we construct is in the current database rather than the database of the table. This leads to strange semantics where the table can only be written to if the current session is in the database of the sequence.

The problem here is that we use the session's database to resolve the sequence rather than the table's. In postgres, IIUC, the table's database and schema are used to resolve the sequence.

To Reproduce

root@localhost:26257/defaultdb> SET serial_normalization = 'sql_sequence';
SET

root@localhost:26257/defaultdb> CREATE DATABASE other_db;
CREATE DATABASE

root@localhost:26257/defaultdb> CREATE TABLE other_db.t (s SERIAL PRIMARY KEY, i INT);
CREATE TABLE

root@localhost:26257/defaultdb> INSERT INTO other_db.t (i) VALUES (1);
INSERT 1

root@localhost:26257/defaultdb> USE other_db;
SET

root@localhost:26257/other_db> INSERT INTO t (i) VALUES (2);
ERROR: nextval(): relation "t_s_seq" does not exist
SQLSTATE: 42P01

Even worse, if I create a sequence in that database with that name, it'll get used:

root@localhost:26257/other_db> CREATE SEQUENCE t_s_seq;
CREATE SEQUENCE

root@localhost:26257/other_db> INSERT INTO t (i) VALUES (2);
ERROR: duplicate key value (s)=(1) violates unique constraint "primary"
SQLSTATE: 23505

Expected behavior
The SERIAL column will work regardless of which database the current session is in. Furthermore, as we move to user-defined schemas, it should refer to a single sequence regardless of the current session's search_path.

The text was updated successfully, but these errors were encountered:

blathers-crl · 2020-07-07T21:00:02Z

Hi @ajwerner, please add a C-ategory label to your issue. Check out the label system docs.

While you're here, please consider adding an A- label to help keep our repository tidy.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.}

ajwerner · 2020-07-07T21:04:54Z

@arulajmani you'll probably find this one interesting. In postgres this works better because nextval takes a regclass and not a string. Curious to hear your thoughts.

knz · 2020-07-08T14:02:03Z

FWIW I recall there was a recent discussion to disallow cross-db references altogether.

ajwerner · 2020-07-08T15:05:45Z

FWIW I recall there was a recent discussion to disallow cross-db references altogether.

That's true. You can, however, take the above example and apply it to user-defined schemas and search path and if we don't change anything, we'll still have a problem.

This issue has several dimensions

We use nextvar with a string instead of a regclass so the element being referenced is ambiguous because we do resolution at runtime.
(The deeper issue) is that we store our default expressions as strings rather than a richer structure. In postgres, it seems, the regclass is stored in the default expression as its integer value but regclass values display as their resolved string names.
Even if we had a way to store the default expr such that it was expressed as an integer regclass (which we very much do not), cockroach doesn't render regclass intergers by their resolved names.

postgres=# SELECT 16441::regclass;                                             
 regclass                              
----------                             
 foo

Imagine today we do the following:

s INT8 NOT NULL DEFAULT nextval('foo_s_seq':::STRING)

I have two related concrete proposals that both have the same first:

We provide an overload for nextval that takes a regclass.

A. In the default expression we just use an integer cast to a regclass. For example:

s INT8 NOT NULL DEFAULT nextval(60::regclass)

B. In the default expression we generate for columns using SERIAL with 'sql_sequence' we start to use crdb_internal.create_regclass as the input to nextval. The advantage with this is that it's clearer which sequence is in use for the reader.

s INT8 NOT NULL DEFAULT nextval(crdb_internal.create_regclass(60, 'foo_s_seq'))

arulajmani · 2020-07-08T15:05:50Z

Nice find! I guess this is currently blocking us from a logictest configuration that uses sequences for serial normalization.

I tried creating a sequence with a fully qualified name which would force the sequence to be created in the same database as the table, and storing the fully qualified name in the default expression, and that seems to fix the problem. I'll add some tests and put a PR out.

ajwerner · 2020-07-08T15:34:09Z

Yeah, that proposal makes sense. In the absence of a regclass overloaded nextval and a proper way to store that regclass in the default expr, using a FQN in the string seems like the right idea. Thanks!

knz · 2020-07-08T15:56:42Z

Can you check that doesn't explode if/when the db gets renamed.

ajwerner · 2020-07-08T15:57:53Z

Can you check that doesn't explode if/when the db gets renamed.

That's a great point. We'll need to deal with this when renaming database (and schemas for UDS).

arulajmani · 2020-07-08T17:38:33Z

It doesn't implode, but it will prohibit renaming databases that use SERIAL and I'm not sure that's much better. There is a closed issue for this (#34416), but I don't think the issue has been fully addressed -- Referring to sequences by their FQNs when using them still results in an error.

[email protected]:59882/defaultdb> CREATE SEQUENCE foo;
CREATE SEQUENCE

Time: 3.217ms

[email protected]:59882/defaultdb> CREATE TABLE t(a INT DEFAULT nextval('defaultdb.foo'));
CREATE TABLE

Time: 25.509ms

[email protected]:59882/defaultdb> ALTER DATABASE defaultdb RENAME TO default_db;
ERROR: cannot rename database because relation "defaultdb.public.t" depends on relation "defaultdb.public.foo"
SQLSTATE: 2BP01
HINT: you can drop the column default "a" of "defaultdb.public.foo" referencing "defaultdb.public.t" or modify the default to not reference the database name "defaultdb"

With this rename issue + the fact that we'll be supporting UDS very soon, I'm leaning more towards Andrew's proposal of solving this by having default expressions store something more robust and overloading nextval. (as opposed to the earlier proposed quick win solution). Additionally, this would allow us to rename sequences which are used by columns, functionality that is currently not supported.

To me it seems like rendering the regclass to its string representation is the major issue there. I'm not too familiar with how/how much regclass compatibility we support, but this seems hard.

I wonder if as an intermediate step there is value in storing the sequence's descriptor ID and name (at creation) in the default expression which will allow us to implement Alternative B in @ajwerner's proposal above. We could be displaying stale information if the sequence is renamed, or we could continue to disallow sequence renames like we do today.

ajwerner · 2020-07-14T21:22:13Z

I wonder if as an intermediate step there is value in storing the sequence's descriptor ID and name (at creation) in the default expression which will allow us to implement Alternative B in @ajwerner's proposal above. We could be displaying stale information if the sequence is renamed, or we could continue to disallow sequence renames like we do today.

I'm happy with B as a first step though do want to ask about sequence renames. It seems to me like getting sequence renames right isn't too terrible if all we need to do is update the table[s] which reference it. I was much more worried about needing to traverse all of the tables when renaming a database or schema.

ajwerner · 2020-07-30T23:38:37Z

Alright here's a perhaps real pathway forward, when do we need to know the name of referenced descriptor? Seems like only for display purposes. So the latest thinking on this proposal is these steps that each get us closer to postgres

implement a regclass overload on nextval
stop creating a new table when truncating because that makes table IDs non stable (and is the only thing as far as I know) sql: create new indexes rather than a new table in TRUNCATE #52170
start encoding the expression with the integer regclass
resolve the regclass to a name for display

Only 2 seems hard. This could be pretty good. This proposal is in many ways exactly what's asked for in #22212. Relates to #10083 as we'd want to do the same thing in view queries with names.

knz · 2020-07-31T08:20:06Z

In fact making table ID stable would be good for pg compatibility. UI tools that use pg_catalog get confused if you truncate a table (and rotate the ID) from "under" them.

ajwerner · 2020-09-15T19:04:26Z

in positive news here, we did the truncate thing so the path to a brighter future is now clearer. There are open questions with regards to migrations of existing expressions. I'm rather open to not migrating them.

Ideally we'd make the "default" overload for the function into the regclass so that new expressions which utilize the string get resolved to regclasses so that the same expression ends up with the better serialization. For existing string serialization the regclass resolution at runtime should end up doing exactly what we do today.

That might be wishful thinking, I don't know enough about the way the overloads work out with the constant expressions. Worst case we could put some hacks in place when to prefer resolving the strings to regclass integers when looking to serialize the sequence operations into descriptors.

knz · 2020-09-16T11:32:13Z

@otan says that the issue in #53819 is a duplicate, here's the symptom

> create sequence s;
> create view a as select nextval('s'); -- OK!
> create materialized view am as select nextval('s');
ERROR: nextval(): relation "s" does not exist
SQLSTATE: 42P01

I don't actually understand why they are marked as duplicate to each other (see my question in #53819 (comment) )

and thus would appreciate an explanation thanks

otan · 2020-09-16T14:44:37Z

The suggested solution to resolving this (#54410) is the same.

knz · 2020-09-16T19:58:26Z

that PR you just linkd to has been closed and not merged, so can you explain what is the actual suggested solution?
we don't typically close issues as duplicate when they have the same solution, but rather when they have overlapping set of causes

knz · 2020-09-16T19:58:55Z

(re my point 2: if there are separate causes but common solutions, we can have 1 PR that "closes" 2 or more issues at a time)

ajwerner · 2020-09-22T18:39:34Z

@knz I agree with your assessment and have determined that while the above proposal would fix #53819, that is a different issue. In the short term I'd like to take the approach used by #42508 to somewhat resolve the issue. It's not pretty but it's clearer than what we see today.

59396: sql: overload sequence operators to accept a regclass r=the-ericwang35 a=the-ericwang35 sql: overload sequence operators to accept a regclass Previously, sequence operators (nextval, setval, currval) only accepted a string to refer to the referenced sequence. This patch allows these operators to accept a regclass as well. Tests were also added to verify this. This is the first step to fix this issue: #51090. Release note (sql change): overload sequence operators to accept a regclass Co-authored-by: Eric Wang <[email protected]>

60957: sql: add backup/restore tests for sequences r=the-ericwang35 a=the-ericwang35 This patch just adds tests for backup/restore with respect to sequences. Now that all new sequences are referenced by ID and can be renamed, we want to test that previously created sequences are not affected, and still follow the old rules (i.e. cannot be renamed and and are referenced by their names. Part of #51090 In the future, we'll want to convert these old sequences to new sequences (i.e. can be renamed and are referenced by ID, see #60942) which means these tests will need to change, but for now, we'll keep the old sequences as is. Release note (None): add backup/restore tests for sequences 60972: sql: error if SURVIVE is set with no regions r=ajstorm a=otan Release justification: bug fix to new functionality Release note: None 60993: colexec: remove duplication around bazel utility functions r=yuzefovich a=yuzefovich Previously, we had essentially duplicated `.bzl` files in two packages that need to generate code using `execgen`, but this can be avoided by defining a separate `.bzl` file as an extension used in both places. Release note: None 61008: TeamCity: update build agents to use the latest sudo r=rail a=rail Release note: None 61024: authors: add erikgrinaker r=erikgrinaker a=erikgrinaker Release note: none 61025: authors: add fqazi to authors r=fqazi a=fqazi authors: add fqazi to authors Release note: None 61026: authors: add abarganier to authors r=abarganier a=abarganier Release note: None Co-authored-by: Eric Wang <[email protected]> Co-authored-by: Oliver Tan <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: Rail Aliiev <[email protected]> Co-authored-by: Erik Grinaker <[email protected]> Co-authored-by: Faizan Qazi <[email protected]> Co-authored-by: Alex Barganier <[email protected]>

61126: sql: add version upgrade test for sequences r=ajwerner a=the-ericwang35 This patch just adds some version upgrade tests for sequences. Now that all new sequences are referenced by ID and can be renamed, we want to test that previously created sequences are not affected, and still follow the old rules (i.e. cannot be renamed and and are referenced by their names. This test creates some sequences in 20.2.5, upgrades the cluster, and verifies this behaviour. Part of #51090. In the future, we'll want to convert these old sequences to new sequences (i.e. can be renamed and are referenced by ID, see #60942) which means these tests will need to change, but for now, we'll keep the old sequences as is. Release note: None Release justification: only adds tests for existing functionality 62203: sql: capture sequences used in cast exprs r=ajwerner a=the-ericwang35 Previously, `GetSequenceFromFunc` only looked for `*tree.DString` and `*tree.DOid`. However, it's also possible for the sequence to be given via a `*tree.CastExpr` (for example, `nextval('s'::regclass)`). In this case, the sequence was not being captured, which meant no back references were created and no sequence encoding was done. This PR addresses this by adding an additional case to capture these sequences. This will probably need to be backported. Release note: None Co-authored-by: Eric Wang <[email protected]>

knz added A-sql-pgcompat Semantic compatibility with PostgreSQL C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. labels Jul 8, 2020

ajwerner assigned arulajmani Jul 8, 2020

arulajmani mentioned this issue Jul 24, 2020

sql: DROP DATABASE/SCHEMA CASCADE may break sequences that have cross schema references #51889

Closed

ajwerner mentioned this issue Jul 30, 2020

sql: create new indexes rather than a new table in TRUNCATE #52170

Closed

This was referenced Sep 9, 2020

sql: add setting to gate cross-db reference, default to disallow #54126

Closed

sql: set planner database scope of SchemaChanger to descriptor db #54410

Closed

otan mentioned this issue Sep 15, 2020

sql: CREATE MATERIALIZED VIEW does not use the right name resolution scope for its scalar expressions #53819

Closed

This was referenced Jan 25, 2021

sql: overload sequence operators to accept a regclass #59396

Merged

sql: encode and decode sequence regclasses, and allow renaming sequences #59655

Closed

ajwerner assigned the-ericwang35 Feb 9, 2021

the-ericwang35 linked a pull request Feb 18, 2021 that will close this issue

sql: encode and decode sequence regclasses, and allow sequence renaming #59864

Merged

the-ericwang35 mentioned this issue Feb 23, 2021

sql: add backup/restore tests for sequences #60957

Merged

the-ericwang35 linked a pull request Feb 23, 2021 that will close this issue

sql: add backup/restore tests for sequences #60957

Merged

craig bot closed this as completed in #60957 Feb 23, 2021

the-ericwang35 mentioned this issue Feb 25, 2021

sql: add version upgrade test for sequences #61126

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: sequences created by `SERIAL` have weird semantics with regards to database #51090

sql: sequences created by `SERIAL` have weird semantics with regards to database #51090

ajwerner commented Jul 7, 2020

blathers-crl bot commented Jul 7, 2020

ajwerner commented Jul 7, 2020

knz commented Jul 8, 2020

ajwerner commented Jul 8, 2020

arulajmani commented Jul 8, 2020 •

edited

Loading

ajwerner commented Jul 8, 2020

knz commented Jul 8, 2020

ajwerner commented Jul 8, 2020

arulajmani commented Jul 8, 2020

ajwerner commented Jul 14, 2020

ajwerner commented Jul 30, 2020 •

edited

Loading

knz commented Jul 31, 2020 •

edited

Loading

ajwerner commented Sep 15, 2020

knz commented Sep 16, 2020

otan commented Sep 16, 2020

knz commented Sep 16, 2020

knz commented Sep 16, 2020

ajwerner commented Sep 22, 2020

sql: sequences created by SERIAL have weird semantics with regards to database #51090

sql: sequences created by SERIAL have weird semantics with regards to database #51090

Comments

ajwerner commented Jul 7, 2020

blathers-crl bot commented Jul 7, 2020

ajwerner commented Jul 7, 2020

knz commented Jul 8, 2020

ajwerner commented Jul 8, 2020

arulajmani commented Jul 8, 2020 • edited Loading

ajwerner commented Jul 8, 2020

knz commented Jul 8, 2020

ajwerner commented Jul 8, 2020

arulajmani commented Jul 8, 2020

ajwerner commented Jul 14, 2020

ajwerner commented Jul 30, 2020 • edited Loading

knz commented Jul 31, 2020 • edited Loading

ajwerner commented Sep 15, 2020

knz commented Sep 16, 2020

otan commented Sep 16, 2020

knz commented Sep 16, 2020

knz commented Sep 16, 2020

ajwerner commented Sep 22, 2020

sql: sequences created by `SERIAL` have weird semantics with regards to database #51090

sql: sequences created by `SERIAL` have weird semantics with regards to database #51090

arulajmani commented Jul 8, 2020 •

edited

Loading

ajwerner commented Jul 30, 2020 •

edited

Loading

knz commented Jul 31, 2020 •

edited

Loading