Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest/sqlsmith: internal error: executing declarative schema change: index ID 1 found in depended-on-by references, no such index in this relation #108974

Closed
yuzefovich opened this issue Aug 18, 2023 · 7 comments · Fixed by #109611
Assignees
Labels
branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. db-cy-23 regression Regression from a release. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@yuzefovich
Copy link
Member

yuzefovich commented Aug 18, 2023

Hit this when running slightly adjusted roachtest/sqlsmith (which includes #108972) manually:

ERROR: internal error: executing declarative schema change PostCommitNonRevertiblePhase stage 3 of 3 with 5 MutationType ops (rollback=false) for ALTER TABLE: error executing PostCommitNonRevertiblePhase stage 3 of 3 with 5 MutationType ops: relation "tab4000" (140): index ID 1 found in depended-on-by references, no such index in this relation
SQLSTATE: XX000
DETAIL: • Schema change plan for ALTER TABLE ‹defaultdb›.public.‹tab4000› ADD COLUMN ‹"Coͩl14456"› FLOAT4 DEFAULT ‹-1791783363›;

The way to reproduce is

./cockroach demo --multitenant=false --set=errexit=false --nodes 3 --no-example-database -f sqlsmith.log

sqlsmith.zip

(it's likely this can be significantly reduced - I didn't try.)

Note that it seems to be a regression on master given that this doesn't repro on 23.1.7.

Jira issue: CRDB-30725

@yuzefovich yuzefovich added the C-test-failure Broken test (automatically or manually discovered). label Aug 18, 2023
@blathers-crl blathers-crl bot added the T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) label Aug 18, 2023
@yuzefovich yuzefovich added the branch-master Failures and bugs on the master branch. label Aug 18, 2023
@rafiss rafiss added regression Regression from a release. db-cy-23 C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. and removed C-test-failure Broken test (automatically or manually discovered). labels Aug 18, 2023
@Xiang-Gu
Copy link
Contributor

a smaller repo:

create table t (i int primary key);
create function f() returns record stable language sql as $$ select * from t@t_pkey; $$;

alter table t add column j int default 30;

The issue is that the UDF body forces the query to use the primary index of t (t_pkey). This is tracked on the table as a back-reference in its DependedOnBy field (something like back-ref={desc: "f", indexID: 1, columnIDs: 1}) but our declarative schema change implementation for add column will create a new primary index and drop the old one, so that during validation, the table would have a back-reference that specifically uses its indexID=1 but that index has been dropped.

In contrast, if we were to use legacy schema changer, which rewrite the primary index in-place, the ADD COLUMN would succeed.

This has made me wonder "what should be the right behavior?" My initial thoughts are "ah, okay, let me disallow this case for declarative schema changer then!" but that will create behavioral divergence between legacy and declarative schema changer. It's a bit unfortunate that the way we track back-references from tables to UDFs uses index IDs so that our two schema changer implementation differences would leak and cause a SQL difference. @fqazi and @rafiss I'd like to hear your thoughts on this.

@fqazi
Copy link
Collaborator

fqazi commented Aug 22, 2023

The right answer here is we need to be able to rewrite back-references since the builder should be able to detect the function dependency and update it for the new index (i.e. drop and recreate it with the new index ID). We can block this operation can fallback for the declarative schema changer in the short term.

@Xiang-Gu
Copy link
Contributor

@fqazi I thin what you suggested is dangerous. Consider a function

create function f 
$$
select * from t@t_pkey;
select * from t@[1];
$$

this will create two identical back-references in the table

              {
                  "columnIds": [
                      1
                  ],
                  "id": 105,
                  "indexId": 1
              },
              {
                  "columnIds": [
                      1
                  ],
                  "id": 105,
                  "indexId": 1
              },

What you suggested is to, in the builder, look at back-references and schedule rewrites such that those back-references be updated to new index ID. But that will break the UDF since select * from t@[1] won't work anymore.

I had a chat with @mgartner yesterday and learned that in PG users bear the risk of breaking the UDF if they change the referencing tables (PG won't prevent them from modifying the table in a udf-breaking way), but so far in CRDB we have tried to track the dependencies and disallow schema changes to tables that would break UDFs that reference them.

For now, I will teach the declarative schema changer to detect this case and fallback to legacy schema changer

@fqazi
Copy link
Collaborator

fqazi commented Aug 23, 2023

I think it could be argued both of those are the same, you are pointing to the same object (index). I think, at least from a usability perspective, we probably should aim for better since dangling references are just bad. Like its okay to aim for PG compatibility in general, but if a behaviour is suspicious or bad, we should have some leeway. That's just my opinion on it.

@Xiang-Gu
Copy link
Contributor

I agree that we don't want to copy every single bit of PG's behavior.

I am saying that if we were to update the back-reference in the table, we would need to potentially update/rewrite the UDF too (e.g. change select * from t@[1]; to select * from t@[2] after the schema change on t). I am just raising the question that "do we want to go down that route?"

@Xiang-Gu
Copy link
Contributor

Another note: for ADD COLUMN and DROP COLUMN, legacy schema changer will work bc its implementation luckily rewrite the old primary index in-place, but it's susceptible to the same problem as declarative schema changer.

This means schema changes that will actually drop the old primary index will be broken in the legacy schema changer. For example, off my head, that will be ALTER PRIMARY KEY.
Indeed, if I run

set use_declarative_schema_changer = 'off';
create table t (i int primary key, j int not null);
create function f() returns record language sql as $$ select * from t@[1]; $$;

alter table t alter primary key using columns (j);

The ALTER PK will hang with logs like

E230823 15:58:14.683954 3072 jobs/registry.go:1699 ⋮ [T1,Vsystem,n1] 361  job 893700979575848961: running execution encountered retriable error: non-cance
lable: relation ‹"t"› (104): index ID 1 found in depended-on-by references, no such index in this relation. ref = ‹id:105 index_id:1 column_ids:1 column_i
ds:2 by_id:false ›

This makes me think maybe we should alter both schema changers to reject such schema changes.

@fqazi
Copy link
Collaborator

fqazi commented Aug 23, 2023

I agree that we don't want to copy every single bit of PG's behavior.

I am saying that if we were to update the back-reference in the table, we would need to potentially update/rewrite the UDF too (e.g. change select * from t@[1]; to select * from t@[2] after the schema change on t). I am just raising the question that "do we want to go down that route?"

The answer is maybe...but I'm sure @rafiss has thoughts on it, too. This is probably the longer-term answer, I think.

This makes me think maybe we should alter both schema changers to reject such schema changes.

It also makes me think of chicken and egg problems here...i.e. you can't replace the routine to refer to the new index, without having the new index.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. db-cy-23 regression Regression from a release. T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants