Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-20.2: DROP DATABASE causes ERROR: descriptor not found #62281

Closed
mgartner opened this issue Mar 19, 2021 · 13 comments
Closed

release-20.2: DROP DATABASE causes ERROR: descriptor not found #62281

mgartner opened this issue Mar 19, 2021 · 13 comments
Assignees
Labels
A-schema-descriptors Relating to SQL table/db descriptor handling. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.

Comments

@mgartner
Copy link
Collaborator

While updating SQLancer to run against Cockroach v20.2.6, SQLancer revealed a bug. Oddly, I'm unable to reproduce the error by manually running the statements that caused it:

-- assuming a "test" database already exists
USE test;
DROP DATABASE IF EXISTS database5 CASCADE;
CREATE DATABASE database5;
USE database5;

The error that SQLancer encounters is: ERROR: descriptor not found. The SQLancer test output can be found here.

This isn't much information to work with, so I'll try to gather more clues.

Environment:

  • CockroachDB version v20.2.6
  • JDBC
@mgartner mgartner added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-schema-descriptors Relating to SQL table/db descriptor handling. labels Mar 19, 2021
mrigger added a commit to sqlancer/sqlancer that referenced this issue Mar 27, 2021
@mrigger
Copy link

mrigger commented Mar 27, 2021

I can reliably reproduce the issue when executing COCKROACHDB_AVAILABLE=true mvn -Dtest=TestCockroachDB test in the SQLancer directory. I cannot reproduce the issue when (repeatedly) executing the statements using cockroach sql. I thought that the issue might be related to running multiple SQLancer threads at once, but reducing the number of threads still causes the issue to appear in the CI (see https://github.com/sqlancer/sqlancer/runs/2209576849?check_suite_focus=true) as well as locally. Since this issue only appeared after upgrading to the latest CockroachDB version, I suspect that this is a regression bug related to the interaction with JDBC.

@mgartner
Copy link
Collaborator Author

Interesting, thanks for the additional info. It's possible that JDBC is issuing the problematic queries on its own. I'm going to try to repoduce and log all queries hitting the cluster to see if that provides any clues.

cc @rafiss for visibility

@mgartner
Copy link
Collaborator Author

mgartner commented Mar 29, 2021

I'm able to reliably reproduce as well. I've attached some of the logs collected during the reproduction. Nothing stood out to me, but maybe there is some useful info in there? I can collect more information while reproducing if it'd be helpful.

cockroach-sql-exec.log
cockroach.log
cockroach-stderr.log

cc @ajwerner incase he has any ideas

Reproduction Steps

1. Start a CockroachDB v20.2.6 single-node cluster and add a sqlancer user.

cockroach start-single-node --insecure
cockroach sql --insecure -e 'CREATE USER sqlancer; GRANT admin TO sqlancer;'

2. Clone SQLancer and run the tests.

git clone https://github.com/sqlancer/sqlancer
cd sqlancer
COCKROACHDB_AVAILABLE=true mvn -Dtest=TestCockroachDB test

After a few thousand queries the error is produced.

@ajwerner
Copy link
Contributor

Ack, with a repro this should be easy enough to track down. I'll take a look in a bit.

@mgartner
Copy link
Collaborator Author

@ajwerner any idea when you'll have a chance to look into this?

@ajwerner
Copy link
Contributor

Sorry, slipped through the cracks, I'll prioritize it and try to have something to say by tomorrow.

@ajwerner
Copy link
Contributor

ajwerner commented Apr 16, 2021

Well this ended up being quite far from where I expected it to be. It also is fixed on 21.1 and master in #60775 (perhaps not totally intentionally). All in all, one of those 2-char code diffs:

--- a/pkg/sql/create_view.go
+++ b/pkg/sql/create_view.go
@@ -176,7 +176,7 @@ func (n *createViewNode) startExec(params runParams) error {
                        privs,
                        &params.p.semaCtx,
                        params.p.EvalContext(),
-                       n.persistence,
+                       persistence,
                )

PR inbound once I get 20.2 tests running.

@ajwerner ajwerner changed the title DROP DATABASE causes ERROR: descriptor not found release-20.2: DROP DATABASE causes ERROR: descriptor not found Apr 16, 2021
@ajwerner
Copy link
Contributor

Fixed by #63781. Will be in the next 20.2.x release.

@mgartner
Copy link
Collaborator Author

Thanks @ajwerner!

mgartner added a commit to mgartner/sqlancer that referenced this issue Apr 26, 2021
CockroachDB tests were disabled after they revealed a bug in v20.2.6,
cockroachdb/cockroach#62281. This bug was
fixed and v20.2.8. This commit updates the CockroachDB version and
enables the tests.
wan1y added a commit to wan1y/sqlancer that referenced this issue Apr 27, 2021
add more tidb error merssage

format

checkstyle pass

Added some expected TiDB error messages

format

Update CockroachDB to v20.2.8 and enable tests

CockroachDB tests were disabled after they revealed a bug in v20.2.6,
cockroachdb/cockroach#62281. This bug was
fixed and v20.2.8. This commit updates the CockroachDB version and
enables the tests.
wan1y added a commit to wan1y/sqlancer that referenced this issue Apr 27, 2021
add more tidb error merssage

format

checkstyle pass

Added some expected TiDB error messages

format

Update CockroachDB to v20.2.8 and enable tests

CockroachDB tests were disabled after they revealed a bug in v20.2.6,
cockroachdb/cockroach#62281. This bug was
fixed and v20.2.8. This commit updates the CockroachDB version and
enables the tests.

Update CockroachDB to v20.2.8 and enable tests

CockroachDB tests were disabled after they revealed a bug in v20.2.6,
cockroachdb/cockroach#62281. This bug was
fixed and v20.2.8. This commit updates the CockroachDB version and
enables the tests.
wan1y added a commit to wan1y/sqlancer that referenced this issue Apr 27, 2021
add more tidb error merssage

format

checkstyle pass

Added some expected TiDB error messages

format

Update CockroachDB to v20.2.8 and enable tests

CockroachDB tests were disabled after they revealed a bug in v20.2.6,
cockroachdb/cockroach#62281. This bug was
fixed and v20.2.8. This commit updates the CockroachDB version and
enables the tests.

Update CockroachDB to v20.2.8 and enable tests

CockroachDB tests were disabled after they revealed a bug in v20.2.6,
cockroachdb/cockroach#62281. This bug was
fixed and v20.2.8. This commit updates the CockroachDB version and
enables the tests.
@mgartner
Copy link
Collaborator Author

mgartner commented Jul 1, 2021

@ajwerner I just encountered the same error message when running SQLancer against v21.1.3 (I'm trying to upgrade SQLancer to run against v21.1.3 in sqlancer/sqlancer#369) with the same repro steps as above.

I noticed that the 2-character change is not included in v21.1. Should it be?

It could be that some other bug is causing the same error. I stressed your regression test in #63781 and was unable to get it to fail.

@ajwerner
Copy link
Contributor

ajwerner commented Jul 1, 2021

I noticed that the 2-character change is not included in v21.1. Should it be?

I recall that code having changed on master at the time which is why I didn't do anything on master. I suspect that it's a different bug. Let me have a look.

@ajwerner
Copy link
Contributor

ajwerner commented Jul 1, 2021

This one turned out to be different, but still related to temporary views! In 21.1 we added a lot of code on properly tracking events in the event log with fully qualified names. Turns out that it can be hard to qualify names of views in other temporary databases. At least, it's hard if you don't keep around more state. See #67104 (which will backport reasonably cleanly).

What's nice is that I've been working on a change just today that should have fixed this internally by recognizing that to have gotten here we must have read the schemas for the database and therefore we know what this thing should be called.

@mgartner
Copy link
Collaborator Author

mgartner commented Jul 1, 2021

Thanks for the quick response!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-schema-descriptors Relating to SQL table/db descriptor handling. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
Projects
None yet
Development

No branches or pull requests

3 participants