-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Information queries failed after upgrade to 20.2 #56810
Comments
Hello, I am Blathers. I am here to help you get the issue triaged. Hoot - a bug! Though bugs are the bane of my existence, rest assured the wretched thing will get the best of care here. I have CC'd a few people who may be able to assist you:
If we have not gotten back to your issue within a few business days, you can try the following:
🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
Hi @ikgo, sorry you're running into this. This is a known issue caused by validation for table metadata becoming more strict in 20.2 and thus uncovering pre-existing inconsistencies. Could you run If you'd rather not post the output publicly, you can send it to us at https://support.cockroachlabs.com/ (and link to this issue) or email me ([email protected]). |
cmd:
|
Thanks. This points to #50997, which is a bug where we failed to drop tables which use sequences during As for dropping those tables, removing the table data would be a somewhat involved process. Simply removing the table metadata would be easier, but less desirable since the table data would still take up space. Do you know approximately how much data was in those tables (the |
hi, it's related to previous issue I reported several months ago. was try
change structure of the tables but alter query failed to run, so we rename
tables and keep them empty.
drop table failed for those tables.
the upgrade to 20.2 performed on testing environment with small amount of
data, but production db have hundreds of millions records.
is it possible somehow to clean up meta data?
what will be upgrade process on production?
…On Wed, Nov 18, 2020, 16:38 Lucy Zhang ***@***.***> wrote:
Thanks. This points to #50997
<#50997>, which is a bug
where we failed to drop tables which use sequences during DROP DATABASE
CASCADE, leaving behind tables with no parent database. Did you attempt
to drop the parent database for those orphaned tables at some point?
As for dropping those tables, removing the table data would be a somewhat
involved process. Simply removing the table metadata would be easier, but
less desirable since the table data would still take up space. Do you know
approximately how much data was in those tables (the campaign tables and
blends)?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#56810 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AIQALGLYDTMRTGYJAATPS53SQPL65ANCNFSM4TY3RFOQ>
.
|
What happens if you run the same We'll get back to you with steps to fix the problem. I expect that we'll be able to take steps to fix the production cluster before upgrading to 20.2, so that you won't run into the error. |
*Output from production DB*:
cockroach debug doctor cluster --insecure --url postgresql://127.0.0.1:26257
Examining 101 descriptors and 107 namespace entries...
Table 2: ParentID 1, ParentSchemaID 29, Name 'namespace': not being
dropped but no namespace entry found
Table 30: ParentID 1, ParentSchemaID 29, Name 'namespace2': namespace
entry {ParentID:1 ParentSchemaID:29 Name:namespace} not found in draining
names
Table 30: ParentID 1, ParentSchemaID 29, Name 'namespace2': could not
find name in namespace table
Table 157: ParentID 61, ParentSchemaID 29, Name 'campaigns': namespace
entry {ParentID:61 ParentSchemaID:29 Name:blends} not found in draining
names
Table 157: ParentID 61, ParentSchemaID 29, Name 'campaigns': could not
find name in namespace table
Table 158: ParentID 61, ParentSchemaID 29, Name 'campaign_sources':
namespace entry {ParentID:61 ParentSchemaID:29 Name:blend_sources} not
found in draining names
Examining 0 running jobs...
ERROR: validation failed
Failed running "debug doctor cluster"
…On Wed, Nov 18, 2020 at 6:06 PM Lucy Zhang ***@***.***> wrote:
What happens if you run the same debug doctor command on the production
cluster?
We'll get back to you with steps to fix the problem. I expect that we'll
be able to take steps to fix the production cluster before upgrading to
20.2, so that you won't run into the error.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#56810 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AIQALGPJKC7Q4ZFAY2KJRRLSQPWJZANCNFSM4TY3RFOQ>
.
|
This same thing happened to us by an accidental auto-update of the database cluster and now one of our production databases is inaccessible because of this. Is there any fix to this? |
@lime008 could you also post (or send us) the output of |
@ikgo The problem with your production cluster seems to be different from the one with the test cluster. Would you mind sending us a debug.zip for each of the clusters? |
|
@lime008 Thanks. Nothing looks obviously out of place here (the namespace-related entries are expected, I believe), but there may be other descriptor issues not reported by What queries are you issuing, and what errors are you getting? Also, was this cluster ever running 19.2 at some point, or was it bootstrapped on 20.1? |
@lime008 it would also help if you sent us a debug.zip. Instructions here: https://www.cockroachlabs.com/docs/v20.2/cockroach-debug-zip.html |
Testing environment
Remove whole database that contains "campaigns" and "blends" tables, but the output still same. Also meta info queries still not works for other databases. Production Environment |
Hi, Please help to fix database or backup/restore data. Any workaround, short term solution will be very appreciated. Problems
|
@ikgo to resolve the problem with your test environment, try these steps to manually update the table metadata. This should resolve the First you'll need to download a nightly build from the 20.2 release branch, which has new builtin functions to enable manual descriptor repair. You can download it from here (or replace with
Restart one of the nodes on your test cluster to run that binary. Then, while connected to that node, do the following:
Let us know how that goes. |
Running query below produce error
Query:
Query output:
|
I'm not sure where that error is coming from. Judging from that stack trace, the |
Hello (coming over from #56957), it's happening to us too! debug doctor:
Tried running the steps described for table 262227
And now I get
Which seems to work ok (I get only the "namespace" errors).
which I think breaks the upsert_descriptor function:
The json can be "prettified" so I think it's syntactically sound, not quite sure what else could be causing that. |
same output |
Hi, I manage to export all needed data from test environment. Clean install of 20.2 works good. What status with my production environment? |
Hi ikgo -- so you know some of the team members that can help you are currently away because of the US thanksgiving holiday. Is this something that can wait until next Monday? Otherwise, please state the urgency and I will try to escalate this. |
Hi, I was able to attach the tables with a missing parent_id to a new table and drop that with the instructions above, so the database is now working as expected. |
not urgent, it can wait. |
With 20.2.3 out, I believe that this issue is resolved and am closing it. |
Hi @ajwerner, |
Describe the problem
After upgrade to 20.2 any information queries (
SHOW
) failed with following errorTo Reproduce
upgrade from 20.1.8 to 20.2
run queries
Environment:
The text was updated successfully, but these errors were encountered: