Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Information queries failed after upgrade to 20.2 #56810

Closed
ikgo opened this issue Nov 17, 2020 · 26 comments
Closed

Information queries failed after upgrade to 20.2 #56810

ikgo opened this issue Nov 17, 2020 · 26 comments
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-community Originated from the community X-blathers-triaged blathers was able to find an owner

Comments

@ikgo
Copy link

ikgo commented Nov 17, 2020

Describe the problem

After upgrade to 20.2 any information queries (SHOW) failed with following error

SQL Error [XX000]: ERROR: internal error: parentID 87 does not exist
  Detail: stack trace:
/go/src/github.com/cockroachdb/cockroach/pkg/sql/catalog/tabledesc/structured.go:1453: validateCrossReferences()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/catalog/tabledesc/structured.go:1440: Validate()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/catalog/catalogkv/catalogkv.go:298: unwrapDescriptor()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/catalog/catalogkv/catalogkv.go:390: GetAllDescriptors()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/collection.go:1545: GetAllDescriptors()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/information_schema.go:1864: forEachTableDescWithTableLookupInternal()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/information_schema.go:1825: forEachTableDescWithTableLookup()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/pg_catalog.go:1072: func1()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/virtual_schema.go:492: 1()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/virtual_table.go:115: func2()
/usr/local/go/src/runtime/asm_amd64.s:1357: goexit()

To Reproduce
upgrade from 20.1.8 to 20.2
run queries

show index from TABLE
show tables from DB

Environment:

  • CockroachDB version 20.2
  • Server OS: Kubernetes
  • Client app: any
@blathers-crl
Copy link

blathers-crl bot commented Nov 17, 2020

Hello, I am Blathers. I am here to help you get the issue triaged.

Hoot - a bug! Though bugs are the bane of my existence, rest assured the wretched thing will get the best of care here.

I have CC'd a few people who may be able to assist you:

If we have not gotten back to your issue within a few business days, you can try the following:

  • Join our community slack channel and ask on #cockroachdb.
  • Try find someone from here if you know they worked closely on the area and CC them.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@blathers-crl blathers-crl bot added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-community Originated from the community X-blathers-triaged blathers was able to find an owner labels Nov 17, 2020
@thoszhang
Copy link
Contributor

Hi @ikgo, sorry you're running into this. This is a known issue caused by validation for table metadata becoming more strict in 20.2 and thus uncovering pre-existing inconsistencies.

Could you run cockroach debug doctor cluster --url=<cluster connection string> (you'll have to use a 20.2 cockroach binary for this) to connect to your running cluster and run checks against the metadata, and send the output to us? (This command isn't publicly documented, but cockroach debug doctor cluster --help provides more documentation, including flags for connecting to the cluster.)

If you'd rather not post the output publicly, you can send it to us at https://support.cockroachlabs.com/ (and link to this issue) or email me ([email protected]).

@ikgo
Copy link
Author

ikgo commented Nov 18, 2020

cmd: cockroach debug doctor cluster --insecure --url postgresql://127.0.0.1:26257

Examining 87 descriptors and 96 namespace entries...
   Table   2: ParentID   1, ParentSchemaID 29, Name 'namespace': not being dropped but no namespace entry found
   Table  30: ParentID   1, ParentSchemaID 29, Name 'namespace2': namespace entry {ParentID:1 ParentSchemaID:29 Name:namespace} not found in draining names
   Table  30: ParentID   1, ParentSchemaID 29, Name 'namespace2': could not find name in namespace table
   Table  90: ParentID  87, ParentSchemaID 29, Name 'campaigns': parentID 87 does not exist
   Table 109: ParentID 106, ParentSchemaID 29, Name 'campaigns': parentID 106 does not exist
   Table 129: ParentID 126, ParentSchemaID 29, Name 'blends': parentID 126 does not exist
   Table 149: ParentID 146, ParentSchemaID 29, Name 'campaigns': parentID 146 does not exist
Examining 0 running jobs...
ERROR: validation failed
Failed running "debug doctor cluster"

@thoszhang
Copy link
Contributor

Thanks. This points to #50997, which is a bug where we failed to drop tables which use sequences during DROP DATABASE CASCADE, leaving behind tables with no parent database. Did you attempt to drop the parent database for those orphaned tables at some point?

As for dropping those tables, removing the table data would be a somewhat involved process. Simply removing the table metadata would be easier, but less desirable since the table data would still take up space. Do you know approximately how much data was in those tables (the campaign tables and blends)?

@ikgo
Copy link
Author

ikgo commented Nov 18, 2020 via email

@thoszhang
Copy link
Contributor

What happens if you run the same debug doctor command on the production cluster?

We'll get back to you with steps to fix the problem. I expect that we'll be able to take steps to fix the production cluster before upgrading to 20.2, so that you won't run into the error.

@ikgo
Copy link
Author

ikgo commented Nov 18, 2020 via email

@lime008
Copy link

lime008 commented Nov 19, 2020

This same thing happened to us by an accidental auto-update of the database cluster and now one of our production databases is inaccessible because of this. Is there any fix to this?

@thoszhang
Copy link
Contributor

@lime008 could you also post (or send us) the output of cockroach debug doctor cluster?

@thoszhang
Copy link
Contributor

@ikgo The problem with your production cluster seems to be different from the one with the test cluster. Would you mind sending us a debug.zip for each of the clusters?

@lime008
Copy link

lime008 commented Nov 20, 2020

@lucy-zhang

Examining 195 descriptors and 159 namespace entries...
   Table   2: ParentID   1, ParentSchemaID 29, Name 'namespace': not being dropped but no namespace entry found
   Table  30: ParentID   1, ParentSchemaID 29, Name 'namespace2': namespace entry {ParentID:1 ParentSchemaID:29 Name:namespace} not found in draining names
   Table  30: ParentID   1, ParentSchemaID 29, Name 'namespace2': could not find name in namespace table
Examining 12 running jobs...
ERROR: validation failed
Failed running "debug doctor cluster"

@thoszhang
Copy link
Contributor

@lime008 Thanks. Nothing looks obviously out of place here (the namespace-related entries are expected, I believe), but there may be other descriptor issues not reported by doctor.

What queries are you issuing, and what errors are you getting?

Also, was this cluster ever running 19.2 at some point, or was it bootstrapped on 20.1?

@thoszhang
Copy link
Contributor

@lime008 it would also help if you sent us a debug.zip. Instructions here: https://www.cockroachlabs.com/docs/v20.2/cockroach-debug-zip.html

@ikgo
Copy link
Author

ikgo commented Nov 20, 2020

Testing environment
debug doctor output

Examining 85 descriptors and 55 namespace entries...
   Table   2: ParentID   1, ParentSchemaID 29, Name 'namespace': not being dropped but no namespace entry found
   Table  30: ParentID   1, ParentSchemaID 29, Name 'namespace2': namespace entry {ParentID:1 ParentSchemaID:29 Name:namespace} not found in draining names
   Table  30: ParentID   1, ParentSchemaID 29, Name 'namespace2': could not find name in namespace table
   Table  90: ParentID  87, ParentSchemaID 29, Name 'campaigns': parentID 87 does not exist
   Table 109: ParentID 106, ParentSchemaID 29, Name 'campaigns': parentID 106 does not exist
   Table 129: ParentID 126, ParentSchemaID 29, Name 'blends': parentID 126 does not exist
   Table 149: ParentID 146, ParentSchemaID 29, Name 'campaigns': parentID 146 does not exist
Examining 2 running jobs...
ERROR: validation failed
Failed running "debug doctor cluste

Remove whole database that contains "campaigns" and "blends" tables, but the output still same. Also meta info queries still not works for other databases.

Production Environment
the debug.zip is a large file. more that 100MB. How you would like I share it with you?
Ican upload file to google drive. please send me email so I can share file.

@ikgo
Copy link
Author

ikgo commented Nov 22, 2020

Hi,
I need help, kind of urgent:)
After migration to 20.2 in test environment instead of problem reported above I have a couple another problems.
Test (local cluster) cluster contains data that I need and I must somehow rescue those data.

Please help to fix database or backup/restore data. Any workaround, short term solution will be very appreciated.

Problems

  • it's impossible to rollback to 20.1.8
  • significant performance degradation after migration to 20.2
  • dump command not exists in 20.2
  • database backup failed with error below
./cockroach sql --insecure --execute="backup database XXX to 'nodelocal://self/cockroach-data/backups/XXX'"
ERROR: internal error: failed to resolve targets specified in the BACKUP stmt: parentID 87 does not exist
SQLSTATE: XX000
DETAIL: stack trace:
/go/src/github.com/cockroachdb/cockroach/pkg/sql/catalog/tabledesc/structured.go:1453: validateCrossReferences()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/catalog/tabledesc/structured.go:1440: Validate()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/catalog/catalogkv/catalogkv.go:298: unwrapDescriptor()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/catalog/catalogkv/catalogkv.go:390: GetAllDescriptors()
/go/src/github.com/cockroachdb/cockroach/pkg/ccl/backupccl/targets.go:702: func1()
/go/src/github.com/cockroachdb/cockroach/pkg/kv/db.go:707: func1()
/go/src/github.com/cockroachdb/cockroach/pkg/kv/txn.go:811: exec()
/go/src/github.com/cockroachdb/cockroach/pkg/kv/db.go:706: Txn()
/go/src/github.com/cockroachdb/cockroach/pkg/ccl/backupccl/targets.go:698: loadAllDescs()
/go/src/github.com/cockroachdb/cockroach/pkg/ccl/backupccl/targets.go:715: ResolveTargetsToDescriptors()
/go/src/github.com/cockroachdb/cockroach/pkg/ccl/backupccl/backup_planning.go:564: func3()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/planhook.go:146: func1()
/usr/local/go/src/runtime/asm_amd64.s:1357: goexit()

@thoszhang
Copy link
Contributor

@ikgo to resolve the problem with your test environment, try these steps to manually update the table metadata. This should resolve the parentID ... does not exist errors, which reflect metadata corruption that's blocking both the information queries and the backup.

First you'll need to download a nightly build from the 20.2 release branch, which has new builtin functions to enable manual descriptor repair. You can download it from here (or replace with darwin-10.9-amd64 or windows-6.2-amd64 as needed):

https://cockroach-builds.s3.amazonaws.com/cockroach-v20.2.0-230-g8729c06ed8.linux-amd64.tgz

Restart one of the nodes on your test cluster to run that binary. Then, while connected to that node, do the following:

  1. Create a new database: CREATE DATABASE temp_db_to_drop. This will be a new temporary parent database for the orphaned tables.
  2. Get its ID: SELECT id FROM system.namespace WHERE name = 'temp_db_to_drop'.
  3. For each table with the Table <tableid>: ParentID <parentid>, ParentSchemaID <schemaid>, Name ...: parentID <parentid> does not exist error in the doctor output (note that there are 3 distinct IDs here):
    1. Get the JSON-encoded table descriptor and save it to a text file:
      SELECT crdb_internal.pb_to_json('cockroach.sql.sqlbase.Descriptor', descriptor) FROM system.descriptor WHERE id = <tableid>
      
    2. Replace the ID in "parentId": <parentid> with the ID of temp_db_to_drop, found in step 2.
    3. Replace the ID in "unexposedParentSchemaId": <schemaid> with the constant value 29. (It may already be that value.)
    4. Write the updated descriptor:
      SELECT crdb_internal.unsafe_upsert_descriptor(<tableid>, crdb_internal.json_to_pb('cockroach.sql.sqlbase.Descriptor', '<modified json descriptor>')
      
  4. Run debug doctor again. You should only get the namespace-related results. If there are other results, post the output here and we'll take a look.
  5. Drop the database along with all its tables with DROP DATABASE temp_db_to_drop CASCADE.

Let us know how that goes.

@ikgo
Copy link
Author

ikgo commented Nov 23, 2020

Running query below produce error
Doctor output:

--   Table   2: ParentID   1, ParentSchemaID 29, Name 'namespace': not being dropped but no namespace entry found
--   Table  30: ParentID   1, ParentSchemaID 29, Name 'namespace2': namespace entry {ParentID:1 ParentSchemaID:29 Name:namespace} not found in draining names
--   Table  30: ParentID   1, ParentSchemaID 29, Name 'namespace2': could not find name in namespace table
--   Table  90: ParentID  87, ParentSchemaID 29, Name 'campaigns': parentID 87 does not exist
--   Table 109: ParentID 106, ParentSchemaID 29, Name 'campaigns': parentID 106 does not exist
--   Table 129: ParentID 126, ParentSchemaID 29, Name 'blends': parentID 126 does not exist
--   Table 149: ParentID 146, ParentSchemaID 29, Name 'campaigns': parentID 146 does not exist

Query:

SELECT crdb_internal.pb_to_json('cockroach.sql.sqlbase.Descriptor', descriptor) FROM system.descriptor WHERE id = 90;

Query output:

org.jkiss.dbeaver.model.exec.DBCException: SQL Error [XX000]: ERROR: internal error: parentID 87 does not exist
  Detail: stack trace:
/go/src/github.com/cockroachdb/cockroach/pkg/sql/catalog/tabledesc/structured.go:1453: validateCrossReferences()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/catalog/tabledesc/structured.go:1440: Validate()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/catalog/catalogkv/catalogkv.go:298: unwrapDescriptor()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/catalog/catalogkv/catalogkv.go:390: GetAllDescriptors()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/catalog/descs/collection.go:1545: GetAllDescriptors()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/information_schema.go:1705: forEachTypeDesc()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/pg_catalog.go:2297: 1()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/information_schema.go:1687: forEachDatabaseDesc()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/pg_catalog.go:2285: func133()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/virtual_schema.go:492: 1()
/go/src/github.com/cockroachdb/cockroach/pkg/sql/virtual_table.go:115: func2()
/usr/local/go/src/runtime/asm_amd64.s:1357: goexit()

@thoszhang
Copy link
Contributor

I'm not sure where that error is coming from. Judging from that stack trace, the SELECT crdb_internal.pb_to_json(...) query is being issued via some external tool, right? What happens if you issue the query directly via ./cockroach sql --execute?

@CMajeri
Copy link

CMajeri commented Nov 25, 2020

Hello (coming over from #56957), it's happening to us too!

debug doctor:

cockroach debug doctor cluster --url postgresql://127.0.0.1:26257 --certs-dir .
Examining 744 descriptors and 760 namespace entries...
   Table   2: ParentID   1, ParentSchemaID 29, Name 'namespace': not being dropped but no namespace entry found
   Table  30: ParentID   1, ParentSchemaID 29, Name 'namespace2': namespace entry {ParentID:1 ParentSchemaID:29 Name:namespace} not found in draining names
   Table  30: ParentID   1, ParentSchemaID 29, Name 'namespace2': could not find name in namespace table
   Table 262227: ParentID 262141, ParentSchemaID 29, Name 'usersgroups': parentID 262141 does not exist
   Table 262231: ParentID 262141, ParentSchemaID 29, Name 'request_trails': parentID 262141 does not exist
   Table 262240: ParentID 262141, ParentSchemaID 29, Name 'address_attributes': parentID 262141 does not exist
   Table 262246: ParentID 262141, ParentSchemaID 29, Name 'rules_trails': parentID 262141 does not exist
   Table 262248: ParentID 262141, ParentSchemaID 29, Name 'entity_changes': parentID 262141 does not exist
   Table 262259: ParentID 262141, ParentSchemaID 29, Name 'wl_address_trails': parentID 262141 does not exist
   Table 262261: ParentID 262141, ParentSchemaID 29, Name 'wl_contract_address_trails': parentID 262141 does not exist
   Table 262267: ParentID 262141, ParentSchemaID 29, Name 'wl_addresses_internal_addresses': parentID 262141 does not exist
   Table 262274: ParentID 262141, ParentSchemaID 29, Name 'reserved_funds': parentID 262141 does not exist
   Table 262485: ParentID 262395, ParentSchemaID 29, Name 'transaction_attributes': parentID 262395 does not exist
   Table 264820: ParentID 264772, ParentSchemaID 29, Name 'token_holders': parentID 264772 does not exist
Examining 1 running jobs...
ERROR: validation failed
Failed running "debug doctor cluster"

Tried running the steps described for table 262227

cockroach sql --host 127.0.0.1:26257 --certs-dir . -e "SELECT id FROM system.namespace WHERE name='tmpdb'"                                                                                                                                                                                                                                                                                                        
    id                                                                                                                                                                                                                                                                                                                                                                                                                                   
----------                                                                                                                                                                                                                                                                                                                                                                                                                               
  269246
cockroach sql --host 127.0.0.1:26257 --certs-dir . -e "SELECT crdb_internal.pb_to_json('cockroach.sql.sqlbase.Descriptor', descriptor) FROM system.descriptor WHERE id = 262227"
...
{"table": {"auditMode": "DISABLED", "checks": [], "columns": [{"alterColumnTypeInProgress": false, "computeExpr": null, "defaultExpr": null, "hidden": false, "id": 1, "name": "tenantid", "nullable": true, "ownsSequenceIds": [], "pgAttributeNum": 0, "systemColumnKind": "NONE", "type": {}, "usesSequenceIds": []}, {"alterColumnTypeInProgress": false, "computeExpr": null, "defaultExpr": null, "hidden": false, "id": 2, "name": "userid", "nullable": true, "ownsSequenceIds": [], "pgAttributeNum": 0, "systemColumnKind": "NONE", "type": {}, "usesSequenceIds": []}, {"alterColumnTypeInProgress": false, "computeExpr": null, "defaultExpr": null, "hidden": false, "id": 3, "name": "groupid", "nullable": true, "ownsSequenceIds": [], "pgAttributeNum": 0, "systemColumnKind": "NONE", "type": {}, "usesSequenceIds": []}, {"alterColumnTypeInProgress": false, "computeExpr": null, "defaultExpr": "unique_rowid()", "hidden": true, "id": 4, "name": "rowid", "nullable": false, "ownsSequenceIds": [], "pgAttributeNum": 0, "systemColumnKind": "NONE", "type": {}, "usesSequenceIds": []}], "createAsOfTime": {"logical": 2, "wallTime": "1605802536470365180"}, "createQuery": "", "dependedOnBy": [], "dependsOn": [], "drainingNames": [], "dropJobId": "0", "dropTime": "0", "families": [{"columnIds": [1, 2, 3, 4], "columnNames": ["tenantid", "userid", "groupid", "rowid"], "defaultColumnId": 0, "id": 0, "name": "primary"}], "formatVersion": 3, "gcMutations": [], "id": 262227, "inboundFks": [], "indexes": [{"columnDirections": ["ASC", "ASC", "ASC"], "columnIds": [1, 2, 3], "columnNames": ["tenantid", "userid", "groupid"], "compositeColumnIds": [], "createdExplicitly": false, "disabled": false, "encodingType": 0, "extraColumnIds": [4], "foreignKey": {"index": 0, "match": "SIMPLE", "name": "", "onDelete": "NO_ACTION", "onUpdate": "NO_ACTION", "sharedPrefixLen": 0, "table": 0, "validity": "Validated"}, "geoConfig": {"s2Geography": null, "s2Geometry": null}, "id": 2, "interleave": {"ancestors": []}, "interleavedBy": [], "name": "usersgroups_tenantid_userid_groupid_key", "partitioning": {"list": [], "numColumns": 0, "range": []}, "predicate": "", "referencedBy": [], "sharded": {"columnNames": [], "isSharded": false, "name": "", "shardBuckets": 0}, "storeColumnIds": [], "storeColumnNames": [], "type": "FORWARD", "unique": true, "version": 1}], "isMaterializedView": false, "lease": null, "modificationTime": {"logical": 0, "wallTime": "0"}, "mutationJobs": [], "mutations": [], "name": "usersgroups", "nextColumnId": 5, "nextFamilyId": 1, "nextIndexId": 3, "nextMutationId": 1, "offlineReason": "", "outboundFks": [], "parentId": 262141, "primaryIndex": {"columnDirections": ["ASC"], "columnIds": [4], "columnNames": ["rowid"], "compositeColumnIds": [], "createdExplicitly": false, "disabled": false, "encodingType": 0, "extraColumnIds": [], "foreignKey": {"index": 0, "match": "SIMPLE", "name": "", "onDelete": "NO_ACTION", "onUpdate": "NO_ACTION", "sharedPrefixLen": 0, "table": 0, "validity": "Validated"}, "geoConfig": {"s2Geography": null, "s2Geometry": null}, "id": 1, "interleave": {"ancestors": []}, "interleavedBy": [], "name": "primary", "partitioning": {"list": [], "numColumns": 0, "range": []}, "predicate": "", "referencedBy": [], "sharded": {"columnNames": [], "isSharded": false, "name": "", "shardBuckets": 0}, "storeColumnIds": [], "storeColumnNames": [], "type": "FORWARD", "unique": true, "version": 0}, "privileges": {"owner": "root", "users": [{"privileges": 2, "user": "admin"}, {"privileges": 2, "user": "root"}], "version": 1}, "replacementOf": {"id": 0, "time": {"logical": 0, "wallTime": "0"}}, "sequenceOpts": null, "state": "PUBLIC", "temporary": false, "unexposedParentSchemaId": 29, "version": 3, "viewQuery": ""}}                                                                                     ```
[email protected]:26257/defaultdb> SELECT crdb_internal.unsafe_upsert_descriptor(262227, crdb_internal.json_to_pb('cockroach.sql.sqlbase.Descriptor', '{"table": {"auditMode": "DISABLED", "checks": [], "columns": [{"alterColumnTypeInProgress": false, "computeExpr": null, "defaultExpr": null, "hidden": false, "id": 1, "name": "tenantid", "nullable": true, "ownsSequenceIds": [], "pgAttributeNum": 0, "systemColumnKind": "NONE", "type": {}, "usesSequenceIds": []}, {"alterColumnTypeInProgress": false, "computeExpr": null, "defaultExpr": null, "hidden": false, "id": 2, "name": "userid", "nullable": true, "ownsSequenceIds": [], "pgAttributeNum": 0, "systemColumnKind": "NONE", "type": {}, "usesSequenceIds": []}, {"alterColumnTypeInProgress": false, "computeExpr": null, "defaultExpr": null, "hidden": false, "id": 3, "name": "groupid", "nullable": true, "ownsSequenceIds": [], "pgAttributeNum": 0, "systemColumnKind": "NONE", "type": {}, "usesSequenceIds": []}, {"alterColumnTypeInProgress": false, "computeExpr": null, "defaultExpr": "unique_rowid()", "hidden": true, "id": 4, "name": "rowid", "nullable": false, "ownsSequenceIds": [], "pgAttributeNum": 0, "systemColumnKind": "NONE", "type": {}, "usesSequenceIds": []}], "createAsOfTime": {"logical": 2, "wallTime": "1605802536470365180"}, "createQuery": "", "dependedOnBy": [], "dependsOn": [], "drainingNames": [], "dropJobId": "0", "dropTime": "0", "families": [{"columnIds": [1, 2, 3, 4], "columnNames": ["tenantid", "userid", "groupid", "rowid"], "defaultColumnId": 0, "id": 0, "name": "primary"}], "formatVersion": 3, "gcMutations": [], "id": 262227, "inboundFks": [], "indexes": [{"columnDirections": ["ASC", "ASC", "ASC"], "columnIds": [1, 2, 3], "columnNames": ["tenantid", "userid", "groupid"], "compositeColumnIds": [], "createdExplicitly": false, "disabled": false, "encodingType": 0, "extraColumnIds": [4], "foreignKey": {"index": 0, "match": "SIMPLE", "name": "", "onDelete": "NO_ACTION", "onUpdate": "NO_ACTION", "sharedPrefixLen": 0, "table": 0, "validity": "Validated"}, "geoConfig": {"s2Geography": null, "s2Geometry": null}, "id": 2, "interleave": {"ancestors": []}, "interleavedBy": [], "name": "usersgroups_tenantid_userid_groupid_key", "partitioning": {"list": [], "numColumns": 0, "range": []}, "predicate": "", "referencedBy": [], "sharded": {"columnNames": [], "isSharded": false, "name": "", "shardBuckets": 0}, "storeColumnIds": [], "storeColumnNames": [], "type": "FORWARD", "unique": true, "version": 1}], "isMaterializedView": false, "lease": null, "modificationTime": {"logical": 0, "wallTime": "0"}, "mutationJobs": [], "mutations": [], "name": "usersgroups", "nextColumnId": 5, "nextFamilyId": 1, "nextIndexId": 3, "nextMutationId": 1, "offlineReason": "", "outboundFks": [], "parentId": 269246, "primaryIndex": {"columnDirections": ["ASC"], "columnIds": [4], "columnNames": ["rowid"], "compositeColumnIds": [], "createdExplicitly": false, "disabled": false, "encodingType": 0, "extraColumnIds": [], "foreignKey": {"index": 0, "match": "SIMPLE", "name": "", "onDelete": "NO_ACTION", "onUpdate": "NO_ACTION", "sharedPrefixLen": 0, "table": 0, "validity": "Validated"}, "geoConfig": {"s2Geography": null, "s2Geometry": null}, "id": 1, "interleave": {"ancestors": []}, "interleavedBy": [], "name": "primary", "partitioning": {"list": [], "numColumns": 0, "range": []}, "predicate": "", "referencedBy": [], "sharded": {"columnNames": [], "isSharded": false, "name": "", "shardBuckets": 0}, "storeColumnIds": [], "storeColumnNames": [], "type": "FORWARD", "unique": true, "version": 0}, "privileges": {"owner": "root", "users": [{"privileges": 2, "user": "admin"}, {"privileges": 2, "user": "root"}], "version": 1}, "replacementOf": {"id": 0, "time": {"logical": 0, "wallTime": "0"}}, "sequenceOpts": null, "state": "PUBLIC", "temporary": false, "unexposedParentSchemaId": 29, "version": 3, "viewQuery": ""}}'));

And now I get

cockroach debug doctor cluster --url postgresql://127.0.0.1:26257 --certs-dir .
Examining 745 descriptors and 762 namespace entries...
   Table   2: ParentID   1, ParentSchemaID 29, Name 'namespace': not being dropped but no namespace entry found
   Table  30: ParentID   1, ParentSchemaID 29, Name 'namespace2': namespace entry {ParentID:1 ParentSchemaID:29 Name:namespace} not found in draining names
   Table  30: ParentID   1, ParentSchemaID 29, Name 'namespace2': could not find name in namespace table
   Table 262227: ParentID 269246, ParentSchemaID 29, Name 'usersgroups': namespace entry {ParentID:262141 ParentSchemaID:29 Name:usersgroups} not found in draining names
   Table 262227: ParentID 269246, ParentSchemaID 29, Name 'usersgroups': could not find name in namespace table
   Table 262231: ParentID 262141, ParentSchemaID 29, Name 'request_trails': parentID 262141 does not exist
   Table 262240: ParentID 262141, ParentSchemaID 29, Name 'address_attributes': parentID 262141 does not exist
   Table 262246: ParentID 262141, ParentSchemaID 29, Name 'rules_trails': parentID 262141 does not exist
   Table 262248: ParentID 262141, ParentSchemaID 29, Name 'entity_changes': parentID 262141 does not exist
   Table 262259: ParentID 262141, ParentSchemaID 29, Name 'wl_address_trails': parentID 262141 does not exist
   Table 262261: ParentID 262141, ParentSchemaID 29, Name 'wl_contract_address_trails': parentID 262141 does not exist
   Table 262267: ParentID 262141, ParentSchemaID 29, Name 'wl_addresses_internal_addresses': parentID 262141 does not exist
   Table 262274: ParentID 262141, ParentSchemaID 29, Name 'reserved_funds': parentID 262141 does not exist
   Table 262485: ParentID 262395, ParentSchemaID 29, Name 'transaction_attributes': parentID 262395 does not exist
   Table 264820: ParentID 264772, ParentSchemaID 29, Name 'token_holders': parentID 264772 does not exist

Which seems to work ok (I get only the "namespace" errors).
However the next table contains a sequence as its primary key, on a column with underscores

CREATE SEQUENCE IF NOT EXISTS request_trails_seq MAXVALUE 9007199254740991;
...
id INT64 PRIMARY KEY DEFAULT nextval('request_trails_seq'),

which I think breaks the upsert_descriptor function:

[email protected]:26257/defaultdb> SELECT crdb_internal.unsafe_upsert_descriptor(262231, crdb_internal.json_to_pb('cockroach.sql.sqlbase.Descriptor', '{"table": {"auditMode": "DISABLED", "checks": [], "columns": [{"alterColumnTypeInProgress": false, "computeExpr": null, "defaultExpr": "nextval('request_trails_seq':::STRING)", "hidden": false, "id": 1, "name": "id", "nullable": false, "ownsSequenceIds": [], "pgAttributeNum":0, "systemColumnKind": "NONE", "type": {}, "usesSequenceIds": [262230]}, {"alterColumnTypeInProgress": false, "computeExpr": null, "defaultExpr": null, "hidden": false, "id": 2, "name": "requestid", "nullable": true, "ownsSequenceIds": [], "pgAttributeNum": 0, "systemColumnKind": "NONE", "type": {}, "usesSequenceIds": []}, {"alterColumnTypeInProgress": false, "computeExpr": null, "defaultExpr": null, "hidden": false, "id": 3, "name": "requeststatus", "nullable": true, "ownsSequenceIds": [], "pgAttributeNum": 0, "systemColumnKind": "NONE", "type": {}, "usesSequenceIds": []}, {"alterColumnTypeInProgress": false, "computeExpr": null, "defaultExpr": null, "hidden": false, "id": 4, "name": "userid", "nullable": true, "ownsSequenceIds": [], "pgAttributeNum": 0, "systemColumnKind": "NONE", "type": {}, "usesSequenceIds": []}, {"alterColumnTypeInProgress": false, "computeExpr": null, "defaultExpr": null, "hidden": false, "id": 5, "name": "externaluserid", "nullable": true, "ownsSequenceIds": [], "pgAttributeNum": 0, "systemColumnKind": "NONE", "type": {}, "usesSequenceIds": []}, {"alterColumnTypeInProgress": false, "computeExpr": null, "defaultExpr": null, "hidden": false, "id": 6, "name": "action", "nullable": true, "ownsSequenceIds": [], "pgAttributeNum": 0, "systemColumnKind": "NONE", "type": {}, "usesSequenceIds": []}, {"alterColumnTypeInProgress": false, "computeExpr": null, "defaultExpr": "now():::TIMESTAMP", "hidden": false, "id": 7, "name": "date", "nullable": true, "ownsSequenceIds": [], "pgAttributeNum": 0, "systemColumnKind": "NONE", "type": {}, "usesSequenceIds": []}, {"alterColumnTypeInProgress": false, "computeExpr": null, "defaultExpr": null, "hidden": false, "id":8, "name": "comment", "nullable": true, "ownsSequenceIds": [], "pgAttributeNum": 0, "systemColumnKind": "NONE", "type": {}, "usesSequenceIds": []}], "createAsOfTime": {"logical": 2, "wallTime": "1605802536470365180"}, "createQuery": "", "dependedOnBy": [], "dependsOn": [], "drainingNames": [], "dropJobId": "0", "dropTime": "0", "families": [{"columnIds": [1, 2, 3, 4, 5, 6, 7, 8], "columnNames": ["id", "requestid", "requeststatus", "userid", "externaluserid", "action", "date", "comment"], "defaultColumnId": 0, "id": 0, "name": "primary"}], "formatVersion": 3, "gcMutations": [], "id": 262231, "inboundFks": [], "indexes": [], "isMaterializedView": false, "lease": null, "modificationTime": {"logical": 0, "wallTime": "0"}, "mutationJobs": [], "mutations": [], "name": "request_trails", "nextColumnId": 9, "nextFamilyId": 1, "nextIndexId": 2, "nextMutationId": 1, "offlineReason": "", "outboundFks": [], "parentId": 269246, "primaryIndex": {"columnDirections": ["ASC"], "columnIds": [1], "columnNames": ["id"], "compositeColumnIds": [], "createdExplicitly": false, "disabled": false, "encodingType": 0, "extraColumnIds": [], "foreignKey": {"index": 0, "match": "SIMPLE", "name": "", "onDelete": "NO_ACTION", "onUpdate": "NO_ACTION", "sharedPrefixLen": 0, "table": 0, "validity": "Validated"}, "geoConfig": {"s2Geography": null, "s2Geometry": null}, "id": 1, "interleave": {"ancestors": []}, "interleavedBy": [], "name": "primary", "partitioning": {"list": [], "numColumns": 0, "range": []}, "predicate": "", "referencedBy": [], "sharded": {"columnNames": [], "isSharded": false, "name": "", "shardBuckets": 0}, "storeColumnIds": [], "storeColumnNames": [], "type": "FORWARD", "unique": true, "version": 1}, "privileges": {"owner": "root", "users": [{"privileges": 2, "user": "admin"}, {"privileges": 2, "user": "root"}], "version": 1}, "replacementOf": {"id": 0, "time": {"logical": 0, "wallTime": "0"}}, "sequenceOpts": null, "state": "PUBLIC", "temporary": false, "unexposedParentSchemaId": 29, "version": 3, "viewQuery": ""}}'));
invalid syntax: statement ignored: at or near "request_trails_seq": syntax error
SQLSTATE: 42601
DETAIL: source SQL:

The json can be "prettified" so I think it's syntactically sound, not quite sure what else could be causing that.
I'd try messing around with changing the format of the 'request_trails_seq':::STRING bit but I'm afraid i'd break things :)

@ikgo
Copy link
Author

ikgo commented Nov 25, 2020

I'm not sure where that error is coming from. Judging from that stack trace, the SELECT crdb_internal.pb_to_json(...) query is being issued via some external tool, right? What happens if you issue the query directly via ./cockroach sql --execute?

same output

@ikgo
Copy link
Author

ikgo commented Nov 25, 2020

Hi, I manage to export all needed data from test environment. Clean install of 20.2 works good.

What status with my production environment?
#56810 (comment)

@knz
Copy link
Contributor

knz commented Nov 26, 2020

Hi ikgo -- so you know some of the team members that can help you are currently away because of the US thanksgiving holiday. Is this something that can wait until next Monday? Otherwise, please state the urgency and I will try to escalate this.

@lime008
Copy link

lime008 commented Nov 27, 2020

Hi, I was able to attach the tables with a missing parent_id to a new table and drop that with the instructions above, so the database is now working as expected.

@ikgo
Copy link
Author

ikgo commented Nov 27, 2020

Hi ikgo -- so you know some of the team members that can help you are currently away because of the US thanksgiving holiday. Is this something that can wait until next Monday? Otherwise, please state the urgency and I will try to escalate this.

not urgent, it can wait.
thanks

@ajwerner
Copy link
Contributor

With 20.2.3 out, I believe that this issue is resolved and am closing it.

@ikgo
Copy link
Author

ikgo commented Dec 21, 2020

Hi @ajwerner,
This issue have two problems. One for testing environment and one for production.
My question about production, see #56810 (comment)
Is it will be safe to migrate?
How to check if migration possible?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. O-community Originated from the community X-blathers-triaged blathers was able to find an owner
Projects
None yet
Development

No branches or pull requests

6 participants