Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backup/restore: restoring table that references a type that is already defined in the schema being restored to causes error #70168

Closed
Tracked by #70164
RichardJCai opened this issue Sep 13, 2021 · 0 comments · Fixed by #70452
Assignees
Labels
A-disaster-recovery C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-disaster-recovery

Comments

@RichardJCai
Copy link
Contributor

RichardJCai commented Sep 13, 2021

Sample test case added to backup_test.go to reproduce

Produces error: error executing 'RESTORE TABLE d.s.t FROM 'nodelocal://0/test/' WITH into_db = 'd2'': pq: restoring 1 TableDescriptors from 0 databases: restoring table desc and namespace entries: table already exists

	// Test cases where we attempt to remap types in the backup to types that
	// already exist in the cluster.
	t.Run("backup-remap2", func(t *testing.T) {
		_, _, sqlDB, _, cleanupFn := BackupRestoreTestSetup(t, singleNode, 0, InitManualReplication)
		defer cleanupFn()
		sqlDB.Exec(t, `
CREATE DATABASE d;
CREATE SCHEMA d.s;
CREATE TYPE d.s.greeting AS ENUM ('hello', 'howdy', 'hi');
CREATE TABLE d.s.t (x d.s.greeting);
INSERT INTO d.s.t VALUES ('hello'), ('howdy');
CREATE TYPE d.s.farewell AS ENUM ('bye', 'cya');
CREATE TABLE d.s.t2 (x d.s.greeting[]);
INSERT INTO d.s.t2 VALUES (ARRAY['hello']);
`)
		{
			// Backup and restore t.
			sqlDB.Exec(t, `BACKUP TABLE d.s.t TO 'nodelocal://0/test/'`)
			sqlDB.Exec(t, `DROP TABLE d.s.t`)
			sqlDB.Exec(t, `RESTORE TABLE d.s.t FROM 'nodelocal://0/test/'`)

			// Check that the table data is restored correctly and the types aren't touched.
			sqlDB.CheckQueryResults(t, `SELECT 'hello'::d.s.greeting, ARRAY['hello']::d.s.greeting[]`, [][]string{{"hello", "{hello}"}})
			sqlDB.CheckQueryResults(t, `SELECT * FROM d.s.t ORDER BY x`, [][]string{{"hello"}, {"howdy"}})

			// d.t should be added as a back reference to greeting.
			sqlDB.ExpectErr(t, `pq: cannot drop type "greeting" because other objects \(.*\) still depend on it`, `DROP TYPE d.s.greeting`)
		}

		{
			// Test that backing up an restoring a table with just the array type
			// will remap types appropriately.
			sqlDB.Exec(t, `BACKUP TABLE d.s.t2 TO 'nodelocal://0/test2/'`)
			sqlDB.Exec(t, `DROP TABLE d.s.t2`)
			sqlDB.Exec(t, `RESTORE TABLE d.s.t2 FROM 'nodelocal://0/test2/'`)
			sqlDB.CheckQueryResults(t, `SELECT 'hello'::d.s.greeting, ARRAY['hello']::d.s.greeting[]`, [][]string{{"hello", "{hello}"}})
			sqlDB.CheckQueryResults(t, `SELECT * FROM d.s.t2 ORDER BY x`, [][]string{{"{hello}"}})
		}

		{
			// Create another database with compatible types.
			sqlDB.Exec(t, `CREATE DATABASE d2`)
			sqlDB.Exec(t, `CREATE SCHEMA d2.s`)
			sqlDB.Exec(t, `CREATE TYPE d2.s.greeting AS ENUM ('hello', 'howdy', 'hi')`)

			// Now restore t into this database. It should remap d.greeting to d2.greeting.
			sqlDB.Exec(t, `RESTORE TABLE d.s.t FROM 'nodelocal://0/test/' WITH into_db = 'd2'`)
                }
})

The culprit is that within restore_planning.go in the allocateDescriptorRewrites function, the collision logic check uses the wrong parent schema id.

https://github.com/cockroachdb/cockroach/blob/master/pkg/ccl/backupccl/restore_planning.go#L721-L728

				// See if there is an existing type with the same name.
				desc, err := catalogkv.GetDescriptorCollidingWithObject(
					ctx,
					txn,
					p.ExecCfg().Codec,
					parentID,
					typ.GetParentSchemaID(),
					typ.Name,
				)

typ.GetParentSchemaID() returns the schema ID of the schema the type was originally backed up from, not the id of the schema being restored to. Note that this is fine with the public schema since it will be 29 either but with UDS, this issue occurs.

I'm not sure at the moment how to map one schema to the other given just the schema id.

@RichardJCai RichardJCai added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-disaster-recovery T-disaster-recovery labels Sep 13, 2021
@gh-casper gh-casper self-assigned this Sep 20, 2021
gh-casper added a commit to gh-casper/cockroach that referenced this issue Sep 20, 2021
…at is defined in user-defined schema

Previously, restore would fail if trying to restore a table that references a type defined in a user defined schema in a new database.

This change adds logic to find correct parent schema ID for a type.

Fixes: cockroachdb#70168

Release note: none
craig bot pushed a commit that referenced this issue Sep 21, 2021
69905: colexec: adds support for partial ordering in topk sorter r=rharding6373 a=rharding6373

Previously, topKSorter had to process all input rows before returning
the top K rows according to its specified ordering. If a subset of the
input rows were already ordered, topKSorter would still iterate over the
entire input.

However, if the input was partially ordered, topKSorter could
potentially stop iterating early, since after it has found K candidates
it is guaranteed not to find any better top candidates.

For example, take the following query and table with an index on a:

```
  a | b
----+----
  1 | 5
  2 | 3
  2 | 1
  3 | 3
  5 | 3

SELECT * FROM t ORDER BY a, b LIMIT 2
```

Given an index scan on a to provide `a`'s ordering, topk only needs to
process 3 rows in order to guarantee that it has found the top K rows.
Once it finishes processing the third row `[2, 1]`, all subsequent rows
have higher values of `a` than the top 2 rows found so far, and
therefore cannot be in the top 2 rows.

This change modifies the vectorized engine's TopKSorter signature to include
a partial ordering. The TopKSorter chunks the input according to the
sorted columns and processes each chunk with its existing heap
algorithm. At the end of each chunk, if K rows are in the heap,
TopKSorter emits the rows and stops execution.

A later commit, once merged with top K optimizer and distsql changes, will adjust the cost model for top K to reflect this change.

Release note: N/A

70285: log: add `version` field to `json` formatted log entries r=knz a=cameronnunez

Fixes [#70202](#70202).

Release note (cli change): version details have been added to all json formatted 
log entries. Refer to the reference docs for details about the field.

70380: backupccl: drop temp system database on failed restore r=irfansharif a=adityamaru

Previously, if a restore failed during execution
we would not cleanup the temproary system db descriptor
that we create during a cluster restore. A
`SHOW DATABASES` after the failed restore would show
the `crdb_temp_system` database as well.

This change adds logic to drop the database in the
OnFailOrCancel hook of the retore job.

Fixes: #70324

Release note: None

70452: backupccl: fix error when restoring a table that references a type defined in user-defined schema r=gh-casper a=gh-casper

Previously, restore would fail if trying to restore a table that references a type defined in a user defined schema in a new database.

This change adds logic to use ID of the schema in the restoring DB for a type if this schema has the same name as in the DB backed up.

Fixes: #70168

Release note: none

70472: sql: include ON UPDATE on CREATE TABLE LIKE for INCLUDING DEFAULTS r=rafiss a=otan

Resolves #69258

Release note (sql change): CREATE TABLE ... LIKE ... now copies ON
UPDATE definitions for INCLUDING DEFAULTS.

70500: vendor: bump Pebble to 9509dcb7a53a r=sumeerbhola a=jbowens

```
9509dcb compaction: fix nil pointer during errored compactions
d27f1d7 internal/base: add SetWithDelete key kind
971533d base: add `InternalKeyKindSeparator`
3f0c125 cmd/pebble: specify format major version
```

Informs #70443.

Release note: none

70511: authors: add Jon Tsiros to authors r=jtsiros a=jtsiros

Release note: None

70518: authors: add mbookham7 to authors r=mbookham7 a=mbookham7

Release note: None

70525: sql: interleaved tables notice was incorrectly labeled as an error r=fqazi a=fqazi

Previously, interleaved tables were only deprecated, and
later on we fully dropped support for them returning a new
notice with the word "error" indicating they were no-ops. This
was inadequates because the message is not fatal and only a notice.
To address this, this patch will change them message type to notice.

Release note: None

Co-authored-by: rharding6373 <[email protected]>
Co-authored-by: Cameron Nunez <[email protected]>
Co-authored-by: Aditya Maru <[email protected]>
Co-authored-by: Casper <[email protected]>
Co-authored-by: Oliver Tan <[email protected]>
Co-authored-by: Jackson Owens <[email protected]>
Co-authored-by: Jon Tsiros <[email protected]>
Co-authored-by: Mike Bookham <[email protected]>
Co-authored-by: Faizan Qazi <[email protected]>
@craig craig bot closed this as completed in 7da0dd7 Sep 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-disaster-recovery C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-disaster-recovery
Projects
No open projects
Archived in project
2 participants