Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: change table truncation to create new indexes rather than new ID #52235

Merged
merged 1 commit into from
Aug 13, 2020

Conversation

rohany
Copy link
Contributor

@rohany rohany commented Aug 3, 2020

Fixes #52170.

This commit updates the implementation of table truncation to create a
new set of indexes for the table, rather than creating a new table ID
for the table's data.

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rohany
Copy link
Contributor Author

rohany commented Aug 3, 2020

cc @ajwerner @lucy-zhang wanted to get some eyes here before cleaning this up heavily and adding some schema changer tests

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 13 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @rohany)


pkg/sql/backfill.go, line 646 at r1 (raw file):

		if err := execCfg.DB.Txn(ctx, func(ctx context.Context, txn *kv.Txn) error {
			var err error
			table, err = sqlbase.GetTableDescFromID(ctx, txn, execCfg.Codec, tableID)

can you use catalogkv here?


pkg/sql/backfill.go, line 665 at r1 (raw file):

			return err
		case errors.Is(err, sqlbase.ErrDescriptorNotFound):
			log.Infof(ctx, "descriptor %d not found for GCJob index truncation", tableID)

Don't you want to return here?


pkg/sql/truncate.go, line 236 at r1 (raw file):

	// TODO (rohany): I'm confused about what job API to use -- i want to start the
	//  job and not wait for it at all.

You want CreateAdoptableJobWithTxn


I feel like we need to make sure in the gcjob that the leases have drained for the current version. Does that happen anywhere?


pkg/sql/truncate.go, line 361 at r1 (raw file):

	// TODO (rohany): Do a query here first to see if there are any index comments
	//  that need to be updated.
	// TODO (rohany): Is there a natural way of doing this so that it is all in

This seems fine.

Copy link
Contributor Author

@rohany rohany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll update this tomorrow with a test that ensures that interleaved data gets cleaned up on truncate.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)


pkg/ccl/changefeedccl/changefeed_test.go, line 1313 at r2 (raw file):

		sqlDB := sqlutils.MakeSQLRunner(db)

		// TODO (rohany): Not sure what we expect from truncations now in changefeedland.

@ajwerner what's going on here? Right now, we hang on the first call to next after the truncation -- I assume this is because now the table ID is the same and there isn't any data coming into the table. What should happen now?


pkg/sql/backfill.go, line 646 at r1 (raw file):

Previously, ajwerner wrote…

can you use catalogkv here?

yeah, this PR was behind those changes.


pkg/sql/truncate.go, line 236 at r1 (raw file):

Previously, ajwerner wrote…
	// TODO (rohany): I'm confused about what job API to use -- i want to start the
	//  job and not wait for it at all.

You want CreateAdoptableJobWithTxn


I feel like we need to make sure in the gcjob that the leases have drained for the current version. Does that happen anywhere?

Done. Added a lease drain stage to the GC Job when indexes are being deleted.

@rohany rohany marked this pull request as ready for review August 11, 2020 17:56
@rohany rohany requested review from a team, adityamaru, thoszhang and ajwerner and removed request for a team August 11, 2020 17:56
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! I had to think for a while about the logic of interleaved secondary indexes but it seems spot on.

Let's get the changefeed stuff in order and merge it

Reviewed 2 of 12 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, @lucy-zhang, and @rohany)


pkg/ccl/changefeedccl/changefeed_test.go, line 1313 at r2 (raw file):

Previously, rohany (Rohan Yadav) wrote…

@ajwerner what's going on here? Right now, we hang on the first call to next after the truncation -- I assume this is because now the table ID is the same and there isn't any data coming into the table. What should happen now?

Yeesh yeah this is a big behavior change. We need to do something here. For now let's work to preserve the behavior that exists, I think ideally what we want is to perform a backfill that issues deletes for every key that used to exist but that can come later.

In order to preserve the behavior, we want to classify the error appropriately such that this call returns an error:

shouldFilter, err := tf.filter.shouldFilter(ctx, e)

We should make

func classifyTableEvent(e TableEvent) tableEventType {
detect and return a new truncate event then have shouldFilter return an error indicating that the table has been truncated with a TODO to do something better.


pkg/sql/truncate.go, line 379 at r3 (raw file):

}

// reassignInterleaveIndexReferences reassigns all index ID's present in

This is sort of magical and amazing that for primary indexes of interleave children you don't actually need to do anything fancy like change the index ID because you've changed the parent index id. 🤯

@rohany
Copy link
Contributor Author

rohany commented Aug 13, 2020

This is sort of magical and amazing that for primary indexes of interleave children you don't actually need to do anything fancy like change the index ID because you've changed the parent index id. 🤯

IDK if this is as 🧠 as you're imagining -- we traverse the interleaved hierarchy during the truncate and reassign all of the index ID's when processing each table.

@rohany rohany force-pushed the truncate-redo branch 2 times, most recently from 7013258 to 00670a2 Compare August 13, 2020 18:13
Copy link
Contributor Author

@rohany rohany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, and @lucy-zhang)


pkg/ccl/changefeedccl/changefeed_test.go, line 1313 at r2 (raw file):

Previously, ajwerner wrote…

Yeesh yeah this is a big behavior change. We need to do something here. For now let's work to preserve the behavior that exists, I think ideally what we want is to perform a backfill that issues deletes for every key that used to exist but that can come later.

In order to preserve the behavior, we want to classify the error appropriately such that this call returns an error:

shouldFilter, err := tf.filter.shouldFilter(ctx, e)

We should make

func classifyTableEvent(e TableEvent) tableEventType {
detect and return a new truncate event then have shouldFilter return an error indicating that the table has been truncated with a TODO to do something better.

Done. However, i tried to add some tests for this (to differentiate table truncation from primary key changes) and my test failed with errors out side of this (we can differentiate truncation from pk changes). I added this test:


// TestChangefeedPrimaryKeyChange ensures that primary key changes aren't
// interpreted as table truncations by the changefeed table event filters.
func TestChangefeedPrimaryKeyChange(t *testing.T) {
	defer leaktest.AfterTest(t)()
	defer log.Scope(t).Close(t)

	testFn := func(t *testing.T, db *gosql.DB, f cdctest.TestFeedFactory) {
		sqlDB := sqlutils.MakeSQLRunner(db)

		// Create a table and add some data.
		sqlDB.Exec(t, `CREATE TABLE t (x INT PRIMARY KEY, y INT NOT NULL)`)
		sqlDB.Exec(t, `INSERT INTO t VALUES (1, 1)`)
		tfeed := feed(t, f, `CREATE CHANGEFEED FOR t`)
		defer closeFeed(t, tfeed)
		assertPayloads(t, tfeed, []string{`t: [1]->{"after": {"x": 1, "y": 1}}`})
		// Now change the primary key.
		sqlDB.Exec(t, `ALTER TABLE t ALTER PRIMARY KEY USING COLUMNS (y)`)
		// Insert a new value.
		sqlDB.Exec(t, `INSERT INTO t VALUES (2, 2)`)
		assertPayloads(t, tfeed, []string{`t: [1]->{"after": {"x": 2, "y": 2}}`})
	}

	t.Run(`sinkless`, sinklessTest(testFn))
	t.Run(`enterprise`, enterpriseTest(testFn))
}

and it fails with an decoding unset encdatum error. I think this can be addressed in future PRs by folks with more change feed knowledge.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: after a rebase and the nits

Reviewed 1 of 12 files at r3, 2 of 8 files at r4.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, @lucy-zhang, and @rohany)


pkg/ccl/changefeedccl/changefeed_test.go, line 1313 at r2 (raw file):

Previously, rohany (Rohan Yadav) wrote…

Done. However, i tried to add some tests for this (to differentiate table truncation from primary key changes) and my test failed with errors out side of this (we can differentiate truncation from pk changes). I added this test:


// TestChangefeedPrimaryKeyChange ensures that primary key changes aren't
// interpreted as table truncations by the changefeed table event filters.
func TestChangefeedPrimaryKeyChange(t *testing.T) {
	defer leaktest.AfterTest(t)()
	defer log.Scope(t).Close(t)

	testFn := func(t *testing.T, db *gosql.DB, f cdctest.TestFeedFactory) {
		sqlDB := sqlutils.MakeSQLRunner(db)

		// Create a table and add some data.
		sqlDB.Exec(t, `CREATE TABLE t (x INT PRIMARY KEY, y INT NOT NULL)`)
		sqlDB.Exec(t, `INSERT INTO t VALUES (1, 1)`)
		tfeed := feed(t, f, `CREATE CHANGEFEED FOR t`)
		defer closeFeed(t, tfeed)
		assertPayloads(t, tfeed, []string{`t: [1]->{"after": {"x": 1, "y": 1}}`})
		// Now change the primary key.
		sqlDB.Exec(t, `ALTER TABLE t ALTER PRIMARY KEY USING COLUMNS (y)`)
		// Insert a new value.
		sqlDB.Exec(t, `INSERT INTO t VALUES (2, 2)`)
		assertPayloads(t, tfeed, []string{`t: [1]->{"after": {"x": 2, "y": 2}}`})
	}

	t.Run(`sinkless`, sinklessTest(testFn))
	t.Run(`enterprise`, enterpriseTest(testFn))
}

and it fails with an decoding unset encdatum error. I think this can be addressed in future PRs by folks with more change feed knowledge.

I created #52796 to track.


pkg/sql/backfill.go, line 627 at r4 (raw file):

// SchemaChanger.truncateIndexes instead because that accesses the most recent
// version of the table when deleting. In this case, we need to use the version
// of the table before truncation, which is passed in.

Is it worth trying to unify some code between the two?


pkg/sql/backfill.go, line 643 at r4 (raw file):

Quoted 4 lines of code…
			if log.V(2) {
				log.Infof(
					ctx, "truncate interleaved index (%d) at row: %d, span: %s", table.ID, rowIdx, resume)
			}

This is a weird wrapping. Also I think log.VEventf is strictly better in this case.

Fixes cockroachdb#52170.

This commit updates the implementation of table truncation to create a
new set of indexes for the table, rather than creating a new table ID
for the table's data.

Release note: None
Copy link
Contributor Author

@rohany rohany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, and @lucy-zhang)


pkg/sql/backfill.go, line 627 at r4 (raw file):

Previously, ajwerner wrote…
// SchemaChanger.truncateIndexes instead because that accesses the most recent
// version of the table when deleting. In this case, we need to use the version
// of the table before truncation, which is passed in.

Is it worth trying to unify some code between the two?

Trust me, I spent a long time trying. Trying to unify them but also allow the schema changer use case to refetch the table at every iteration led to some weird interfaces.

@rohany
Copy link
Contributor Author

rohany commented Aug 13, 2020

This is a weird wrapping. Also I think log.VEventf is strictly better in this case.

Switched to a veventf

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, and @lucy-zhang)

@rohany
Copy link
Contributor Author

rohany commented Aug 13, 2020

bors r+

@craig
Copy link
Contributor

craig bot commented Aug 13, 2020

Build succeeded:

@craig craig bot merged commit 673ca44 into cockroachdb:master Aug 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: create new indexes rather than a new table in TRUNCATE
3 participants