Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

workload/schemachange: add select support for workload #84399

Merged
merged 5 commits into from
Aug 10, 2022

Conversation

fqazi
Copy link
Collaborator

@fqazi fqazi commented Jul 13, 2022

  1. Refactor the op interface to allow us to easily support different types of queries
  2. Add a new select operation and make interface changes to help support some level of validation
  3. Introduce a new connection watch dog, which ensures that connections make progress
  4. Limit the CTAS operations size, since with join we can produce extremely large inserts
  5. Skip constraint validation when generated columns cannot be evalauted

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@fqazi fqazi force-pushed the scEnableSelect branch 3 times, most recently from e915ea1 to 3b1d04e Compare July 14, 2022 15:03
@fqazi fqazi marked this pull request as ready for review July 14, 2022 15:03
@fqazi fqazi requested a review from a team July 14, 2022 15:03
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking pretty close. Has anything interesting come from it?

Reviewed 4 of 4 files at r1, 3 of 3 files at r2, 5 of 5 files at r3, 1 of 1 files at r4, 1 of 1 files at r5, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @fqazi)


-- commits line 39 at r4:
nit: remove this


pkg/ccl/testccl/workload/schemachange/schema_change_external_test.go line 94 at r3 (raw file):

	workerFn := func(ctx context.Context, fn func(ctx context.Context) error) func() error {
		return func() error {
			defer fmt.Printf("FINISHED WORKER %p\n", fn)

remove these?


pkg/workload/schemachange/error_screening.go line 345 at r1 (raw file):

				continue
			}
			evalTxn, err := tx.Begin(ctx)

you should check this error

edit: you do something about this in a later commit


pkg/workload/schemachange/error_screening.go line 347 at r1 (raw file):

			evalTxn, err := tx.Begin(ctx)
			newCols[colInfo.name], err = og.generateColumn(ctx, tx, colInfo, columnsToValues)
			evalTxn.Rollback(ctx)

do you need to check this error?


pkg/workload/schemachange/operation_generator.go line 2562 at r1 (raw file):

			)
		}
		// FIXME: Operation tracking..

What is the TODO here?


pkg/workload/schemachange/operation_generator.go line 3334 at r2 (raw file):

	const MaxTablesForSelect = 3
	const MaxColumnsForSelect = 16
	const MaxRowsToConsume = 1

why export these? should these parameters live in the og struct?

Code quote:

	const MaxTablesForSelect = 3
	const MaxColumnsForSelect = 16
	const MaxRowsToConsume = 1

pkg/workload/schemachange/watch_dog.go line 61 at r3 (raw file):

	lastActiveQuery := w.activeQuery
	if err := sessionInfo.Scan(&w.activeQuery, &w.txnID); err != nil {
		fmt.Printf("failed to get session information: %v\n", err)

should we plumb some sort of logger into this object so we get context?


pkg/workload/schemachange/watch_dog.go line 85 at r3 (raw file):

// or a timeout is hit.
func (w *schemaChangeWatchDog) watchLoop(ctx context.Context) {
	const maxTimeOutForDump = 300

time.Duration? otherwise, change the language to be about iterations?


pkg/workload/schemachange/watch_dog.go line 94 at r3 (raw file):

			return
		case <-time.After(time.Second):
			if deadline, ok := ctx.Deadline(); ok {

why not just use ctx.Done to wait for the context deadline?

Copy link
Collaborator Author

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me try scale testing with this stuff again. It's been on the back burner while I was digging into the other weird issues.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)


pkg/ccl/testccl/workload/schemachange/schema_change_external_test.go line 94 at r3 (raw file):

Previously, ajwerner wrote…

remove these?

Done.


pkg/workload/schemachange/error_screening.go line 345 at r1 (raw file):

Previously, ajwerner wrote…

you should check this error

edit: you do something about this in a later commit

Done.

Yes moved it one commit back


pkg/workload/schemachange/error_screening.go line 347 at r1 (raw file):

Previously, ajwerner wrote…

do you need to check this error?

Done.


pkg/workload/schemachange/operation_generator.go line 2562 at r1 (raw file):

Previously, ajwerner wrote…

What is the TODO here?

Ugh left over from development before I fixed the logging stuff.


pkg/workload/schemachange/operation_generator.go line 3334 at r2 (raw file):

Previously, ajwerner wrote…

why export these? should these parameters live in the og struct?

Accidentally exported for now, but longer term maybe have something shared with CTAS.


pkg/workload/schemachange/watch_dog.go line 61 at r3 (raw file):

Previously, ajwerner wrote…

should we plumb some sort of logger into this object so we get context?

Done.

Sharing the logger with the worker now, which will allow us to know which connection is having trouble.


pkg/workload/schemachange/watch_dog.go line 85 at r3 (raw file):

Previously, ajwerner wrote…

time.Duration? otherwise, change the language to be about iterations?

Done.


pkg/workload/schemachange/watch_dog.go line 94 at r3 (raw file):

Previously, ajwerner wrote…

why not just use ctx.Done to wait for the context deadline?

Done.

Copy link
Collaborator Author

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is at least one memory limit error that we are working around, which I don't think makes sense we don't have crazy scale here.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)

@fqazi fqazi requested a review from ajwerner August 9, 2022 13:26
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: -- let's be vigilant and see if flakes come out of this

Reviewed 1 of 5 files at r8, 1 of 2 files at r9, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @fqazi)


pkg/workload/schemachange/error_screening.go line 348 at r10 (raw file):

			newCols[colInfo.name], err = og.generateColumn(ctx, tx, colInfo, columnsToValues)
			if err != nil {
				_ = evalTxn.Rollback(ctx)

Do you need to check the Rollback error?

Copy link
Collaborator Author

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner)


pkg/workload/schemachange/error_screening.go line 348 at r10 (raw file):

Previously, ajwerner wrote…

Do you need to check the Rollback error?

Done.

fqazi added 4 commits August 9, 2022 23:00
To help support SELECT statements we are going to refactor,
the returned statements inside the schemachange operation
generator into structures.

Release note: None
Previously, the schema change workload did not
select from any of the tables that were created.
This was inadequate because we didn't fully validate
readability of tables during schema changes. To address,
this patch adds supports for select statements.

Release note: None
Previously, this workload had no watch dogs for connections,
to determine when a connection is hung. This was inadequate
because it made it hard to detect hangs within the workload.
To address this, this patch will add a connection watch
dog to detect if connection are making progress. If no
progress is detected, then a panic will be caused.

Release note: None
Previously, we would still try to validate constraints, if any of
the generated columns could be evaluated. To address this, this
patch will skip constraint validation in these scenarios.

Release note: None
Previously, the schemachange workload could generate
pretty large tables because of joins on CTAS statements.
These could take a fairly long time to CREATE. To address,
this patch will limit the maximum number of rows in
these tables.

Release note: None
@fqazi
Copy link
Collaborator Author

fqazi commented Aug 10, 2022

bors r+

@craig
Copy link
Contributor

craig bot commented Aug 10, 2022

Build succeeded:

@craig craig bot merged commit 65fe1a5 into cockroachdb:master Aug 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants