Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

persistedsqlstats: add AOST clause to statement statistics size check #107549

Merged

Conversation

THardy98
Copy link

@THardy98 THardy98 commented Jul 25, 2023

Epic: None

This change as an AOST clause to the statement statistics size check.
This helps mitigate contention on the system.statement_statistics
table which has caused the sql stats compaction job to fail.

Release note: None

@THardy98 THardy98 requested a review from a team July 25, 2023 17:35
@THardy98 THardy98 marked this pull request as draft July 25, 2023 17:35
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@THardy98 THardy98 force-pushed the update_fetch_stmt_count_as_follower_read branch 2 times, most recently from f58ef30 to 47adc69 Compare July 25, 2023 19:16
@THardy98 THardy98 changed the title wip persistedsqlstats: add AOST clause to statement statistics size check Jul 25, 2023
@THardy98 THardy98 marked this pull request as ready for review July 25, 2023 19:17
Copy link
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @THardy98)


pkg/sql/sqlstats/persistedsqlstats/flush.go line 78 at r1 (raw file):

	if err != nil {
		log.Errorf(ctx, "encountered an error at flush, checking for statement statistics size limit: %v", err)
		fmt.Println("err", err)

I assume you added this print for debugging, so you can remove it

Copy link
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @THardy98)


-- commits line 8 at r1:
can you add a release note? you don't need to go into details about the AOST, but add that this is fixing the retry error on compaction job

Copy link
Contributor

@j82w j82w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @maryliag and @THardy98)


pkg/sql/sqlstats/persistedsqlstats/flush.go line 75 at r1 (raw file):

	aggregatedTs := s.ComputeAggregatedTs()

	limitReached, err := s.StmtsLimitSizeReached(ctx)

nit: add a comment explaining that only stmt count needs to be checked because there should always be more stmts than txns.


pkg/sql/sqlstats/persistedsqlstats/flush.go line 79 at r1 (raw file):

		log.Errorf(ctx, "encountered an error at flush, checking for statement statistics size limit: %v", err)
		fmt.Println("err", err)
	} else if limitReached {

Before if an error occurred in the StmtsLimitSizeReached it returned false and still allowed the flush. Let's keep the same behavior.

defer s.Stopper().Stop(context.Background())
sqlConn := sqlutils.MakeSQLRunner(conn)
sqlConn.Exec(t, `INSERT INTO system.users VALUES ('node', NULL, true, 3); GRANT node TO root`)
waitForFollowerReadTimestamp(t, sqlConn)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't you just set the testing knob to make the AOST time a more recent value? We usually use '-1us'.

@THardy98 THardy98 force-pushed the update_fetch_stmt_count_as_follower_read branch from 47adc69 to 5dc9995 Compare July 25, 2023 19:54
This change as an AOST clause to the statement statistics size check.
This helps mitigate contention on the `system.statement_statistics`
table which has caused the sql stats compaction job to fail.

Release note (bug fix): this change reduces contention on the
`system.statement_statistics` table which has caused the SQL statistics
compaction job to fail
@THardy98 THardy98 force-pushed the update_fetch_stmt_count_as_follower_read branch from 5dc9995 to cead5f7 Compare July 25, 2023 19:59
Copy link
Author

@THardy98 THardy98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @j82w, @maryliag, and @xinhaoz)


-- commits line 8 at r1:

Previously, maryliag (Marylia Gutierrez) wrote…

can you add a release note? you don't need to go into details about the AOST, but add that this is fixing the retry error on compaction job

Added one, labelled it as a bug fix.


pkg/sql/sqlstats/persistedsqlstats/flush.go line 75 at r1 (raw file):

Previously, j82w (Jake) wrote…

nit: add a comment explaining that only stmt count needs to be checked because there should always be more stmts than txns.

Done.


pkg/sql/sqlstats/persistedsqlstats/flush.go line 78 at r1 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

I assume you added this print for debugging, so you can remove it

Removed.


pkg/sql/sqlstats/persistedsqlstats/flush.go line 79 at r1 (raw file):

Previously, j82w (Jake) wrote…

Before if an error occurred in the StmtsLimitSizeReached it returned false and still allowed the flush. Let's keep the same behavior.

Removed the conditional. If an error occurs, we just log it.


pkg/sql/sqlstats/persistedsqlstats/flush_test.go line 539 at r1 (raw file):

Previously, xinhaoz (Xin Hao Zhang) wrote…

Why can't you just set the testing knob to make the AOST time a more recent value? We usually use '-1us'.

Considered it but after some offline discussion (here) opted to wait for the follower read timestamp to better replicate the code flow.

Copy link
Contributor

@j82w j82w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @maryliag and @xinhaoz)

@j82w
Copy link
Contributor

j82w commented Jul 25, 2023

pkg/sql/sqlstats/persistedsqlstats/flush_test.go line 576 at r2 (raw file):

	// (meaning that the limit will be reached) and no error. We need SucceedsSoon here for the follower
	// read timestamp to catch up enough for this state to be reached.
	testutils.SucceedsSoon(t, func() error {

How could this ever succeed if the statement above took an update lock? It shouldn't be possible to add rows right?

Copy link
Author

@THardy98 THardy98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @j82w, @maryliag, and @xinhaoz)


pkg/sql/sqlstats/persistedsqlstats/flush_test.go line 576 at r2 (raw file):

Previously, j82w (Jake) wrote…

How could this ever succeed if the statement above took an update lock? It shouldn't be possible to add rows right?

It doesn't fail initially because there aren't enough rows in the table, it fails initially because the number of rows AOST follower_read_timestamp is less than the actual number of rows needed for limitReached == true. We SucceedsSoon to keep checking the rows until AOST catches up to see them all

@maryliag
Copy link
Contributor

bors r+

@craig
Copy link
Contributor

craig bot commented Jul 26, 2023

Build succeeded:

@craig craig bot merged commit 68985a2 into cockroachdb:master Jul 26, 2023
@blathers-crl
Copy link

blathers-crl bot commented Jul 26, 2023

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from cead5f7 to blathers/backport-release-22.2-107549: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.2.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants