storage: Make logging event listener async for DiskSlow #96021

itsbilal · 2023-01-26T19:51:51Z

The pebble logger could block if we're experiencing a slow / stalling disk. If the call to the pebble logger is synchronous from the EventListener passed into Pebble, it could end up slowing down Pebble's internal disk health checks as those rely on EventListener methods being quick to run.

This change updates the logging event listener to asynchronously call the logger on a DiskSlow event.

Related to #94373.

Epic: none

Release note: None.

cockroach-teamcity · 2023-01-26T19:52:02Z

This change is

blathers-crl · 2023-01-26T19:52:05Z

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.}

sumeerbhola

Reviewed 1 of 1 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal and @RaduBerinde)

pkg/storage/pebble.go line 995 at r1 (raw file):

	// up Pebble's internal disk health checking, and better obeys the
	// EventListener contract for not having any functions block or take a while to
	// run.

Can you add something like
// Creating goroutines is acceptable given their low cost, and the low write concurrency to Pebble's FS (Pebble compactions + flushes + SQL spilling to disk). If the maximum concurrency of DiskSlow events increases significantly in the future, we can improve the logic here by queueing up most of the logging work (except for the Fatalf call), and have it be done by a single goroutine.

Hmm, what if someone configured the system such that the logger had 0 write bandwidth (say on a different device) and Pebble's FS was slow enough to cause this event to fire for each write, but be lower than MaxSyncDuration. We'd keep generating new goroutines that would get stuck and eventually have a huge number of goroutines. Maybe this is not a real problem because if the logger is getting stuck all the other goroutines in CockroachDB that are logging would get stuck too. I don't really understand how our logging system is implemented -- does it buffer log statements with severity other than Severity_FATAL?

pkg/storage/pebble.go line 1004 at r1 (raw file):

		// Run oldDiskSlow asynchronously.
		go oldDiskSlow(info)
	}

In light of ExitTimeoutOnFatalLog being 20s, and to obey the contract that Pebble is specifying, should we put the DiskSlow returned by makeMetricEtcEventListener behind a goroutine too?

itsbilal

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @sumeerbhola)

pkg/storage/pebble.go line 995 at r1 (raw file):

Previously, sumeerbhola wrote…

Can you add something like
// Creating goroutines is acceptable given their low cost, and the low write concurrency to Pebble's FS (Pebble compactions + flushes + SQL spilling to disk). If the maximum concurrency of DiskSlow events increases significantly in the future, we can improve the logic here by queueing up most of the logging work (except for the Fatalf call), and have it be done by a single goroutine.

Hmm, what if someone configured the system such that the logger had 0 write bandwidth (say on a different device) and Pebble's FS was slow enough to cause this event to fire for each write, but be lower than MaxSyncDuration. We'd keep generating new goroutines that would get stuck and eventually have a huge number of goroutines. Maybe this is not a real problem because if the logger is getting stuck all the other goroutines in CockroachDB that are logging would get stuck too. I don't really understand how our logging system is implemented -- does it buffer log statements with severity other than Severity_FATAL?

From what I'm reading, yes it'd buffer log statements and their respective goroutines until their appropriate output "turn" has occurred. It's really only Fatals that get the "exit in 20s if nothing happens" timer.

Made the comment change.

pkg/storage/pebble.go line 1004 at r1 (raw file):

Previously, sumeerbhola wrote…

In light of ExitTimeoutOnFatalLog being 20s, and to obey the contract that Pebble is specifying, should we put the DiskSlow returned by makeMetricEtcEventListener behind a goroutine too?

Done. Think it's a good idea.

sumeerbhola

Reviewed 2 of 2 files at r4, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @RaduBerinde)

itsbilal · 2023-01-26T22:59:12Z

TFTR!

bors r=sumeerbhola

RaduBerinde

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @itsbilal and @sumeerbhola)

pkg/storage/pebble.go line 1098 at r4 (raw file):

				// quickly.
				if fatalOnExceeded {
					go log.Fatalf(ctx, "file write stall detected: %s", info)

What is the logic behind this go? The reasoning above applies well to the case below but here we will crash, so not much point in returning control. Plus the stack trace inside Fatalf will be much less useful

itsbilal · 2023-01-26T23:08:31Z

pkg/storage/pebble.go line 1098 at r4 (raw file):

Previously, RaduBerinde wrote…

What is the logic behind this go? The reasoning above applies well to the case below but here we will crash, so not much point in returning control. Plus the stack trace inside Fatalf will be much less useful

The concern is that the ExitTimeoutOnFatalLog at 20s is a little too high. I guess at that point we are likely to crash no matter what, but it's best to try all avenues to crashing instead of waiting, say, 19s for one log to go through.

RaduBerinde · 2023-01-26T23:11:21Z

pkg/storage/pebble.go line 1098 at r4 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

The concern is that the ExitTimeoutOnFatalLog at 20s is a little too high. I guess at that point we are likely to crash no matter what, but it's best to try all avenues to crashing instead of waiting, say, 19s for one log to go through.

Are you saying that if we return control here we might also crash through a different pathway that does not involve Fatalf?

itsbilal · 2023-01-26T23:18:19Z

pkg/storage/pebble.go line 1098 at r4 (raw file):

Previously, RaduBerinde wrote…

Are you saying that if we return control here we might also crash through a different pathway that does not involve Fatalf?

I realized that's somewhat unlikely after I wrote that comment, so yes I can probably revert this go.

bors cancel

craig · 2023-01-26T23:18:21Z

Canceled.

The pebble logger could block if we're experiencing a slow / stalling disk. If the call to the pebble logger is synchronous from the EventListener passed into Pebble, it could end up slowing down Pebble's internal disk health checks as those rely on EventListener methods being quick to run. This change updates the logging event listener to asynchronously call the logger on a DiskSlow event. Related to cockroachdb#94373. Epic: none Release note: None.

itsbilal

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde and @sumeerbhola)

pkg/storage/pebble.go line 1098 at r4 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

I realized that's somewhat unlikely after I wrote that comment, so yes I can probably revert this go.

bors cancel

Fixed, PTAL

RaduBerinde

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @sumeerbhola)

pkg/storage/pebble.go line 1098 at r4 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

Fixed, PTAL

LGTM!

itsbilal · 2023-01-27T00:10:23Z

bors r=sumeerbhola

craig · 2023-01-27T01:10:15Z

Build failed (retrying...):

Bazel Essential CI (Cockroach)

craig · 2023-01-27T07:15:09Z

Build succeeded:

Bazel Essential CI (Cockroach)

blathers-crl · 2023-01-27T07:15:20Z

Encountered an error creating backports. Some common things that can go wrong:

The backport branch might have already existed.
There was a merge conflict.
The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.

error creating merge commit from 922238d to blathers/backport-release-22.2-96021: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.2.x failed. See errors above.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.}

sumeerbhola

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale)

pkg/storage/pebble.go line 1098 at r4 (raw file):

Plus the stack trace inside Fatalf will be much less useful

That is compelling enough for this change.
I was being a stickler about obeying the contract with Pebble that no IO should happen on the DiskSlow callback. It makes sense to make an exception when we are guaranteed to crash.

In cockroachdb#96021 handling of disk slow events was made asynchronous. This introduced a race during Close where the goroutine logging could still be inflight. Fix cockroachdb#96635. Epic: None Release note: None

95245: sem/tree: avoid heap allocs during pretty-printing r=chengxiong-ruan,rafiss a=knz Everywhere applicable, change ```go for i, xxx := range yyy { ctx.FormatNode(&xxx) } ``` to: ```go for i := range yyy { ctx.FormatNode(&yyy[i]) } ``` Epic: None 96476: sql: validate function volatility r=chengxiong-ruan a=chengxiong-ruan This commits adds a simple udf volatility check by utilizing the volatilities collected by the optbuilder when building a statement. Errors are throw if any statement is not as strict as the target volatility. Stable function folding is turned off when building statements in CREATE FUNCTION to catch stable function's volatility. Release note (sql change): Previously, UDFs can be created with any volatility no matter if the function body statements contain any expression which would violate the target volatility. For example, an immutable function might use `random()` in it. This change added validations to guarantee that all statements in the function body should be as strict as the expected UDF volatility. Informs: #87699 96515: ui: fix time scale selection r=maryliag a=maryliag When making a call to retrieve the list of fingerprints per index, it was suppose to use the round dates from the time picker, otherwise it wouldn't get the rounded hour that has the aggregated timestamp. Epic: None Release note: None 96651: storage: fix TestPebbleMetricEventListener r=itsbilal a=jbowens In #96021 handling of disk slow events was made asynchronous. This introduced a race during Close where the goroutine logging could still be inflight. Fix #96635. Epic: None Release note: None Co-authored-by: Raphael 'kena' Poss <[email protected]> Co-authored-by: Chengxiong Ruan <[email protected]> Co-authored-by: maryliag <[email protected]> Co-authored-by: Jackson Owens <[email protected]>

itsbilal requested review from RaduBerinde and sumeerbhola January 26, 2023 19:51

itsbilal requested a review from a team as a code owner January 26, 2023 19:51

itsbilal self-assigned this Jan 26, 2023

itsbilal mentioned this pull request Jan 26, 2023

vfs: don't call onDiskSlow while holding diskHealthCheckingFS mutex cockroachdb/pebble#2278

Merged

sumeerbhola requested changes Jan 26, 2023

View reviewed changes

itsbilal force-pushed the logging-event-listener-diskslow branch from 7c91314 to 782a25d Compare January 26, 2023 21:16

itsbilal commented Jan 26, 2023

View reviewed changes

itsbilal force-pushed the logging-event-listener-diskslow branch 2 times, most recently from 0023170 to fe86de5 Compare January 26, 2023 21:36

itsbilal added the backport-22.2.x label Jan 26, 2023

sumeerbhola approved these changes Jan 26, 2023

View reviewed changes

RaduBerinde reviewed Jan 26, 2023

View reviewed changes

itsbilal force-pushed the logging-event-listener-diskslow branch from fe86de5 to 922238d Compare January 26, 2023 23:19

itsbilal commented Jan 26, 2023

View reviewed changes

RaduBerinde reviewed Jan 26, 2023

View reviewed changes

craig bot merged commit efb4481 into cockroachdb:master Jan 27, 2023

sumeerbhola reviewed Jan 27, 2023

View reviewed changes

This was referenced Jan 27, 2023

release-22.2: storage: Make logging event listener async for DiskSlow #96095

Closed

release-22.2: storage: Make logging event listener async for DiskSlow #96096

Merged

jbowens mentioned this pull request Feb 6, 2023

storage: fix TestPebbleMetricEventListener #96651

Merged

nicktrav mentioned this pull request Feb 25, 2023

release-22.1: backport two disk-stall enhancements #97667

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: Make logging event listener async for DiskSlow #96021

storage: Make logging event listener async for DiskSlow #96021

itsbilal commented Jan 26, 2023

cockroach-teamcity commented Jan 26, 2023

blathers-crl bot commented Jan 26, 2023

sumeerbhola left a comment

itsbilal left a comment

sumeerbhola left a comment

itsbilal commented Jan 26, 2023

RaduBerinde left a comment

itsbilal commented Jan 26, 2023

RaduBerinde commented Jan 26, 2023

itsbilal commented Jan 26, 2023

craig bot commented Jan 26, 2023

itsbilal left a comment

RaduBerinde left a comment

itsbilal commented Jan 27, 2023

craig bot commented Jan 27, 2023

craig bot commented Jan 27, 2023

blathers-crl bot commented Jan 27, 2023

sumeerbhola left a comment

storage: Make logging event listener async for DiskSlow #96021

storage: Make logging event listener async for DiskSlow #96021

Conversation

itsbilal commented Jan 26, 2023

cockroach-teamcity commented Jan 26, 2023

blathers-crl bot commented Jan 26, 2023

sumeerbhola left a comment

Choose a reason for hiding this comment

itsbilal left a comment

Choose a reason for hiding this comment

sumeerbhola left a comment

Choose a reason for hiding this comment

itsbilal commented Jan 26, 2023

RaduBerinde left a comment

Choose a reason for hiding this comment

itsbilal commented Jan 26, 2023

RaduBerinde commented Jan 26, 2023

itsbilal commented Jan 26, 2023

craig bot commented Jan 26, 2023

itsbilal left a comment

Choose a reason for hiding this comment

RaduBerinde left a comment

Choose a reason for hiding this comment

itsbilal commented Jan 27, 2023

craig bot commented Jan 27, 2023

craig bot commented Jan 27, 2023

blathers-crl bot commented Jan 27, 2023

sumeerbhola left a comment

Choose a reason for hiding this comment