Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: Make logging event listener async for DiskSlow #96021

Merged
merged 1 commit into from
Jan 27, 2023

Conversation

itsbilal
Copy link
Member

The pebble logger could block if we're experiencing a slow / stalling disk. If the call to the pebble logger is synchronous from the EventListener passed into Pebble, it could end up slowing down Pebble's internal disk health checks as those rely on EventListener methods being quick to run.

This change updates the logging event listener to asynchronously call the logger on a DiskSlow event.

Related to #94373.

Epic: none

Release note: None.

@itsbilal itsbilal requested a review from a team as a code owner January 26, 2023 19:51
@itsbilal itsbilal self-assigned this Jan 26, 2023
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@blathers-crl
Copy link

blathers-crl bot commented Jan 26, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal and @RaduBerinde)


pkg/storage/pebble.go line 995 at r1 (raw file):

	// up Pebble's internal disk health checking, and better obeys the
	// EventListener contract for not having any functions block or take a while to
	// run.

Can you add something like
// Creating goroutines is acceptable given their low cost, and the low write concurrency to Pebble's FS (Pebble compactions + flushes + SQL spilling to disk). If the maximum concurrency of DiskSlow events increases significantly in the future, we can improve the logic here by queueing up most of the logging work (except for the Fatalf call), and have it be done by a single goroutine.

Hmm, what if someone configured the system such that the logger had 0 write bandwidth (say on a different device) and Pebble's FS was slow enough to cause this event to fire for each write, but be lower than MaxSyncDuration. We'd keep generating new goroutines that would get stuck and eventually have a huge number of goroutines. Maybe this is not a real problem because if the logger is getting stuck all the other goroutines in CockroachDB that are logging would get stuck too. I don't really understand how our logging system is implemented -- does it buffer log statements with severity other than Severity_FATAL?


pkg/storage/pebble.go line 1004 at r1 (raw file):

		// Run oldDiskSlow asynchronously.
		go oldDiskSlow(info)
	}

In light of ExitTimeoutOnFatalLog being 20s, and to obey the contract that Pebble is specifying, should we put the DiskSlow returned by makeMetricEtcEventListener behind a goroutine too?

@itsbilal itsbilal force-pushed the logging-event-listener-diskslow branch from 7c91314 to 782a25d Compare January 26, 2023 21:16
Copy link
Member Author

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @sumeerbhola)


pkg/storage/pebble.go line 995 at r1 (raw file):

Previously, sumeerbhola wrote…

Can you add something like
// Creating goroutines is acceptable given their low cost, and the low write concurrency to Pebble's FS (Pebble compactions + flushes + SQL spilling to disk). If the maximum concurrency of DiskSlow events increases significantly in the future, we can improve the logic here by queueing up most of the logging work (except for the Fatalf call), and have it be done by a single goroutine.

Hmm, what if someone configured the system such that the logger had 0 write bandwidth (say on a different device) and Pebble's FS was slow enough to cause this event to fire for each write, but be lower than MaxSyncDuration. We'd keep generating new goroutines that would get stuck and eventually have a huge number of goroutines. Maybe this is not a real problem because if the logger is getting stuck all the other goroutines in CockroachDB that are logging would get stuck too. I don't really understand how our logging system is implemented -- does it buffer log statements with severity other than Severity_FATAL?

From what I'm reading, yes it'd buffer log statements and their respective goroutines until their appropriate output "turn" has occurred. It's really only Fatals that get the "exit in 20s if nothing happens" timer.

Made the comment change.


pkg/storage/pebble.go line 1004 at r1 (raw file):

Previously, sumeerbhola wrote…

In light of ExitTimeoutOnFatalLog being 20s, and to obey the contract that Pebble is specifying, should we put the DiskSlow returned by makeMetricEtcEventListener behind a goroutine too?

Done. Think it's a good idea.

@itsbilal itsbilal force-pushed the logging-event-listener-diskslow branch 2 times, most recently from 0023170 to fe86de5 Compare January 26, 2023 21:36
Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r4, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @RaduBerinde)

@itsbilal
Copy link
Member Author

TFTR!

bors r=sumeerbhola

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @itsbilal and @sumeerbhola)


pkg/storage/pebble.go line 1098 at r4 (raw file):

				// quickly.
				if fatalOnExceeded {
					go log.Fatalf(ctx, "file write stall detected: %s", info)

What is the logic behind this go? The reasoning above applies well to the case below but here we will crash, so not much point in returning control. Plus the stack trace inside Fatalf will be much less useful

@itsbilal
Copy link
Member Author

pkg/storage/pebble.go line 1098 at r4 (raw file):

Previously, RaduBerinde wrote…

What is the logic behind this go? The reasoning above applies well to the case below but here we will crash, so not much point in returning control. Plus the stack trace inside Fatalf will be much less useful

The concern is that the ExitTimeoutOnFatalLog at 20s is a little too high. I guess at that point we are likely to crash no matter what, but it's best to try all avenues to crashing instead of waiting, say, 19s for one log to go through.

@RaduBerinde
Copy link
Member

pkg/storage/pebble.go line 1098 at r4 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

The concern is that the ExitTimeoutOnFatalLog at 20s is a little too high. I guess at that point we are likely to crash no matter what, but it's best to try all avenues to crashing instead of waiting, say, 19s for one log to go through.

Are you saying that if we return control here we might also crash through a different pathway that does not involve Fatalf?

@itsbilal
Copy link
Member Author

pkg/storage/pebble.go line 1098 at r4 (raw file):

Previously, RaduBerinde wrote…

Are you saying that if we return control here we might also crash through a different pathway that does not involve Fatalf?

I realized that's somewhat unlikely after I wrote that comment, so yes I can probably revert this go.

bors cancel

@craig
Copy link
Contributor

craig bot commented Jan 26, 2023

Canceled.

The pebble logger could block if we're experiencing a slow
/ stalling disk. If the call to the pebble logger is synchronous
from the EventListener passed into Pebble, it could end up slowing
down Pebble's internal disk health checks as those rely on EventListener
methods being quick to run.

This change updates the logging event listener to asynchronously
call the logger on a DiskSlow event.

Related to cockroachdb#94373.

Epic: none

Release note: None.
@itsbilal itsbilal force-pushed the logging-event-listener-diskslow branch from fe86de5 to 922238d Compare January 26, 2023 23:19
Copy link
Member Author

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @RaduBerinde and @sumeerbhola)


pkg/storage/pebble.go line 1098 at r4 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

I realized that's somewhat unlikely after I wrote that comment, so yes I can probably revert this go.

bors cancel

Fixed, PTAL

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @sumeerbhola)


pkg/storage/pebble.go line 1098 at r4 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

Fixed, PTAL

LGTM!

@itsbilal
Copy link
Member Author

bors r=sumeerbhola

@craig
Copy link
Contributor

craig bot commented Jan 27, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Jan 27, 2023

Build succeeded:

@craig craig bot merged commit efb4481 into cockroachdb:master Jan 27, 2023
@blathers-crl
Copy link

blathers-crl bot commented Jan 27, 2023

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error creating merge commit from 922238d to blathers/backport-release-22.2-96021: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.2.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/storage/pebble.go line 1098 at r4 (raw file):

Plus the stack trace inside Fatalf will be much less useful

That is compelling enough for this change.
I was being a stickler about obeying the contract with Pebble that no IO should happen on the DiskSlow callback. It makes sense to make an exception when we are guaranteed to crash.

jbowens added a commit to jbowens/cockroach that referenced this pull request Feb 6, 2023
In cockroachdb#96021 handling of disk slow events was made asynchronous. This introduced a
race during Close where the goroutine logging could still be inflight.

Fix cockroachdb#96635.

Epic: None
Release note: None
craig bot pushed a commit that referenced this pull request Feb 6, 2023
95245: sem/tree: avoid heap allocs during pretty-printing r=chengxiong-ruan,rafiss a=knz

Everywhere applicable, change
```go
  for i, xxx := range yyy {
      ctx.FormatNode(&xxx)
  }
```
to:
```go
  for i := range yyy {
      ctx.FormatNode(&yyy[i])
  }
```

Epic: None

96476: sql: validate function volatility r=chengxiong-ruan a=chengxiong-ruan

This commits adds a simple udf volatility check by utilizing
the volatilities collected by the optbuilder when building a
statement. Errors are throw if any statement is not as strict
as the target volatility. Stable function folding is turned off
when building statements in CREATE FUNCTION to catch stable
function's volatility.

Release note (sql change): Previously, UDFs can be created with
any volatility no matter if the function body statements contain
any expression which would violate the target volatility. For
example, an immutable function might use `random()` in it. This
change added validations to guarantee that all statements in the
function body should be as strict as the expected UDF volatility.

Informs: #87699

96515: ui: fix time scale selection r=maryliag a=maryliag

When making a call to retrieve the list of fingerprints per index, it was suppose to use the round dates from the time picker, otherwise it wouldn't get the rounded hour that has the aggregated timestamp.

Epic: None
Release note: None

96651: storage: fix TestPebbleMetricEventListener r=itsbilal a=jbowens

In #96021 handling of disk slow events was made asynchronous. This introduced a race during Close where the goroutine logging could still be inflight.

Fix #96635.

Epic: None
Release note: None

Co-authored-by: Raphael 'kena' Poss <[email protected]>
Co-authored-by: Chengxiong Ruan <[email protected]>
Co-authored-by: maryliag <[email protected]>
Co-authored-by: Jackson Owens <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants