Priority index locking and logging of long operations #1847

shanson7 · 2020-06-09T10:55:12Z

Still needs cleaning up and more testing, but this PR changes the locking behavior in the index so that long read queries can't block ingest (Update and AddOrUpdate calls).

There is a scenario (common in our case) where long index queries (say 5s) can block the acquisition of a write lock (like in the WriteQueue or add if not using the WriteQueue) which in turn blocks new read locks (like in Update). This means long index read ops can actually block ingest.

This PR alleviates that problem by differentiating the possibly slow operations and the almost certainly fast operations so that a write lock is still blocked on long index operations, but the fast operations should only get blocked by write locks (which we should make sure are fast).

It also adds logging around long lock waits/holds for non-high priority locking. I decided not to track the timings for those, since we may be doing them many thousands of times per second.

shanson7 · 2020-06-09T10:56:36Z

idx/memory/write_queue.go

+		maxBuffered:  maxBuffered,
+		maxDelay:     maxDelay,
+		flushTrigger: make(chan struct{}, 1),
+		flushPending: false,


This is a large change in WriteQueue behavior to make maxBuffered more of a heuristic than an absolute max. This was done to take more control of the locking in the flush function.

shanson7 · 2020-06-09T10:57:24Z

idx/memory/write_queue.go

 			if !timer.Stop() {
 				<-timer.C
 			}
 			return
 		}
 	}
 }
+
+func (wq *WriteQueue) isFlushPending() bool {


The intent was to use this in testing. Or something similar.

shanson7 · 2020-06-09T11:01:00Z

idx/memory/priority_lock.go

+	// Wait for any pending writes
+	bc := BlockContext{lock: pm, preLockTime: time.Now()}
+	pm.lowPrioLock.RLock()
+	pm.lock.RLock()


I'm not entirely convinced we actually need to lock both of these here, if Lock is guaranteed to acquire both write locks.

idx/memory/priority_lock.go

Dieterbe · 2020-08-13T16:47:47Z

At a high level, the PriorityRWMutex seems to make sense.

Here's my breakdown

Master

Situation	Reads Behavior	Writes behavior
fast read executing	fast+slow reads exec immediately	write executes as soon as earlier reads complete
slow read executing	fast+slow reads exec immediately	write executes as soon as earlier reads complete
fast read executing, write lock waiting	fast+slow reads wait until read+write complete [1]	write executes as soon as earlier reads complete
slow read executing, write lock waiting	fast+slow reads wait until read+write complete [2]	write executes as soon as earlier reads complete

[1] it wouldn't be so bad if we could squeeze in some fast reads while the original fast read is still executing, but how many do we want to squeeze in? Where do we want to draw
the line as to how long we delay the write? This is a bit more of an ambiguous case and probably not worth any hassle.
[2] this sucks. we want to execute fast reads straight away

PR 1847

Situation	Reads Behavior	Writes behavior	Comment
fast read executing	fast+slow reads exec immediately	write executes as soon as earlier reads complete	write blocks on `lock`
slow read executing	fast+slow reads exec immediately	write executes as soon as earlier reads complete	write blocks on `lowPrioLock`
fast read executing, write lock waiting	fast+slow reads wait until read+write complete	write executes as soon as earlier reads complete	write blocks on `lock`
slow read executing, write lock waiting	fast R exec imm, slow R waits until write completes	write executes as soon as earlier reads complete	write blocks on `lowPrioLock`

So
the same:

slow reads still scheduled after writes (for fairness)

new:

fast reads can execute any time except during a write

So this tackles precisely the 2nd case.

shanson7 · 2020-08-13T17:46:54Z

Yes! Excellent breakdown.

Dieterbe · 2020-08-13T21:20:40Z

So, what difference in behavior do you observe with this PR applied? Were you observing missing data due to the ingestion blocks? Or was it more along the lines of instances marking themselves as unavailable due to too high priority (kafka lag) ?

I suspect that your request latencies are unchanged?

shanson7 · 2020-08-14T08:21:41Z

So, what difference in behavior do you observe with this PR applied? Were you observing missing data due to the ingestion blocks? Or was it more along the lines of instances marking themselves as unavailable due to too high priority (kafka lag) ?

Yes, this mainly solves the issue of instances getting marked unavailable due to queries that took a few seconds to and blocked at the same time a write was waiting.

I suspect that your request latencies are unchanged?

We did see a very small improvement to long tail latencies, but I expect that was mostly due to fewer "bursts" of ingest after being blocked for a few seconds. Most of the latency improvement we got stemmed from enabling the write queue so the write locks were acquired less frequently.

Dieterbe · 2020-08-24T18:23:04Z

the prioritymutex makes sense to me, though i need to review the writeQueue buffering/flushing changes still.
how do you feel about the quality of this? have you been running this in prod?

shanson7 · 2020-08-25T09:00:00Z

the prioritymutex makes sense to me, though i need to review the writeQueue buffering/flushing changes still.
how do you feel about the quality of this? have you been running this in prod?

High confidence. We've been running it in prod for about 2 months now.

Dieterbe · 2020-08-26T13:22:32Z

The write queue change makes sense too. Since getting the index lock is slow, until we acquire it, keep ingesting into the write queue, and thus possibly breach maxBuffered by however many entries come in while we wait for the index lock. Seems like a very sensible tradeoff.

shanson7 commented Jun 9, 2020

View reviewed changes

shanson7 mentioned this pull request Jun 19, 2020

request: a way to block expensive/problematic queries #1850

Closed

replay requested review from replay and removed request for replay July 6, 2020 14:20

shanson7 force-pushed the log_long_idx_locks branch from 0f74e7c to fd9f586 Compare August 4, 2020 15:57

shanson7 mentioned this pull request Aug 12, 2020

meta tags: avoid frequent write locks #1881

Merged

Dieterbe reviewed Aug 12, 2020

View reviewed changes

idx/memory/priority_lock.go Outdated Show resolved Hide resolved

Dieterbe reviewed Aug 12, 2020

View reviewed changes

idx/memory/priority_lock.go Show resolved Hide resolved

shanson7 added 10 commits August 13, 2020 10:05

Add logs around long Lock call

3db6365

Add logs around read ops

62893e6

Use Priority lock, write queue flush is only called from flush loop

2219c7b

Actually include priority lock this time

fd9733b

remove old code

b0f5113

Fix deadlock situation

b975c69

Fix misuse of WaitGroup

7821bc5

Fix unlock order

063abf9

Fix test

5d2b72d

Add some documentation

5d1441a

shanson7 force-pushed the log_long_idx_locks branch from a8b4770 to 5d1441a Compare August 13, 2020 09:06

Remove unused function

e2cb292

Dieterbe self-assigned this Aug 24, 2020

Dieterbe added this to the sprint-16 milestone Aug 24, 2020

Fix merge conflicts

d95174e

Remove unneeded aliases, clarify unlock calls

0bca3d1

Dieterbe changed the title ~~WIP - Priority index locking and logging of long operations~~ Priority index locking and logging of long operations Aug 26, 2020

Dieterbe approved these changes Aug 26, 2020

View reviewed changes

Dieterbe merged commit 4e5172e into grafana:master Aug 26, 2020

shanson7 mentioned this pull request Sep 21, 2020

*DNM* Log long idx locks bloomberg/metrictank#96

Closed

shanson7 deleted the log_long_idx_locks branch October 22, 2021 16:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Priority index locking and logging of long operations #1847

Priority index locking and logging of long operations #1847

shanson7 commented Jun 9, 2020

shanson7 Jun 9, 2020

shanson7 Jun 9, 2020

shanson7 Jun 9, 2020

Dieterbe commented Aug 13, 2020

shanson7 commented Aug 13, 2020

Dieterbe commented Aug 13, 2020

shanson7 commented Aug 14, 2020

Dieterbe commented Aug 24, 2020

shanson7 commented Aug 25, 2020

Dieterbe commented Aug 26, 2020

Priority index locking and logging of long operations #1847

Priority index locking and logging of long operations #1847

Conversation

shanson7 commented Jun 9, 2020

shanson7 Jun 9, 2020

Choose a reason for hiding this comment

shanson7 Jun 9, 2020

Choose a reason for hiding this comment

shanson7 Jun 9, 2020

Choose a reason for hiding this comment

Dieterbe commented Aug 13, 2020

Master

PR 1847

shanson7 commented Aug 13, 2020

Dieterbe commented Aug 13, 2020

shanson7 commented Aug 14, 2020

Dieterbe commented Aug 24, 2020

shanson7 commented Aug 25, 2020

Dieterbe commented Aug 26, 2020