agent: Cache can return errors older than the last non-error result #4480
Labels
theme/connect
Anything related to Consul Connect, Service Mesh, Side Car Proxies
type/bug
Feature does not function as expected
Milestone
Assume the first call to the Roots endpoint failed. That cache entry looks a little like this (pseudo code)
The next blocking call comes in with index=1 since we set that in the leaf client:
consul/agent/cache-types/connect_ca_leaf.go
Lines 171 to 175 in ec755b4
So the get makes another request to roots. Let's assume this request succeeds.
The non-nil value is populated into the new cache entry, and the index updated:
consul/agent/cache/cache.go
Lines 346 to 364 in ec755b4
The cache Entry now looks like this:
Then we close the
Waiter
chan unblocking anyGet
calls which loop and hit:consul/agent/cache/cache.go
Lines 211 to 231 in ec755b4
But since there is now actually a change in Index we make it through OK and the comment there explains why the Error value was left and gets ignored in this case.
So far so good.
Then the next blocking
cache.Get
comes in with the same index123
.Assume it times out (because roots didn't change) which means the roots
Fetch
returns a good response (same as one in cache) and the same Index.consul/agent/cache/cache.go
Lines 346 to 364 in ec755b4
That is a no-op since it's setting same value, index, and valid values. So the cache entry still looks the same:
BUT we close the
Waiter
chan which causes ourGet
to unblock again.This time we have the same index as the cached block so we fail to enter the happy path here:
consul/agent/cache/cache.go
Lines 211 to 212 in ec755b4
We should go back to blocking on another query if
timeoutCh
hasn't fired yet but instead we hit:consul/agent/cache/cache.go
Lines 233 to 240 in ec755b4
And so we return (early) the error from before the current valid cached result!
The text was updated successfully, but these errors were encountered: