diff --git a/docs/RFCS/20200810_leaseholder_cache_invalidation.md b/docs/RFCS/20200810_leaseholder_cache_invalidation.md
new file mode 100644
index 000000000000..7c5d44505ebf
--- /dev/null
+++ b/docs/RFCS/20200810_leaseholder_cache_invalidation.md
@@ -0,0 +1,625 @@
+- Feature Name: Range leaseholder cache invalidation
+- Status: draft
+- Start Date: 2020-08-10
+- Authors: knz, nathan, andrei
+- RFC PR: [#52593](https://github.com/cockroachdb/cockroach/pull/52593)
+- Cockroach Issue: [This github card](https://github.com/cockroachdb/cockroach/projects/19#card-36490737),
+  [#50199](https://github.com/cockroachdb/cockroach/issues/50199)
+
+# Summary
+
+This RFC propose an improvement to the range cache to reduce the
+unavailability and performance irregularities that occur when nodes
+are restarted, migrated or simply become unavailable.
+
+To understand the proposal, consider the following scenario:
+
+- consider 3 nodes n1, n2, n3;
+- n1 has a lease initially;
+- n3 knows about n1's lease via its cache;
+- then n1's lease is transfered to n2.
+
+In that scenario, *today* node n3 does not get informed proactively
+that the lease has been transferred to n2. Instead, upon the first
+next request for the range, n3 will first attempt to send the request
+to n1. Then it expects n1 to *redirect* the request to n2. If n1
+happens to still be running, this works; however, if n1 is down (either
+planned or unplanned), the request fails to reach n1, then
+n3 starts trying every other node and discover the lease on n2.
+
+This stall and subsequent discovery process have been observed to
+cause measurable latency blips and other unexpected behavior, which
+this RFC offers to adddress.
+
+The selected proposal combines two mechanisms, denoted as [solutions
+4&6 below](#summary-of-solutions-and-choice): for background lease
+transfers, it introduces low-traffic gossiping of new leaseholders;
+and for node shutdowns/restarts it makes range caches use liveness
+records.
+
+# Motivation
+
+## Scenarios
+
+This RFC looks at three scenarios:
+
+**Scenario A.** Nodes get shutdown gracefully. In this case, there is a possibility
+for the node being shut down to "announce" this to the rest of the
+cluster.
+
+Today the mechanism for graceful shutdown moves all the leases from
+the node being shut down to other nodes. Both the stopped node and
+the other nodes with the transferred leases have up-to-date
+knowledge.
+
+(These would be n1 and n2 in the example at the beginning.)
+
+However, 3rd party nodes that don't receive leases do not get
+informed of this process. So they may retain an outdated cache entry
+that points to the node being shut down. As long as the node is
+down (e.g. during a restart) any further queries to ranges
+previously leased to that node will experience transient failures.
+
+(This would be n3 in the example at the beginning.)
+
+**Scenario B.** Nodes disappear from the cluster ungracefully -- either because the
+process crashes or a network partition. In this case, the remainder
+of the cluster must discover this fact. We consider the case
+where the remainder of the cluster votes for a lease to appear
+elsewhere. The replicas involved in the lease vote know
+about the new leaseholder.
+
+(In the example at the beginning, we can assume replication factor 1;
+then n2 could self elect to be leaseholder for the range when n1 dies.)
+
+However, 3rd party nodes will not be informed of the new lease
+election. So they may retain an outadated cache entry, like above,
+with similar consequences.
+
+Worse even, if there is a network partition, the attempt to connect to
+the now-unreachable node may encounter a much higher delay (TCP
+timeout, 30s by default—note that our existing RPC heartbeat timeout,
+which drops unhealthy connection, only kicks in after a connection has
+been established. For brand new connections, the timeout can be
+longer.).
+
+(This would be n3 in the example at the beginning.)
+
+**Scenario C.** Leases get transferred from one node to another
+as part of the background rebalancing that is continuously happening
+throughout the culture, or in response to zone config changes.
+
+(In the example at the beginning, a lease gets “naturally” transferred
+from n1 to n2.)
+
+In this case, when a lease gets transferred from one to another node,
+the other nodes are not notified. So they may retain an outadated
+cache entry, like above, with similar consequences.
+
+(In the example at the beginning, n3 does nopt know of the transfer
+between n1 and n3.)
+
+**All three scenarios** have the following in common:
+
+- range lease transferred from n1 to n2.
+- n1 becomes unreachable.
+- n3 then mistakenly tries
+  to reach out to n1 which it believes is still leaseholder.
+
+In this context we want to:
+
+1. reduce tail latencies by reducing mis-routing
+2. help with the pathological behaviors in cases when a node is
+   unresponsive and other nodes contact it even though it lost its leases
+   (e.g. #37906).
+
+# Background
+
+## Gossip
+
+The gossip subsystems propagates K/V pairs.
+
+Each pair has a "time to live" (TTL) after which it disappears from
+the gossip network.
+
+Nodes can "subscribe" to particular key prefixes to receive a
+notification (a code callback) every time a new value is received for
+a particular key.
+
+**All K/V data in gossip is copied on every node in the network.** (In
+a lazy manner, but still: every node has a copy of all the data under
+TTL.)
+
+## Draining
+
+A graceful node shutdown proceeds as follows (simplified):
+
+1. stops accepting new leases.
+2. stop SQL clients and distsql flows.
+3. marks node as draining in its liveness record.
+4. transfers leases to other nodes.
+5. process terminates.
+
+## Liveness updates
+
+Remember from above [draining nodes update their liveness
+  records](#draining). Liveness record updates are automatically
+gossipped.
+
+Separately, every time a node notices that another node has
+disappeared (when it looks the node up, and notices that there was no
+recent heartbeat to the liveness record), it marks it as unavailable
+(bumps its epoch). This also gets automatically gossipped.
+
+## General CockroachDB design principles
+
+“Don't pay for what you don't use”
+
+We try as much as possible to not store data on a node that is not
+necessary for requests served by this node. In particular, we avoid
+O(N) storage requirements on every node, where N is the total
+number of ranges in the cluster.
+
+“Caches don't have a minimum size requirement”
+
+It should be possible to size caches up and down (e.g. down to obey
+memory limit constraints) and still benefit from them. A cache that
+only works if its size is larger than some number defined cluster-wide
+is undesirable.
+
+In the remainder of the RFC, the “cache” in this principle designates
+the gossip K/V space as a whole. It means we don't want to mandate
+that gossip grows to arbitrarily large number of entries to make
+the solutions work.
+
+## Multi-tenant CockroachDB
+
+Multi-tenant splits the system in 3 components:
+
+1. SQL proxy
+2. SQL tenant servers
+3. KV servers
+
+Each SQL tenant server hosts a `DistSender` which is responsible for
+routing requests to ranges and nodes. Each DistSender has a range cache.
+
+However, only the KV servers are connected to each other and partake in
+gossip. SQL tenant servers each are connected to exactly 1 KV server
+and do not have access to gossip.
+
+# Guide-level explanation
+
+All the technical solutions that were considered here are invisible to
+end-users of CockroachDB.
+
+For the CockroachDB eng teams, see [the summary section](#summary-of-solutions-and-choice) below.
+
+# Reference-level explanation
+
+The following solutions were considered:
+
+[1. Cluster-wide invalidation via RPC](#1-cluster-wide-invalidation-via-RPC)
+
+[2. Gossiping lease transfers](#2-gossiping-lease-transfers)
+
+[3. Gossiping lease acquisitions](#3-gossiping-lease-acquisitions)
+
+[4. Liveness-based cache invalidation with poison](#4-liveness-based-cache-invalidation-with-poison)
+
+[5. Liveness-based cache invalidation with bypass](#5-liveness-based-cache-invalidation-with-bypass)
+
+[6. Best-effort gossiping of lease acquisitions](#6-best-effort-gossiping-of-lease-acquisitons)
+
+All these solutions were proposed in the context of single-tenant CockroachDB.
+For each of them, there is a trivial extension to make it work in the multi-tenant case, explained in a sub-section.
+
+[Summary of solutions and choice](#summary-of-solutions-and-choice)
+
+## 1. Cluster-wide invalidation via RPC
+
+Outline:
+
+- between steps 4 and 5 of the [draining process](#draining), the node
+  shutting down calls a new RPC `InvalidateNode()` on every other node
+  in the cluster to announce itself as now-unavailable.
+- upon receiving `InvalidateNode()`, each other node removes any
+  range entry for that node from their cache.
+- upon the next first request, the nodes would go through
+  the discovery process to gain knowledge of the new leaseholder.
+
+**Pros:**
+
+- Addesses [scenarios A and C](#scenarios).
+- Simple to explain/understand, simple to implement.
+- It doesn't rely on gossip, so assuming the cluster is otherwise
+  healthy it will reliably invalidate/update all the caches as a
+  synchronous part of the draining process.
+
+**Cons:**
+
+- Does not address [scenario B](#scenarios).
+- A cluster-wide RPC will not propagate to nodes that are briefly
+  unavailable. When these nodes come back, their cache is still
+  outdated. So even [scenarios A and C](#scenarios) are only partially
+  covered.
+- This still depends on the lease discovery mechanism. If
+  the discovery re-attempts to use the node that just went down,
+  that will block/fail. If the discovery takes a while, some
+  client requests will block/fail.
+
+  This discovery process could be skipped if the `InvalidateNode()`
+  request included the new leaseholders (as in the "drain record"
+  proposed for option 2 below). OTOH, this would mean pushing a lot of
+  data out of the draining node in a less network-efficient way than
+  gossip (but more memory-efficient).
+
+### Multi-tenant variant
+
+- The KV servers receive the `InvalidateNode()` RPC from other KV servers.
+- The KV servers *forward* `InvalidateNode()` RPCs to SQL tenants,
+  where the range caches are stored.
+
+## 2. Gossiping lease transfers
+
+Outline:
+
+- at the end of step 4 of the [draining process](#draining), the node
+  shutting down emits a gossip "drain record" keyed on its own node ID,
+  with a payload containing the list of all range IDs it had
+  a lease for, and for each of them the destination of the lease.
+- upon receiving a drain record, every other node updates the
+  entries in their lease cache to point to the new node.
+
+**Pros:**
+
+- Addresses [scenarios A and C](#scenarios).
+- By using gossip, the data will eventually reach every other
+  node even those that were temporarily unavailable.
+  (i.e. eliminates one cons of solution 1 above)
+- By informing the other nodes of the new leaseholder,
+  this solution accelerates/removes the need for the
+  lease discovery mechanism.
+  (i.e. eliminates the other cons of solution 1 above)
+
+**Cons:**
+
+- Does not address [scenario B](#scenarios).
+- Each "drain record" in gossip has N entries, where N is the previous
+  number of leases in that node. This number can be arbitrarily
+  large, up to the total number of ranges in the cluster.
+  This violates the [design principles](#general-cockroachdb-design-principles).
+
+  We can partially alleviate this cost by populating the lease expiry
+  time as the value of [gossip TTL](#gossip). This reduces
+  the space complexity from O(#ranges) to O(#active ranges).
+
+  Note from Ben:
+
+  > Choosing a timeout here is tricky (there's no particular reason
+  > for it to be related to the lease expiry time). Too long and you
+  > increase the gossip memory requirements. Too short and it may
+  > expire before getting propagated to all nodes and you're back to
+  > dealing with stale caches. (and because it's all on one gossip
+  > key, expiration is all or nothing. It may even be possible that
+  > large gossip values are more likely to experience delays
+  > propagating through the system).
+
+- Due to the eventual consistency of gossip, there's an potentially
+  long amount of time between the moment a node terminates ([step 5 of
+  drain](#draining)) and the moment other nodes learn of the drain
+  record. During that time, requests can be mis-routed.
+
+  This last cons can be alleviated by extending step 5 of the draining
+  process by a "sleep period" to let gossip propagate, calibrated to
+  the standard cross-cluster gossip delivery latency (for reference,
+  that's around 10-20s for up to 100 geo-distributed nodes in normal
+  operation).
+
+### Multi-tenant variant
+
+- The KV servers receive the "drain records" over gossip from other KV servers.
+- The KV servers *forward* the drain records to SQL tenants,
+  where the range caches are stored.
+
+## 3. Gossiping lease acquisitions
+
+Outline:
+
+- each node acquiring a lease (either because it was given to it
+  during another's node drain, or it is voted to be when a node goes
+  AWOL) gossips this fact in a "lease record", keyed by the range ID
+  with the lease as payload.
+- upon receiving "lease records", other nodes update their cache
+  with the gossiped lease, if it's newer than what they know.
+
+This approach was prototyped here: https://github.com/cockroachdb/cockroach/pull/52572
+
+**Pros:**
+
+- Addresses both [scenarios A, B and C](#scenarios).
+- By using gossip, the data will eventually reach every other
+  node even those that were temporarily unavailable.
+  (i.e. eliminates one cons of solution 1 above)
+- By informing the other nodes of the new leaseholder,
+  this solution accelerates/removes the need for the
+  lease discovery mechanism.
+  (i.e. eliminates the other cons of solution 1 above)
+
+**Cons:**
+
+- There is one "lease record" in gossip per range.
+  This violates the [design principles](#general-cockroachdb-design-principles).
+
+  We can partially alleviate this cost by populating the lease expiry
+  time as the value of [gossip TTL](#gossip). This reduces
+  the space complexity from O(#ranges) to O(#active ranges).
+
+  Remark from Andrei:
+
+  > Besides the TTL, the other big possible mitigation here is various
+  > rate limits - like a rate limit per node dictating how many
+  > acquisitions per second it can add to gossip, and maybe also one
+  > in gossip about how many of these updates it forwards every second
+  > (like, once that is reached, a node could pretend that the TTL is
+  > hit for everything else it receives). I don't know how feasible
+  > the latter is.
+
+  Remark from Ben:
+
+  > There is some room for cleverness here — we could choose not to
+  > gossip *every* lease acquisition (and similarly, we could use a
+  > rather short TTL and not worry if some records get dropped before
+  > propagating). Maybe have a rate limit and prioritize those ranges
+  > with higher traffic, for example.
+  >
+  > With a short enough TTL, we can think of the gossip costs here
+  > primarily in terms of the communications cost instead of the
+  > memory size.
+
+  NB: the above remarks do not take into account scenarios A and B,
+  when a large number of leases get transferred at once and it's likely
+  important that every other node takes note of all of it.
+
+- Like above, there's a delay between the moment a node terminates
+  ([step 5 of drain](#draining)) and the moment other nodes learn of
+  the new lease records. During that time, requests can be mis-routed.
+
+  Like above, this last cons can be alleviated by extending step 5 of
+  the draining process by a "sleep period".
+
+### Multi-tenant variant
+
+- The KV servers receive the "lease records" over gossip from other KV servers.
+- The KV servers *forward* the lease records to SQL tenants,
+  where the range caches are stored.
+
+## 4. Liveness-based cache invalidation with poison
+
+Outline:
+
+- [liveness updates](#liveness-updates) are gossiped as usual.
+- upon receiving a liveness update that indicates that another node is
+  not live, each node updates its range cache as follows:
+
+  - invalidate all leases pointing to the now-unavailable node, ie.
+    remove the lease from the cache.
+  - mark that store/node ID as "not preferred" for the subsequent lease
+    discovery. This *“poisons”* that store/node ID.
+
+- upon a cache miss, the range cache will still do lease discovery,
+  but it will skip over "not preferred" nodes/stores in the first
+  round.
+
+  Only if a lease is not discovered during the first round, are
+  poisoned nodes added to the lookup in the second and next rounds.
+
+**Pros:**
+
+- Addresses both [scenarios A and B](#scenarios).
+- By using gossip, the data will eventually reach every other
+  node even those that were temporarily unavailable.
+  (i.e. eliminates one cons of solution 1 above)
+- Does not add new data in gossip. Does not violate the [design
+  principles](#general-cockroachdb-design-principles).
+
+**Cons:**
+
+- Does not address [scenario C](#scenarios): if there is no node
+  liveness changes, other nodes are not informed of lease transfers.
+- Added complexity in the range cache to mark node/store IDs as
+  poisoned.
+- Added complexity in the lease discovery process.
+- There is still a need for lease discovery; there is a chance
+  that the first next replica contacted during discovery
+  will take a while to respond.
+- Like above, there's a delay between the moment a node terminates
+  ([step 5 of drain](#draining)) and the moment other nodes learn of
+  the new lease records. During that time, requests can be mis-routed.
+
+  Like above, this last cons can be alleviated by extending step 5 of
+  the draining process by a "sleep period".
+
+### Multi-tenant variant
+
+- The KV servers receive the liveness updates over gossip from other
+  KV servers.
+- The KV servers *forward* the liveness updates to SQL tenants,
+  where the range caches are stored.
+
+## 5. Liveness-based cache invalidation with bypass
+
+Outline:
+
+- [liveness updates](#liveness-updates) are gossiped as usual.
+- nothing particular happens upon receiving a liveness update
+  (other than updating the liveness cache as usual).
+- upon a cache lookup, the range cache consults liveness.
+
+  If the entry in the cache point to a node for which liveness
+  currently says the node is not live, the cache entry is ignored (is
+  *“bypassed”*), and another live replica is tried. If that replica
+  has a lease, then it will serve the request; otherwise it will
+  redirect to the updated lease and that will also refresh the cache.
+
+**Pros:**
+
+- Addresses both [scenarios A and B](#scenarios).
+- By using gossip, the data will eventually reach every other
+  node even those that were temporarily unavailable.
+  (i.e. eliminates one cons of solution 1 above)
+- Does not add new data in gossip. Does not violate the [design
+  principles](#general-cockroachdb-design-principles).
+- No new data stored in the range cache to mark nodes
+  as poisoned.
+
+**Cons:**
+
+- Does not address [scenario C](#scenarios): if there is no node
+  liveness changes, other nodes are not informed of lease transfers.
+- Some complexity in cache lookups, to consult liveness
+  upon both cache hits and misses.
+- Added complexity during the lease discovery process.
+- Like above, there's a delay between the moment a node terminates
+  ([step 5 of drain](#draining)) and the moment other nodes learn of
+  the new lease records. During that time, requests can be mis-routed.
+
+  Like above, this last cons can be alleviated by extending step 5 of
+  the draining process by a "sleep period".
+
+
+### Multi-tenant variant
+
+- The KV servers receive the liveness updates over gossip from other
+  KV servers.
+- The KV servers *forward* the liveness updates to SQL tenants,
+  where the range caches are stored.
+
+## 6. Best-effort gossiping of lease acquisitions
+
+Outline:
+
+- same principle as [solution 3
+  above](#3-gossiping-lease-acquisitions), however the TTL on gossip
+  entries is set to a short value.
+
+**Pros:**
+
+- Addresses [scenario C](#scenarios).
+- Partially addresses [scenario A](#scenarios).
+- By using gossip, the data will eventually reach every other
+  node even those that were temporarily unavailable.
+- By informing the other nodes of the new leaseholder,
+  this solution accelerates/removes the need for the
+  lease discovery mechanism.
+
+**Cons:**
+
+- Does not fully address [scenario A](#scenarios): the gossip records
+  may not propagate throughout the entire cluster, and during a node
+  shutdown (when many leases are transferred in a short time) some
+  lease transfer notifications may be lost.
+- Might cause a gossip storm during node restarts.
+
+  Remark from Ben about this:
+
+  > There is some room for cleverness here — we could choose not to
+  > gossip *every* lease acquisition (and similarly, we could use a
+  > rather short TTL and not worry if some records get dropped before
+  > propagating). Maybe have a rate limit and prioritize those ranges
+  > with higher traffic, for example.
+  >
+  > With a short enough TTL, we can think of the gossip costs here
+  > primarily in terms of the communications cost instead of the
+  > memory size.
+
+- Like above, there's a delay between the moment a node terminates
+  ([step 5 of drain](#draining)) and the moment other nodes learn of
+  the new lease records. During that time, requests can be mis-routed.
+
+  Like above, this last cons can be alleviated by extending step 5 of
+  the draining process by a "sleep period".
+
+## Summary of solutions and choice
+
+| Solution                                                                         | Addresses [scenario A](#scenarios) (restarts) | Addresses [scenario B](#scenarios) (node down) | Addresses [scenario C](#scenarios) (rebalance) | New data in gossip                        | New logic in cache         |
+|----------------------------------------------------------------------------------|-----------------------------------------------|------------------------------------------------|------------------------------------------------|-------------------------------------------|----------------------------|
+| [1](#1-cluster-wide-invalidation-via-RPC) (cluster-wide RPC)                     | partially                                     | no                                             | partially                                      | N/A                                       | no                         |
+| [2](#2-gossiping-lease-transfers) (gossip transfers)                             | yes                                           | no                                             | yes                                            | O(#leases)                                | no                         |
+| [3](#3-gossiping-lease-acquisitions) (gossip *all* acquisitions)                 | yes                                           | yes                                            | yes                                            | O(#ranges)                                | no                         |
+| [4](#4-liveness-based-cache-invalidation-with-poison) (liveness poison)          | yes                                           | yes                                            | no                                             | N/A                                       | Poisoned node/store IDs    |
+| [5](#5-liveness-based-cache-invalidation-with-bypass) (liveness bypass)          | yes                                           | yes                                            | no                                             | N/A                                       | Liveness bypass in lookups |
+| [6](#6-best-effort-gossiping-of-lease-acquisitions) (gossip *some* acquisitions) | partially                                     | no                                             | yes                                            | O(#ranges) but much lower than solution 3 | no                         |
+
+Solution 1 is undesirable because it does't address [scenario
+A&C fully](#scenarios): the synchronous RPC will fail to update
+nodes that are temporarily unreachable.
+
+Solution 2 is undesirable because it does not address [scenario
+B](#scenarios) and has an uncompressibly high gossip storage footprint
+that goes against the [design
+principles](#general-cockroachdb-design-principles).
+
+Solution 3 is desirable because it addresses [all 3
+scenarios](#scenarios). However it is suspicious because of the gossip
+storage requirement that goes against the [design
+principles](#general-cockroachdb-design-principles).
+
+Solutions 4 & 5 satisfy both [scenarios A & B](#scenarios) without
+incurring extra gossip cost. However, they may require one additional
+cross-node hop for every range with outdated cache information, to
+discover the new lease. In contrast, solutions 2 & 3 & 6 pre-populate the
+caches with the location of the new lease. Additionally, solutions 4 &
+5 do not satisfy scenario C, whereas solutions 3 & 6 do.
+
+Solution 5 is simpler to implement than 4 but may incur lock
+contention around the liveness cache in Go.
+
+Comment from Andrei, about the relative desirability of solutions 3 (perhaps 6)
+and 4+5:
+
+>  So, whereas I think Nathan sees more gossipping to
+>  be a risk to stability, I see it as aiming to help stability. It
+>  sucks for a node to lose its leases in whatever ways (e.g. say it
+>  shed them away through load balancing, or its so overloaded that
+>  it's failing to heartbeat its node liveness) but then that has no
+>  effect because nobody can find out about where the new leases
+>  are. For each individual situation where something like this
+>  happens there'll be something else to fix, but still I see people
+>  being up to date with lease locations as a stabilizing factor.
+
+Other comment from Andrei, about the future utility of solutions 3 & 6:
+
+>  I think we'd benefit from caches being kept up to date under all
+>  conditions (well, maybe not under extreme conditions where there's
+>  too many updates to propagate), and I'm also interested in
+>  expanding a gossiping mechanism that I hope we introduce here to
+>  also deal with range splits/merges. Range descriptor changes are
+>  even more disruptive to traffic than lease changes, so I want to be
+>  proactive about them too.
+
+Comment from Ben, about combining solutions:
+
+> We should make use of the gossiped liveness information [in any
+> case] to either proactively invalidate the lease cache or bypass
+> entries pointing to down nodes.
+>
+> [i.e. apply solutions 4&5 regardless of whichever solution we
+> want for scenario C]
+>
+> It's not out of the question to gossip lease acquisitions with a
+> short TTL as a best-effort supplement to pre-populate caches. It
+> sounds potentially expensive so we'd need to do a lot of testing,
+> but I think if this were available as an option I'd turn it on.
+>
+> [i.e. apply solution 6 for scenario C, with the understanding that
+> scenarios A and B are covered by solutions 4&5]
+
+### Selection
+
+Taking this feedback into account, the RFC proposes to apply solution 4
+or 5 to address scenarios A&B (the most important for v20.2), and
+combine it with solution 6 to address scenario C.
+
+## Unresolved questions
+
+TBD