Skip to content

Commit

Permalink
proposal: Added timely vote credits proposal. (#28162)
Browse files Browse the repository at this point in the history
Added timely vote credits proposal.
  • Loading branch information
bji authored Oct 20, 2022
1 parent f207af7 commit a2e1228
Showing 1 changed file with 190 additions and 0 deletions.
190 changes: 190 additions & 0 deletions docs/src/proposals/timely-vote-credits.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,190 @@
---
title: Timely Vote Credits
---

## Timely Vote Credits

This design describes a modification to the method that is used to calculate
vote credits earned by validator votes.

Vote credits are the accounting method used to determine what percentage of
inflation rewards a validator earns on behalf of its stakers. Currently, when
a slot that a validator has previously voted on is "rooted", it earns 1 vote
credit. A "rooted" slot is one which has received full committment by the
validator (i.e. has been finalized).

One problem with this simple accounting method is that it awards one credit
regardless of how "old" the slot that was voted on at the time that it was
voted on. This means that a validator can delay its voting for many slots in
order to survey forks and wait to make votes that are less likely to be
expired, and without incurring any penalty for doing so. This is not just a
theoretical concern: there are known and documented instances of validators
using this technique to significantly delay their voting while earning more
credits as a result.


### Proposed Change

The proposal is to award a variable number of vote credits per voted on slot,
with more credits being given for votes that have "less latency" than votes
that have "more latency".

In this context, "latency" is the number of slots in between the slot that is
being voted on and the slot in which the vote has landed. Because a slot
cannot be voted on until after it has been completed, the minimum possible
latency is 1, which would occur when a validator voted as quickly as possible,
transmitting its vote on that slot in time for it to be included in the very
next slot.

Credits awarded would become a function of this latency, with lower latencies
awarding more credits. This will discourage intentional "lagging", because
delaying a vote for any slots decreases the number of credits that vote will
earn, because it will necessarily land in a later slot if it is delayed, and
then earn a lower number of credits than it would have earned had it been
transmitted immediately and landed in an earlier slot.

### Grace Period

If landing a vote with 1 slot latency awarded more credit than landing that
same vote in 2 slots latency, then validators who could land votes
consistently wihthin 1 slot would have a credits earning advantage over those
who could not. Part of the latency when transmitting votes is unavoidable as
it's a function of geographical distance between the sender and receiver of
the vote. The Solana network is spread around the world but it is not evenly
distributed over the whole planet; there are some locations which are, on
average, more distant from the network than others are.

It would likely be harmful to the network to encourage tight geographical
concentration - if, for example, the only way to achieve 1 slot latency was to
be within a specific country - then a very strict credit rewards schedule
would encourage all validators to move to the same country in order to
maximize their credit earnings.

For this reason, the credits reward schedule should have a built-in "grace
period" that gives all validators a "reasonable" amount of time to land their
votes. This will reduce the credits earning disadvantage that comes from
being more distant from the network. A balance needs to be struck between the
strictest rewards schedule, which most strongly discourages intentional
lagging, and more lenient rewards schedules, which improves credit earnings
for distant validators who are not artificially lagging.

Historical voting data has been analyzed over many epochs and the data
suggests that the smallest grace period that allows for very minimal impact on
well behaved distant validators is 3 slots, which means that all slots voted
on within 3 slots will award maximum vote credits to the voting validator.
This gives validators nearly 2 seconds to land their votes without penalty.
The maximum latency between two points on Earth is about 100 ms, so allowing a
full 1,500 ms to 2,000 ms latency without penalty should not have adverse
impact on distant validators.

### Maximum Vote Credits

Another factor to consider is what the maximum vote credits to award for a
vote should be. Assuming linear reduction in vote credits awarded (where 1
slot of additional lag reduces earned vote credits by 1), the maximum vote
credits value determines how much "penalty" there is for each additional slot
of latency. For example, a value of 10 would mean that after the grace period
slots, every additional slot of latency would result in a 10% reduction in
vote credits earned as each subsequent slot earns 1 credit less out of a
maximum possible 10 credits.

Again, historical voting data was analyzed over many epochs and the conclusion
drawn was that a maximum credits of 10 is the largest value that can be used
and still have a noticeable effect on known laggers. Values higher than that
result in such a small penalty for each slot of lagging that intentional
lagging is still too profitable. Lower values are even more punishing to
intentional lagging; but an attempt has been made to conservatively choose the
highest value that produces noticeable results.

The selection of these values is partially documented here:

https://www.shinobi-systems.com/timely_voting_proposal

The above document is somewhat out of date with more recent analysis, which
occurred in this github issue:

https://github.com/solana-labs/solana/issues/19002

To summarize the findings of these documents: analysis over many epochs showed
that almost all validators from all regions have an average vote latency of 1
slot or less. The validators with higher average latency are either known
laggers, or are not representative of their region since many other validators
in the same region achieve low latency voting. With a maximum vote credit of
10, there is almost no change in vote credits earned relative to the highest vote
earner by the majority of validators, aside from a general uplift of about 0.4%.
Additionally, data centers were analyzed to ensure that there aren't regions of
the world that would be adversely affected, and none were found.


### Method of Implementation

When a Vote or VoteStateUpdate instruction is received by a validator, it will
use the Clock sysvar to identify the slot in which that instruction has
landed. For any newly voted on slot within that Vote or VoteStateUpdate
transaction, the validator will record the vote latency of that slot as
(voted_in_slot - voted_on_slot).

These vote latencies will be stored a new vector of u8 latency values appended
to the end of the VoteState. VoteState currently has ~200 bytes of free space
at the end that is unused, so this new vector of u8 values should easily fit
within this available space. Because VoteState is an ABI frozen structure,
utilizing the mechanisms for updating frozen ABI will be required, which will
complicate the change. Furthermore, because VoteState is embedded in the
Tower data structure and it is frozen ABI as well, updates to the frozen ABI
mechanisms for Tower will be needed also. These are almost entirely
mechanical changes though, that involve ensuring that older versions of these
data structures can be updated to the new version as they are read in, and the
new version written out when the data structure is next persisted.

The credits to award for a rooted slot will be calculated using the latency
value stored in latency vector for the slot, and a formula that awards
latencies of 1 - 3 slots ten credits, with a 1 credit reduction for each vote
latency after 3. Rooted slots will always be awarded a minimum credit of 1
(never 0) so that very old votes, possibly necessary in times of network
stress, are not discouraged.

To summarize the above: latency is recorded in a new Vector at the end of
VoteState when a vote first lands, but the credits for that slot are not
awarded until the slot becomes rooted, at which point the latency that was
recorded is used to compute the credits to award for that newly rooted slot.

When a Vote instruction is processed, the changes are fairly easy to implement
as Vote can only add new slots to the end of Lockouts and pop existing slots
off of the back (which become rooted), so the logic merely has to compute
rewards for the new roots, and new latencies for the newly added slots, both
of which can be processed in the fairly simple existing logic for Vote
processing.

When a VoteStateUpdate instruction is processed:

1. For each slot that was in the previous VoteState but are not in the new
VoteState because they have been rooted in the transition from the old
VoteState to the new VoteState, credits to award are calculated based on the
latency that was recorded for them and still available in the old VoteState.

2. For each slot that was in both the previous VoteState and the new
VoteState, the latency that was previously recorded for that slot is copied
from the old VoteState to the new VoteState.

3. For each slot that is in the new VoteState but wasn't in the old VoteState,
the latency value is calculated for this new slot according to what slot the
vote is for and what slot is in the Clock (i.e. the slot this VoteStateUpdate
tx landed in) and this latency is stored in VoteState for that slot.

The code to handle this is more complex, because VoteStateUpdate may include
removal of slots that expired as performed by the voting validator, in
addition to slots that have been rooted and new slots added. However, the
assumptions that are needed to handle VoteStateUpdate with timely vote credits
are already guaranteed by existing VoteStateUpdate correctness checking code:

The existing VoteStateUpdate processing code already ensures that (1) only
roots slots that could actually have been rooted in the transition from the
old VoteState to the new VoteState, so there is no danger of over-counting
credits (i.e. imagine that a 'cheating' validator "pretended" that slots were
rooted by dropping them off of the back of the new VoteState before they have
actually achieved 32 confirmations; the existing logic prevents this).

The existing VoteStateUpdate processing code already ensures that (2) new
slots included in the new VoteState are only slots after slots that have
already been voted on in the old VoteState (i.e. can't inject new slots in the
middle of slots already voted on).

0 comments on commit a2e1228

Please sign in to comment.