-
Notifications
You must be signed in to change notification settings - Fork 59
replica_server: reimplement uniq_timestamp generator #8
Conversation
Reimplement this for 2 reasons: 1. All threads shared a lock in the old implementation, which was not friendly to performance. As a matter of fact, it's not necessary for different replicas to keep an global increasing timestamp, so we can try this optimization 2. Although the timestamp was replicated to secondaries from primary, timestamp value of secondaries never updated accordingly, for which reason we were exposed to the risks that a newer mutation may had smaller timestamp if primary switched.
// | ||
class uniq_timestamp_us { | ||
private: | ||
uint64_t last_ts; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
变量名以下划线开头
public: | ||
uniq_timestamp_us() { last_ts = dsn_now_us(); } | ||
|
||
void try_update(uint64_t new_ts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
加个注释更清晰:
// Update the local timestamp (being a Secondary) to ensure
// when it's elected as primary, the timestamp is monotonically
// increasing.
仔细想了下这个递增的时间戳和真实的时间戳有什么关系:
另外,和@neverchanje 讨论时,谈到secondary能否在on_prepare时无脑update primary发过来的timestamp,想了下貌似不太好。因为这个secondary可能是一个旧primary降级下来的,所以本地的timestamp可能要比对方发过来的要大。 目前这种实现中导致时间戳回跳的主要可能是:选出一个新primary时, 其mutation log全删除,且物理时钟也回跳了。 |
准确说是要 三副本 之间保证 共同的 timestamp 递增关系,所以要求 secondary 的 timestamp 也要单调递增,不能减少 |
其实就是在复制状态机模型下,操作序列中每个操作的timestamp要保证严格递增的偏序关系(与decree具有相同的偏序性质),这样对于同一个key,后面操作的timestamp总是保证大于其前面操作的timestamp,避免后写入的数据无法生效的情况。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Reimplement this for 2 reasons:
All threads shared a lock in the old implementation, which was
not friendly to performance. As a matter of fact, it's not necessary
for different replicas to keep an global increasing timestamp, so we
can try this optimization
Although the timestamp was replicated to secondaries from primary,
timestamp value of secondaries never updated accordingly, for which reason
we were exposed to the risks that a newer mutation may had smaller timestamp
if primary switched.