[GLCC] Implement Sequencer API with snowflake algorithm #193

seeflood · 2021-08-23T11:11:24Z

What would you like to be added:
Implement Sequencer API with snowflake algorithm.
We need to be careful to avoid clock rollback problems.

Reference:

Sequencer API design doc(in chinese)
Existing implementation. You can refer to https://github.com/baidu/uid-generator or https://github.com/Meituan-Dianping/Leaf . These existing snowflake implementation depends on a non-volatile storage like zookeeper to avoid clock rollback.

I don't know if we can ignore clock rollback problem just warning in the document "Hey,please use a reliable NTP server which prevent clock rollback. Otherwise the generated id might duplicate."

为了解决时钟回拨问题，得依赖存储；
我不确定能否无视时钟回拨问题、省掉存储，仅仅声明“我们不帮你解决时钟回拨问题，请使用一个可靠的NTP服务器、让他确保你的机器不会时钟回拨”。我不确定有没有这样的NTP服务（我记得即使关闭NTP同步，闰秒还是会导致回拨?)

Why is this needed:
Snowflake algorithms is a useful tool to generate distributed unique id.

keleqnma · 2021-08-23T12:12:58Z

assign to me pls, thanks for your replys and tips

ZLBer · 2021-08-27T16:01:01Z

我不确定能否无视时钟回拨问题、省掉存储，仅仅声明“我们不帮你解决时钟回拨问题，请使用一个可靠的NTP服务器、让他确保你的机器不会时钟回拨”。我不确定有没有这样的NTP服务（我记得即使关闭NTP同步，闰秒还是会导致回拨?)

阿里云有NTP服务器,可以参考下 : https://help.aliyun.com/document_detail/92704.html

github-actions · 2021-10-21T02:21:24Z

This issue has been automatically marked as stale because it has not had recent activity in the last 30 days. It will be closed in the next 7 days unless it is tagged (pinned, good first issue or help wanted) or other activity occurs. Thank you for your contributions.

github-actions · 2021-10-29T02:08:37Z

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as pinned, good first issue or help wanted. Thank you for your contributions.

seeflood · 2022-05-17T08:40:51Z

@keleqnma Hi, are u still working on it?
Since this issue has been automatically closed, we will reopen it and make it assignable to others

ZLBer · 2022-05-18T12:15:33Z

This issue will participate in the chinese GLCC activity

1.题目描述

Layotto Sequencer API 的雪花算法实现

2.编码任务

实现雪花算法的Sequencer组件，解决时钟漂移问题
编写合适的测试用例测试算法的正确性
编写组件文档(中英文)

3.技能要求和编程语言

熟悉go语言
了解雪花算法的原理与实现
了解layotto组件开发流程

4.预期完成的结果

逻辑正确的雪花算法自增id组件
在时钟漂移发生时，保证算法的正确性

5.题目详情

Layotto Sequencer API已经有了基于redis、zookeeper、etcd的自增id实现，基于mysql与postgresql的实现也已经提交pr，雪花算法的实现存在一定的难度。在这里不细说雪花算法的距离原理，其实现分成时间戳+机器id+序列号三部分，且能保证递增和全局唯一。但由于机器时间的不确定性，，有可能会发生时间漂移出现时间回退的现象，则会出现数据重复和非递增性。可以参考业界的方案来解决此问题。

参考文档：
Layotto Sequencer API 设计文档
已有的算法实现：
https://github.com/baidu/uid-generator
https://github.com/Meituan-Dianping/Leaf .

导师

@ZLBer
[email protected]

价值

扩展Layotto的Sequencer API，提供更多的选择。

github-actions · 2022-06-18T03:10:00Z

This issue has been automatically marked as stale because it has not had recent activity in the last 30 days. It will be closed in the next 7 days unless it is tagged (pinned, good first issue or help wanted) or other activity occurs. Thank you for your contributions.

seeflood · 2022-07-09T12:23:07Z

进展：
@OOOOlh 刚开始开发
参考 https://github.com/baidu/uid-generator
需要考虑下重启+时钟回拨的情况

OOOOlh · 2022-07-11T07:47:34Z

1、位数分配

下面各个部分的位数支持用户根据需要自定义。

sign(1 bit)
delta seconds(28bits)

精确到秒，大约可以使用8.7年
worker id(22 bits)

机器id，最多可支持约420w次机器启动。内置实现为在启动时由数据库分配，默认分配策略为用后即弃
sequence(12 bits)

每秒下的并发序列，13 bits可支持每秒8192个并发。

2、worker id生成

在程序启动时，将“主机IP”和“当前时间戳 (单位：秒) - (0-10000之内的随机数)”两个字段作为一条记录写入Mysql，并获得一个唯一自增id作为work id，如下。

只有当添加新的机器或机器重启时，才会往Mysql中添加新的记录，并生成新的唯一的work id（来自Mysql中的ID字段（auto_increment））。

3、delta seconds生成

开启多个线程向RingBuffer中填入预先生成的UID。当sequence到达最大位数，delta seconds直接+1（除了程序启动时读取当前时间，之后不再需要）。

4、时钟回拨问题的解决

delta seconds的生成已经不再依靠"当前时间"了，所以在运行时发生时钟回拨自然不受影响。
当重启时发生时钟回拨，考虑在程序启动后，生成一条记录，在写入Mysql之前，先在Mysql检测“HOST_NAME” 和“PORT”两个字段是否有重复的记录，如果有，就重新生成一条记录，直到不再重复。然后写入Mysql。此时生成的work id是唯一的，所以后面生成的UID也都是唯一的，可以解决重启+时钟回拨的问题。（ps：“PORT”字段中后面的随机数部分已经可以大概率保证即使发生时钟回拨也不会生成重复的“PORT”，进而也不会生成重复的work id）。

ZLBer · 2022-07-13T07:58:35Z

开启多个线程向RingBuffer中填入预先生成的UID。当sequence到达最大位数，delta seconds直接+1（除了程序启动时读取当前时间，之后不再需要）。

@OOOOlh 可以展开说一下这里吗。按这种方式，delta seconds已经和时间没有什么关系了。当sequence到达最大位数，delta seconds直接+1 这与直接递增id好像是一样的了。这种方式的的并发度应该不如和与时间绑定的算法吧？

OOOOlh · 2022-07-13T08:42:22Z

1、传统雪花算法是完全依赖时间，所以发生时钟回拨后容易出问题。所以为了减少对时间的依赖，就采取序列依次+1，满后delta seconds +1的方法。只在程序启动时读取当前时间（这个时间一般是比上次RingBuffer中的uid的时间戳晚的），这也可以保证趋势递增。这种方法的一个主要不足是业务请求的uid反解析后的时间不是业务真实发生的时间，我不确定这个影响算不算大。
2、使用RingBuffer存储提前生成的uid。RingBuffer的容量是比sequence大几倍的，而且当RingBuffer中可用uid数小于阈值会异步填充，所以在同一秒内并发请求RingBuffer中的uid应该是承受的住的。百度UidGenerator官方测试是单机可提供600万/s的稳定吞吐量。
@ZLBer

ZLBer · 2022-07-13T15:32:39Z

这种方法的一个主要不足是业务请求的uid反解析后的时间不是业务真实发生的时间，我不确定这个影响算不算大。

这个问题应该不大，只要是递增的就行

只在程序启动时读取当前时间（这个时间一般是比上次RingBuffer中的uid的时间戳晚的

还有这里，按照上面的设计，每次启动不都是获取的新的workid吗？那么时间戳从0开始不都可以？

OOOOlh · 2022-07-13T16:08:52Z

百度的这个雪花算法顺序是sign+delta seconds+work id +sequence，delta seconds在高位，所以必须读当前时间。@ZLBer

ZLBer · 2022-07-14T15:25:34Z

好，我大概明白了，就是不再依赖于物理时间。可以大概描述下整个id生成的过程，然后可以写代码了。

ZLBer · 2022-07-14T15:26:58Z

@seeflood 周老师看下方案吧

OOOOlh · 2022-07-15T07:34:51Z

1、程序开始时，从mysql读workid，再结合当前时间生成uid，并填满整个RingBuffer；
2、然后其它程序来读，读后的uid标志位设为false，表示不可用；
3、当RingBuffer中可用uid少于阈值时（比如buffer容量的三分之一），进行异步填充。循环2、3步。

seeflood mentioned this issue Aug 23, 2021

Community tasks 新手任务计划 #108

Open

seeflood assigned keleqnma Aug 23, 2021

github-actions bot added the stale label Oct 21, 2021

github-actions bot closed this as completed Oct 29, 2021

seeflood reopened this May 17, 2022

seeflood unassigned keleqnma May 17, 2022

github-actions bot removed the stale label May 18, 2022

seeflood added the GLCC label May 18, 2022

seeflood changed the title ~~Implement Sequencer API with snowflake algorithm~~ [GLCC] Implement Sequencer API with snowflake algorithm May 18, 2022

github-actions bot added the stale label Jun 18, 2022

seeflood added pinned and removed stale labels Jun 18, 2022

OOOOlh mentioned this issue Jul 24, 2022

[WIP] feat: add sequencer api component with snowflake algorithm #734

Merged

3 tasks

ZLBer assigned OOOOlh Jul 27, 2022

This was referenced Aug 16, 2022

[WIP] feat:add sequencer api component with snowflake algorithm #766

Closed

feat: add sequencer api component with snowflake algorithm #767

Merged

seeflood closed this as completed in #767 Oct 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GLCC] Implement Sequencer API with snowflake algorithm #193

[GLCC] Implement Sequencer API with snowflake algorithm #193

seeflood commented Aug 23, 2021 •

edited

Loading

keleqnma commented Aug 23, 2021

ZLBer commented Aug 27, 2021 •

edited

Loading

github-actions bot commented Oct 21, 2021

github-actions bot commented Oct 29, 2021

seeflood commented May 17, 2022

ZLBer commented May 18, 2022

github-actions bot commented Jun 18, 2022

seeflood commented Jul 9, 2022 •

edited

Loading

OOOOlh commented Jul 11, 2022

ZLBer commented Jul 13, 2022 •

edited

Loading

OOOOlh commented Jul 13, 2022

ZLBer commented Jul 13, 2022

OOOOlh commented Jul 13, 2022

ZLBer commented Jul 14, 2022

ZLBer commented Jul 14, 2022

OOOOlh commented Jul 15, 2022

[GLCC] Implement Sequencer API with snowflake algorithm #193

[GLCC] Implement Sequencer API with snowflake algorithm #193

Comments

seeflood commented Aug 23, 2021 • edited Loading

keleqnma commented Aug 23, 2021

ZLBer commented Aug 27, 2021 • edited Loading

github-actions bot commented Oct 21, 2021

github-actions bot commented Oct 29, 2021

seeflood commented May 17, 2022

ZLBer commented May 18, 2022

1.题目描述

2.编码任务

3.技能要求和编程语言

4.预期完成的结果

5.题目详情

导师

价值

github-actions bot commented Jun 18, 2022

seeflood commented Jul 9, 2022 • edited Loading

OOOOlh commented Jul 11, 2022

1、位数分配

2、worker id生成

3、delta seconds生成

4、时钟回拨问题的解决

ZLBer commented Jul 13, 2022 • edited Loading

OOOOlh commented Jul 13, 2022

ZLBer commented Jul 13, 2022

OOOOlh commented Jul 13, 2022

ZLBer commented Jul 14, 2022

ZLBer commented Jul 14, 2022

OOOOlh commented Jul 15, 2022

seeflood commented Aug 23, 2021 •

edited

Loading

ZLBer commented Aug 27, 2021 •

edited

Loading

seeflood commented Jul 9, 2022 •

edited

Loading

ZLBer commented Jul 13, 2022 •

edited

Loading