概述
SnowFlake算法生成id的结果是一个64bit大小的整数,它的结构如下图:
-
1位
,不用。二进制中最高位为1的都是负数,但是我们生成的id一般都使用整数,所以这个最高位固定是0 -
41位
,用来记录时间戳(毫秒)。<ul style="margin-left:3em;"><li>41位可以表示241−1个数字,</li> <li>如果只用来表示正整数(计算机中正数包含0),可以表示的数值范围是:0 至 241−1,减1是因为可表示的数值范围是从0开始算的,而不是1。</li> <li>也就是说41位可以表示241−1个毫秒的值,转化成单位年则是(241−1)/(1000∗60∗60∗24∗365)=69年</li> </ul></li> <li> <p><code>10位</code>,用来记录工作机器id。</p> <ul style="margin-left:3em;"><li>可以部署在210=1024个节点,包括<code>5位datacenterId</code>和<code>5位workerId</code></li> <li><code>5位(bit)</code>可以表示的最大正整数是25−1=31,即可以用0、1、2、3、....31这32个数字,来表示不同的datecenterId或workerId</li> </ul></li> <li> <p><code>12位</code>,序列号,用来记录同毫秒内产生的不同id。</p> <ul style="margin-left:3em;"><li><code>12位(bit)</code>可以表示的最大正整数是212−1=4096,即可以用0、1、2、3、....4095这4096个数字,来表示同一机器同一时间截(毫秒)内产生的4096个ID序号</li> </ul></li>
由于在Java中64bit的整数是long类型,所以在Java中SnowFlake算法生成的id就是long来存储的。
SnowFlake可以保证:
- 所有生成的id按时间趋势递增
- 整个分布式系统内不会产生重复id(因为有datacenterId和workerId来做区分)
Talk is cheap, show you the code
以下是Twitter官方原版的,用Scala写的,(我也不懂Scala,当成Java看即可):
1.
2.
/** Copyright 2010-2012 Twitter, Inc.*/
3.
package com.twitter.service.snowflake
4.
5.
import com.twitter.ostrich.stats.Stats
6.
import com.twitter.service.snowflake.gen._
7.
import java.util.Random
8.
import com.twitter.logging.Logger
9.
10.
/**
11.
* An object that generates IDs.
12.
* This is broken into a separate class in case
13.
* we ever want to support multiple worker threads
14.
* per process
15.
*/
16.
class IdWorker(
17.
val workerId: Long,
18.
val datacenterId: Long,
19.
private val reporter: Reporter,
20.
var sequence: Long = 0L) extends Snowflake.Iface {
21.
22.
private[this] def genCounter(agent: String) = {
23.
Stats.incr("ids_generated")
24.
Stats.incr("ids_generated_%s".format(agent))
25.
}
26.
private[this] val exceptionCounter = Stats.getCounter("exceptions")
27.
private[this] val log = Logger.get
28.
private[this] val rand = new Random
29.
30.
val twepoch = 1288834974657L
31.
32.
private[this] val workerIdBits = 5L
33.
private[this] val datacenterIdBits = 5L
34.
private[this] val maxWorkerId = -1L ^ (-1L << workerIdBits)
35.
private[this] val maxDatacenterId = -1L ^ (-1L << datacenterIdBits)
36.
private[this] val sequenceBits = 12L
37.
38.
private[this] val workerIdShift = sequenceBits
39.
private[this] val datacenterIdShift = sequenceBits + workerIdBits
40.
private[this] val timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits
41.
private[this] val sequenceMask = -1L ^ (-1L << sequenceBits)
42.
43.
private[this] var lastTimestamp = -1L
44.
45.
// sanity check for workerId
46.
if (workerId > maxWorkerId || workerId < 0) {
47.
exceptionCounter.incr(1)
48.
throw new IllegalArgumentException("worker Id can't be greater than %d or less than 0".format(maxWorkerId))
49.
}
50.
51.
if (datacenterId > maxDatacenterId || datacenterId < 0) {
52.
exceptionCounter.incr(1)
53.
throw new IllegalArgumentException("datacenter Id can't be greater than %d or less than 0".format(maxDatacenterId))
54.
}
55.
56.
log.info("worker starting. timestamp left shift %d, datacenter id bits %d, worker id bits %d, sequence bits %d, workerid %d",
57.
timestampLeftShift, datacenterIdBits, workerIdBits, sequenceBits, workerId)
58.
59.
def get_id(useragent: String): Long = {
60.
if (!validUseragent(useragent)) {
61.
exceptionCounter.incr(1)
62.
throw new InvalidUserAgentError
63.
}
64.
65.
val id = nextId()
66.
genCounter(useragent)
67.
68.
reporter.report(new AuditLogEntry(id, useragent, rand.nextLong))
69.
id
70.
}
71.
72.
def get_worker_id(): Long = workerId
73.
def get_datacenter_id(): Long = datacenterId
74.
def get_timestamp() = System.currentTimeMillis
75.
76.
protected[snowflake] def nextId(): Long = synchronized {
77.
var timestamp = timeGen()
78.
79.
if (timestamp < lastTimestamp) {
80.
exceptionCounter.incr(1)
81.
log.error("clock is moving backwards. Rejecting requests until %d.", lastTimestamp);
82.
throw new InvalidSystemClock("Clock moved backwards. Refusing to generate id for %d milliseconds".format(
83.
lastTimestamp - timestamp))
84.
}
85.
86.
if (lastTimestamp == timestamp) {
87.
sequence = (sequence + 1) & sequenceMask
88.
if (sequence == 0) {
89.
timestamp = tilNextMillis(lastTimestamp)
90.
}
91.
} else {
92.
sequence = 0
93.
}
94.
95.
lastTimestamp = timestamp
96.
((timestamp - twepoch) << timestampLeftShift) |
97.
(datacenterId << datacenterIdShift) |
98.
(workerId << workerIdShift) |
99.
sequence
100.
}
101.
102.
protected def tilNextMillis(lastTimestamp: Long): Long = {
103.
var timestamp = timeGen()
104.
while (timestamp <= lastTimestamp) {
105.
timestamp = timeGen()
106.
}
107.
timestamp
108.
}
109.
110.
protected def timeGen(): Long = System.currentTimeMillis()
111.
112.
val AgentParser = """([a-zA-Z][a-zA-Z\-0-9]*)""".r
113.
114.
def validUseragent(useragent: String): Boolean = useragent match {
115.
case AgentParser(_) => true
116.
case _ => false
117.
}
118.
}
119.
Scala是一门可以编译成字节码的语言,简单理解是在Java语法基础上加上了很多语法糖,例如不用每条语句后写分号,可以使用动态类型等等。抱着试一试的心态,我把Scala版的代码“翻译”成Java版本的,对scala代码改动的地方如下:
1.
2.
/** Copyright 2010-2012 Twitter, Inc.*/
3.
package com.twitter.service.snowflake
4.
5.
import com.twitter.ostrich.stats.Stats
6.
import com.twitter.service.snowflake.gen._
7.
import java.util.Random
8.
import com.twitter.logging.Logger
9.
10.
/**
11.
* An object that generates IDs.
12.
* This is broken into a separate class in case
13.
* we ever want to support multiple worker threads
14.
* per process
15.
*/
16.
class IdWorker( // |
17.
val workerId: Long, // |
18.
val datacenterId: Long, // |<--这部分改成Java的构造函数形式
19.
private val reporter: Reporter,//日志相关,删 // |
20.
var sequence: Long = 0L) // |
21.
extends Snowflake.Iface { //接口找不到,删 // |
22.
23.
private[this] def genCounter(agent: String) = { // |
24.
Stats.incr("ids_generated") // |
25.
Stats.incr("ids_generated_%s".format(agent)) // |<--错误、日志处理相关,删
26.
} // |
27.
private[this] val exceptionCounter = Stats.getCounter("exceptions") // |
28.
private[this] val log = Logger.get // |
29.
private[this] val rand = new Random // |
30.
31.
val twepoch = 1288834974657L
32.
33.
private[this] val workerIdBits = 5L
34.
private[this] val datacenterIdBits = 5L
35.
private[this] val maxWorkerId = -1L ^ (-1L << workerIdBits)
36.
private[this] val maxDatacenterId = -1L ^ (-1L << datacenterIdBits)
37.
private[this] val sequenceBits = 12L
38.
39.
private[this] val workerIdShift = sequenceBits
40.
private[this] val datacenterIdShift = sequenceBits + workerIdBits
41.
private[this] val timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits
42.
private[this] val sequenceMask = -1L ^ (-1L << sequenceBits)
43.
44.
private[this] var lastTimestamp = -1L
45.
46.
//----------------------------------------------------------------------------------------------------------------------------//
47.
// sanity check for workerId //
48.
if (workerId > maxWorkerId || workerId < 0) { //
49.
exceptionCounter.incr(1) //<--错误处理相关,删 //
50.
throw new IllegalArgumentException("worker Id can't be greater than %d or less than 0".format(maxWorkerId)) //这
51.
// |-->改成:throw new IllegalArgumentException //部
52.
// (String.format("worker Id can't be greater than %d or less than 0",maxWorkerId)) //分
53.
} //放
54.
//到
55.
if (datacenterId > maxDatacenterId || datacenterId < 0) { //构
56.
exceptionCounter.incr(1) //<--错误处理相关,删 //造
57.
throw new IllegalArgumentException("datacenter Id can't be greater than %d or less than 0".format(maxDatacenterId)) //函
58.
// |-->改成:throw new IllegalArgumentException //数
59.
// (String.format("datacenter Id can't be greater than %d or less than 0",maxDatacenterId)) //中
60.
} //
61.
//
62.
log.info("worker starting. timestamp left shift %d, datacenter id bits %d, worker id bits %d, sequence bits %d, workerid %d", //
63.
timestampLeftShift, datacenterIdBits, workerIdBits, sequenceBits, workerId) //
64.
// |-->改成:System.out.printf("worker...%d...",timestampLeftShift,...); //
65.
//----------------------------------------------------------------------------------------------------------------------------//
66.
67.
//-------------------------------------------------------------------//
68.
//这个函数删除错误处理相关的代码后,剩下一行代码:val id = nextId() //
69.
//所以我们直接调用nextId()函数可以了,所以在“翻译”时可以删除这个函数 //
70.
def get_id(useragent: String): Long = { //
71.
if (!validUseragent(useragent)) { //
72.
exceptionCounter.incr(1) //
73.
throw new InvalidUserAgentError //删
74.
} //除
75.
//
76.
val id = nextId() //
77.
genCounter(useragent) //
78.
//
79.
reporter.report(new AuditLogEntry(id, useragent, rand.nextLong)) //
80.
id //
81.
} //
82.
//-------------------------------------------------------------------//
83.
84.
def get_worker_id(): Long = workerId // |
85.
def get_datacenter_id(): Long = datacenterId // |<--改成Java函数
86.
def get_timestamp() = System.currentTimeMillis // |
87.
88.
protected[snowflake] def nextId(): Long = synchronized { // 改成Java函数
89.
var timestamp = timeGen()
90.
91.
if (timestamp < lastTimestamp) {
92.
exceptionCounter.incr(1) // 错误处理相关,删
93.
log.error("clock is moving backwards. Rejecting requests until %d.", lastTimestamp); // 改成System.err.printf(...)
94.
throw new InvalidSystemClock("Clock moved backwards. Refusing to generate id for %d milliseconds".format(
95.
lastTimestamp - timestamp)) // 改成RumTimeException
96.
}
97.
98.
if (lastTimestamp == timestamp) {
99.
sequence = (sequence + 1) & sequenceMask
100.
if (sequence == 0) {
101.
timestamp = tilNextMillis(lastTimestamp)
102.
}
103.
} else {
104.
sequence = 0
105.
}
106.
107.
lastTimestamp = timestamp
108.
((timestamp - twepoch) << timestampLeftShift) | // |<--加上关键字return
109.
(datacenterId << datacenterIdShift) | // |
110.
(workerId << workerIdShift) | // |
111.
sequence // |
112.
}
113.
114.
protected def tilNextMillis(lastTimestamp: Long): Long = { // 改成Java函数
115.
var timestamp = timeGen()
116.
while (timestamp <= lastTimestamp) {
117.
timestamp = timeGen()
118.
}
119.
timestamp // 加上关键字return
120.
}
121.
122.
protected def timeGen(): Long = System.currentTimeMillis() // 改成Java函数
123.
124.
val AgentParser = """([a-zA-Z][a-zA-Z\-0-9]*)""".r // |
125.
// |
126.
def validUseragent(useragent: String): Boolean = useragent match { // |<--日志相关,删
127.
case AgentParser(_) => true // |
128.
case _ => false // |
129.
} // |
130.
}
131.
改出来的Java版:
1.
2.
public class IdWorker{
3.
4.
private long workerId;
5.
private long datacenterId;
6.
private long sequence;
7.
8.
public IdWorker(long workerId, long datacenterId, long sequence){
9.
// sanity check for workerId
10.
if (workerId > maxWorkerId || workerId < 0) {
11.
throw new IllegalArgumentException(String.format("worker Id can't be greater than %d or less than 0",maxWorkerId));
12.
}
13.
if (datacenterId > maxDatacenterId || datacenterId < 0) {
14.
throw new IllegalArgumentException(String.format("datacenter Id can't be greater than %d or less than 0",maxDatacenterId));
15.
}
16.
System.out.printf("worker starting. timestamp left shift %d, datacenter id bits %d, worker id bits %d, sequence bits %d, workerid %d",
17.
timestampLeftShift, datacenterIdBits, workerIdBits, sequenceBits, workerId);
18.
19.
this.workerId = workerId;
20.
this.datacenterId = datacenterId;
21.
this.sequence = sequence;
22.
}
23.
24.
private long twepoch = 1288834974657L;
25.
26.
private long workerIdBits = 5L;
27.
private long datacenterIdBits = 5L;
28.
private long maxWorkerId = -1L ^ (-1L << workerIdBits);
29.
private long maxDatacenterId = -1L ^ (-1L << datacenterIdBits);
30.
private long sequenceBits = 12L;
31.
32.
private long workerIdShift = sequenceBits;
33.
private long datacenterIdShift = sequenceBits + workerIdBits;
34.
private long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;
35.
private long sequenceMask = -1L ^ (-1L << sequenceBits);
36.
37.
private long lastTimestamp = -1L;
38.
39.
public long getWorkerId(){
40.
return workerId;
41.
}
42.
43.
public long getDatacenterId(){
44.
return datacenterId;
45.
}
46.
47.
public long getTimestamp(){
48.
return System.currentTimeMillis();
49.
}
50.
51.
public synchronized long nextId() {
52.
long timestamp = timeGen();
53.
54.
if (timestamp < lastTimestamp) {
55.
System.err.printf("clock is moving backwards. Rejecting requests until %d.", lastTimestamp);
56.
throw new RuntimeException(String.format("Clock moved backwards. Refusing to generate id for %d milliseconds",
57.
lastTimestamp - timestamp));
58.
}
59.
60.
if (lastTimestamp == timestamp) {
61.
sequence = (sequence + 1) & sequenceMask;
62.
if (sequence == 0) {
63.
timestamp = tilNextMillis(lastTimestamp);
64.
}
65.
} else {
66.
sequence = 0;
67.
}
68.
69.
lastTimestamp = timestamp;
70.
return ((timestamp - twepoch) << timestampLeftShift) |
71.
(datacenterId << datacenterIdShift) |
72.
(workerId << workerIdShift) |
73.
sequence;
74.
}
75.
76.
private long tilNextMillis(long lastTimestamp) {
77.
long timestamp = timeGen();
78.
while (timestamp <= lastTimestamp) {
79.
timestamp = timeGen();
80.
}
81.
return timestamp;
82.
}
83.
84.
private long timeGen(){
85.
return System.currentTimeMillis();
86.
}
87.
88.
//---------------测试---------------
89.
public static void main(String[] args) {
90.
IdWorker worker = new IdWorker(1,1,1);
91.
for (int i = 0; i < 30; i++) {
92.
System.out.println(worker.nextId());
93.
}
94.
}
95.
96.
}
97.
代码理解
上面的代码中,有部分位运算的代码,如:
1.
2.
sequence = (sequence + 1) & sequenceMask;
3.
4.
private long maxWorkerId = -1L ^ (-1L << workerIdBits);
5.
6.
return ((timestamp - twepoch) << timestampLeftShift) |
7.
(datacenterId << datacenterIdShift) |
8.
(workerId << workerIdShift) |
9.
sequence;
10.
为了能更好理解,我对相关知识研究了一下。
负数的二进制表示
在计算机中,负数的二进制是用补码
来表示的。
假设我是用Java中的int类型来存储数字的,
int类型的大小是32个二进制位(bit),即4个字节(byte)。(1 byte = 8 bit)
那么十进制数字3
在二进制中的表示应该是这样的:
1.
2.
00000000 00000000 00000000 00000011
3.
// 3的二进制表示,就是原码
4.
那数字-3
在二进制中应该如何表示?
我们可以反过来想想,因为-3+3=0,
在二进制运算中把-3的二进制看成未知数x来求解
,
求解算式的二进制表示如下:
1.
2.
3.
00000000 00000000 00000000 00000011 //3,原码
4.
+ xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx //-3,补码
5.
-----------------------------------------------
6.
00000000 00000000 00000000 00000000
7.
反推x的值,3的二进制加上什么值才使结果变成00000000 00000000 00000000 00000000
?:
1.
2.
00000000 00000000 00000000 00000011 //3,原码
3.
+ 11111111 11111111 11111111 11111101 //-3,补码
4.
-----------------------------------------------
5.
1 00000000 00000000 00000000 00000000
6.
反推的思路是3的二进制数从最低位开始逐位加1,使溢出的1不断向高位溢出,直到溢出到第33位。然后由于int类型最多只能保存32个二进制位,所以最高位的1溢出了,剩下的32位就成了(十进制的)0。
补码的意义就是可以拿补码和原码(3的二进制)相加,最终加出一个“溢出的0”
以上是理解的过程,实际中记住公式就很容易算出来:
- 补码 = 反码 + 1
- 补码 = (原码 - 1)再取反码
因此-1
的二进制应该这样算:
1.
2.
00000000 00000000 00000000 00000001 //原码:1的二进制
3.
11111111 11111111 11111111 11111110 //取反码:1的二进制的反码
4.
11111111 11111111 11111111 11111111 //加1:-1的二进制表示(补码)
5.
用位运算计算n个bit能表示的最大数值
比如这样一行代码:
1.
2.
3.
private long workerIdBits = 5L;
4.
private long maxWorkerId = -1L ^ (-1L << workerIdBits);
5.
上面代码换成这样看方便一点:
long maxWorkerId = -1L ^ (-1L << 5L)
咋一看真的看不准哪个部分先计算,于是查了一下Java运算符的优先级表:
所以上面那行代码中,运行顺序是:
- -1 左移 5,得结果a
- -1 异或 a
long maxWorkerId = -1L ^ (-1L << 5L)
的二进制运算过程如下:
-1 左移 5,得结果a :
1.
2.
11111111 11111111 11111111 11111111 //-1的二进制表示(补码)
3.
11111 11111111 11111111 11111111 11100000 //高位溢出的不要,低位补0
4.
11111111 11111111 11111111 11100000 //结果a
5.
-1 异或 a :
1.
2.
11111111 11111111 11111111 11111111 //-1的二进制表示(补码)
3.
^ 11111111 11111111 11111111 11100000 //两个操作数的位中,相同则为0,不同则为1
4.
---------------------------------------------------------------------------
5.
00000000 00000000 00000000 00011111 //最终结果31
6.
最终结果是31,二进制00000000 00000000 00000000 00011111
转十进制可以这么算:
24+23+22+21+20=16+8+4+2+1=31
那既然现在知道算出来long maxWorkerId = -1L ^ (-1L << 5L)
中的maxWorkerId = 31
,有什么含义?为什么要用左移5来算?如果你看过概述
部分,请找到这段内容看看:
5位(bit)
可以表示的最大正整数是25−1=31,即可以用0、1、2、3、....31这32个数字,来表示不同的datecenterId或workerId
-1L ^ (-1L << 5L)
结果是31
,25−1的结果也是31
,所以在代码中,-1L ^ (-1L << 5L)
的写法是利用位运算计算出5位能表示的最大正整数是多少
用mask防止溢出
有一段有趣的代码:
sequence = (sequence + 1) & sequenceMask;
分别用不同的值测试一下,你就知道它怎么有趣了:
1.
2.
long seqMask = -1L ^ (-1L << 12L); //计算12位能耐存储的最大正整数,相当于:2^12-1 = 4095
3.
System.out.println("seqMask: "+seqMask);
4.
System.out.println(1L & seqMask);
5.
System.out.println(2L & seqMask);
6.
System.out.println(3L & seqMask);
7.
System.out.println(4L & seqMask);
8.
System.out.println(4095L & seqMask);
9.
System.out.println(4096L & seqMask);
10.
System.out.println(4097L & seqMask);
11.
System.out.println(4098L & seqMask);
12.
13.
14.
/**
15.
seqMask: 4095
16.
1
17.
2
18.
3
19.
4
20.
4095
21.
0
22.
1
23.
2
24.
*/
25.
这段代码通过位与
运算保证计算的结果范围始终是 0-4095 !
用位运算汇总结果
还有另外一段诡异的代码:
1.
2.
return ((timestamp - twepoch) << timestampLeftShift) |
3.
(datacenterId << datacenterIdShift) |
4.
(workerId << workerIdShift) |
5.
sequence;
6.
为了弄清楚这段代码,
首先
需要计算一下相关的值:
1.
2.
3.
private long twepoch = 1288834974657L; //起始时间戳,用于用当前时间戳减去这个时间戳,算出偏移量
4.
5.
private long workerIdBits = 5L; //workerId占用的位数:5
6.
private long datacenterIdBits = 5L; //datacenterId占用的位数:5
7.
private long maxWorkerId = -1L ^ (-1L << workerIdBits); // workerId可以使用的最大数值:31
8.
private long maxDatacenterId = -1L ^ (-1L << datacenterIdBits); // datacenterId可以使用的最大数值:31
9.
private long sequenceBits = 12L;//序列号占用的位数:12
10.
11.
private long workerIdShift = sequenceBits; // 12
12.
private long datacenterIdShift = sequenceBits + workerIdBits; // 12+5 = 17
13.
private long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits; // 12+5+5 = 22
14.
private long sequenceMask = -1L ^ (-1L << sequenceBits);//4095
15.
16.
private long lastTimestamp = -1L;
17.
其次
写个测试,把参数都写死,并运行打印信息,方便后面来核对计算结果:
1.
2.
3.
//---------------测试---------------
4.
public static void main(String[] args) {
5.
long timestamp = 1505914988849L;
6.
long twepoch = 1288834974657L;
7.
long datacenterId = 17L;
8.
long workerId = 25L;
9.
long sequence = 0L;
10.
11.
System.out.printf("\ntimestamp: %d \n",timestamp);
12.
System.out.printf("twepoch: %d \n",twepoch);
13.
System.out.printf("datacenterId: %d \n",datacenterId);
14.
System.out.printf("workerId: %d \n",workerId);
15.
System.out.printf("sequence: %d \n",sequence);
16.
System.out.println();
17.
System.out.printf("(timestamp - twepoch): %d \n",(timestamp - twepoch));
18.
System.out.printf("((timestamp - twepoch) << 22L): %d \n",((timestamp - twepoch) << 22L));
19.
System.out.printf("(datacenterId << 17L): %d \n" ,(datacenterId << 17L));
20.
System.out.printf("(workerId << 12L): %d \n",(workerId << 12L));
21.
System.out.printf("sequence: %d \n",sequence);
22.
23.
long result = ((timestamp - twepoch) << 22L) |
24.
(datacenterId << 17L) |
25.
(workerId << 12L) |
26.
sequence;
27.
System.out.println(result);
28.
29.
}
30.
31.
/** 打印信息:
32.
timestamp: 1505914988849
33.
twepoch: 1288834974657
34.
datacenterId: 17
35.
workerId: 25
36.
sequence: 0
37.
38.
(timestamp - twepoch): 217080014192
39.
((timestamp - twepoch) << 22L): 910499571845562368
40.
(datacenterId << 17L): 2228224
41.
(workerId << 12L): 102400
42.
sequence: 0
43.
910499571847892992
44.
*/
45.
代入位移的值得之后,就是这样:
1.
2.
return ((timestamp - 1288834974657) << 22) |
3.
(datacenterId << 17) |
4.
(workerId << 12) |
5.
sequence;
6.
对于尚未知道的值,我们可以先看看概述
中对SnowFlake结构的解释,再代入在合法范围的值(windows系统可以用计算器方便计算这些值的二进制),来了解计算的过程。
当然,由于我的测试代码已经把这些值写死了,那直接用这些值来手工验证计算结果即可:
1.
2.
long timestamp = 1505914988849L;
3.
long twepoch = 1288834974657L;
4.
long datacenterId = 17L;
5.
long workerId = 25L;
6.
long sequence = 0L;
7.
1.
2.
设:timestamp = 1505914988849,twepoch = 1288834974657
3.
1505914988849 - 1288834974657 = 217080014192 (timestamp相对于起始时间的毫秒偏移量),其(a)二进制左移22位计算过程如下:
4.
5.
|<--这里开始左右22位
6.
00000000 00000000 000000|00 00110010 10001010 11111010 00100101 01110000 // a = 217080014192
7.
00001100 10100010 10111110 10001001 01011100 00|000000 00000000 00000000 // a左移22位后的值(la)
8.
|<--这里后面的位补0
9.
1.
2.
设:datacenterId = 17,其(b)二进制左移17位计算过程如下:
3.
4.
|<--这里开始左移17位
5.
00000000 00000000 0|0000000 00000000 00000000 00000000 00000000 00010001 // b = 17
6.
00000000 00000000 00000000 00000000 00000000 0010001|0 00000000 00000000 // b左移17位后的值(lb)
7.
|<--这里后面的位补0
8.
1.
2.
设:workerId = 25,其(c)二进制左移12位计算过程如下:
3.
4.
|<--这里开始左移12位
5.
00000000 0000|0000 00000000 00000000 00000000 00000000 00000000 00011001 // c = 25
6.
00000000 00000000 00000000 00000000 00000000 00000001 1001|0000 00000000 // c左移12位后的值(lc)
7.
|<--这里后面的位补0
8.
1.
2.
设:sequence = 0,其二进制如下:
3.
4.
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 // sequence = 0
5.
现在知道了每个部分左移后的值(la,lb,lc),代码可以简化成下面这样去理解:
1.
2.
return ((timestamp - 1288834974657) << 22) |
3.
(datacenterId << 17) |
4.
(workerId << 12) |
5.
sequence;
6.
-----------------------------
7.
|
8.
|简化
9.
\|/
10.
-----------------------------
11.
return (la) |
12.
(lb) |
13.
(lc) |
14.
sequence;
15.
上面的管道符号|
在Java中也是一个位运算符。其含义是:
x的第n位和y的第n位 只要有一个是1,则结果的第n位也为1,否则为0
,因此,我们对四个数的位或运算
如下:
1.
2.
1 | 41 | 5 | 5 | 12
3.
4.
0|0001100 10100010 10111110 10001001 01011100 00|00000|0 0000|0000 00000000 //la
5.
0|0000000 00000000 00000000 00000000 00000000 00|10001|0 0000|0000 00000000 //lb
6.
0|0000000 00000000 00000000 00000000 00000000 00|00000|1 1001|0000 00000000 //lc
7.
or 0|0000000 00000000 00000000 00000000 00000000 00|00000|0 0000|0000 00000000 //sequence
8.
------------------------------------------------------------------------------------------
9.
0|0001100 10100010 10111110 10001001 01011100 00|10001|1 1001|0000 00000000 //结果:910499571847892992
10.
结果计算过程:
1) 从至左列出1出现的下标(从0开始算):
1.
2.
0000 1 1 00 1 0 1 000 1 0 1 0 1 1 1 1 1 0 1 000 1 00 1 0 1 0 1 1 1 0000 1 000 1 1 1 00 1 0000 0000 0000
3.
59 58 55 53 49 47 45 44 43 42 41 39 35 32 30 28 27 26 21 17 16 15 12
4.
2) 各个下标作为2的幂数来计算,并相加:
259+258+255+253+249+247+245+244+243+242+241+239+235+232+230+228+227+226+221+217+216+215+22
1.
2.
2^59} : 576460752303423488
3.
2^58} : 288230376151711744
4.
2^55} : 36028797018963968
5.
2^53} : 9007199254740992
6.
2^49} : 562949953421312
7.
2^47} : 140737488355328
8.
2^45} : 35184372088832
9.
2^44} : 17592186044416
10.
2^43} : 8796093022208
11.
2^42} : 4398046511104
12.
2^41} : 2199023255552
13.
2^39} : 549755813888
14.
2^35} : 34359738368
15.
2^32} : 4294967296
16.
2^30} : 1073741824
17.
2^28} : 268435456
18.
2^27} : 134217728
19.
2^26} : 67108864
20.
2^21} : 2097152
21.
2^17} : 131072
22.
2^16} : 65536
23.
2^15} : 32768
24.
+ 2^12} : 4096
25.
----------------------------------------
26.
910499571847892992
27.
计算截图:
跟测试程序打印出来的结果一样,手工验证完毕!
观察
1.
2.
1 | 41 | 5 | 5 | 12
3.
4.
0|0001100 10100010 10111110 10001001 01011100 00| | | //la
5.
0| |10001| | //lb
6.
0| | |1 1001| //lc
7.
or 0| | | |0000 00000000 //sequence
8.
------------------------------------------------------------------------------------------
9.
0|0001100 10100010 10111110 10001001 01011100 00|10001|1 1001|0000 00000000 //结果:910499571847892992
10.
上面的64位我按1、41、5、5、12的位数截开了,方便观察。
-
纵向
观察发现:<ul style="margin-left:3em;"><li>在41位那一段,除了la一行有值,其它行(lb、lc、sequence)都是0,(我爸其它)</li> <li>在左起第一个5位那一段,除了lb一行有值,其它行都是0</li> <li>在左起第二个5位那一段,除了lc一行有值,其它行都是0</li> <li>按照这规律,如果sequence是0以外的其它值,12位那段也会有值的,其它行都是0</li> </ul></li> <li> <p><code>横向</code>观察发现:</p> <ul style="margin-left:3em;"><li>在la行,由于左移了5+5+12位,5、5、12这三段都补0了,所以la行除了41那段外,其它肯定都是0</li> <li>同理,lb、lc、sequnece行也以此类推</li> <li>正因为左移的操作,使四个不同的值移到了SnowFlake理论上相应的位置,然后四行做<code>位或</code>运算(只要有1结果就是1),就把4段的二进制数合并成一个二进制数。</li> </ul></li>
结论:
所以,在这段代码中
1.
2.
return ((timestamp - 1288834974657) << 22) |
3.
(datacenterId << 17) |
4.
(workerId << 12) |
5.
sequence;
6.
左移运算是为了将数值移动到对应的段(41、5、5,12那段因为本来就在最右,因此不用左移)。
然后对每个左移后的值(la、lb、lc、sequence)做位或运算,是为了把各个短的数据合并起来,合并成一个二进制数。
最后转换成10进制,就是最终生成的id
扩展
在理解了这个算法之后,其实还有一些扩展的事情可以做:
- 根据自己业务修改每个位段存储的信息。算法是通用的,可以根据自己需求适当调整每段的大小以及存储的信息。
- 解密id,由于id的每段都保存了特定的信息,所以拿到一个id,应该可以尝试反推出原始的每个段的信息。反推出的信息可以帮助我们分析。比如作为订单,可以知道该订单的生成日期,负责处理的数据中心等等。