1. 程式人生 > >理解分布式id生成算法SnowFlake

理解分布式id生成算法SnowFlake

想想 err print xxx 其中 moved except println cep

理解分布式id生成算法SnowFlake

https://segmentfault.com/a/1190000011282426#articleHeader2

分布式id生成算法的有很多種,Twitter的SnowFlake就是其中經典的一種。
概述
SnowFlake算法生成id的結果是一個64bit大小的整數,它的結構如下圖:

圖片描述

1位,不用。二進制中最高位為1的都是負數,但是我們生成的id一般都使用整數,所以這個最高位固定是0
41位,用來記錄時間戳(毫秒)。

41位可以表示241?1個數字,
如果只用來表示正整數(計算機中正數包含0),可以表示的數值範圍是:0 至 241?1,減1是因為可表示的數值範圍是從0開始算的,而不是1。

也就是說41位可以表示241?1個毫秒的值,轉化成單位年則是(241?1)/(1000?60?60?24?365)=69年
10位,用來記錄工作機器id。

可以部署在210=1024個節點,包括5位datacenterId和5位workerId
5位(bit)可以表示的最大正整數是25?1=31,即可以用0、1、2、3、....31這32個數字,來表示不同的datecenterId或workerId
12位,序列號,用來記錄同毫秒內產生的不同id。

12位(bit)可以表示的最大正整數是212?1=4095,即可以用0、1、2、3、....4094這4095個數字,來表示同一機器同一時間截(毫秒)內產生的4095個ID序號

由於在Java中64bit的整數是long類型,所以在Java中SnowFlake算法生成的id就是long來存儲的。

SnowFlake可以保證:

所有生成的id按時間趨勢遞增
整個分布式系統內不會產生重復id(因為有datacenterId和workerId來做區分)
Talk is cheap, show you the code
以下是Twitter官方原版的,用Scala寫的,(我也不懂Scala,當成Java看即可):

/** Copyright 2010-2012 Twitter, Inc.*/
package com.twitter.service.snowflake

import com.twitter.ostrich.stats.Stats

import com.twitter.service.snowflake.gen._
import java.util.Random
import com.twitter.logging.Logger

/**

  • An object that generates IDs.
  • This is broken into a separate class in case
  • we ever want to support multiple worker threads
  • per process
    */
    class IdWorker(
    val workerId: Long,
    val datacenterId: Long,
    private val reporter: Reporter,
    var sequence: Long = 0L) extends Snowflake.Iface {

private[this] def genCounter(agent: String) = {
Stats.incr("ids_generated")
Stats.incr("ids_generated_%s".format(agent))
}
private[this] val exceptionCounter = Stats.getCounter("exceptions")
private[this] val log = Logger.get
private[this] val rand = new Random

val twepoch = 1288834974657L

private[this] val workerIdBits = 5L
private[this] val datacenterIdBits = 5L
private[this] val maxWorkerId = -1L ^ (-1L << workerIdBits)
private[this] val maxDatacenterId = -1L ^ (-1L << datacenterIdBits)
private[this] val sequenceBits = 12L

private[this] val workerIdShift = sequenceBits
private[this] val datacenterIdShift = sequenceBits + workerIdBits
private[this] val timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits
private[this] val sequenceMask = -1L ^ (-1L << sequenceBits)

private[this] var lastTimestamp = -1L

// sanity check for workerId
if (workerId > maxWorkerId || workerId < 0) {
exceptionCounter.incr(1)
throw new IllegalArgumentException("worker Id can‘t be greater than %d or less than 0".format(maxWorkerId))
}

if (datacenterId > maxDatacenterId || datacenterId < 0) {
exceptionCounter.incr(1)
throw new IllegalArgumentException("datacenter Id can‘t be greater than %d or less than 0".format(maxDatacenterId))
}

log.info("worker starting. timestamp left shift %d, datacenter id bits %d, worker id bits %d, sequence bits %d, workerid %d",
timestampLeftShift, datacenterIdBits, workerIdBits, sequenceBits, workerId)

def get_id(useragent: String): Long = {
if (!validUseragent(useragent)) {
exceptionCounter.incr(1)
throw new InvalidUserAgentError
}

val id = nextId()
genCounter(useragent)

reporter.report(new AuditLogEntry(id, useragent, rand.nextLong))
id

}

def get_worker_id(): Long = workerId
def get_datacenter_id(): Long = datacenterId
def get_timestamp() = System.currentTimeMillis

protected[snowflake] def nextId(): Long = synchronized {
var timestamp = timeGen()

if (timestamp < lastTimestamp) {
  exceptionCounter.incr(1)
  log.error("clock is moving backwards.  Rejecting requests until %d.", lastTimestamp);
  throw new InvalidSystemClock("Clock moved backwards.  Refusing to generate id for %d milliseconds".format(
    lastTimestamp - timestamp))
}

if (lastTimestamp == timestamp) {
  sequence = (sequence + 1) & sequenceMask
  if (sequence == 0) {
    timestamp = tilNextMillis(lastTimestamp)
  }
} else {
  sequence = 0
}

lastTimestamp = timestamp
((timestamp - twepoch) << timestampLeftShift) |
  (datacenterId << datacenterIdShift) |
  (workerId << workerIdShift) | 
  sequence

}

protected def tilNextMillis(lastTimestamp: Long): Long = {
var timestamp = timeGen()
while (timestamp <= lastTimestamp) {
timestamp = timeGen()
}
timestamp
}

protected def timeGen(): Long = System.currentTimeMillis()

val AgentParser = """([a-zA-Z][a-zA-Z-0-9]*)""".r

def validUseragent(useragent: String): Boolean = useragent match {
case AgentParser() => true
case
=> false
}
}
Scala是一門可以編譯成字節碼的語言,簡單理解是在Java語法基礎上加上了很多語法糖,例如不用每條語句後寫分號,可以使用動態類型等等。抱著試一試的心態,我把Scala版的代碼“翻譯”成Java版本的,對scala代碼改動的地方如下:

/** Copyright 2010-2012 Twitter, Inc.*/
package com.twitter.service.snowflake

import com.twitter.ostrich.stats.Stats
import com.twitter.service.snowflake.gen._
import java.util.Random
import com.twitter.logging.Logger

/**

  • An object that generates IDs.
  • This is broken into a separate class in case
  • we ever want to support multiple worker threads
  • per process
    */
    class IdWorker( // |
    val workerId: Long, // |
    val datacenterId: Long, // |<--這部分改成Java的構造函數形式
    private val reporter: Reporter,//日誌相關,刪 // |
    var sequence: Long = 0L) // |
    extends Snowflake.Iface { //接口找不到,刪 // |

private[this] def genCounter(agent: String) = { // |
Stats.incr("ids_generated") // |
Stats.incr("ids_generated_%s".format(agent)) // |<--錯誤、日誌處理相關,刪
} // |
private[this] val exceptionCounter = Stats.getCounter("exceptions") // |
private[this] val log = Logger.get // |
private[this] val rand = new Random // |

val twepoch = 1288834974657L

private[this] val workerIdBits = 5L
private[this] val datacenterIdBits = 5L
private[this] val maxWorkerId = -1L ^ (-1L << workerIdBits)
private[this] val maxDatacenterId = -1L ^ (-1L << datacenterIdBits)
private[this] val sequenceBits = 12L

private[this] val workerIdShift = sequenceBits
private[this] val datacenterIdShift = sequenceBits + workerIdBits
private[this] val timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits
private[this] val sequenceMask = -1L ^ (-1L << sequenceBits)

private[this] var lastTimestamp = -1L

//----------------------------------------------------------------------------------------------------------------------------//
// sanity check for workerId //
if (workerId > maxWorkerId || workerId < 0) { //
exceptionCounter.incr(1) //<--錯誤處理相關,刪 //
throw new IllegalArgumentException("worker Id can‘t be greater than %d or less than 0".format(maxWorkerId)) //這
// |-->改成:throw new IllegalArgumentException //部
// (String.format("worker Id can‘t be greater than %d or less than 0",maxWorkerId)) //分
} //放
//到
if (datacenterId > maxDatacenterId || datacenterId < 0) { //構
exceptionCounter.incr(1) //<--錯誤處理相關,刪 //造
throw new IllegalArgumentException("datacenter Id can‘t be greater than %d or less than 0".format(maxDatacenterId)) //函
// |-->改成:throw new IllegalArgumentException //數
// (String.format("datacenter Id can‘t be greater than %d or less than 0",maxDatacenterId)) //中
} //
//
log.info("worker starting. timestamp left shift %d, datacenter id bits %d, worker id bits %d, sequence bits %d, workerid %d", //
timestampLeftShift, datacenterIdBits, workerIdBits, sequenceBits, workerId) //
// |-->改成:System.out.printf("worker...%d...",timestampLeftShift,...); //
//----------------------------------------------------------------------------------------------------------------------------//

//-------------------------------------------------------------------//
//這個函數刪除錯誤處理相關的代碼後,剩下一行代碼:val id = nextId() //
//所以我們直接調用nextId()函數可以了,所以在“翻譯”時可以刪除這個函數 //
def get_id(useragent: String): Long = { //
if (!validUseragent(useragent)) { //
exceptionCounter.incr(1) //
throw new InvalidUserAgentError //刪
} //除
//
val id = nextId() //
genCounter(useragent) //
//
reporter.report(new AuditLogEntry(id, useragent, rand.nextLong)) //
id //
} //
//-------------------------------------------------------------------//

def get_worker_id(): Long = workerId // |
def get_datacenter_id(): Long = datacenterId // |<--改成Java函數
def get_timestamp() = System.currentTimeMillis // |

protected[snowflake] def nextId(): Long = synchronized { // 改成Java函數
var timestamp = timeGen()

if (timestamp < lastTimestamp) {
  exceptionCounter.incr(1) // 錯誤處理相關,刪
  log.error("clock is moving backwards.  Rejecting requests until %d.", lastTimestamp); // 改成System.err.printf(...)
  throw new InvalidSystemClock("Clock moved backwards.  Refusing to generate id for %d milliseconds".format(
    lastTimestamp - timestamp)) // 改成RumTimeException
}

if (lastTimestamp == timestamp) {
  sequence = (sequence + 1) & sequenceMask
  if (sequence == 0) {
    timestamp = tilNextMillis(lastTimestamp)
  }
} else {
  sequence = 0
}

lastTimestamp = timestamp
((timestamp - twepoch) << timestampLeftShift) | // |<--加上關鍵字return
  (datacenterId << datacenterIdShift) |         // |
  (workerId << workerIdShift) |                 // |
  sequence                                      // |

}

protected def tilNextMillis(lastTimestamp: Long): Long = { // 改成Java函數
var timestamp = timeGen()
while (timestamp <= lastTimestamp) {
timestamp = timeGen()
}
timestamp // 加上關鍵字return
}

protected def timeGen(): Long = System.currentTimeMillis() // 改成Java函數

val AgentParser = """([a-zA-Z][a-zA-Z-0-9]*)""".r // |
// |
def validUseragent(useragent: String): Boolean = useragent match { // |<--日誌相關,刪
case AgentParser() => true // |
case
=> false // |
} // |
}
改出來的Java版:

public class IdWorker{

private long workerId;
private long datacenterId;
private long sequence;

public IdWorker(long workerId, long datacenterId, long sequence){
    // sanity check for workerId
    if (workerId > maxWorkerId || workerId < 0) {
        throw new IllegalArgumentException(String.format("worker Id can't be greater than %d or less than 0",maxWorkerId));
    }
    if (datacenterId > maxDatacenterId || datacenterId < 0) {
        throw new IllegalArgumentException(String.format("datacenter Id can't be greater than %d or less than 0",maxDatacenterId));
    }
    System.out.printf("worker starting. timestamp left shift %d, datacenter id bits %d, worker id bits %d, sequence bits %d, workerid %d",
            timestampLeftShift, datacenterIdBits, workerIdBits, sequenceBits, workerId);

    this.workerId = workerId;
    this.datacenterId = datacenterId;
    this.sequence = sequence;
}

private long twepoch = 1288834974657L;

private long workerIdBits = 5L;
private long datacenterIdBits = 5L;
private long maxWorkerId = -1L ^ (-1L << workerIdBits);
private long maxDatacenterId = -1L ^ (-1L << datacenterIdBits);
private long sequenceBits = 12L;

private long workerIdShift = sequenceBits;
private long datacenterIdShift = sequenceBits + workerIdBits;
private long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits;
private long sequenceMask = -1L ^ (-1L << sequenceBits);

private long lastTimestamp = -1L;

public long getWorkerId(){
    return workerId;
}

public long getDatacenterId(){
    return datacenterId;
}

public long getTimestamp(){
    return System.currentTimeMillis();
}

public synchronized long nextId() {
    long timestamp = timeGen();

    if (timestamp < lastTimestamp) {
        System.err.printf("clock is moving backwards.  Rejecting requests until %d.", lastTimestamp);
        throw new RuntimeException(String.format("Clock moved backwards.  Refusing to generate id for %d milliseconds",
                lastTimestamp - timestamp));
    }

    if (lastTimestamp == timestamp) {
        sequence = (sequence + 1) & sequenceMask;
        if (sequence == 0) {
            timestamp = tilNextMillis(lastTimestamp);
        }
    } else {
        sequence = 0;
    }

    lastTimestamp = timestamp;
    return ((timestamp - twepoch) << timestampLeftShift) |
            (datacenterId << datacenterIdShift) |
            (workerId << workerIdShift) |
            sequence;
}

private long tilNextMillis(long lastTimestamp) {
    long timestamp = timeGen();
    while (timestamp <= lastTimestamp) {
        timestamp = timeGen();
    }
    return timestamp;
}

private long timeGen(){
    return System.currentTimeMillis();
}

//---------------測試---------------
public static void main(String[] args) {
    IdWorker worker = new IdWorker(1,1,1);
    for (int i = 0; i < 30; i++) {
        System.out.println(worker.nextId());
    }
}

}
代碼理解
上面的代碼中,有部分位運算的代碼,如:

sequence = (sequence + 1) & sequenceMask;

private long maxWorkerId = -1L ^ (-1L << workerIdBits);

return ((timestamp - twepoch) << timestampLeftShift) |
(datacenterId << datacenterIdShift) |
(workerId << workerIdShift) |
sequence;
為了能更好理解,我對相關知識研究了一下。

負數的二進制表示
在計算機中,負數的二進制是用補碼來表示的。
假設我是用Java中的int類型來存儲數字的,
int類型的大小是32個二進制位(bit),即4個字節(byte)。(1 byte = 8 bit)
那麽十進制數字3在二進制中的表示應該是這樣的:

00000000 00000000 00000000 00000011
// 3的二進制表示,就是原碼
那數字-3在二進制中應該如何表示?
我們可以反過來想想,因為-3+3=0,
在二進制運算中把-3的二進制看成未知數x來求解,
求解算式的二進制表示如下:

00000000 00000000 00000000 00000011 //3,原碼

  • xxxxxxxx xxxxxxxx xxxxxxxx xxxxxxxx //-3,補碼

    00000000 00000000 00000000 00000000
    反推x的值,3的二進制加上什麽值才使結果變成00000000 00000000 00000000 00000000?:

    00000000 00000000 00000000 00000011 //3,原碼
  • 11111111 11111111 11111111 11111101 //-3,補碼

    1 00000000 00000000 00000000 00000000
    反推的思路是3的二進制數從最低位開始逐位加1,使溢出的1不斷向高位溢出,直到溢出到第33位。然後由於int類型最多只能保存32個二進制位,所以最高位的1溢出了,剩下的32位就成了(十進制的)0。

補碼的意義就是可以拿補碼和原碼(3的二進制)相加,最終加出一個“溢出的0”

以上是理解的過程,實際中記住公式就很容易算出來:

補碼 = 反碼 + 1
補碼 = (原碼 - 1)再取反碼
因此-1的二進制應該這樣算:

00000000 00000000 00000000 00000001 //原碼:1的二進制
11111111 11111111 11111111 11111110 //取反碼:1的二進制的反碼
11111111 11111111 11111111 11111111 //加1:-1的二進制表示(補碼)
用位運算計算n個bit能表示的最大數值
比如這樣一行代碼:

private long workerIdBits = 5L;
private long maxWorkerId = -1L ^ (-1L << workerIdBits);       

上面代碼換成這樣看方便一點:
long maxWorkerId = -1L ^ (-1L << 5L)

咋一看真的看不準哪個部分先計算,於是查了一下Java運算符的優先級表:
圖片描述

所以上面那行代碼中,運行順序是:

-1 左移 5,得結果a
-1 異或 a
long maxWorkerId = -1L ^ (-1L << 5L)的二進制運算過程如下:

-1 左移 5,得結果a :

    11111111 11111111 11111111 11111111 //-1的二進制表示(補碼)

11111 11111111 11111111 11111111 11100000 //高位溢出的不要,低位補0
11111111 11111111 11111111 11100000 //結果a
-1 異或 a :

    11111111 11111111 11111111 11111111 //-1的二進制表示(補碼)
^   11111111 11111111 11111111 11100000 //兩個操作數的位中,相同則為0,不同則為1

    00000000 00000000 00000000 00011111 //最終結果31

最終結果是31,二進制00000000 00000000 00000000 00011111轉十進制可以這麽算:
24+23+22+21+20=16+8+4+2+1=31
那既然現在知道算出來long maxWorkerId = -1L ^ (-1L << 5L)中的maxWorkerId = 31,有什麽含義?為什麽要用左移5來算?如果你看過概述部分,請找到這段內容看看:

5位(bit)可以表示的最大正整數是25?1=31,即可以用0、1、2、3、....31這32個數字,來表示不同的datecenterId或workerId
-1L ^ (-1L << 5L)結果是31,25?1的結果也是31,所以在代碼中,-1L ^ (-1L << 5L)的寫法是利用位運算計算出5位能表示的最大正整數是多少

用mask防止溢出
有一段有趣的代碼:

sequence = (sequence + 1) & sequenceMask;
分別用不同的值測試一下,你就知道它怎麽有趣了:

    long seqMask = -1L ^ (-1L << 12L); //計算12位能耐存儲的最大正整數,相當於:2^12-1 = 4095
    System.out.println("seqMask: "+seqMask);
    System.out.println(1L & seqMask);
    System.out.println(2L & seqMask);
    System.out.println(3L & seqMask);
    System.out.println(4L & seqMask);
    System.out.println(4095L & seqMask);
    System.out.println(4096L & seqMask);
    System.out.println(4097L & seqMask);
    System.out.println(4098L & seqMask);


    /**
    seqMask: 4095
    1
    2
    3
    4
    4095
    0
    1
    2
    */

這段代碼通過位與運算保證計算的結果範圍始終是 0-4095 !

用位運算匯總結果
還有另外一段詭異的代碼:

return ((timestamp - twepoch) << timestampLeftShift) |
(datacenterId << datacenterIdShift) |
(workerId << workerIdShift) |
sequence;
為了弄清楚這段代碼,

首先 需要計算一下相關的值:

private long twepoch = 1288834974657L; //起始時間戳,用於用當前時間戳減去這個時間戳,算出偏移量

private long workerIdBits = 5L; //workerId占用的位數:5
private long datacenterIdBits = 5L; //datacenterId占用的位數:5
private long maxWorkerId = -1L ^ (-1L << workerIdBits);  // workerId可以使用的最大數值:31
private long maxDatacenterId = -1L ^ (-1L << datacenterIdBits); // datacenterId可以使用的最大數值:31
private long sequenceBits = 12L;//序列號占用的位數:12

private long workerIdShift = sequenceBits; // 12
private long datacenterIdShift = sequenceBits + workerIdBits; // 12+5 = 17
private long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits; // 12+5+5 = 22
private long sequenceMask = -1L ^ (-1L << sequenceBits);//4095

private long lastTimestamp = -1L;

其次 寫個測試,把參數都寫死,並運行打印信息,方便後面來核對計算結果:

//---------------測試---------------
public static void main(String[] args) {
    long timestamp = 1505914988849L;
    long twepoch = 1288834974657L;
    long datacenterId = 17L;
    long workerId = 25L;
    long sequence = 0L;

    System.out.printf("\ntimestamp: %d \n",timestamp);
    System.out.printf("twepoch: %d \n",twepoch);
    System.out.printf("datacenterId: %d \n",datacenterId);
    System.out.printf("workerId: %d \n",workerId);
    System.out.printf("sequence: %d \n",sequence);
    System.out.println();
    System.out.printf("(timestamp - twepoch): %d \n",(timestamp - twepoch));
    System.out.printf("((timestamp - twepoch) << 22L): %d \n",((timestamp - twepoch) << 22L));
    System.out.printf("(datacenterId << 17L): %d \n" ,(datacenterId << 17L));
    System.out.printf("(workerId << 12L): %d \n",(workerId << 12L));
    System.out.printf("sequence: %d \n",sequence);

    long result = ((timestamp - twepoch) << 22L) |
            (datacenterId << 17L) |
            (workerId << 12L) |
            sequence;
    System.out.println(result);

}

/** 打印信息:
    timestamp: 1505914988849 
    twepoch: 1288834974657 
    datacenterId: 17 
    workerId: 25 
    sequence: 0 
    
    (timestamp - twepoch): 217080014192 
    ((timestamp - twepoch) << 22L): 910499571845562368 
    (datacenterId << 17L): 2228224 
    (workerId << 12L): 102400 
    sequence: 0 
    910499571847892992
*/

代入位移的值得之後,就是這樣:

return ((timestamp - 1288834974657) << 22) |
(datacenterId << 17) |
(workerId << 12) |
sequence;
對於尚未知道的值,我們可以先看看概述 中對SnowFlake結構的解釋,再代入在合法範圍的值(windows系統可以用計算器方便計算這些值的二進制),來了解計算的過程。
當然,由於我的測試代碼已經把這些值寫死了,那直接用這些值來手工驗證計算結果即可:

    long timestamp = 1505914988849L;
    long twepoch = 1288834974657L;
    long datacenterId = 17L;
    long workerId = 25L;
    long sequence = 0L;

設:timestamp = 1505914988849,twepoch = 1288834974657
1505914988849 - 1288834974657 = 217080014192 (timestamp相對於起始時間的毫秒偏移量),其(a)二進制左移22位計算過程如下:

                    |<--這裏開始左右22位                            ?

00000000 00000000 000000|00 00110010 10001010 11111010 00100101 01110000 // a = 217080014192
00001100 10100010 10111110 10001001 01011100 00|000000 00000000 00000000 // a左移22位後的值(la)
|<--這裏後面的位補0
設:datacenterId = 17,其(b)二進制左移17位計算過程如下:

               |<--這裏開始左移17位    

00000000 00000000 0|0000000 ?00000000 00000000 00000000 00000000 00010001 // b = 17
0000000?0 00000000 00000000 00000000 00000000 0010001|0 00000000 00000000 // b左移17位後的值(lb)
|<--這裏後面的位補0
設:workerId = 25,其(c)二進制左移12位計算過程如下:

         |<--這裏開始左移12位    

?00000000 0000|0000 00000000 00000000 00000000 00000000 00000000 00011001? // c = 25
00000000 00000000 00000000 00000000 00000000 00000001 1001|0000 00000000? // c左移12位後的值(lc)
|<--這裏後面的位補0
設:sequence = 0,其二進制如下:

00000000 00000000 00000000 00000000 00000000 00000000 0000?0000 00000000? // sequence = 0
現在知道了每個部分左移後的值(la,lb,lc),代碼可以簡化成下面這樣去理解:

return ((timestamp - 1288834974657) << 22) |
(datacenterId << 17) |
(workerId << 12) |
sequence;
-----------------------------
|
|簡化
|/
-----------------------------
return (la) |
(lb) |
(lc) |
sequence;
上面的管道符號|在Java中也是一個位運算符。其含義是:
x的第n位和y的第n位 只要有一個是1,則結果的第n位也為1,否則為0,因此,我們對四個數的位或運算如下:

1 | 41 | 5 | 5 | 12

0|0001100 10100010 10111110 10001001 01011100 00|00000|0 0000|0000 00000000 //la
0|000000?0 00000000 00000000 00000000 00000000 00|10001|0 0000|0000 00000000 //lb
0|0000000 00000000 00000000 00000000 00000000 00|00000|1 1001|0000 00000000 //lc
or 0|0000000 00000000 00000000 00000000 00000000 00|00000|0 0000|?0000 00000000? //sequence
------------------------------------------------------------------------------------------
0|0001100 10100010 10111110 10001001 01011100 00|10001|1 1001|?0000 00000000? //結果:910499571847892992
結果計算過程:
1) 從至左列出1出現的下標(從0開始算):

0000 1 1 00 1 0 1 000 1 0 1 0 1 1 1 1 1 0 1 000 1 00 1 0 1 0 1 1 1 0000 1 000 1 1 1 00 1? 0000 0000 0000
59 58 55 53 49 47 45 44 43 42 41 39 35 32 30 28 27 26 21 17 16 15 12
2) 各個下標作為2的冪數來計算,並相加:

259+258+255+253+249+247+245+244+243+242+241+239+235+232+230+228+227+226+221+217+216+215+22
2^59} : 576460752303423488
2^58} : 288230376151711744
2^55} : 36028797018963968
2^53} : 9007199254740992
2^49} : 562949953421312
2^47} : 140737488355328
2^45} : 35184372088832
2^44} : 17592186044416
2^43} : 8796093022208
2^42} : 4398046511104
2^41} : 2199023255552
2^39} : 549755813888
2^35} : 34359738368
2^32} : 4294967296
2^30} : 1073741824
2^28} : 268435456
2^27} : 134217728
2^26} : 67108864
2^21} : 2097152
2^17} : 131072
2^16} : 65536
2^15} : 32768

  • 2^12} : 4096

         910499571847892992
    計算截圖:
    圖片描述

跟測試程序打印出來的結果一樣,手工驗證完畢!

觀察
1 | 41 | 5 | 5 | 12

0|0001100 10100010 10111110 10001001 01011100 00| | | //la
0| |10001| | //lb
0| | |1 1001| //lc
or 0| | | |?0000 00000000? //sequence
------------------------------------------------------------------------------------------
0|0001100 10100010 10111110 10001001 01011100 00|10001|1 1001|?0000 00000000? //結果:910499571847892992
上面的64位我按1、41、5、5、12的位數截開了,方便觀察。

縱向觀察發現:

在41位那一段,除了la一行有值,其它行(lb、lc、sequence)都是0,(我爸其它)
在左起第一個5位那一段,除了lb一行有值,其它行都是0
在左起第二個5位那一段,除了lc一行有值,其它行都是0
按照這規律,如果sequence是0以外的其它值,12位那段也會有值的,其它行都是0
橫向觀察發現:

在la行,由於左移了5+5+12位,5、5、12這三段都補0了,所以la行除了41那段外,其它肯定都是0
同理,lb、lc、sequnece行也以此類推
正因為左移的操作,使四個不同的值移到了SnowFlake理論上相應的位置,然後四行做位或運算(只要有1結果就是1),就把4段的二進制數合並成一個二進制數。
結論:
所以,在這段代碼中

return ((timestamp - 1288834974657) << 22) |
(datacenterId << 17) |
(workerId << 12) |
sequence;
左移運算是為了將數值移動到對應的段(41、5、5,12那段因為本來就在最右,因此不用左移)。

然後對每個左移後的值(la、lb、lc、sequence)做位或運算,是為了把各個短的數據合並起來,合並成一個二進制數。

最後轉換成10進制,就是最終生成的id

擴展
在理解了這個算法之後,其實還有一些擴展的事情可以做:

根據自己業務修改每個位段存儲的信息。算法是通用的,可以根據自己需求適當調整每段的大小以及存儲的信息。
解密id,由於id的每段都保存了特定的信息,所以拿到一個id,應該可以嘗試反推出原始的每個段的信息。反推出的信息可以幫助我們分析。比如作為訂單,可以知道該訂單的生成日期,負責處理的數據中心等等。

理解分布式id生成算法SnowFlake