1. 程式人生 > >MySQL 8.0.12 深入理解bit型別

MySQL 8.0.12 深入理解bit型別

背景:
在阿里巴巴推薦的MySQL建表規範裡要求如下:
表達是與否概念的欄位,必須使用 is_xxx 的方式命名,資料型別是 unsigned tinyint
( 1 表示是,0 表示否)。

解釋:
在MySQL裡表示是和否的概念 可以使用如下三種方案:
1.使用bit(1)型別,此時bit允許儲存的是ASCII中的0和1. 0表示否1表示是的概念。
2.使用tinyint unsigned型別,此時儲存的是十進位制數字0和1.
3.使用boolean型別,MySQL並不真正支援此型別,是為相容其他型別的資料。
TRUE等同於1,表示是;False等同於0表示否;TRUE和FALSE不區分大小寫。
true和false在儲存的時候是以0和1儲存的。

試驗驗證:
1.前期使用sysbench建立一張1000萬條記錄的表t1,複製三個表新增不同型別的資料:
mysql> desc t1;
+----------------+-----------+------+-----+-------------------+-----------------------------+
| Field          | Type      | Null | Key | Default           | Extra                       |
+----------------+-----------+------+-----+-------------------+-----------------------------+
| id             | int(11)   | NO   | PRI | NULL              | auto_increment              |
| k              | int(11)   | NO   | MUL | 0                 |                             |
| c              | char(120) | NO   |     |                   |                             |
| pad            | char(60)  | NO   |     |                   |                             |
| LastModifyTime | datetime  | NO   |     | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
+----------------+-----------+------+-----+-------------------+-----------------------------+
5 rows in set (0.01 sec)

mysql> create table b1 like t1;
Query OK, 0 rows affected (0.02 sec)

mysql> create table b2 like t1;
Query OK, 0 rows affected (0.04 sec)
mysql> create table b3 like t1;
Query OK, 0 rows affected (0.06 sec)

2.新增欄位:
b1表採用bit(1) 型別;
b2表採用tinyint unsigned 型別;
b3表採用bool型別。
mysql> alter table b1 add column b bit(1) not null default b'0' comment '0否 1是';
Query OK, 0 rows affected (0.08 sec)
Records: 0  Duplicates: 0  Warnings: 0
mysql> alter table b2 add column b tinyint unsigned not null default 0 comment '0否 1是';
Query OK, 0 rows affected (0.02 sec)
mysql> alter table b3 add column b bool not null default false comment 'false否0 true 是1';
Query OK, 0 rows affected (0.44 sec)
Records: 0  Duplicates: 0  Warnings: 0

3.插入資料:
mysql> insert into b1 select * from t1;
Query OK, 10000000 rows affected (3 min 12.15 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

mysql> insert into b2 select * from t1;
Query OK, 10000000 rows affected (3 min 10.29 sec)
Records: 10000000  Duplicates: 0  Warnings: 0
mysql> insert into b3 select * from t1;
Query OK, 10000000 rows affected (3 min 10.29 sec)
Records: 10000000  Duplicates: 0  Warnings: 0

4.更新資料:
--更新操作:
mysql> update b1 set b=case when id%2=1 then b'1' when id%2=0 then b'0' end;
Query OK, 5000000 rows affected (2 min 3.22 sec)
Rows matched: 10000000  Changed: 5000000  Warnings: 0

mysql> update b2 set b=case when id%2=1 then 1 when id%2=0 then 0 end;
Query OK, 5000000 rows affected (1 min 50.12 sec)
Rows matched: 10000000  Changed: 5000000  Warnings: 0

mysql> update b3 set b=case when id%2=1 then true when id%2=0 then false end;
Query OK, 5000000 rows affected (1 min 51.17 sec)
Rows matched: 10000000  Changed: 5000000  Warnings: 0

5.資料查詢:
--查詢:
mysql> select id,k,lastmodifytime,bin(b) from b1 limit 2;
+----+---------+---------------------+--------+
| id | k       | lastmodifytime      | bin(b) |
+----+---------+---------------------+--------+
|  1 | 5014614 | 2018-09-13 09:38:02 | 1      |
|  2 | 5024801 | 2018-09-12 14:39:49 | 0      |
+----+---------+---------------------+--------+
2 rows in set (0.00 sec)

mysql> select id,k,lastmodifytime,b from b2 limit 2;
+----+---------+---------------------+---+
| id | k       | lastmodifytime      | b |
+----+---------+---------------------+---+
|  1 | 5014614 | 2018-09-13 09:42:06 | 1 |
|  2 | 5024801 | 2018-09-12 14:39:49 | 0 |
+----+---------+---------------------+---+
2 rows in set (0.00 sec)

mysql> select id,k,lastmodifytime,b from b3 limit 2;
+----+---------+---------------------+---+
| id | k       | lastmodifytime      | b |
+----+---------+---------------------+---+
|  1 | 5014614 | 2018-09-13 09:48:28 | 1 |
|  2 | 5024801 | 2018-09-12 14:39:49 | 0 |
+----+---------+---------------------+---+
2 rows in set (0.00 sec)

將b寫為bit型別時候需要使用函式bin函式方可正確查詢,在命令模式下查詢出來的結果為空。
mysql> select id,k,lastmodifytime,b from b1 limit 2;
+----+---------+---------------------+---+
| id | k       | lastmodifytime      | b |
+----+---------+---------------------+---+
|  1 | 5014614 | 2018-09-13 09:38:02 |   |
|  2 | 5024801 | 2018-09-12 14:39:49 |   |
+----+---------+---------------------+---+
2 rows in set (0.00 sec)
在SQLyog等客戶端工具中查詢:
b1的查詢:
    id        k  lastmodifytime       b       
------  -------  -------------------  --------
     1  5014614  2018-09-13 09:38:02  b'1'    
     2  5024801  2018-09-12 14:39:49  b'000000
b2和b3的查詢:
    id        k  lastmodifytime            b  
------  -------  -------------------  --------
     1  5014614  2018-09-13 09:48:28         1
     2  5024801  2018-09-12 14:39:49         0

結論:在易讀性上使用tinyint型別和bool型別是等同的,使用bit型別則需要使用函式轉換。

6.資料儲存空間:
由於使用update等語句會對真實的表有碎片,在查詢表真實空間大小前整理表碎片:
mysql>optimize table b1;
mysql>optimize table b2;
mysql>optimize table b3;
# du -sb b*
2868903936 b1.ibd
2868903936 b2.ibd
2868903936 b3.ibd
結論:當使用bool型別和bit(1) 的時候和tinyint是儲存上是等同的。

7.查詢驗證:
對於b1表正確的查詢語句:(b1表使用bit型別)
 select id,k,lastmodifytime,b,bin(b) from b1 where b=b'0' limit 2;
對於b2表正確的查詢語句:(b2表使用tinyint型別)
select id,k,lastmodifytime,b,bin(b) from b1 where b=0 limit 2;
對於b3表正確的查詢語句:(b3表使用boolean型別)
select id,k,lastmodifytime,b,bin(b) from b1 where b is false limit 2;

現在對B1、B2、B3表均採用上述查詢:
--B1表查詢:
mysql> select id,k,lastmodifytime,b,bin(b) from b1 where b=b'0' limit 2;
+----+---------+---------------------+---+--------+
| id | k       | lastmodifytime      | b | bin(b) |
+----+---------+---------------------+---+--------+
|  2 | 5024801 | 2018-09-12 14:39:49 |   | 0      |
|  4 | 5026450 | 2018-09-12 14:39:49 |   | 0      |
+----+---------+---------------------+---+--------+
2 rows in set (0.03 sec)

mysql> select id,k,lastmodifytime,b,bin(b) from b1 where b=0 limit 2;
+----+---------+---------------------+---+--------+
| id | k       | lastmodifytime      | b | bin(b) |
+----+---------+---------------------+---+--------+
|  2 | 5024801 | 2018-09-12 14:39:49 |   | 0      |
|  4 | 5026450 | 2018-09-12 14:39:49 |   | 0      |
+----+---------+---------------------+---+--------+
2 rows in set (0.03 sec)

mysql> select id,k,lastmodifytime,b,bin(b) from b1 where b=false limit 2;
+----+---------+---------------------+---+--------+
| id | k       | lastmodifytime      | b | bin(b) |
+----+---------+---------------------+---+--------+
|  2 | 5024801 | 2018-09-12 14:39:49 |   | 0      |
|  4 | 5026450 | 2018-09-12 14:39:49 |   | 0      |
+----+---------+---------------------+---+--------+
2 rows in set (0.00 sec)

mysql> select id,k,lastmodifytime,b,bin(b) from b1 where b is false limit 2;
+----+---------+---------------------+---+--------+
| id | k       | lastmodifytime      | b | bin(b) |
+----+---------+---------------------+---+--------+
|  2 | 5024801 | 2018-09-12 14:39:49 |   | 0      |
|  4 | 5026450 | 2018-09-12 14:39:49 |   | 0      |
+----+---------+---------------------+---+--------+
2 rows in set (0.03 sec)

查詢結果一致。

--B2表:
mysql>  select id,k,lastmodifytime,b,bin(b) from b2 where b=b'0' limit 2;
+----+---------+---------------------+---+--------+
| id | k       | lastmodifytime      | b | bin(b) |
+----+---------+---------------------+---+--------+
|  2 | 5024801 | 2018-09-12 14:39:49 | 0 | 0      |
|  4 | 5026450 | 2018-09-12 14:39:49 | 0 | 0      |
+----+---------+---------------------+---+--------+
2 rows in set (0.00 sec)

mysql>  select id,k,lastmodifytime,b,bin(b) from b2 where b=0 limit 2;
+----+---------+---------------------+---+--------+
| id | k       | lastmodifytime      | b | bin(b) |
+----+---------+---------------------+---+--------+
|  2 | 5024801 | 2018-09-12 14:39:49 | 0 | 0      |
|  4 | 5026450 | 2018-09-12 14:39:49 | 0 | 0      |
+----+---------+---------------------+---+--------+
2 rows in set (0.00 sec)

mysql>  select id,k,lastmodifytime,b,bin(b) from b2 where b is false limit 2;
+----+---------+---------------------+---+--------+
| id | k       | lastmodifytime      | b | bin(b) |
+----+---------+---------------------+---+--------+
|  2 | 5024801 | 2018-09-12 14:39:49 | 0 | 0      |
|  4 | 5026450 | 2018-09-12 14:39:49 | 0 | 0      |
+----+---------+---------------------+---+--------+
2 rows in set (0.00 sec)

--B3表:
mysql> select id,k,lastmodifytime,b,bin(b) from b3 where b=b'0' limit 2;
+----+---------+---------------------+---+--------+
| id | k       | lastmodifytime      | b | bin(b) |
+----+---------+---------------------+---+--------+
|  2 | 5024801 | 2018-09-12 14:39:49 | 0 | 0      |
|  4 | 5026450 | 2018-09-12 14:39:49 | 0 | 0      |
+----+---------+---------------------+---+--------+
2 rows in set (0.03 sec)

mysql> select id,k,lastmodifytime,b,bin(b) from b3 where b=0 limit 2;
+----+---------+---------------------+---+--------+
| id | k       | lastmodifytime      | b | bin(b) |
+----+---------+---------------------+---+--------+
|  2 | 5024801 | 2018-09-12 14:39:49 | 0 | 0      |
|  4 | 5026450 | 2018-09-12 14:39:49 | 0 | 0      |
+----+---------+---------------------+---+--------+
2 rows in set (0.00 sec)

mysql> select id,k,lastmodifytime,b,bin(b) from b3 where b is false limit 2;
+----+---------+---------------------+---+--------+
| id | k       | lastmodifytime      | b | bin(b) |
+----+---------+---------------------+---+--------+
|  2 | 5024801 | 2018-09-12 14:39:49 | 0 | 0      |
|  4 | 5026450 | 2018-09-12 14:39:49 | 0 | 0      |
+----+---------+---------------------+---+--------+
2 rows in set (0.00 sec)

上述三個語句在查詢的時候均可直接查詢出結果。



8.對b欄位加索引查詢:
mysql> alter table b1 add key ix_b(b);
Query OK, 0 rows affected (11.45 sec)
Records: 0  Duplicates: 0  Warnings: 0
mysql> alter table b2 add key ix_b(b);
Query OK, 0 rows affected (11.93 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> alter table b3 add key ix_b(b);
Query OK, 0 rows affected (11.59 sec)
Records: 0  Duplicates: 0  Warnings: 0

--對B1表查詢的執行計劃:
mysql> explain select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b=b'0' limit 2\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: b1
   partitions: NULL
         type: ref
possible_keys: ix_b
          key: ix_b
      key_len: 1
          ref: const
         rows: 4929222
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)
查詢走了執行計劃。
mysql> explain select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b='0' limit 2\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: NULL
   partitions: NULL
         type: NULL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: NULL
     filtered: NULL
        Extra: no matching row in const table
1 row in set, 1 warning (0.00 sec)
提示資訊:沒有匹配的資料。

mysql> explain select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b=0 limit 2\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: b1
   partitions: NULL
         type: ref
possible_keys: ix_b
          key: ix_b
      key_len: 1
          ref: const
         rows: 4929222
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.02 sec)

*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: b1
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 9858444
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.02 sec)
全表掃描。

mysql> explain select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b=false limit 2\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: b1
   partitions: NULL
         type: ref
possible_keys: ix_b
          key: ix_b
      key_len: 1
          ref: const
         rows: 4929222
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)

走索引。
實際查詢一次:
mysql> select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b=b'0' limit 2;
+----+---------+---------------------+---+--------+-------------------+
| id | k       | lastmodifytime      | b | bin(b) | cast(b as signed) |
+----+---------+---------------------+---+--------+-------------------+
|  2 | 5024801 | 2018-09-12 14:39:49 |   | 0      |                 0 |
|  4 | 5026450 | 2018-09-12 14:39:49 |   | 0      |                 0 |
+----+---------+---------------------+---+--------+-------------------+
2 rows in set (0.00 sec)

mysql> select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b=0 limit 2;
+----+---------+---------------------+---+--------+-------------------+
| id | k       | lastmodifytime      | b | bin(b) | cast(b as signed) |
+----+---------+---------------------+---+--------+-------------------+
|  2 | 5024801 | 2018-09-12 14:39:49 |   | 0      |                 0 |
|  4 | 5026450 | 2018-09-12 14:39:49 |   | 0      |                 0 |
+----+---------+---------------------+---+--------+-------------------+
2 rows in set (0.00 sec)

mysql> select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b=false limit 2;
+----+---------+---------------------+---+--------+-------------------+
| id | k       | lastmodifytime      | b | bin(b) | cast(b as signed) |
+----+---------+---------------------+---+--------+-------------------+
|  2 | 5024801 | 2018-09-12 14:39:49 |   | 0      |                 0 |
|  4 | 5026450 | 2018-09-12 14:39:49 |   | 0      |                 0 |
+----+---------+---------------------+---+--------+-------------------+
2 rows in set (0.00 sec)

mysql> select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b='0' limit 2;
Empty set (0.00 sec)

mysql> select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b is false limit 2;
+----+---------+---------------------+---+--------+-------------------+
| id | k       | lastmodifytime      | b | bin(b) | cast(b as signed) |
+----+---------+---------------------+---+--------+-------------------+
|  2 | 5024801 | 2018-09-12 14:39:49 |   | 0      |                 0 |
|  4 | 5026450 | 2018-09-12 14:39:49 |   | 0      |                 0 |
+----+---------+---------------------+---+--------+-------------------+
2 rows in set (0.00 sec)

結論:針對bit型別 最正確的查詢方式是 
select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b=b'0' limit 2;
其他的會發生隱式型別轉換。


針對tinyint型別的執行計劃:
mysql> explain select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b2 where b is false limit 2\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: b2
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 9858444
     filtered: 100.00
        Extra: Using where
1 row in set, 1 warning (0.00 sec)
全表掃描。
如下的查詢語句執行計劃如下:
mysql> explain select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b2 where b=b'0' limit 2\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: b2
   partitions: NULL
         type: ref
possible_keys: ix_b
          key: ix_b
      key_len: 1
          ref: const
         rows: 4929222
     filtered: 100.00
        Extra: NULL
1 row in set, 1 warning (0.00 sec)
查詢語句:
select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b2 where b=b'0' limit 2;
select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b2 where b=0 limit 2;
select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b2 where b='0' limit 2;
select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b2 where b=false limit 2;

--針對boolean型別:
查詢的執行計劃如B2.

分析:
MySQL把BIT當做字串型別, 而不是資料型別。當檢索BIT(1)列的值, 結果是一個字串且內容是二進位制位0或1, 而不是ASCII值”0″或”1″.
結論:
對於bit 型別的數值不使用索引,mysql 檢索bit的值是不管是數值還是字元,mysql會對where 條件進行型別轉化,將字元轉換為數值,並比較數值對應的ascii碼,如果值為1,則返回結果,否則,結果為空。

9.bit型別在ETL中:
針對boolean型別在MySQL資料庫中是以0和1儲存的,查詢可以直接查詢出。
採用bit(1)型別儲存需要做轉換:
mysql> select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b is false limit 2;
+----+---------+---------------------+---+--------+-------------------+
| id | k       | lastmodifytime      | b | bin(b) | cast(b as signed) |
+----+---------+---------------------+---+--------+-------------------+
|  2 | 5024801 | 2018-09-12 14:39:49 |   | 0      |                 0 |
|  4 | 5026450 | 2018-09-12 14:39:49 |   | 0      |                 0 |
+----+---------+---------------------+---+--------+-------------------+
2 rows in set (0.00 sec)
可以看到如不做轉換的時候則查詢出來為空,在做資料ETL的時候對其他資料庫不便於讀取。

10.MySQL bit型別的bool運算和bit型別的儲存:
mysql> select b'0'=0,b'0'='0',b'110000'='0',b'110000'+0;
+--------+----------+---------------+-------------+
| b'0'=0 | b'0'='0' | b'110000'='0' | b'110000'+0 |
+--------+----------+---------------+-------------+
|      1 |        0 |             1 |          48 |
+--------+----------+---------------+-------------+
1 row in set (0.03 sec)

mysql> select b'1'=1,b'1'='1',b'110001'='1',b'110001'+0;
+--------+----------+---------------+-------------+
| b'1'=1 | b'1'='1' | b'110001'='1' | b'110001'+0 |
+--------+----------+---------------+-------------+
|      1 |        0 |             1 |          49 |
+--------+----------+---------------+-------------+
1 row in set (0.00 sec)
結論:b'110000'為十進位制的48,而不是字串0.

驗證:
mysql> create table b(a bit(64));
Query OK, 0 rows affected (0.04 sec)

mysql> create table bb(a bit(65));
ERROR 1439 (42000): Display width out of range for column 'a' (max = 64)

--插入資料:
insert into b values(b'1'),(b'0'),(b'01000001'),(b'01011010'),(b'01100001'),(b'00110000'),(b'00110001');
mysql> select a,a+0,bin(a),oct(a),hex(a) from b;
+----------+------+---------+--------+--------+
| a        | a+0  | bin(a)  | oct(a) | hex(a) |
+----------+------+---------+--------+--------+
|          |    1 | 1       | 1      | 1      |
|          |    0 | 0       | 0      | 0      |
|        A |   65 | 1000001 | 101    | 41     |
|        Z |   90 | 1011010 | 132    | 5A     |
|        a |   97 | 1100001 | 141    | 61     |
|        0 |   48 | 110000  | 60     | 30     |
|        1 |   49 | 110001  | 61     | 31     |
+----------+------+---------+--------+--------+
7 rows in set (0.02 sec)

mysql> select lpad(bin(a),8,'0') la,a,a+0,bin(a),oct(a),hex(a) from b;
+----------+----------+------+---------+--------+--------+
| la       | a        | a+0  | bin(a)  | oct(a) | hex(a) |
+----------+----------+------+---------+--------+--------+
| 00000000 |          |    0 | 0       | 0      | 0      |
| 00000001 |          |    1 | 1       | 1      | 1      |
| 00110000 |        0 |   48 | 110000  | 60     | 30     |
| 00110001 |        1 |   49 | 110001  | 61     | 31     |
| 01000001 |        A |   65 | 1000001 | 101    | 41     |
| 01011010 |        Z |   90 | 1011010 | 132    | 5A     |
| 01100001 |        a |   97 | 1100001 | 141    | 61     |
+----------+----------+------+---------+--------+--------+
7 rows in set (0.00 sec)

結論:bit型別表示方式為bit(m),M的取值範圍為1到64.bit型別儲存的二進位制字串。
bit型別的資料範圍為bit(1) 到bit(64),換算成十進位制範圍0到2^64減1;
tinyint unsigned的資料範圍,十進位制表示為0到255.
bit的儲存近似為(M+7)/8 bytes,而tinyint為1byte。
當使用bit(1)和tinyint的時候使用的儲存空間一致。


11.JDBC和bit型別:

MySQL Type Name	Return value of GetColumnTypeName	Return value of GetColumnClassName
BIT(1)	    BIT	     java.lang.Boolean
BIT( > 1)	BIT	     byte[]
TINYINT	    TINYINT	 java.lang.Boolean if the configuration property tinyInt1isBit is set to true (the default) and the storage size is 1, or java.lang.Integer if not.
在一些應用場景下tinyint會被預設讀取為true或者false,而不是想要的資料0和1.
jdbc會把tinyint 認為是java.sql.Types.BIT
此時需要在連線配置上使用類似如下的配置:
jdbc:mysql://localhost/databaseName?tinyInt1isBit=false

結論:
從易於讀取性和大眾接受程度、軟體應用的通用性上推薦使用tinyint表示是和否。