MySQL 8.0.12 深入理解bit型別
阿新 • • 發佈:2018-12-09
背景: 在阿里巴巴推薦的MySQL建表規範裡要求如下: 表達是與否概念的欄位,必須使用 is_xxx 的方式命名,資料型別是 unsigned tinyint ( 1 表示是,0 表示否)。 解釋: 在MySQL裡表示是和否的概念 可以使用如下三種方案: 1.使用bit(1)型別,此時bit允許儲存的是ASCII中的0和1. 0表示否1表示是的概念。 2.使用tinyint unsigned型別,此時儲存的是十進位制數字0和1. 3.使用boolean型別,MySQL並不真正支援此型別,是為相容其他型別的資料。 TRUE等同於1,表示是;False等同於0表示否;TRUE和FALSE不區分大小寫。 true和false在儲存的時候是以0和1儲存的。 試驗驗證: 1.前期使用sysbench建立一張1000萬條記錄的表t1,複製三個表新增不同型別的資料: mysql> desc t1; +----------------+-----------+------+-----+-------------------+-----------------------------+ | Field | Type | Null | Key | Default | Extra | +----------------+-----------+------+-----+-------------------+-----------------------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | k | int(11) | NO | MUL | 0 | | | c | char(120) | NO | | | | | pad | char(60) | NO | | | | | LastModifyTime | datetime | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP | +----------------+-----------+------+-----+-------------------+-----------------------------+ 5 rows in set (0.01 sec) mysql> create table b1 like t1; Query OK, 0 rows affected (0.02 sec) mysql> create table b2 like t1; Query OK, 0 rows affected (0.04 sec) mysql> create table b3 like t1; Query OK, 0 rows affected (0.06 sec) 2.新增欄位: b1表採用bit(1) 型別; b2表採用tinyint unsigned 型別; b3表採用bool型別。 mysql> alter table b1 add column b bit(1) not null default b'0' comment '0否 1是'; Query OK, 0 rows affected (0.08 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> alter table b2 add column b tinyint unsigned not null default 0 comment '0否 1是'; Query OK, 0 rows affected (0.02 sec) mysql> alter table b3 add column b bool not null default false comment 'false否0 true 是1'; Query OK, 0 rows affected (0.44 sec) Records: 0 Duplicates: 0 Warnings: 0 3.插入資料: mysql> insert into b1 select * from t1; Query OK, 10000000 rows affected (3 min 12.15 sec) Records: 10000000 Duplicates: 0 Warnings: 0 mysql> insert into b2 select * from t1; Query OK, 10000000 rows affected (3 min 10.29 sec) Records: 10000000 Duplicates: 0 Warnings: 0 mysql> insert into b3 select * from t1; Query OK, 10000000 rows affected (3 min 10.29 sec) Records: 10000000 Duplicates: 0 Warnings: 0 4.更新資料: --更新操作: mysql> update b1 set b=case when id%2=1 then b'1' when id%2=0 then b'0' end; Query OK, 5000000 rows affected (2 min 3.22 sec) Rows matched: 10000000 Changed: 5000000 Warnings: 0 mysql> update b2 set b=case when id%2=1 then 1 when id%2=0 then 0 end; Query OK, 5000000 rows affected (1 min 50.12 sec) Rows matched: 10000000 Changed: 5000000 Warnings: 0 mysql> update b3 set b=case when id%2=1 then true when id%2=0 then false end; Query OK, 5000000 rows affected (1 min 51.17 sec) Rows matched: 10000000 Changed: 5000000 Warnings: 0 5.資料查詢: --查詢: mysql> select id,k,lastmodifytime,bin(b) from b1 limit 2; +----+---------+---------------------+--------+ | id | k | lastmodifytime | bin(b) | +----+---------+---------------------+--------+ | 1 | 5014614 | 2018-09-13 09:38:02 | 1 | | 2 | 5024801 | 2018-09-12 14:39:49 | 0 | +----+---------+---------------------+--------+ 2 rows in set (0.00 sec) mysql> select id,k,lastmodifytime,b from b2 limit 2; +----+---------+---------------------+---+ | id | k | lastmodifytime | b | +----+---------+---------------------+---+ | 1 | 5014614 | 2018-09-13 09:42:06 | 1 | | 2 | 5024801 | 2018-09-12 14:39:49 | 0 | +----+---------+---------------------+---+ 2 rows in set (0.00 sec) mysql> select id,k,lastmodifytime,b from b3 limit 2; +----+---------+---------------------+---+ | id | k | lastmodifytime | b | +----+---------+---------------------+---+ | 1 | 5014614 | 2018-09-13 09:48:28 | 1 | | 2 | 5024801 | 2018-09-12 14:39:49 | 0 | +----+---------+---------------------+---+ 2 rows in set (0.00 sec) 將b寫為bit型別時候需要使用函式bin函式方可正確查詢,在命令模式下查詢出來的結果為空。 mysql> select id,k,lastmodifytime,b from b1 limit 2; +----+---------+---------------------+---+ | id | k | lastmodifytime | b | +----+---------+---------------------+---+ | 1 | 5014614 | 2018-09-13 09:38:02 | | | 2 | 5024801 | 2018-09-12 14:39:49 | | +----+---------+---------------------+---+ 2 rows in set (0.00 sec) 在SQLyog等客戶端工具中查詢: b1的查詢: id k lastmodifytime b ------ ------- ------------------- -------- 1 5014614 2018-09-13 09:38:02 b'1' 2 5024801 2018-09-12 14:39:49 b'000000 b2和b3的查詢: id k lastmodifytime b ------ ------- ------------------- -------- 1 5014614 2018-09-13 09:48:28 1 2 5024801 2018-09-12 14:39:49 0 結論:在易讀性上使用tinyint型別和bool型別是等同的,使用bit型別則需要使用函式轉換。 6.資料儲存空間: 由於使用update等語句會對真實的表有碎片,在查詢表真實空間大小前整理表碎片: mysql>optimize table b1; mysql>optimize table b2; mysql>optimize table b3; # du -sb b* 2868903936 b1.ibd 2868903936 b2.ibd 2868903936 b3.ibd 結論:當使用bool型別和bit(1) 的時候和tinyint是儲存上是等同的。 7.查詢驗證: 對於b1表正確的查詢語句:(b1表使用bit型別) select id,k,lastmodifytime,b,bin(b) from b1 where b=b'0' limit 2; 對於b2表正確的查詢語句:(b2表使用tinyint型別) select id,k,lastmodifytime,b,bin(b) from b1 where b=0 limit 2; 對於b3表正確的查詢語句:(b3表使用boolean型別) select id,k,lastmodifytime,b,bin(b) from b1 where b is false limit 2; 現在對B1、B2、B3表均採用上述查詢: --B1表查詢: mysql> select id,k,lastmodifytime,b,bin(b) from b1 where b=b'0' limit 2; +----+---------+---------------------+---+--------+ | id | k | lastmodifytime | b | bin(b) | +----+---------+---------------------+---+--------+ | 2 | 5024801 | 2018-09-12 14:39:49 | | 0 | | 4 | 5026450 | 2018-09-12 14:39:49 | | 0 | +----+---------+---------------------+---+--------+ 2 rows in set (0.03 sec) mysql> select id,k,lastmodifytime,b,bin(b) from b1 where b=0 limit 2; +----+---------+---------------------+---+--------+ | id | k | lastmodifytime | b | bin(b) | +----+---------+---------------------+---+--------+ | 2 | 5024801 | 2018-09-12 14:39:49 | | 0 | | 4 | 5026450 | 2018-09-12 14:39:49 | | 0 | +----+---------+---------------------+---+--------+ 2 rows in set (0.03 sec) mysql> select id,k,lastmodifytime,b,bin(b) from b1 where b=false limit 2; +----+---------+---------------------+---+--------+ | id | k | lastmodifytime | b | bin(b) | +----+---------+---------------------+---+--------+ | 2 | 5024801 | 2018-09-12 14:39:49 | | 0 | | 4 | 5026450 | 2018-09-12 14:39:49 | | 0 | +----+---------+---------------------+---+--------+ 2 rows in set (0.00 sec) mysql> select id,k,lastmodifytime,b,bin(b) from b1 where b is false limit 2; +----+---------+---------------------+---+--------+ | id | k | lastmodifytime | b | bin(b) | +----+---------+---------------------+---+--------+ | 2 | 5024801 | 2018-09-12 14:39:49 | | 0 | | 4 | 5026450 | 2018-09-12 14:39:49 | | 0 | +----+---------+---------------------+---+--------+ 2 rows in set (0.03 sec) 查詢結果一致。 --B2表: mysql> select id,k,lastmodifytime,b,bin(b) from b2 where b=b'0' limit 2; +----+---------+---------------------+---+--------+ | id | k | lastmodifytime | b | bin(b) | +----+---------+---------------------+---+--------+ | 2 | 5024801 | 2018-09-12 14:39:49 | 0 | 0 | | 4 | 5026450 | 2018-09-12 14:39:49 | 0 | 0 | +----+---------+---------------------+---+--------+ 2 rows in set (0.00 sec) mysql> select id,k,lastmodifytime,b,bin(b) from b2 where b=0 limit 2; +----+---------+---------------------+---+--------+ | id | k | lastmodifytime | b | bin(b) | +----+---------+---------------------+---+--------+ | 2 | 5024801 | 2018-09-12 14:39:49 | 0 | 0 | | 4 | 5026450 | 2018-09-12 14:39:49 | 0 | 0 | +----+---------+---------------------+---+--------+ 2 rows in set (0.00 sec) mysql> select id,k,lastmodifytime,b,bin(b) from b2 where b is false limit 2; +----+---------+---------------------+---+--------+ | id | k | lastmodifytime | b | bin(b) | +----+---------+---------------------+---+--------+ | 2 | 5024801 | 2018-09-12 14:39:49 | 0 | 0 | | 4 | 5026450 | 2018-09-12 14:39:49 | 0 | 0 | +----+---------+---------------------+---+--------+ 2 rows in set (0.00 sec) --B3表: mysql> select id,k,lastmodifytime,b,bin(b) from b3 where b=b'0' limit 2; +----+---------+---------------------+---+--------+ | id | k | lastmodifytime | b | bin(b) | +----+---------+---------------------+---+--------+ | 2 | 5024801 | 2018-09-12 14:39:49 | 0 | 0 | | 4 | 5026450 | 2018-09-12 14:39:49 | 0 | 0 | +----+---------+---------------------+---+--------+ 2 rows in set (0.03 sec) mysql> select id,k,lastmodifytime,b,bin(b) from b3 where b=0 limit 2; +----+---------+---------------------+---+--------+ | id | k | lastmodifytime | b | bin(b) | +----+---------+---------------------+---+--------+ | 2 | 5024801 | 2018-09-12 14:39:49 | 0 | 0 | | 4 | 5026450 | 2018-09-12 14:39:49 | 0 | 0 | +----+---------+---------------------+---+--------+ 2 rows in set (0.00 sec) mysql> select id,k,lastmodifytime,b,bin(b) from b3 where b is false limit 2; +----+---------+---------------------+---+--------+ | id | k | lastmodifytime | b | bin(b) | +----+---------+---------------------+---+--------+ | 2 | 5024801 | 2018-09-12 14:39:49 | 0 | 0 | | 4 | 5026450 | 2018-09-12 14:39:49 | 0 | 0 | +----+---------+---------------------+---+--------+ 2 rows in set (0.00 sec) 上述三個語句在查詢的時候均可直接查詢出結果。 8.對b欄位加索引查詢: mysql> alter table b1 add key ix_b(b); Query OK, 0 rows affected (11.45 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> alter table b2 add key ix_b(b); Query OK, 0 rows affected (11.93 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> alter table b3 add key ix_b(b); Query OK, 0 rows affected (11.59 sec) Records: 0 Duplicates: 0 Warnings: 0 --對B1表查詢的執行計劃: mysql> explain select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b=b'0' limit 2\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: b1 partitions: NULL type: ref possible_keys: ix_b key: ix_b key_len: 1 ref: const rows: 4929222 filtered: 100.00 Extra: NULL 1 row in set, 1 warning (0.00 sec) 查詢走了執行計劃。 mysql> explain select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b='0' limit 2\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: NULL partitions: NULL type: NULL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: NULL filtered: NULL Extra: no matching row in const table 1 row in set, 1 warning (0.00 sec) 提示資訊:沒有匹配的資料。 mysql> explain select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b=0 limit 2\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: b1 partitions: NULL type: ref possible_keys: ix_b key: ix_b key_len: 1 ref: const rows: 4929222 filtered: 100.00 Extra: NULL 1 row in set, 1 warning (0.02 sec) *************************** 1. row *************************** id: 1 select_type: SIMPLE table: b1 partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 9858444 filtered: 100.00 Extra: Using where 1 row in set, 1 warning (0.02 sec) 全表掃描。 mysql> explain select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b=false limit 2\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: b1 partitions: NULL type: ref possible_keys: ix_b key: ix_b key_len: 1 ref: const rows: 4929222 filtered: 100.00 Extra: NULL 1 row in set, 1 warning (0.00 sec) 走索引。 實際查詢一次: mysql> select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b=b'0' limit 2; +----+---------+---------------------+---+--------+-------------------+ | id | k | lastmodifytime | b | bin(b) | cast(b as signed) | +----+---------+---------------------+---+--------+-------------------+ | 2 | 5024801 | 2018-09-12 14:39:49 | | 0 | 0 | | 4 | 5026450 | 2018-09-12 14:39:49 | | 0 | 0 | +----+---------+---------------------+---+--------+-------------------+ 2 rows in set (0.00 sec) mysql> select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b=0 limit 2; +----+---------+---------------------+---+--------+-------------------+ | id | k | lastmodifytime | b | bin(b) | cast(b as signed) | +----+---------+---------------------+---+--------+-------------------+ | 2 | 5024801 | 2018-09-12 14:39:49 | | 0 | 0 | | 4 | 5026450 | 2018-09-12 14:39:49 | | 0 | 0 | +----+---------+---------------------+---+--------+-------------------+ 2 rows in set (0.00 sec) mysql> select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b=false limit 2; +----+---------+---------------------+---+--------+-------------------+ | id | k | lastmodifytime | b | bin(b) | cast(b as signed) | +----+---------+---------------------+---+--------+-------------------+ | 2 | 5024801 | 2018-09-12 14:39:49 | | 0 | 0 | | 4 | 5026450 | 2018-09-12 14:39:49 | | 0 | 0 | +----+---------+---------------------+---+--------+-------------------+ 2 rows in set (0.00 sec) mysql> select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b='0' limit 2; Empty set (0.00 sec) mysql> select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b is false limit 2; +----+---------+---------------------+---+--------+-------------------+ | id | k | lastmodifytime | b | bin(b) | cast(b as signed) | +----+---------+---------------------+---+--------+-------------------+ | 2 | 5024801 | 2018-09-12 14:39:49 | | 0 | 0 | | 4 | 5026450 | 2018-09-12 14:39:49 | | 0 | 0 | +----+---------+---------------------+---+--------+-------------------+ 2 rows in set (0.00 sec) 結論:針對bit型別 最正確的查詢方式是 select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b=b'0' limit 2; 其他的會發生隱式型別轉換。 針對tinyint型別的執行計劃: mysql> explain select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b2 where b is false limit 2\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: b2 partitions: NULL type: ALL possible_keys: NULL key: NULL key_len: NULL ref: NULL rows: 9858444 filtered: 100.00 Extra: Using where 1 row in set, 1 warning (0.00 sec) 全表掃描。 如下的查詢語句執行計劃如下: mysql> explain select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b2 where b=b'0' limit 2\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: b2 partitions: NULL type: ref possible_keys: ix_b key: ix_b key_len: 1 ref: const rows: 4929222 filtered: 100.00 Extra: NULL 1 row in set, 1 warning (0.00 sec) 查詢語句: select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b2 where b=b'0' limit 2; select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b2 where b=0 limit 2; select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b2 where b='0' limit 2; select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b2 where b=false limit 2; --針對boolean型別: 查詢的執行計劃如B2. 分析: MySQL把BIT當做字串型別, 而不是資料型別。當檢索BIT(1)列的值, 結果是一個字串且內容是二進位制位0或1, 而不是ASCII值”0″或”1″. 結論: 對於bit 型別的數值不使用索引,mysql 檢索bit的值是不管是數值還是字元,mysql會對where 條件進行型別轉化,將字元轉換為數值,並比較數值對應的ascii碼,如果值為1,則返回結果,否則,結果為空。 9.bit型別在ETL中: 針對boolean型別在MySQL資料庫中是以0和1儲存的,查詢可以直接查詢出。 採用bit(1)型別儲存需要做轉換: mysql> select id,k,lastmodifytime,b,bin(b),cast(b as signed) from b1 where b is false limit 2; +----+---------+---------------------+---+--------+-------------------+ | id | k | lastmodifytime | b | bin(b) | cast(b as signed) | +----+---------+---------------------+---+--------+-------------------+ | 2 | 5024801 | 2018-09-12 14:39:49 | | 0 | 0 | | 4 | 5026450 | 2018-09-12 14:39:49 | | 0 | 0 | +----+---------+---------------------+---+--------+-------------------+ 2 rows in set (0.00 sec) 可以看到如不做轉換的時候則查詢出來為空,在做資料ETL的時候對其他資料庫不便於讀取。 10.MySQL bit型別的bool運算和bit型別的儲存: mysql> select b'0'=0,b'0'='0',b'110000'='0',b'110000'+0; +--------+----------+---------------+-------------+ | b'0'=0 | b'0'='0' | b'110000'='0' | b'110000'+0 | +--------+----------+---------------+-------------+ | 1 | 0 | 1 | 48 | +--------+----------+---------------+-------------+ 1 row in set (0.03 sec) mysql> select b'1'=1,b'1'='1',b'110001'='1',b'110001'+0; +--------+----------+---------------+-------------+ | b'1'=1 | b'1'='1' | b'110001'='1' | b'110001'+0 | +--------+----------+---------------+-------------+ | 1 | 0 | 1 | 49 | +--------+----------+---------------+-------------+ 1 row in set (0.00 sec) 結論:b'110000'為十進位制的48,而不是字串0. 驗證: mysql> create table b(a bit(64)); Query OK, 0 rows affected (0.04 sec) mysql> create table bb(a bit(65)); ERROR 1439 (42000): Display width out of range for column 'a' (max = 64) --插入資料: insert into b values(b'1'),(b'0'),(b'01000001'),(b'01011010'),(b'01100001'),(b'00110000'),(b'00110001'); mysql> select a,a+0,bin(a),oct(a),hex(a) from b; +----------+------+---------+--------+--------+ | a | a+0 | bin(a) | oct(a) | hex(a) | +----------+------+---------+--------+--------+ | | 1 | 1 | 1 | 1 | | | 0 | 0 | 0 | 0 | | A | 65 | 1000001 | 101 | 41 | | Z | 90 | 1011010 | 132 | 5A | | a | 97 | 1100001 | 141 | 61 | | 0 | 48 | 110000 | 60 | 30 | | 1 | 49 | 110001 | 61 | 31 | +----------+------+---------+--------+--------+ 7 rows in set (0.02 sec) mysql> select lpad(bin(a),8,'0') la,a,a+0,bin(a),oct(a),hex(a) from b; +----------+----------+------+---------+--------+--------+ | la | a | a+0 | bin(a) | oct(a) | hex(a) | +----------+----------+------+---------+--------+--------+ | 00000000 | | 0 | 0 | 0 | 0 | | 00000001 | | 1 | 1 | 1 | 1 | | 00110000 | 0 | 48 | 110000 | 60 | 30 | | 00110001 | 1 | 49 | 110001 | 61 | 31 | | 01000001 | A | 65 | 1000001 | 101 | 41 | | 01011010 | Z | 90 | 1011010 | 132 | 5A | | 01100001 | a | 97 | 1100001 | 141 | 61 | +----------+----------+------+---------+--------+--------+ 7 rows in set (0.00 sec) 結論:bit型別表示方式為bit(m),M的取值範圍為1到64.bit型別儲存的二進位制字串。 bit型別的資料範圍為bit(1) 到bit(64),換算成十進位制範圍0到2^64減1; tinyint unsigned的資料範圍,十進位制表示為0到255. bit的儲存近似為(M+7)/8 bytes,而tinyint為1byte。 當使用bit(1)和tinyint的時候使用的儲存空間一致。 11.JDBC和bit型別: MySQL Type Name Return value of GetColumnTypeName Return value of GetColumnClassName BIT(1) BIT java.lang.Boolean BIT( > 1) BIT byte[] TINYINT TINYINT java.lang.Boolean if the configuration property tinyInt1isBit is set to true (the default) and the storage size is 1, or java.lang.Integer if not. 在一些應用場景下tinyint會被預設讀取為true或者false,而不是想要的資料0和1. jdbc會把tinyint 認為是java.sql.Types.BIT 此時需要在連線配置上使用類似如下的配置: jdbc:mysql://localhost/databaseName?tinyInt1isBit=false 結論: 從易於讀取性和大眾接受程度、軟體應用的通用性上推薦使用tinyint表示是和否。