mysql 字段值分布很少的字段要不要加索引

在我還是個mysql新手的時候，看到有的同事給字段值分布很少的字段也加索引，這違背了我看過的大部分mysql索引優化的文章內容，甚是疑惑。

例如：訂單狀態字段只有6個值： 0 待確認，1 已確認，2 已收貨，3 已取消，4 已完成，5 已關閉

在我理解mysql b+tree的原理後，很有必要去實戰這種情況到底有沒有必要加索引。

建立帶索引的表

drop table if exists `bool_index`;
create table `bool_index` (
`id` int (11) not null auto_increment,
`rand_id` varchar (200) comment '隨機數',
`order_status` tinyint (1) not null default '0' comment '訂單狀態.0待確認，1已確認，2已收貨，3已取消，4已完成，5已作廢',
`created_at` datetime not null,
primary key (`id`),
key `idx_order_status` (`order_status`)
) engine = innodb default charset = utf8;

建立不帶索引的表

drop table if exists `bool_no_index`;
create table `bool_no_index` (
`id` int (11) not null auto_increment,
`rand_id` varchar (200) comment '隨機數',
`order_status` tinyint (1) not null default '0' comment '訂單狀態.0待確認，1已確認，2已收貨，3已取消，4已完成，5已作廢',
`created_at` datetime not null,
primary key (`id`)
) engine = innodb default charset = utf8;

通過儲存過程造一些測試資料

delimiter $$
drop procedure if exists `proc_index`$$
create procedure proc_index()
begin
declare rand_id varchar(120);
declare order_status int(1);
declare i int default 0;
declare createtime datetime;
-- 除錯過程, 插入一些資料
while i < 10000 do
set rand_id= substring(md5(rand()),1,28);
-- 生成 訂單狀態值.0待確認，1已確認，2已收貨，3已取消，4已完成，5已關閉
set order_status = floor(rand()*10)%6;
set createtime = now();
insert into  `bool_index`(`rand_id`,`order_status`,`created_at`) values(rand_id,order_status,createtime);
insert into  `bool_no_index`(`rand_id`,`order_status`,`created_at`) values(rand_id,order_status,createtime);
set i=i+1;
end while;
end$$
call proc_index();

表資料量/耗時

select * from bool_index

where order_status=3 and rand_id='bd0bcd23960dbe0140bea563e7bd';

select * from bool_no_index

where order_status=3 and rand_id='bd0bcd23960dbe0140bea563e7bd';

order_status=3資料總量

1w0.002s

0.002s

約2000

4w0.011s

0.009s

約8000

8w0.021s

0.021s

約1.6w

16w0.059s

0.040s

約3.2w

32w0.142s

0.110s

約6.3w

64w1.194s

0.383s

約12w

100w

2.761s

0.563s

約20w

200w

7.025s

1.158s

約40w

通過比較，在資料量小於16w時，加索引和不加索引查詢速度差別不大，資料大於16w時，隨著資料量的增大，加索引的查詢速度相對會越來越慢。

如：第20001萬條記錄rand_id='56079ad22da839c1a00bd812a191' order_status=3

通過explain分析執**況

加索引掃瞄的資料rows=366798，不加索引rows=997976 (全表掃瞄)，明明加索引的掃瞄條目更少，為何反而變慢了呢？

舉乙個非常好理解的場景（通過索引讀取表中20%的資料）解釋一下這個有趣的概念：（例子**

假設一張表含有10萬行資料--------100000行

我們要讀取其中20%(2萬)行資料----20000行

表中每行資料大小80位元組----------80bytes

資料庫中的資料塊大小8k----------8000bytes

所以有以下結果：

每個資料塊包含100行資料---------100行

這張表一共有1000個資料塊--------1000塊

上面列出了一系列淺顯易懂的資料，我們挖掘一下這些資料後面的故事：

通過索引讀取20000行資料 = 約20000個table access by rowid = 需要處理20000個塊來執行這個查詢

但是，請大家注意：整個表只有1000個塊！

所以：如果按照索引讀取全部的資料的20%相當於將整張表平均讀取了20次！！so，這種情況下直接讀取整張表的效率會更高。）（索引還涉及多次回表查詢問題）

總結：禁止在更新十分頻繁、區分度不高的屬性上建立索引

具體深層次的原因請先了解b+tree的底層原理

mysql字段值連線語法

問題現有一批人員組織名稱需要組合查詢，表userinfo如下 employee company branch department usergroup hall salepoint 0288000001 廣州分公司工程建設中心 0221000002 廣州分公司從化分公司政企客戶中心 0221...

根據字段值取字段別名

描述根據某一字段的不同值，為另一字段取不同的別名。背景建乙個學生缺勤表，其中有乙個欄位是缺勤狀態status，取值為0 5，分別代表曠課早到遲到病假事假公假。求出各系各缺勤狀態的人數。select 系別,sum case when 缺勤狀態 0 then 1 else 0 end a...

Mysql 新增字段修改字段刪除字段

alter table 表名 add 欄位名字段型別字段長度 default 預設值 comment 注釋例如 alter table order add code char 6 default null comment 優惠碼 2 修改字段修改欄位名字段型別長度 a 修改欄位名 alt...

mysql 字段值分布很少的字段要不要加索引

mysql字段值連線語法

根據字段值取字段別名

Mysql 新增字段 修改字段 刪除字段

相關推薦

Mysql 新增字段修改字段刪除字段