性能问题描述
之前公司接火车票下订单的项目,由于购买火车票的人数众多,所以订单记录也很多,单表有100多万,订单里会存放购票人的身份证号,由于监管要求,身份证属于敏感信息,不能明文存储,需要加密存储,用公司加密算法加密之后,身份证号就变成了128位的字符长度,而一些场景有需要使用身份证号去查询。
所以怎么写SQL和设计索引,既能减少数据库资源消耗,又能保证高效的查询性能呢?
示例表
下面创建一张示例表t_test4
[root@localhost] 17:03:22 [testdb]>show create table t_test4G;*************************** 1. row *************************** Table: t_test4Create Table: CREATE TABLE `t_test4` ( `order_id` int(10) unsigned NOT NULL AUTO_INCREMENT, `id_sha2` varchar(512) COLLATE utf8mb4_unicode_ci DEFAULT NULL, PRIMARY KEY (`order_id`)) ENGINE=InnoDB AUTO_INCREMENT=49997686 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci MAX_ROWS=10000001 row in set (0.00 sec)ERROR: No query specified
模拟创建1万条记录,取其中10条记录示例,其中id_sha2就是模拟的身份证号加密字段
[root@localhost] 16:58:57 [testdb]>select order_id,id_sha2 from t_test4 limit 10;+----------+----------------------------------------------------------------------------------------------------------------------------------+| order_id | id_sha2 |+----------+----------------------------------------------------------------------------------------------------------------------------------+| 8217 | 375f902e071d1ac817b3aaa847b13335cc560761b168ec427ec5bf952b3029a96710c9111d8cd35b7f243d5f52302a4ba9ddded66c93b84a9e451ed62335eff8 || 13214 | 499415d203ab4bb5644c263645c213f2f93f20194e28ad5acde8de6ffd734853f4bfcd27222ab1957dbe92ee7b8487ca6fb02f3cde72af008a4f3369b51cc56f || 27054 | 2c301044742d02dc4e1d82ebda451032dede921e0bc7ec08e2879b8406e32cb1155d86ac2900141b718dc818a3c2bfed974726fe34cc0dbc60cceb1facbe98ab || 29352 | 127b2250a4823fa3b51d30b2d621f666e45c17c807bb1a909827aa277ec7ea6986c1ee75273d389748dd5e239595f3e45efeef2754aede5fd20e8fea6acc0bf2 || 30485 | 630f48ca90d93448e25c3ae4c838641ec9c45f49af191d992287ab6e28a7329dd3ce7afe239e5ed02a2aca4d976ca506d4b44c9c0d9f01107076c3015d40965d || 40517 | a5cd97274ac976271745990b5c7a9f05f806b15fdeca2a37431fdb84eb0868eb99ee2b0b43389c6c8ec5687d7c162300eebc14ecb7bcae17c502fd628e7de98b || 41811 | 6698dbd0898ab61f78f7d7de3011248e7dd75aeef56c0e11c85779d9c62200773802cde1f7f95d19bac00c1d3251c3765be8b760cf11b8fae886dd90a908783c || 49089 | 2b8a793e174669f41f9e7c56a01dc6a708b0071dbad3da5ca4be9ffd9ce577ba44bed92c7db86f7941fb14cb828e66a7c6315500decb7474e02e7637dcad6b99 || 56027 | a237ab3a1398e3785f6dad02800055d4832e6cff4db7d4942a332961a8026fbf3f24eb0e80e458bad594d7e5c0efae8492eb1b0d3ba50a3f2a0aebaad958da73 || 56735 | 53bb95d87c8241b4d707a59b14b00fb14e3162ea91ada3e6cb17fc6339e85a76e870b56e8acf0c2220b5720971359916b2b1977c9c5aa1c1e43d201665448629 |+----------+----------------------------------------------------------------------------------------------------------------------------------+10 rows in set (0.00 sec)
业务查询SQL
select order_id,id_sha2 from t_test4 where id_sha2='a5cd97274ac976271745990b5c7a9f05f806b15fdeca2a37431fdb84eb0868eb99ee2b0b43389c6c8ec5687d7c162300eebc14ecb7bcae17c502fd628e7de98b';
看到这里,大家发现问题了,如果id_sha2上没有索引,从一张100W记录的业务表查出某个身份证号购买的所有火车票,简直要人命,可是直接id_sha2列上创建索引,又不合适。
原因有3个:第一:Mysql数据库默认页大小是16K,那么一个页大约可以存放16*1024/128=128记录,而且很快因为页空间占满,导致页分裂,由于页分裂过快,影响性能
第二:占用空间比短长度列索引空间大
第三:数据库查询时,加载到内存占用更多的内存空间
优化思路
关键问题是索引列太宽,优化思路就是缩小列宽度,在Mysql5.7中是不支持列上创建函数索引,那我们有2种方案
方案1:在t_test4表上,创建一个新列(id_sha2_idx),列的长度可以设置成身份证号长度(14位),每次写入id_sha2列时,同时往id_sha2_idx写入substr(id_sha2,1,13)的值,并在id_sha2_idx列上创建索引
方案2:在t_test4表上,创建一个虚拟列(id_sha2_idx),列的长度同样设置成身份证号长度(14位),并在id_sha2_idx列上创建索引。
这2个方案有什么不一样的地方呢,方案一添加的列是需要在硬盘上写入实际数据的,而方案二创建的是虚拟列,不需要存储数据,而且创建虚拟列和删除虚拟列都不会改变表的存储结构,所以创建和删除很方便,快捷。
在这里,我选取方案二来给大家演示优化后的效果。
创建虚拟列,并创建索引
[root@localhost] 17:30:00 [testdb]>alter table t_test4 add id_sha2_idx varchar(13) generated always as (substr(id_sha2,1,13));Query OK, 0 rows affected (0.45 sec)Records: 0 Duplicates: 0 Warnings: 0[root@localhost] 17:30:01 [testdb]>alter table t_test4 add index idx_t_test4_id_sha2_idx(id_sha2_idx);Query OK, 0 rows affected (0.41 sec)Records: 0 Duplicates: 0 Warnings: 0[root@localhost] 17:30:06 [testdb]>show create table t_test4G;*************************** 1. row *************************** Table: t_test4Create Table: CREATE TABLE `t_test4` ( `order_id` int(10) unsigned NOT NULL AUTO_INCREMENT, `id_sha2` varchar(512) COLLATE utf8mb4_unicode_ci DEFAULT NULL, `id_sha2_idx` varchar(13) COLLATE utf8mb4_unicode_ci GENERATED ALWAYS AS (substr(`id_sha2`,1,13)) VIRTUAL, PRIMARY KEY (`order_id`), KEY `idx_t_test4_id_sha2_idx` (`id_sha2_idx`)) ENGINE=InnoDB AUTO_INCREMENT=49997686 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci MAX_ROWS=10000001 row in set (0.00 sec)ERROR: No query specified
优化之后的业务SQL语句
[root@localhost] 17:31:43 [testdb]>select order_id,id_sha2,id_sha2_idx from t_test4 where id_sha2='499415d203ab4bb5644c263645c213f2f93f20194e28ad5acde8de6ffd734853f4bfcd27222ab1957dbe92ee7b8487ca6fb02f3cde72af008a4f3369b51cc56f' and id_sha2_idx='499415d203ab4';+----------+----------------------------------------------------------------------------------------------------------------------------------+---------------+| order_id | id_sha2 | id_sha2_idx |+----------+----------------------------------------------------------------------------------------------------------------------------------+---------------+| 13214 | 499415d203ab4bb5644c263645c213f2f93f20194e28ad5acde8de6ffd734853f4bfcd27222ab1957dbe92ee7b8487ca6fb02f3cde72af008a4f3369b51cc56f | 499415d203ab4 |+----------+----------------------------------------------------------------------------------------------------------------------------------+---------------+1 row in set (0.01 sec)[root@localhost] 17:32:28 [testdb]>explain select order_id,id_sha2,id_sha2_idx from t_test4 where id_sha2='499415d203ab4bb5644c263645c213f2f93f20194e28ad5acde8de6ffd734853f4bfcd27222ab1957dbe92ee7b8487ca6fb02f3cde72af008a4f3369b51cc56f' and id_sha2_idx='499415d203ab4';+----+-------------+---------+------------+------+-------------------------+-------------------------+---------+-------+------+----------+-------------+| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |+----+-------------+---------+------------+------+-------------------------+-------------------------+---------+-------+------+----------+-------------+| 1 | SIMPLE | t_test4 | NULL | ref | idx_t_test4_id_sha2_idx | idx_t_test4_id_sha2_idx | 55 | const | 1 | 10.00 | Using where |+----+-------------+---------+------------+------+-------------------------+-------------------------+---------+-------+------+----------+-------------+1 row in set, 1 warning (0.00 sec)
可以看到,现在已经非常优雅的解决了,索引占用空大和慢的问题。
SQL优化系列文章 MySQL数据库SQL语句优化原理专题(一)
关注
1.如果您喜欢这篇文章,请点赞+转发。
2.如果您特别喜欢,请加关注。