MySQL反连接的优化总结(2)

日期：2020-06-02 栏目：程序人生浏览：次

可见通过这种拆分，不断的猜测和排除，已经找到了一些思路。
我们开始抓住问题的本质。
首先删除test_tab1上的索引，看看执行效果如何。
> alter table test_tab1 drop index ind_tmp_account;
Query OK, 0 rows affected (0.00 sec)
Records: 0 Duplicates: 0 Warnings: 0
语句如下，执行时间0.15秒
select account from test_tab1
where not exists (select login_account from test_tab where login_account=test_tab1.account );
+--------------------------------+
11364 rows in set (0.15 sec)
是否not in的方式会有很大的差别呢，持续0.18秒，有差别，但差别不大。
select account from test_tab1
where account not in (select login_account from test_tab );
+--------------------------------+
11364 rows in set (0.18 sec)
我们逐步恢复原来的查询，去除临时表test_tab1,整个查询持续了1.12秒。
select account from t_fund_info
where money >=300 and account not in(select login_account from test_tab);
+--------------------------------+
11364 rows in set (1.12 sec)
使用explain extended解析的内容如下：
Message: select `test`.`t_fund_info`.`account` AS `account` from `test`.`t_fund_info` where ((`test`.`t_fund_info`.`money` >= 300) and (not((`test`.`t_fund_info`.`account`,(<index_lookup>((`test`.`t_fund_info`.`account`) in test_tab on ind_tmp_login_account checking NULL having (`test`.`test_tab`.`login_account`)))))))
这个时候，问题已经基本定位了。在反连接的查询中，在这个问题场景中，需要对子查询的表添加一个索引基于login_account，可以和外层的查询字段映射，提高查询效率。
当然在一个数据量庞大，业务相对繁忙的系统中，添加一个临时需求的索引可能不是一个很好的方案。不过我们还是在测试环境体验一下。
> create index ind_tmp_account1 on t_user_login_record(login_account);
Query OK, 0 rows affected (4 min 45.48 sec)
Records: 0 Duplicates: 0 Warnings: 0
添加索引的过程持续了近4分钟，在这个时候我们使用最开始的查询语句，性能如何呢。
select account from t_fund_info where money >=300 and account not in (select distinct(login_account) from t_user_login_record where);
+--------------------------------+
11364 rows in set (2.52 sec)
只要2.52秒就可以完成之前20多个小时查询结果，性能简直就是天壤之别。
不过话说回来，跑批查询可以在从库上执行，从库上创建一个这样的索引，用完再删掉也是不错的选择，要么就是创建一个临时表，在临时表上创建索引，临时表的意义就在于此，不是为了做查询结果缓存而是创建索引来提高数据过滤效率。
在此有个问题就是临时表只有一个字段，创建索引的意义在哪里呢。
我画一个图来解释一下。

MySQL反连接的优化总结

转载注明出处：https://www.heiqu.com/74c4ac091b24860ef41160dce6502780.html

MySQL反连接的优化总结(2)

相关推荐