其中,ONLY_FULL_GROUP_BY与group by语句有关,其要求select列表里只能出现分组列(即group by后面的列)和聚合函数(sum,avg,max等),这也是SQL92的标准。
但在工作中,却经常看到开发写出下面这种SQL。
mysql> select deptno,ename,max(sal) from emp group by deptno;
+--------+----------+----------+
| deptno | ename | max(sal) |
+--------+----------+----------+
| 10 | emp_1001 | 200.00 |
| 20 | emp_1003 | 400.00 |
| 30 | emp_1005 | 600.00 |
+--------+----------+----------+
3 rows in set (0.01 sec)
实在不明白,这里的ename在业务层有何意义,毕竟,他并不是工资最高的那位员工。
分组求最值,MySQL的实现方式
其实分组求最值是一个很普遍的需求。在工作中,也经常被开发同事问到。 下面具体来看看,MySQL中有哪些实现方式。
方法1
SELECT
e.deptno,
ename,
sal
FROM
emp e,
( SELECT deptno, max( sal ) maxsal FROM emp GROUP BY deptno ) t
WHERE
e.deptno = t.deptno
AND e.sal = t.maxsal;
方法2
SELECT
a.deptno,
a.ename,
a.sal
FROM
emp a
LEFT JOIN emp b ON a.deptno = b.deptno
AND a.sal < b.sal
WHERE
b.sal IS NULL;
这两种实现方式,其实是通用的,不仅适用于MySQL,也适用于其它主流关系型数据库。
方法3
MySQL 8.0推出了分析函数,其也可实现类似功能。
SELECT
deptno,
ename,
sal
FROM
(
SELECT
deptno,
ename,
sal,
LAST_VALUE ( sal ) OVER ( PARTITION BY deptno ORDER BY sal ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ) maxsal
FROM
emp
) a
WHERE
sal = maxsal;
三种实现方式的性能对比
因上面测试案例的数据量太小,三种实现方式的结果都是秒出,仅凭执行计划很难直观地看出实现方式的优劣。
下面换上数据量更大的测试数据,官方示例数据库employees中的dept_emp表,https://github.com/datacharmer/test_db
表的相关信息如下,其中emp_no是员工编号,dept_no是部门编号,from_date是入职日期。
mysql> show create table dept_emp\G
*************************** 1. row ***************************
Table: dept_emp
Create Table: CREATE TABLE `dept_emp` (
`emp_no` int(11) NOT NULL,
`dept_no` char(4) NOT NULL,
`from_date` date NOT NULL,
`to_date` date NOT NULL,
KEY `dept_no` (`dept_no`,`from_date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
1 row in set (0.00 sec)
mysql> select count(*) from dept_emp;
+----------+
| count(*) |
+----------+
| 331603 |
+----------+
1 row in set (0.09 sec)
mysql> select * from dept_emp limit 1;
+--------+---------+------------+------------+
| emp_no | dept_no | from_date | to_date |
+--------+---------+------------+------------+
| 10001 | d005 | 1986-06-26 | 9999-01-01 |
+--------+---------+------------+------------+
1 row in set (0.00 sec)
方法1
mysql> select d.dept_no,d.emp_no,d.from_date from dept_emp d, (select dept_no,max(from_date) max_hiredate from dept_emp group by dept_no) t where d.dept_no=t.dept_no and d.from_date=t.max_hiredate;
…
12 rows in set (0.00 sec)
mysql> explain select d.dept_no,d.emp_no,d.from_date from dept_emp d, (select dept_no,max(from_date) max_hiredate from dept_emp group by dept_no) t where d.dept_no=t.dept_no and d.from_date=t.max_hiredate;
+----+-------------+------------+------------+-------+---------------+---------+---------+--------------------------+------+----------+----------------------
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra
+----+-------------+------------+------------+-------+---------------+---------+---------+--------------------------+------+----------+----------------------
| 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 9 | 100.00 | Using where
| 1 | PRIMARY | d | NULL | ref | dept_no | dept_no | 19 | t.dept_no,t.max_hiredate | 5 | 100.00 | NULL
| 2 | DERIVED | dept_emp | NULL | range | dept_no | dept_no | 16 | NULL | 9 | 100.00 | Using index for group-by
+----+-------------+------------+------------+-------+---------------+---------+---------+--------------------------+------+----------+----------------------
方法2