Postgresql中的explain (3)

日期：2021-05-25 栏目：程序人生浏览：次

另一个可能的连接类型是merge join：

EXPLAIN SELECT * FROM tenk1 t1, onek t2 WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2; QUERY PLAN ------------------------------------------------------------------------------------------ Merge Join (cost=198.11..268.19 rows=10 width=488) Merge Cond: (t1.unique2 = t2.unique2) -> Index Scan using tenk1_unique2 on tenk1 t1 (cost=0.29..656.28 rows=101 width=244) Filter: (unique1 < 100) -> Sort (cost=197.83..200.33 rows=1000 width=244) Sort Key: t2.unique2 -> Seq Scan on onek t2 (cost=0.00..148.00 rows=1000 width=244)

Merge Join需要已经排序的输入数据。在这个规划中按正确顺序索引扫描tenk1的数据，但是对onek表执行排序和顺序扫描，因为需要在这个表中查询多条数据。因为索引扫描需要访问不连续的磁盘，所以索引扫描多条数据时会频繁使用排序顺序扫描（Sequential-scan-and-sort）。

有一种方法可以看到不同的规划，就是强制规划器忽略任何策略。例如，如果我们不相信排序顺序扫描（sequential-scan-and-sort）是最好的办法，我们可以尝试这样的做法：

SET enable_sort = off; EXPLAIN SELECT * FROM tenk1 t1, onek t2 WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2; QUERY PLAN ------------------------------------------------------------------------------------------ Merge Join (cost=0.56..292.65 rows=10 width=488) Merge Cond: (t1.unique2 = t2.unique2) -> Index Scan using tenk1_unique2 on tenk1 t1 (cost=0.29..656.28 rows=101 width=244) Filter: (unique1 < 100) -> Index Scan using onek_unique2 on onek t2 (cost=0.28..224.79 rows=1000 width=244)

显示测结果表明，规划器认为索引扫描比排序顺序扫描消耗高12%。当然下一个问题就是规划器的评估为什么是正确的。我们可以通过EXPLAIN ANALYZE进行考察。

EXPLAIN ANALYZE

通过EXPLAIN ANALYZE可以检查规划器评估的准确性。使用ANALYZE选项，EXPLAIN实际运行查询，显示真实的返回记录数和运行每个规划节点的时间，例如我们可以得到下面的结果：

EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 < 10 AND t1.unique2 = t2.unique2; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=4.65..118.62 rows=10 width=488) (actual time=0.128..0.377 rows=10 loops=1) -> Bitmap Heap Scan on tenk1 t1 (cost=4.36..39.47 rows=10 width=244) (actual time=0.057..0.121 rows=10 loops=1) Recheck Cond: (unique1 < 10) -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..4.36 rows=10 width=0) (actual time=0.024..0.024 rows=10 loops=1) Index Cond: (unique1 < 10) -> Index Scan using tenk2_unique2 on tenk2 t2 (cost=0.29..7.91 rows=1 width=244) (actual time=0.021..0.022 rows=1 loops=10) Index Cond: (unique2 = t1.unique2) Total runtime: 0.501 ms

注意，实际时间(actual time)的值是已毫秒为单位的实际时间，cost是评估的消耗，是个虚拟单位时间，所以他们看起来不匹配。

通常最重要的是看评估的记录数是否和实际得到的记录数接近。在这个例子里评估数完全和实际一样，但这种情况很少出现。

某些查询规划可能执行多次子规划。比如之前提过的内循环规划（nested-loop），内部索引扫描的次数是外部数据的数量。在这种情况下，报告显示循环执行的总次数、平均实际执行时间和数据条数。这样做是为了和评估值表示方式一至。由循环次数和平均值相乘得到总消耗时间。

某些情况EXPLAIN ANALYZE会显示额外的信息，比如sort和hash节点的时候：

EXPLAIN ANALYZE SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2 ORDER BY t1.fivethous; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=717.34..717.59 rows=101 width=488) (actual time=7.761..7.774 rows=100 loops=1) Sort Key: t1.fivethous Sort Method: quicksort Memory: 77kB -> Hash Join (cost=230.47..713.98 rows=101 width=488) (actual time=0.711..7.427 rows=100 loops=1) Hash Cond: (t2.unique2 = t1.unique2) -> Seq Scan on tenk2 t2 (cost=0.00..445.00 rows=10000 width=244) (actual time=0.007..2.583 rows=10000 loops=1) -> Hash (cost=229.20..229.20 rows=101 width=244) (actual time=0.659..0.659 rows=100 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 28kB -> Bitmap Heap Scan on tenk1 t1 (cost=5.07..229.20 rows=101 width=244) (actual time=0.080..0.526 rows=100 loops=1) Recheck Cond: (unique1 < 100) -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=101 width=0) (actual time=0.049..0.049 rows=100 loops=1) Index Cond: (unique1 < 100) Total runtime: 8.008 ms

排序节点（Sort）显示排序类型（一般是在内存还是在磁盘）和使用多少内存。哈希节点（Hash）显示哈希桶和批数以及使用内存的峰值。

转载注明出处：https://www.heiqu.com/wpdwpf.html

Postgresql中的explain (3)

相关推荐