PG–执行计划之辅助算法

Sort

需要排序数据时,这个节点就会作为计划树的一部分被添加。可以显式或隐式地要求排序数据

postgres=# drop table demotable;
DROP TABLE
postgres=# CREATE TABLE demotable (num numeric, id int);
CREATE TABLE
postgres=# INSERT INTO demotable SELECT random() * 1000, generate_series(1, 10000);
INSERT 0 10000
postgres=# explain select * from demotable order by num;
QUERY PLAN
---------------------------------------------------------------------
Sort (cost=570.84..588.31 rows=6985 width=36)
Sort Key: num
-> Seq Scan on demotable (cost=0.00..124.85 rows=6985 width=36)
(3 rows)

即使用户要求最终输出按已排序的顺序,如果对应的表和排序列上有索引,则可能不会在最终计划中添加Sort节点。

postgres=# CREATE INDEX demoidx ON demotable(num);
CREATE INDEX
postgres=# explain select * from demotable order by num;
QUERY PLAN
--------------------------------------------------------------------------------
Index Scan using demoidx on demotable (cost=0.29..534.28 rows=10000 width=15)
(1 row)

Merge Join需要在连接之前对两个表数据进行排序。因此,可能会发现Merge Join比其他任何连接方法的成本都要低,即使需要额外的排序成本。因此,在本例中,Sort节点将添加到表的连接和扫描方法之间,以便将已排序的记录传递给连接方法。

postgres=# create table demo1(id int, id2 int);
CREATE TABLE
postgres=# insert into demo1 values(generate_series(1,1000), generate_series(1,1000));
INSERT 0 1000
postgres=# create table demo2(id int, id2 int);
CREATE TABLE
postgres=# create index demoidx2 on demo2(id);
CREATE INDEX
postgres=# insert into demo2 values(generate_series(1,100000), generate_series(1,100000));
INSERT 0 100000
postgres=# analyze;
ANALYZE
postgres=# explain select * from demo1, demo2 where demo1.id=demo2.id;
QUERY PLAN
------------------------------------------------------------------------------------
Merge Join (cost=65.12..115.04 rows=1000 width=16)
Merge Cond: (demo2.id = demo1.id)
-> Index Scan using demoidx2 on demo2 (cost=0.29..3050.29 rows=100000 width=8)
-> Sort (cost=64.83..67.33 rows=1000 width=8)
Sort Key: demo1.id
-> Seq Scan on demo1 (cost=0.00..15.00 rows=1000 width=8)
(6 rows)

Aggregate

如果有某个聚合函数用于从多个输入元组中计算以得到单个结果,那么Aggregate节点就会作为规划树的一部分被添加。一些常用的聚合函数是COUNT, SUM, AVG (AVERAGE), MAX (MAXIMUM)和MIN (MINIMUM)。

postgres=# explain select count(*) from demo1;
QUERY PLAN
---------------------------------------------------------------
Aggregate (cost=17.50..17.51 rows=1 width=8)
-> Seq Scan on demo1 (cost=0.00..15.00 rows=1000 width=0)
(2 rows)

postgres=# explain select sum(demo1.id) from demo1, demo2 where demo1.id=demo2.id;
QUERY PLAN
-----------------------------------------------------------------------------------------------
Aggregate (cost=117.54..117.55 rows=1 width=8)
-> Merge Join (cost=65.12..115.04 rows=1000 width=4)
Merge Cond: (demo2.id = demo1.id)
-> Index Only Scan using demoidx2 on demo2 (cost=0.29..3050.29 rows=100000 width=4)
-> Sort (cost=64.83..67.33 rows=1000 width=4)
Sort Key: demo1.id
-> Seq Scan on demo1 (cost=0.00..15.00 rows=1000 width=4)
(7 rows)

HashAggregate

如果聚合发生在未排序的数据集上,那么HashAggregate可以被组级聚合使用

postgres=# explain select count(*) from demo1 group by id2;
QUERY PLAN
---------------------------------------------------------------
HashAggregate (cost=20.00..30.00 rows=1000 width=12)
Group Key: id2
-> Seq Scan on demo1 (cost=0.00..15.00 rows=1000 width=4)
(3 rows)

这里的demo1表模式数据与前一节中显示的示例相同。由于只需要对1000行进行分组,因此构建哈希表所需的资源要小于排序的成本。查询规划器决定选择HashAggregate。

GroupAggregate

GroupAggregate用于处理已经排好序的数据,因此不需要任何额外的数据结构

postgres=# explain select count(*) from demo2 group by id2;
QUERY PLAN
-------------------------------------------------------------------------
GroupAggregate (cost=9747.82..11497.82 rows=100000 width=12)
Group Key: id2
-> Sort (cost=9747.82..9997.82 rows=100000 width=4)
Sort Key: id2
-> Seq Scan on demo2 (cost=0.00..1443.00 rows=100000 width=4)
(5 rows)

其他常用

limit (SELECT查询中使用了“ limit / offset”子句

explain select * from demo1 offset 5 limit 10;

unique (distinct)

explain select distinct(id) from demo2 where id<100;

lockrows(for update)

explain select * from demo1 for update;

setop (合并结果)

explain select * from emp where age>25 intersect select * from emp where salary > 95;

append (union)

subquery scan (子查询)

function scan (使用函数)

values scan (指定结果集)

发表评论

相关文章