pg_hint_plan — tweak execution plans in SQL comments
pg_hint_plan
makes it possible to tweak execution plans using "hints" in SQL comments, as of /*+ SeqScan(a) */
.
ProtonBase uses a cost-based optimizer, which utilizes data statistics, not static rules. The planner (optimizer) estimates costs of each possible execution plans for a SQL statement then the execution plan with the lowest cost is executed. The planner does its best to select the best execution plan, but is not always perfect, since it doesn't take into account some of the data properties or correlations between columns.
Basic Usage
pg_hint_plan
reads hinting phrases in a comment of special form given a SQL statement. A hint can be specified by prefixing it with the sequence "/\*+"
and ending it with "\*/"
. Hint phrases consist of hint names and parameters enclosed by parentheses and delimited by whitespaces. Hint phrases can use newlines for readability.
In the example below, a hash join is selected as the join method while doing a sequential scan on pgbench_accounts
:
=# /*+
HashJoin(a b)
SeqScan(a)
*/
EXPLAIN SELECT *
FROM pgbench_branches b
JOIN pgbench_accounts a ON b.bid = a.bid
ORDER BY a.aid;
QUERY PLAN
---------------------------------------------------------------------------------------
Sort (cost=31465.84..31715.84 rows=100000 width=197)
Sort Key: a.aid
-> <b>Hash Join</b> (cost=1.02..4016.02 rows=100000 width=197)
Hash Cond: (a.bid = b.bid)
-> <b>Seq Scan on pgbench_accounts a</b> (cost=0.00..2640.00 rows=100000 width=97)
-> Hash (cost=1.01..1.01 rows=1 width=100)
-> Seq Scan on pgbench_branches b (cost=0.00..1.01 rows=1 width=100)
(7 rows)
The hint table
Hints can be specified in a comment, still this can be inconvenient in the case where queries cannot be edited. In the case, hints can be placed in a special table named "hint_plan.hints"
. The table consists of the following columns:
column | description |
---|---|
id | Unique number to identify a row for a hint. This column is filled automatically by sequence. |
query_id | A unique query ID, generated by the backend when the GUC compute_query_id is enabled |
application_name | The value of application_name where sessions can apply a hint. The hint in the example below applies to sessions connected from psql. An empty string implies that all sessions will apply the hint. |
hints | Hint phrase. This must be a series of hints excluding surrounding comment marks. |
The following example shows how to operate with the hint table.
=# EXPLAIN (VERBOSE, COSTS false) SELECT * FROM t1 WHERE t1.id = 1;
QUERY PLAN
----------------------------------------
Seq Scan on public.t1
Output: id, id2
Filter: (t1.id = 1)
Query Identifier: -7164653396197960701
(4 rows)
=# INSERT INTO hint_plan.hints(query_id, application_name, hints)
VALUES (-7164653396197960701, '', 'SeqScan(t1)');
INSERT 0 1
=# UPDATE hint_plan.hints
SET hints = 'IndexScan(t1)'
WHERE id = 1;
UPDATE 1
=# DELETE FROM hint_plan.hints WHERE id = 1;
DELETE 1
The hint table is owned by the extension owner and has the same default privileges as of the time of its creation, during CREATE EXTENSION
. Hints in the hint table are prioritized over hints in comments.
The query ID can be retrieved with pg_stat_statements
or with EXPLAIN (VERBOSE)
.
Types of hints
Hinting phrases are classified in multiple types based on what kind of object and how they can affect the planner. See Hint list for more details.
Hints for Scan methods
Scan method hints enforce specific scanning methods on the target table. pg_hint_plan
recognizes the target table by alias names if any. These are for example SeqScan
or IndexScan
.
Scan hints work on ordinary tables, inheritance tables, UNLOGGED tables, temporary tables and system catalogs. External (foreign) tables, table functions, VALUES clause, CTEs, views and subqueries are not affected.
=# /*+
SeqScan(t1)
IndexScan(t2 t2_pkey)
*/
SELECT * FROM table1 t1 JOIN table table2 t2 ON (t1.key = t2.key);
Hints for Join methods
Join method hints enforce the join methods of the joins involving the specified tables.
This can affect joins only on ordinary tables. Inheritance tables, UNLOGGED tables, temporary tables, external (foreign) tables, system catalogs, table functions, VALUES command results and CTEs are allowed to be in the parameter list. Joins on views and subqueries are not affected.
Hints for Joining order
This hint, named "Leading", enforces the order of join on two or more tables. There are two methods of enforcing it. The first method enforces a specific order of joining but does not restrict the direction at each join level. The second method enforces the join direction additionally. See hint list for more details. For example:
=# /*+
NestLoop(t1 t2)
MergeJoin(t1 t2 t3)
Leading(t1 t2 t3)
*/
SELECT * FROM table1 t1
JOIN table table2 t2 ON (t1.key = t2.key)
JOIN table table3 t3 ON (t2.key = t3.key);
Details in hinting
Syntax and placement
pg_hint_plan
reads hints from only the first block comment and stops parsing from any characters except alphabetical characters, digits, spaces, underscores, commas and parentheses. In the following example, HashJoin(a b)
and SeqScan(a)
are parsed as hints, but IndexScan(a)
and MergeJoin(a b)
are not:
=# /*+
HashJoin(a b)
SeqScan(a)
*/
/*+ IndexScan(a) */
EXPLAIN SELECT /*+ MergeJoin(a b) */ *
FROM pgbench_branches b
JOIN pgbench_accounts a ON b.bid = a.bid
ORDER BY a.aid;
QUERY PLAN
---------------------------------------------------------------------------------------
Sort (cost=31465.84..31715.84 rows=100000 width=197)
Sort Key: a.aid
-> Hash Join (cost=1.02..4016.02 rows=100000 width=197)
Hash Cond: (a.bid = b.bid)
-> Seq Scan on pgbench_accounts a (cost=0.00..2640.00 rows=100000 width=97)
-> Hash (cost=1.01..1.01 rows=1 width=100)
-> Seq Scan on pgbench_branches b (cost=0.00..1.01 rows=1 width=100)
(7 rows)
Upper and lower case handling in object names
Unlike the way ProtonBase handles object names, pg_hint_plan
compares bare object names in hints against the database internal object names in a case-sensitive manner. Therefore, an object name TBL in a hint matches only "TBL" in the database and does not match any unquoted names like TBL, tbl or Tbl.
Escaping special characters in object names
The objects defined in a hint's parameter can use double quotes if they include parentheses, double quotes and white spaces. The escaping rules are the same as ProtonBase.
Distinction between multiple occurences of a table
pg_hint_plan
identifies the target object by using aliases if any. This behavior is useful to point to a specific occurrence among multiple occurrences of one table.
=# /*+ HashJoin(t1 t1) */
EXPLAIN SELECT * FROM s1.t1
JOIN public.t1 ON (s1.t1.id=public.t1.id);
INFO: hint syntax error at or near "HashJoin(t1 t1)"
DETAIL: Relation name "t1" is ambiguous.
...
=# /*+ HashJoin(pt st) */
EXPLAIN SELECT * FROM s1.t1 st
JOIN public.t1 pt ON (st.id=pt.id);
QUERY PLAN
---------------------------------------------------------------------
Hash Join (cost=64.00..1112.00 rows=28800 width=8)
Hash Cond: (st.id = pt.id)
-> Seq Scan on t1 st (cost=0.00..34.00 rows=2400 width=4)
-> Hash (cost=34.00..34.00 rows=2400 width=4)
-> Seq Scan on t1 pt (cost=0.00..34.00 rows=2400 width=4)
Underlying tables of views or rules
Hints are not applicable on views, but they can affect the queries within the view if the object names match the names in the expanded query on the view. Assigning aliases to the tables in a view enables them to be manipulated from outside the view.
=# CREATE VIEW v1 AS SELECT * FROM t2;
=# EXPLAIN /*+ HashJoin(t1 v1) */
SELECT * FROM t1 JOIN v1 ON (c1.a = v1.a);
QUERY PLAN
------------------------------------------------------------------
Hash Join (cost=3.27..18181.67 rows=101 width=8)
Hash Cond: (t1.a = t2.a)
-> Seq Scan on t1 (cost=0.00..14427.01 rows=1000101 width=4)
-> Hash (cost=2.01..2.01 rows=101 width=4)
-> Seq Scan on t2 (cost=0.00..2.01 rows=101 width=4)
Hint list
The available hints are listed below.
Group | Format | Description |
---|---|---|
Scan method | SeqScan(table) | Forces sequential scan on the table. |
IndexScan(table[ index...]) | Forces index scan on the table. Restricts to specified indexes if any. | |
Join method | NestLoop(table table[ table...]) | Forces nested loop for the joins on the tables specified. |
HashJoin(table table[ table...]) | Forces hash join for the joins on the tables specified. | |
MergeJoin(table table[ table...]) | Forces merge join for the joins on the tables specified. | |
Join order | Leading(table table[ table...]) | Forces join order as specified. |
Leading(<join pair>) | Forces join order and directions as specified. A join pair is a pair of tables and/or other join pairs enclosed by parentheses, which can make a nested structure. |