`

Oracle 12c.1新特性--Adaptive Query Optimization(原创)

 
阅读更多

概述
众 说周知CBO通过统计信息决定sql的最优执行计划,如果统计信息不准确,谓词过于复杂,或者表连接基数估算不够准确便有可能导致错误的执行计划。在 12c之前,生成执行计划后,sql执行期间便无法改变执行计划。Oracle 12c的Adaptive Query Optimization便是为了避免使用效率低下的执行计划的新特性,该特性包含adaptive query optimization和adaptive statistics两方面,请参见下图架构。

Adaptive Query Plan

Adaptive Query Plans包含两方面,一个是Adaptive Join Methods,另一个是Adaptive Parallel Distribution Methods, 这两方面特性都允许sql在执行期间改变执行计划,其中Adaptive Join Methods会通过实际的统计信息改变表join的方式,Adaptive Parallel Distribution Methods则会改变parallel distribute server分配parallel slaves的方式。一个adaptive plan 包含了多个预定的子执行计划和 optimizer statistics collector. 子执行计划是CBO在sql执行期间的可选执行计划。在Oracle 12c,

在sql执行期间,statistics collector会收集sql执行的信息并根据之前生成的执行计划缓存一定量的数据。根据这些信息,CBO便可以在sql执行过程中选择更优的执行计 划。在决定了最终的执行计划以后,collector 停止收集信息和缓冲数据,CBO以该执行计划获取所有数据。

Adaptive Join Methods
实 验场景:根据现有统计信息,CBO认为是两张小表tab_a, tab_b进行关联,并生产了使用nested loop的执行计划。如果CBO在实际执行中发现tab1的统计信息远大于统计信息显示的值,CBO便有可能生成使用hash join的执行计划。对于Adaptive Join Methods一旦CBO决定了最终的执行计划,便会在后续的sql执行中沿用该执行计划,直到这个执行计划被刷出shared pool为止,并不再使用statistics collectors产生更优的执行计划。

CREATE TABLE tab_a (
  id    NUMBER,
  code  VARCHAR2(5),
  data  NUMBER(5),
  CONSTRAINT tab_a_pk PRIMARY KEY (id)
);

CREATE INDEX tab_a_code ON tab_a(code);

CREATE SEQUENCE tab_a_seq;

INSERT INTO tab_a VALUES (tab_a_seq.nextval, 'ONE', 1);
INSERT INTO tab_a VALUES (tab_a_seq.nextval, 'TWO', 2);
INSERT INTO tab_a VALUES (tab_a_seq.nextval, 'THREE', 3);
INSERT INTO tab_a VALUES (tab_a_seq.nextval, 'FOUR', 4);
INSERT INTO tab_a VALUES (tab_a_seq.nextval, 'FIVE', 5);
COMMIT;

CREATE TABLE tab_b (
  id       NUMBER,
  tab_a_id  NUMBER,
  data     NUMBER(5),
  CONSTRAINT tab_b_pk PRIMARY KEY (id),
  CONSTRAINT tab_b_tab_a_fk FOREIGN KEY (tab_a_id) REFERENCES tab_a(id)
);

CREATE SEQUENCE tab_b_seq;

CREATE INDEX tab_b_tab_a_fki ON tab_b(tab_a_id);

INSERT /*+ APPEND */ INTO tab_b
SELECT tab_b_seq.nextval,
       TRUNC(DBMS_RANDOM.value(1,5)),
       level
FROM   dual
CONNECT BY level <= 100;
COMMIT;

EXEC DBMS_STATS.gather_table_stats(USER, 'tab_a');
EXEC DBMS_STATS.gather_table_stats(USER, 'tab_b');

查看下面sql的执行计划,这里我们使用DBMS_XPLAN.display_cursor中新增的adaptive参数查看adaptive query plan的信息

SQL>     SELECT /*+ GATHER_PLAN_STATISTICS */
  2             a.data AS tab_a_data,
  3             b.data AS tab_b_data
  4      FROM   tab_a a
  5             JOIN tab_b b ON b.tab_a_id = a.id
  6      WHERE  a.code = 'ONE';
TAB_A_DATA TAB_B_DATA
---------- ----------
         1          4
         1          6
 .................输出省略.............................
         1         92
28 rows selected.
SQL> SET LINESIZE 200 PAGESIZE 100
SQL>     SELECT * FROM TABLE(DBMS_XPLAN.display_cursor(format => 'adaptive allstats last'));

PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------------------------------
SQL_ID  dmg8qawnr6pur, child number 0
-------------------------------------
    SELECT /*+ GATHER_PLAN_STATISTICS */            a.data AS
tab_a_data,            b.data AS tab_b_data     FROM   tab_a a
  JOIN tab_b b ON b.tab_a_id = a.id     WHERE  a.code = 'ONE'

Plan hash value: 1300943669

-----------------------------------------------------------------------------------------------------------------------
|   Id  | Operation                               | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-----------------------------------------------------------------------------------------------------------------------
|     0 | SELECT STATEMENT                        |                 |      1 |        |     28 |00:00:00.01 |       8 |
|- *  1 |  HASH JOIN                              |                 |      1 |     25 |     28 |00:00:00.01 |       8 |
|     2 |   NESTED LOOPS                          |                 |      1 |     25 |     28 |00:00:00.01 |       8 |
|     3 |    NESTED LOOPS                         |                 |      1 |     25 |     28 |00:00:00.01 |       5 |
|-    4 |     STATISTICS COLLECTOR                |                 |      1 |        |      1 |00:00:00.01 |       2 |
|     5 |      TABLE ACCESS BY INDEX ROWID BATCHED| TAB_A           |      1 |      1 |      1 |00:00:00.01 |       2 |
|  *  6 |       INDEX RANGE SCAN                  | TAB_A_CODE      |      1 |      1 |      1 |00:00:00.01 |       1 |
|  *  7 |     INDEX RANGE SCAN                    | TAB_B_TAB_A_FKI |      1 |     25 |     28 |00:00:00.01 |       3 |
|     8 |    TABLE ACCESS BY INDEX ROWID          | TAB_B           |     28 |     25 |     28 |00:00:00.01 |       3 |
|-    9 |   TABLE ACCESS FULL                     | TAB_B           |      0 |     25 |      0 |00:00:00.01 |       0 |
-----------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("B"."TAB_A_ID"="A"."ID")
   6 - access("A"."CODE"='ONE')
   7 - access("B"."TAB_A_ID"="A"."ID")
Note
-----

   - this is an adaptive plan (rows marked '-' are inactive)

34 rows selected.

再来看看没有使用adaptive参数时的执行计划显示情况

SQL> SELECT /*+ GATHER_PLAN_STATISTICS */
  2             a.data AS tab_a_data,
  3             b.data AS tab_b_data
  4      FROM   tab_a a
  5             JOIN tab_b b ON b.tab_a_id = a.id
  6      WHERE  a.code = 'ONE';
TAB_A_DATA TAB_B_DATA
---------- ----------
         1          4
         1          6
 .................输出省略.............................
         1         92
28 rows selected.
SQL> SET LINESIZE 200 PAGESIZE 100

SQL>  SELECT * FROM TABLE(DBMS_XPLAN.display_cursor(format => 'allstats last'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  dmg8qawnr6pur, child number 0
-------------------------------------
    SELECT /*+ GATHER_PLAN_STATISTICS */            a.data AS
tab_a_data,            b.data AS tab_b_data     FROM   tab_a a
  JOIN tab_b b ON b.tab_a_id = a.id     WHERE  a.code = 'ONE'

Plan hash value: 1300943669

-------------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                 |      1 |        |     28 |00:00:00.01 |       9 |
|   1 |  NESTED LOOPS                         |                 |      1 |     25 |     28 |00:00:00.01 |       9 |
|   2 |   NESTED LOOPS                        |                 |      1 |     25 |     28 |00:00:00.01 |       6 |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| TAB_A      |      1 |      1 |      1 |00:00:00.01 |       3 |
|*  4 |     INDEX RANGE SCAN                  | TAB_A_CODE      |      1 |      1 |      1 |00:00:00.01 |       2 |
|*  5 |    INDEX RANGE SCAN                   | TAB_B_TAB_A_FKI |      1 |     25 |     28 |00:00:00.01 |       3 |
|   6 |   TABLE ACCESS BY INDEX ROWID         | TAB_B           |     28 |     25 |     28 |00:00:00.01 |       3 |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("A"."CODE"='ONE')
   5 - access("B"."TAB_A_ID"="A"."ID")

Note
-----
   - this is an adaptive plan
30 rows selected.

注意:带"-"号的部分并没有显示。

对于adaptive sql plan, v$sql.is_resolved_adaptive_plan的值为Y

select IS_RESOLVED_ADAPTIVE_PLAN ,s.sql_text from v$sql s
 where sql_text like '%tab_a%tab_b%ONE%';

IS_RESOLVED_ADAPTIVE_PLAN        SQL_TEXT

------------------------------------------------        ------------------------

Y   SELECT /*+ GATHER_PLAN_STATISTICS */            a.data AS tab_a_data,            b.data AS tab_b_data     FROM   tab_a a            JOIN tab_b b ON b.tab_a_id = a.id     WHERE  a.code = 'ONE'

现在为表插入大量数据,数据集从1变为了10001,在没有收集新的统计信息的情况下,现有的nested loop将不再是”最优“的执行计划

INSERT /*+ APPEND */ INTO tab_a
SELECT tab_a_seq.nextval,
       'ONE',
       level
FROM   dual
CONNECT BY level <= 10000;
COMMIT;

INSERT /*+ APPEND */ INTO tab_b
SELECT tab_b_seq.nextval,
       TRUNC(DBMS_RANDOM.value(11,10005)),
       level
FROM   dual
CONNECT BY level <= 10000;
COMMIT;


SQL>  SELECT num_rows FROM user_tables WHERE table_name = 'TAB_B';
  NUM_ROWS
----------
       100

 

SQL>  SELECT num_rows FROM user_tables WHERE table_name = 'TAB_A';
  NUM_ROWS
----------
       5
执行相同的语句,注意,执行计划并没有因为adaptive query plan而改变

SQL>     SELECT /*+ GATHER_PLAN_STATISTICS */
  2             a.data AS tab_a_data,
  3             b.data AS tab_b_data
  4      FROM   tab_a a
  5             JOIN tab_b b ON b.tab_a_id = a.id
  6      WHERE  a.code = 'ONE';
TAB_A_DATA TAB_B_DATA
---------- ----------
         1          4
         1          6
 .................输出省略.............................
      5180       9765
      1226       9766
10028 rows selected.
SQL> SET LINESIZE 200 PAGESIZE 100

SQL>  SELECT * FROM TABLE(DBMS_XPLAN.display_cursor(format => 'allstats last adaptive'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  2qrt2y39aq9z3, child number 0
-------------------------------------
SELECT /*+ GATHER_PLAN_STATISTICS */            a.data AS tab_a_data,
         b.data AS tab_b_data     FROM   tab_a a            JOIN tab_b
b ON b.tab_a_id = a.id     WHERE  a.code = 'ONE'

Plan hash value: 1300943669

--------------------------------------------------------------------------------------------------------------------------------
|   Id  | Operation                               | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
--------------------------------------------------------------------------------------------------------------------------------
|     0 | SELECT STATEMENT                        |                 |      1 |        |  10019 |00:00:00.04 |   12617 |     30 |
|- *  1 |  HASH JOIN                              |                 |      1 |     25 |  10019 |00:00:00.04 |   12617 |     30 |
|     2 |   NESTED LOOPS                          |                 |      1 |     25 |  10019 |00:00:00.04 |   12617 |     30 |
|     3 |    NESTED LOOPS                         |                 |      1 |     25 |  10019 |00:00:00.02 |    3040 |     30 |
|-    4 |     STATISTICS COLLECTOR                |                 |      1 |        |  10001 |00:00:00.01 |    1428 |     30 |
|     5 |      TABLE ACCESS BY INDEX ROWID BATCHED| TAB_A           |      1 |      1 |  10001 |00:00:00.01 |    1428 |     30 |
|  *  6 |       INDEX RANGE SCAN                  | TAB_A_CODE      |      1 |      1 |  10001 |00:00:00.01 |     738 |      0 |
|  *  7 |     INDEX RANGE SCAN                    | TAB_B_TAB_A_FKI |  10001 |     25 |  10019 |00:00:00.01 |    1612 |      0 |
|     8 |    TABLE ACCESS BY INDEX ROWID          | TAB_B           |  10019 |     25 |  10019 |00:00:00.01 |    9577 |      0 |
|-    9 |   TABLE ACCESS FULL                     | TAB_B           |      0 |     25 |      0 |00:00:00.01 |       0 |      0 |
--------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("B"."TAB_A_ID"="A"."ID")
   6 - access("A"."CODE"='ONE')
   7 - access("B"."TAB_A_ID"="A"."ID")

Note
-----
   - this is an adaptive plan (rows marked '-' are inactive)
34 rows selected.

将执行计划刷出shared_pool,依旧可以看到CBO使用了adaptive query plan,但这次,CBO发现用于估算的统计信息和实际执行期间收集的统计信息相差较大,CBO改使用hash join进行连接。

SQL>   ALTER SYSTEM FLUSH SHARED_POOL;

SQL>     SELECT /*+ GATHER_PLAN_STATISTICS */
  2             a.data AS tab_a_data,
  3             b.data AS tab_b_data
  4      FROM   tab_a a
  5             JOIN tab_b b ON b.tab_a_id = a.id
  6      WHERE  a.code = 'ONE';
TAB_A_DATA TAB_B_DATA
---------- ----------
         1          4
         1          6

.................输出省略.............................
      5180       9765
      1226       9766
10028 rows selected.

SQL> SET LINESIZE 200 PAGESIZE 100
SQL> SELECT * FROM TABLE(DBMS_XPLAN.display_cursor(format => 'adaptive allstats last'));
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  dmg8qawnr6pur, child number 0
-------------------------------------
    SELECT /*+ GATHER_PLAN_STATISTICS */            a.data AS
tab_a_data,            b.data AS tab_b_data     FROM   tab_a a
  JOIN tab_b b ON b.tab_a_id = a.id     WHERE  a.code = 'ONE'

Plan hash value: 1213658146

--------------------------------------------------------------------------------------------------------------------------------------------------
|   Id  | Operation                               | Name            | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
--------------------------------------------------------------------------------------------------------------------------------------------------
|     0 | SELECT STATEMENT                        |                 |      1 |        |  10028 |00:00:00.02 |     761 |       |       |      |
|  *  1 |  HASH JOIN                              |                 |      1 |     25 |  10028 |00:00:00.02 |     761 |  1899K|  1899K| 1958K (0)|
|-    2 |   NESTED LOOPS                          |                 |      1 |     25 |  10001 |00:00:00.03 |      63 |       |       |      |
|-    3 |    NESTED LOOPS                         |                 |      1 |     25 |  10001 |00:00:00.03 |      63 |       |       |      |
|-    4 |     STATISTICS COLLECTOR                |                 |      1 |        |  10001 |00:00:00.02 |      63 |       |       |      |
|     5 |      TABLE ACCESS BY INDEX ROWID BATCHED| TAB_A           |      1 |      1 |  10001 |00:00:00.01 |      63 |       |       |      |
|  *  6 |       INDEX RANGE SCAN                  | TAB_A_CODE      |      1 |      1 |  10001 |00:00:00.01 |      37 |       |       |      |
|- *  7 |     INDEX RANGE SCAN                    | TAB_B_TAB_A_FKI |      0 |     25 |      0 |00:00:00.01 |       0 |       |       |      |
|-    8 |    TABLE ACCESS BY INDEX ROWID          | TAB_B           |      0 |     25 |      0 |00:00:00.01 |       0 |       |       |      |
|     9 |   TABLE ACCESS FULL                     | TAB_B           |      1 |     25 |  10100 |00:00:00.01 |     698 |       |       |      |
--------------------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("B"."TAB_A_ID"="A"."ID")
   6 - access("A"."CODE"='ONE')
   7 - access("B"."TAB_A_ID"="A"."ID")

Note
-----
   - this is an adaptive plan (rows marked '-' are inactive)

34 rows selected.

Adaptive Parallel Distribution Method

通 常情况下, 并行SQL会进行数据的重新分部来执行相关操作,如并行排序,聚会和join. Oracle有多种不同的数据分布方式,如broadcast ,hash for the left input, round-robin和hash for the right input等。Oracle根据重新分布的数据量和并行进程的个数决定使用哪一种数据分布方式。
通常情况下可概括为以下两种情况:

  • 大量parallel server processes处理少数数据。在这种情况下,数据库有可能使用broadcast方式进行分布,每个结果集中的数据行会由单独的parallel server processes处理
  • 少数parallel server processes处理大量数据。一种典型的情况就是,当在倾斜的数据重新分布时,CBO很可能使用hash的分布方式以保证每个并行进程处理等量的数据

Oracle Database 12c引进了hybrid hash的分布方式,该分布方式会根据statistics collectors在执行期间收集的信息决定如何分配parallel slave process之间的数据。和adaptive join method不同的是,在sql每次执行的时候adaptive parallel distribution method都有机会改变数据的分布方式。

Hybrid Hash Adaptive Distribution方式会假设使用hash的数据分布方式,如果producer返回的数据比阀值(两倍于degree of parallelism (DOP) )少,则会改用broadcast的分布方式。

接下来会用之前的例子来继续使用,但需要重新收集和flush shared pool。

SQL> EXEC DBMS_STATS.gather_table_stats(USER, 'TAB_A');
PL/SQL procedure successfully completed.
SQL> EXEC DBMS_STATS.gather_table_stats(USER, 'TAB_B');
PL/SQL procedure successfully completed.

SQL> ALTER SYSTEM FLUSH SHARED_POOL;
System altered.

执行相同sql,但这次加上了parallel hint

SQL> SELECT /*+ GATHER_PLAN_STATISTICS
           PARALLEL(16) */
       a.data AS tab_a_data,
       b.data AS tab_b_data
FROM   tab_a a
       JOIN tab_b b ON b.tab_a_id = a.id
WHERE  a.code = 'ONE';

SQL> SET LINESIZE 200 PAGESIZE 100

SQL> SELECT * FROM TABLE(DBMS_XPLAN.display_cursor(format => 'allstats last adaptive'));

PLAN_TABLE_OUTPUT

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  6dyu38vv8p592, child number 0
-------------------------------------
SELECT /*+ GATHER_PLAN_STATISTICS            PARALLEL(16) */
a.data AS tab_a_data,        b.data AS tab_b_data FROM   tab_a a
JOIN tab_b b ON b.tab_a_id = a.id WHERE  a.code = 'ONE'

Plan hash value: 3273022468

----------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                  | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |          |      1 |        |  10028 |00:00:00.08 |     742 |       |       |          |
|   1 |  PX COORDINATOR            |          |      1 |        |  10028 |00:00:00.08 |     742 |       |       |          |
|   2 |   PX SEND QC (RANDOM)      | :TQ10002 |      1 |  10100 |  10028 |00:00:00.04 |     732 |       |       |          |
|*  3 |    HASH JOIN BUFFERED      |          |      1 |  10100 |  10028 |00:00:00.03 |     732 |  2517K|  2517K| 2892K (0)|
|   4 |     PX RECEIVE             |          |      1 |  10001 |  10001 |00:00:00.01 |      33 |       |       |          |
|   5 |      PX SEND HYBRID HASH   | :TQ10000 |      1 |  10001 |  10001 |00:00:00.01 |      33 |       |       |          |
|   6 |       STATISTICS COLLECTOR |          |      1 |        |  10001 |00:00:00.01 |      33 |       |       |          |
|   7 |        PX BLOCK ITERATOR   |          |      1 |  10001 |  10001 |00:00:00.01 |      33 |       |       |          |
|*  8 |         TABLE ACCESS FULL  | TAB_A    |      1 |  10001 |  10001 |00:00:00.01 |      33 |       |       |          |
|   9 |     PX RECEIVE             |          |      1 |  10100 |  10100 |00:00:00.01 |     699 |       |       |          |
|  10 |      PX SEND HYBRID HASH   | :TQ10001 |      1 |  10100 |  10100 |00:00:00.01 |     699 |       |       |          |
|  11 |       PX BLOCK ITERATOR    |          |      1 |  10100 |  10100 |00:00:00.01 |     699 |       |       |          |
|* 12 |        TABLE ACCESS FULL   | TAB_B    |      1 |  10100 |  10100 |00:00:00.01 |     699 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("B"."TAB_A_ID"="A"."ID")
   8 - access(:Z>=:Z AND :Z<=:Z)
       filter("A"."CODE"='ONE')
  12 - access(:Z>=:Z AND :Z<=:Z)

Note
-----
   - Degree of Parallelism is 16 because of hint

38 rows selected.

可以看到CBO使用了PX SEND HYBRID HASH的方式。

Adaptive Statistics
12c中的Adaptive Statistics由以下3部分组成

  • Dynamic Statistics,是统计信息收集的一种补充手段,与之前的Dynamic Sampling类似,该技术能使CBO在没有收集统计信息的前提下得到足够的信息以保证做出正确的执行计划。
  • Automatic Reoptimization,尽管adaptive query plans可以让CBO在多个子执行计划中进行选择,但并是总能改变执行期间的执行计划。例如,在sql查询时使用了不理想的驱动表, adaptive plans对此便无能为力。在这种情况下,CBO将会考虑Automatic Reoptimization. 与adaptive plans不同的是, automatic reoptimization对sql后续的执行都起作用。
    在第一次SQL执行结束后,优化器会根据 执行期间收集的信息决定是否进行reoptimization。如果执行信息和优化器的估计值有很大不同,优化器会在下一次执行时尝试生成新的可选执行计 划,这种特性会在sql每次执行时都触发,通过每次收集的信息不断改善执行计划。
  • SQL Plan Directives,SQL plan directive可以看作是额外的统计信息,优化器利用SQL plan directive生成更优的执行计划.例如,SQL语句join的表有数据倾斜的情况,优化器会根据SQL plan directive判断是否进行动态采样以获得更准确的执行计划。SQL plan directive不是以语句为单位,而是以表达式为单位,这也就意味着优化器可以对多个不同的sql应用相同的SQL plan directive。默认情况下Oracle每15分钟会自动将内存中的SQL plan directive写入SYSAUX表空间,也可以通过DBMS_SPD包进行手动管理。
Dynamic Statistics

详细内容可以参见笔者的另一篇文章http://czmmiao.iteye.com/blog/1484571,这里不再赘述。

Automatic Reoptimization

Reoptimization: Statistics Feedback

这里我们新建一张tab_c表

SQL> create table tab_c as select * from tab_b where data between 4000 and 8000;
Table created.

设置错误的统计信息

EXEC DBMS_STATS.SET_TABLE_STATS('SCOTT','TAB_A',NUMROWS=>1);
EXEC DBMS_STATS.SET_TABLE_STATS('SCOTT','TAB_B',NUMROWS=>1);
EXEC DBMS_STATS.SET_TABLE_STATS('SCOTT','TAB_C',NUMROWS=>1);|

执行下列语句

SQL> SELECT /*+ GATHER_PLAN_STATISTICS  */ b.id, v.CODE
FROM   tab_b b,
       ( SELECT c.id, a.CODE
         FROM   tab_a a, tab_c c
         WHERE  c.ID = a.ID
         AND    C.ID < 7000
         AND    C.DATA > 5000 ) v
WHERE  b.ID = v.ID;

     ID        CODE

----------- --------------

       6992 ONE
      6993 ONE
      6994 ONE
 .................输出省略.............................
      5180       9765
      1226       9766
1899 rows selected.

SQL> SET LINESIZE 200 PAGESIZE 100
SQL> SELECT * FROM TABLE(DBMS_XPLAN.display_cursor(format => 'allstats last adaptive'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  as7bp59xsbp2m, child number 0
-------------------------------------
SELECT /*+ GATHER_PLAN_STATISTICS  */ b.id, v.CODE FROM   tab_b b,
  ( SELECT c.id, a.CODE          FROM   tab_a a, tab_c c          WHERE
 c.ID = a.ID          AND    C.ID < 7000          AND    C.DATA > 5000
) v WHERE  b.ID = v.ID

Plan hash value: 1222210870

---------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |          |      1 |        |   1899 |00:00:00.01 |    2373 |
|   1 |  NESTED LOOPS                |          |      1 |      1 |   1899 |00:00:00.01 |    2373 |
|   2 |   NESTED LOOPS               |          |      1 |      1 |   1899 |00:00:00.01 |     474 |
|   3 |    NESTED LOOPS              |          |      1 |      1 |   1899 |00:00:00.01 |     307 |
|*  4 |     TABLE ACCESS FULL        | TAB_C    |      1 |      1 |   1899 |00:00:00.01 |     140 |
|*  5 |     INDEX UNIQUE SCAN        | TAB_B_PK |   1899 |      1 |   1899 |00:00:00.01 |     167 |
|*  6 |    INDEX UNIQUE SCAN         | TAB_A_PK |   1899 |      1 |   1899 |00:00:00.01 |     167 |
|   7 |   TABLE ACCESS BY INDEX ROWID| TAB_A    |   1899 |      1 |   1899 |00:00:00.01 |    1899 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter(("C"."ID"<7000 AND "C"."DATA">5000))
   5 - access("B"."ID"="C"."ID")
       filter("B"."ID"<7000)
   6 - access("C"."ID"="A"."ID")
       filter("A"."ID"<7000)

31 rows selected.
可以看到统计信息的估计值和实际值有严重偏差,再次执行相同语句。

SQL> SELECT /*+ GATHER_PLAN_STATISTICS  */ b.id, v.CODE
FROM   tab_b b,
       ( SELECT c.id, a.CODE
         FROM   tab_a a, tab_c c
         WHERE  c.ID = a.ID
         AND    C.ID < 7000
         AND    C.DATA > 5000 ) v
WHERE  b.ID = v.ID;

     ID        CODE

----------- --------------

       6992 ONE
      6993 ONE
      6994 ONE
 .................输出省略.............................
      5180       9765
      1226       9766
1899 rows selected.

SQL> SET LINESIZE 200 PAGESIZE 100
SQL> SELECT * FROM TABLE(DBMS_XPLAN.display_cursor(format => 'allstats last adaptive'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  as7bp59xsbp2m, child number 1
-------------------------------------
SELECT /*+ GATHER_PLAN_STATISTICS  */ b.id, v.CODE FROM   tab_b b,
  ( SELECT c.id, a.CODE          FROM   tab_a a, tab_c c          WHERE
 c.ID = a.ID          AND    C.ID < 7000          AND    C.DATA > 5000
) v WHERE  b.ID = v.ID

Plan hash value: 204912182

----------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |          |      1 |        |   1899 |00:00:00.03 |    2373 |
|   1 |  NESTED LOOPS                 |          |      1 |   1899 |   1899 |00:00:00.03 |    2373 |
|   2 |   NESTED LOOPS                |          |      1 |      1 |   1899 |00:00:00.05 |    2206 |
|*  3 |    TABLE ACCESS FULL          | TAB_C    |      1 |   1899 |   1899 |00:00:00.01 |     140 |
|   4 |    TABLE ACCESS BY INDEX ROWID| TAB_A    |   1899 |      1 |   1899 |00:00:00.01 |    2066 |
|*  5 |     INDEX UNIQUE SCAN         | TAB_A_PK |   1899 |      1 |   1899 |00:00:00.01 |     167 |
|*  6 |   INDEX UNIQUE SCAN           | TAB_B_PK |   1899 |  80020 |   1899 |00:00:00.01 |     167 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter(("C"."ID"<7000 AND "C"."DATA">5000))
   5 - access("C"."ID"="A"."ID")
       filter("A"."ID"<7000)
   6 - access("B"."ID"="C"."ID")
       filter("B"."ID"<7000)

Note
-----
   - statistics feedback used for this statement

34 rows selected.

可以看到,表的连接顺序发生了变化,statistics feedback used for this statement表明,执行期间收集的统计信息使优化器进行了Automatic Reoptimization。注意,第6步队TAB_B的估算仍然偏差较大。

再次执行相同语句

SQL> SELECT /*+ GATHER_PLAN_STATISTICS  */ b.id, v.CODE
FROM   tab_b b,
       ( SELECT c.id, a.CODE
         FROM   tab_a a, tab_c c
         WHERE  c.ID = a.ID
         AND    C.ID < 7000
         AND    C.DATA > 5000 ) v
WHERE  b.ID = v.ID;

     ID        CODE

----------- --------------

       6992 ONE
      6993 ONE
      6994 ONE
 .................输出省略.............................
      5180       9765
      1226       9766
1899 rows selected.

SQL> SET LINESIZE 200 PAGESIZE 100
SQL> SELECT * FROM TABLE(DBMS_XPLAN.display_cursor(format => 'allstats last adaptive'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------
SQL_ID  as7bp59xsbp2m, child number 2
-------------------------------------
SELECT /*+ GATHER_PLAN_STATISTICS  */ b.id, v.CODE FROM   tab_b b,
  ( SELECT c.id, a.CODE          FROM   tab_a a, tab_c c          WHERE
 c.ID = a.ID          AND    C.ID < 7000          AND    C.DATA > 5000
) v WHERE  b.ID = v.ID

Plan hash value: 1222210870

---------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |          |      1 |        |   1899 |00:00:00.01 |    2373 |
|   1 |  NESTED LOOPS                |          |      1 |   1899 |   1899 |00:00:00.01 |    2373 |
|   2 |   NESTED LOOPS               |          |      1 |   1899 |   1899 |00:00:00.01 |     474 |
|   3 |    NESTED LOOPS              |          |      1 |   1899 |   1899 |00:00:00.01 |     307 |
|*  4 |     TABLE ACCESS FULL        | TAB_C    |      1 |   1899 |   1899 |00:00:00.01 |     140 |
|*  5 |     INDEX UNIQUE SCAN        | TAB_B_PK |   1899 |      1 |   1899 |00:00:00.01 |     167 |
|*  6 |    INDEX UNIQUE SCAN         | TAB_A_PK |   1899 |      1 |   1899 |00:00:00.01 |     167 |
|   7 |   TABLE ACCESS BY INDEX ROWID| TAB_A    |   1899 |      1 |   1899 |00:00:00.01 |    1899 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - filter(("C"."ID"<7000 AND "C"."DATA">5000))
   5 - access("B"."ID"="C"."ID")
       filter("B"."ID"<7000)
   6 - access("C"."ID"="A"."ID")
       filter("A"."ID"<7000)

Note
-----
   - statistics feedback used for this statement
35 rows selected.

可以看到,执行计划有回到了之前的版本,但有了更准确的估计信息,显然优化器再次进行了Automatic Reoptimization

注意,对于可Automatic Reoptimization的语句,v$sql.IS_REOPTIMIZABLE 会被标记为'Y'

select  EXECUTIONS, sql_id, CHILD_NUMBER,IS_REOPTIMIZABLE, LAST_LOAD_TIME from v$sql
 where sql_text like '% /*+ GATHER_PLAN_STATISTICS  */ b.id, v.CODE FROM   tab_b b%';
EXECUTIONS  SQL_ID  CHILD_NUMBER  IS_REOPTIMIZABLE     LAST_LOAD_TIME
 --------  ---------   ---------   ---------          ------------------                   
1             as7bp59xsbp2m    0                 Y                2015-02-05/11:47:15
1             as7bp59xsbp2m    1                 Y                2015-02-05/11:47:49
1             as7bp59xsbp2m    2                 N                2015-02-05/11:47:49

Reoptimization: Performance Feedback
另一种形式的
reoptimization 是performance feedback. 当PARALLEL_DEGREE_POLICY设为adaptive时,该reoptimization能够自动根据性能状况改善并行度。 performance feedback的过程可以概括如下:

  1. 当PARALLEL_DEGREE_POLICY设置为ADAPTIVE,在第一次执行sql时,优化器判断是否使用并行,
  2. 如果使用并行,优化器根据估算的语句性能决定并行度,同时监控语句性能。
  3. 在第一次语句执行结束后,优化器进行对如下事项进行比较:
    • 优化器选择的并行度
    • 优化器根据语句执行时收集的性能统计信息(CPU时间等)计算出的并行度。
    • 如果两者差别很大,Oracle将会标识该语句为需要重新解析,并存储初次手机的执行信息作为反馈,用于为后续的执行生成更优的并行度。

SQL Plan Directives

在sql执行期间,如果错误的估计了cardinality, Oracle便会生成SQL plan directives。在sql编译期间,相应的SQL plan directives信息以判断是否缺少扩展统计信息或者直方图。如果缺少,优化器便会进行记录,并后续的调用DBMS_STATS进行收集。

只要优化器没有足够的directive对应的统计信息,优化器都会通过dynamic statistics进行收集。例如, 优化器会在发现错误估算统计信息之后,通过dynamic statistics收集column group的统计信息。目前优化器还只能监控column group,无法监控表达式上的extension statistics如上所述,SQL plan directives,并不是以sql语句为单位,优化器可以在相似的语句上应用相同的SQL plan directives。

创建新表tab_d, flush shared pool,并设置错误的统计信息。

create table tab_d
 as
 select  TRUNC(DBMS_RANDOM.value(1,10)) col1, TRUNC(DBMS_RANDOM.value(11,20)) col2
   from dual connect by level <= 1000;

 

ALTER SYSTEM FLUSH SHARED_POOL;

 

EXEC DBMS_STATS.SET_TABLE_STATS('SCOTT','TAB_D',NUMROWS=>1);
执行下列语句,

SQL>    select /*+ GATHER_PLAN_STATISTICS  */ * from tab_d
  2      where col1=2 and col2=12;
      COL1       COL2
---------- ----------
         2         12
         2         12
         2         12
         2         12
         2         12
         2         12
         2         12

7 rows selected.

SQL>  SELECT * FROM TABLE(DBMS_XPLAN.display_cursor(format => 'allstats last adaptive'));
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------
SQL_ID  amnmarqm0py0a, child number 0
-------------------------------------
   select /*+ GATHER_PLAN_STATISTICS  */ * from tab_d     where col1=2
and col2=12

Plan hash value: 1458238053

-------------------------------------------------------------------------------------
| Id  | Operation         | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |      1 |        |      7 |00:00:00.01 |       5 |
|*  1 |  TABLE ACCESS FULL| TAB_D |      1 |     1 |      7 |00:00:00.01 |       5 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(("COL1"=2 AND "COL2"=12))
19 rows selected.
查询v$sql发现IS_REOPTIMIZABLE为Y
select  EXECUTIONS,SQL_ID,CHILD_NUMBER,IS_REOPTIMIZABLE,LAST_LOAD_TIME from v$sql
 where sql_text like '% /*+ GATHER_PLAN_STATISTICS  */%tab_d%';

EXECUTIONS  SQL_ID  CHILD_NUMBER  IS_REOPTIMIZABLE     LAST_LOAD_TIME
 --------  ---------   ---------   ---------          ------------------ 

1    1k7puah78vpdn    0    Y    2015-02-05/13:05:18
再次执行相同语句
SQL>  select /*+ GATHER_PLAN_STATISTICS  */ * from tab_d
  2    where col1=2 and col2=12;

      COL1       COL2
---------- ----------
         2         12
         2         12
         2         12
         2         12
         2         12
         2         12
         2         12

7 rows selected.
SQL> SELECT * FROM TABLE(DBMS_XPLAN.display_cursor(format => 'allstats last adaptive'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  1k7puah78vpdn, child number 1
-------------------------------------
 select /*+ GATHER_PLAN_STATISTICS  */ * from tab_d   where col1=2 and
col2=12

Plan hash value: 1458238053

-------------------------------------------------------------------------------------
| Id  | Operation         | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |      1 |        |      7 |00:00:00.01 |       5 |
|*  1 |  TABLE ACCESS FULL| TAB_D |      1 |      7 |      7 |00:00:00.01 |       5 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(("COL1"=2 AND "COL2"=12))

Note
-----
   - statistics feedback used for this statement

23 rows selected.
使用了statistics feedback, 将shared pool中的sql directive写入到sysaux表空间中。

EXEC DBMS_SPD.FLUSH_SQL_PLAN_DIRECTIVE;
     
SELECT o.OWNER, o.OBJECT_NAME,
o.SUBOBJECT_NAME col_name, o.OBJECT_TYPE, d.TYPE, d.STATE, d.REASON
FROM   DBA_SQL_PLAN_DIRECTIVES d, DBA_SQL_PLAN_DIR_OBJECTS o
WHERE  d.DIRECTIVE_ID=o.DIRECTIVE_ID
AND    o.OWNER IN ('SCOTT')
AND o.object_name='TAB_D'
ORDER BY 1,2,3,4,5;

OWNER  OBJECT_NAME  col_name       OBJECT_TYPE       TYPE       STATE      REASON

-------------   ---------------       ---------------     ----------------------    -------------  -------------  -------------

SCOTT    TAB_D    COL1    COLUMN    DYNAMIC_SAMPLING    USABLE    SINGLE TABLE CARDINALITY MISESTIMATE
SCOTT    TAB_D    COL2    COLUMN    DYNAMIC_SAMPLING    USABLE    SINGLE TABLE CARDINALITY MISESTIMATE
SCOTT    TAB_D        TABLE    DYNAMIC_SAMPLING    USABLE    SINGLE TABLE CARDINALITY MISESTIMATE
SQL>    select /*+ GATHER_PLAN_STATISTICS  */ * from tab_d
  2      where col1=2 and col2=12;
      COL1       COL2
---------- ----------
         2         12
         2         12
         2         12
         2         12
         2         12
         2         12
         2         12

7 rows selected.

SQL> SELECT * FROM TABLE(DBMS_XPLAN.display_cursor(format => 'allstats last adaptive'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  amnmarqm0py0a, child number 0
-------------------------------------
   select /*+ GATHER_PLAN_STATISTICS  */ * from tab_d     where col1=2
and col2=12

Plan hash value: 1458238053

-------------------------------------------------------------------------------------
| Id  | Operation         | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |      1 |        |      7 |00:00:00.01 |       5 |
|*  1 |  TABLE ACCESS FULL| TAB_D |      1 |      7 |      7 |00:00:00.01 |       5 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(("COL1"=2 AND "COL2"=12))

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - 1 Sql Plan Directive used for this statement

24 rows selected.
可以看到该语句使用了sql plan directive,尝试使用不同谓词,看下结果如何
SQL>    select /*+ GATHER_PLAN_STATISTICS  */ * from tab_d
  2      where col1=2 and col2=13;

      COL1       COL2
---------- ----------
         2         13
         2         13
         2         13
         2         13
         2         13
         2         13
         2         13
         2         13
         2         13
         2         13
         2         13
         2         13
         2         13

13 rows selected.

SQL> SELECT * FROM TABLE(DBMS_XPLAN.display_cursor(format => 'allstats last adaptive'));

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  9w6458332qkgy, child number 0
-------------------------------------
   select /*+ GATHER_PLAN_STATISTICS  */ * from tab_d     where col1=2
and col2=13

Plan hash value: 1458238053

-------------------------------------------------------------------------------------
| Id  | Operation         | Name  | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |      1 |        |     13 |00:00:00.01 |       5 |
|*  1 |  TABLE ACCESS FULL| TAB_D |      1 |     13 |     13 |00:00:00.01 |       5 |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(("COL1"=2 AND "COL2"=13))

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)
   - 1 Sql Plan Directive used for this statement

24 rows selected.
发现同样适用了sql plan directive,由此可证sql plan directive不是语句级的,而是表达式级的。

 

参考至:http://docs.oracle.com/database/121/TGSQL/tgsql_stats.htm#TGSQL424
            http://docs.oracle.com/database/121/TGSQL/tgsql_optcncpt.htm#TGSQL94983
            http://docs.oracle.com/database/121/TGSQL/tgsql_statscon.htm#TGSQL341
            http://docs.oracle.com/database/121/TGSQL/tgsql_statscon.htm#TGSQL347
            http://docs.oracle.com/database/121/TGSQL/tgsql_optcncpt.htm#TGSQL221
            http://docs.oracle.com/database/121/TGSQL/tgsql_statscon.htm#TGSQL344
            http://docs.oracle.com/database/121/TGSQL/tgsql_astat.htm#TGSQL95102
            http://docs.oracle.com/database/121/TGSQL/tgsql_influence.htm#TGSQL257
            http://oracle-base.com/articles/12c/adaptive-plans-12cr1.php

本文原创,转载请出处,作者

如有错误,欢迎指正

邮箱:czmcj@163.com

0
0
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics