By Franck Pachot
.
This was initially posted to CERN Database blog on Wednesday, 12 September 2018 where it seems to be lost. Here is a copy thanks to web.archive.org
Demo table
I create a table for the demo. The CTAS gathers statistics (12c online statistics gathering) with one row and then I insert more rows:
10:33:56 SQL> create table DEMO as select rownum n from dual;
Table DEMO created.
10:33:56 SQL> insert into DEMO select rownum n from xmltable('1 to 41');
41 rows inserted.
10:33:56 SQL> commit;
Commit complete.
The estimations are stale: estimates 1 row (E-Rows) but 42 actual rows (A-Rows)
10:33:56 SQL> select /*+ gather_plan_statistics */ count(*) from DEMO;
COUNT(*)
--------
42
10:33:57 SQL> select * from table(dbms_xplan.display_cursor(format=>'basic +rows +rowstats last'));
PLAN_TABLE_OUTPUT
-----------------
EXPLAINED SQL STATEMENT:
------------------------
select /*+ gather_plan_statistics */ count(*) from DEMO
Plan hash value: 2180342005
--------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows |
--------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |
| 2 | TABLE ACCESS FULL| DEMO | 1 | 1 | 42 |
--------------------------------------------------------------
Pending Statistics
Here we are: I want to gather statistics on this table. But I will lower all risks by not publishing them immediately. Current statistics preferences are set to PUBLISH=TRUE:
10:33:58 SQL> select num_rows,cast(last_analyzed as timestamp),dbms_stats.get_prefs('PUBLISH',owner,table_name) from dba_tab_statistics where owner='DEMO' and table_name in ('DEMO');
NUM_ROWS CAST(LAST_ANALYZEDASTIMESTAMP) DBMS_STATS.GET_PREFS('PUBLISH',OWNER,TABLE_NAME)
-------- ------------------------------ ------------------------------------------------
1 12-SEP-18 10.33.56.000000000 AM TRUE
I set it to FALSE:
10:33:59 SQL> exec dbms_stats.set_table_prefs('DEMO','DEMO','publish','false');
PL/SQL procedure successfully completed.
10:34:00 SQL> select num_rows,cast(last_analyzed as timestamp),dbms_stats.get_prefs('PUBLISH',owner,table_name) from dba_tab_statistics where owner='DEMO' and table_name in ('DEMO');
NUM_ROWS CAST(LAST_ANALYZEDASTIMESTAMP) DBMS_STATS.GET_PREFS('PUBLISH',OWNER,TABLE_NAME)
-------- ------------------------------ ------------------------------------------------
1 12-SEP-18 10.33.56.000000000 AM FALSE
I’m now gathering stats as I want to:
10:34:01 SQL> exec dbms_stats.gather_table_stats('DEMO','DEMO');
PL/SQL procedure successfully completed.
Test Pending Statistics
They are not published. But to test my queries with those new stats, I can set my session to use pending statistics:
10:34:02 SQL> alter session set optimizer_use_pending_statistics=true;
Session altered.
Running my query again, I can see the good estimations (E-Rows=A-Rows)
10:34:03 SQL> select /*+ gather_plan_statistics */ count(*) from DEMO;
COUNT(*)
--------
42
10:34:04 SQL> select * from table(dbms_xplan.display_cursor(format=>'basic +rows +rowstats last'));
PLAN_TABLE_OUTPUT
-----------------
EXPLAINED SQL STATEMENT:
------------------------
select /*+ gather_plan_statistics */ count(*) from DEMO
Plan hash value: 2180342005
--------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows |
--------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |
| 2 | TABLE ACCESS FULL| DEMO | 1 | 42 | 42 |
--------------------------------------------------------------
The published statistics still show 1 row:
10:34:05 SQL> select num_rows,cast(last_analyzed as timestamp),dbms_stats.get_prefs('PUBLISH',owner,table_name) from dba_tab_statistics where owner='DEMO' and table_name in ('DEMO');
NUM_ROWS CAST(LAST_ANALYZEDASTIMESTAMP) DBMS_STATS.GET_PREFS('PUBLISH',OWNER,TABLE_NAME)
-------- ------------------------------ ------------------------------------------------
1 12-SEP-18 10.33.56.000000000 AM FALSE
But I can query the pending ones before publishing them:
10:34:05 SQL> c/dba_tab_statistics/dba_tab_pending_stats
1* select num_rows,cast(last_analyzed as timestamp),dbms_stats.get_prefs('PUBLISH',owner,table_name) from dba_tab_pending_stats where owner='DEMO' and table_name in ('DEMO');
10:34:05 SQL> /
NUM_ROWS CAST(LAST_ANALYZEDASTIMESTAMP) DBMS_STATS.GET_PREFS('PUBLISH',OWNER,TABLE_NAME)
-------- ------------------------------ ------------------------------------------------
42 12-SEP-18 10.34.01.000000000 AM FALSE
I’ve finished my test with pending statistics:
10:34:05 SQL> alter session set optimizer_use_pending_statistics=false;
Session altered.
Note that if you have Real Application Testing, you can use SQL Performance Analyzer to test the pending statistics on a whole SQL Tuning Set representing the critical queries of your application. Of course, the more you test there, the better it is.
Delete Pending Statistics
Now let’s say that my test shows that the new statistics are not good, I can simply delete the pending statistics:
10:34:06 SQL> exec dbms_stats.delete_pending_stats('DEMO','DEMO');
PL/SQL procedure successfully completed.
Then all queries are still using the previous statistics:
10:34:07 SQL> show parameter pending
NAME TYPE VALUE
-------------------------------- ------- -----
optimizer_use_pending_statistics boolean FALSE
10:34:07 SQL> select /*+ gather_plan_statistics */ count(*) from DEMO;
COUNT(*)
--------
42
10:34:08 SQL> select * from table(dbms_xplan.display_cursor(format=>'basic +rows +rowstats last'));
PLAN_TABLE_OUTPUT
-----------------
EXPLAINED SQL STATEMENT:
------------------------
select /*+ gather_plan_statistics */ count(*) from DEMO
Plan hash value: 2180342005
--------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows |
--------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |
| 2 | TABLE ACCESS FULL| DEMO | 1 | 1 | 42 |
--------------------------------------------------------------
Accept Pending Statistics
Now I’ll show the second case where my tests show that the new statistics gathering is ok. I gather statistics again:
10:34:09 SQL> exec dbms_stats.gather_table_stats('DEMO','DEMO');
PL/SQL procedure successfully completed.
10:34:09 SQL> alter session set optimizer_use_pending_statistics=true;
Session altered.
10:34:11 SQL> select /*+ gather_plan_statistics */ count(*) from DEMO;
COUNT(*)
--------
42
10:34:12 SQL> select * from table(dbms_xplan.display_cursor(format=>'basic +rows +rowstats last'));
PLAN_TABLE_OUTPUT
-----------------
EXPLAINED SQL STATEMENT:
------------------------
select /*+ gather_plan_statistics */ count(*) from DEMO
Plan hash value: 2180342005
--------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows |
--------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |
| 2 | TABLE ACCESS FULL| DEMO | 1 | 42 | 42 |
--------------------------------------------------------------
10:34:12 SQL> alter session set optimizer_use_pending_statistics=false;
Session altered.
When I’m ok with the new statistics I can publish them so that other sessions can see them. As doing this in production is probably a fix for a critical problem, I want the effects to take immediately, invalidating all cursors:
10:34:13 SQL> exec dbms_stats.publish_pending_stats('DEMO','DEMO',no_invalidate=>false);
PL/SQL procedure successfully completed.
The default NO_INVALIDATE value is probably to avoid in those cases because you want to see the side effects, if any, as soon as possible. Not within a random window of 5 hours later where you have left the office. I set back the table preference to PUBLISH=TRUE and check that the new statistics are visible in DBA_TAB_STATISTICS (and no more in DBA_TAB_PENDING_STATS):
10:34:14 SQL> exec dbms_stats.set_table_prefs('DEMO','DEMO','publish','true');
PL/SQL procedure successfully completed.
10:34:15 SQL> select num_rows,cast(last_analyzed as timestamp),dbms_stats.get_prefs('PUBLISH',owner,table_name) from dba_tab_statistics where owner='DEMO' and table_name in ('DEMO');
NUM_ROWS CAST(LAST_ANALYZEDASTIMESTAMP) DBMS_STATS.GET_PREFS('PUBLISH',OWNER,TABLE_NAME)
-------- ------------------------------ ------------------------------------------------
42 12-SEP-18 10.34.09.000000000 AM TRUE
10:34:15 SQL> c/dba_tab_statistics/dba_tab_pending_stats
1* select num_rows,cast(last_analyzed as timestamp),dbms_stats.get_prefs('PUBLISH',owner,table_name) from dba_tab_pending_stats where owner='DEMO' and table_name in ('DEMO');
10:34:15 SQL> /
no rows selected
Report Differences
Then what if a citical regression is observed later? I still have the possibility to revert to the old statistics. First I can check in detail what has changed:
10:34:16 SQL> select report from table(dbms_stats.diff_table_stats_in_history('DEMO','DEMO',sysdate-1,sysdate,0));
REPORT
------
###############################################################################
STATISTICS DIFFERENCE REPORT FOR:
.................................
TABLE : DEMO
OWNER : DEMO
SOURCE A : Statistics as of 11-SEP-18 10.34.16.000000 AM EUROPE/ZURICH
SOURCE B : Statistics as of 12-SEP-18 10.34.16.000000 AM EUROPE/ZURICH
PCTTHRESHOLD : 0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
TABLE / (SUB)PARTITION STATISTICS DIFFERENCE:
.............................................
OBJECTNAME TYP SRC ROWS BLOCKS ROWLEN SAMPSIZE
...............................................................................
DEMO T A 1 4 3 1
B 42 8 3 42
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
COLUMN STATISTICS DIFFERENCE:
.............................
COLUMN_NAME SRC NDV DENSITY HIST NULLS LEN MIN MAX SAMPSIZ
...............................................................................
N A 1 1 NO 0 3 C102 C102 1
B 41 .024390243 NO 0 3 C102 C12A 42
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NO DIFFERENCE IN INDEX / (SUB)PARTITION STATISTICS
###############################################################################
Restore Previous Statistics
If nothing is obvious and the regression is more critical than the original problem, I still have the possibility to revert back to the old statistics:
10:34:17 SQL> exec dbms_stats.restore_table_stats('DEMO','DEMO',sysdate-1,no_invalidate=>false);
PL/SQL procedure successfully completed.
Again, invalidating all cursors immediately is probably required as I solve a critical problem here. Immediately, the same query uses the old statistics:
10:34:17 SQL> select /*+ gather_plan_statistics */ count(*) from DEMO;
COUNT(*)
--------
42
10:34:17 SQL> select * from table(dbms_xplan.display_cursor(format=>'basic +rows +rowstats last'));
PLAN_TABLE_OUTPUT
-----------------
EXPLAINED SQL STATEMENT:
------------------------
select /*+ gather_plan_statistics */ count(*) from DEMO
Plan hash value: 2180342005
--------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows |
--------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |
| 2 | TABLE ACCESS FULL| DEMO | 1 | 1 | 42 |
--------------------------------------------------------------
If I want to see what happened recently on this table, I can query the history of operations (you can replace my ugly regexp_replace with XQuery):
10:34:18 SQL> select end_time,end_time-start_time,operation,target,regexp_replace(regexp_replace(notes,'" val="','=>'),'(||)',' '),status from DBA_OPTSTAT_OPERATIONS where regexp_like(target,'"?'||'DEMO'||'"?."?'||'DEMO'||'"?') order by end_time desc fetch first 10 rows only;
END_TIME END_TIME-START_TIME OPERATION TARGET REGEXP_REPLACE(REGEXP_REPLACE(NOTES,'"VAL="','=>'),'(||)','') STATUS
-------- ------------------- --------- ------ ---------------------------------------------------------------------------------------------- ------
12-SEP-18 10.34.17.718800000 AM +02:00 +00 00:00:00.017215 restore_table_stats "DEMO"."DEMO" as_of_timestamp=>09-11-2018 10:34:17 force=>FALSE no_invalidate=>FALSE ownname=>DEMO restore_cluster_index=>FALSE tabname=>DEMO COMPLETED
12-SEP-18 10.34.13.262234000 AM +02:00 +00 00:00:00.010021 restore_table_stats "DEMO"."DEMO" as_of_timestamp=>11-30-3000 01:00:00 force=>FALSE no_invalidate=>FALSE ownname=>DEMO restore_cluster_index=>FALSE tabname=>DEMO COMPLETED
12-SEP-18 10.34.09.974873000 AM +02:00 +00 00:00:00.032513 gather_table_stats "DEMO"."DEMO" block_sample=>FALSE cascade=>NULL concurrent=>FALSE degree=>NULL estimate_percent=>DBMS_STATS.AUTO_SAMPLE_SIZE force=>FALSE granularity=>AUTO method_opt=>FOR ALL COLUMNS SIZE AUTO no_invalidate=>NULL ownname=>DEMO partname=> reporting_mode=>FALSE statid=> statown=> stattab=> stattype=>DATA tabname=>DEMO COMPLETED
12-SEP-18 10.34.01.194735000 AM +02:00 +00 00:00:00.052087 gather_table_stats "DEMO"."DEMO" block_sample=>FALSE cascade=>NULL concurrent=>FALSE degree=>NULL estimate_percent=>DBMS_STATS.AUTO_SAMPLE_SIZE force=>FALSE granularity=>AUTO method_opt=>FOR ALL COLUMNS SIZE AUTO no_invalidate=>NULL ownname=>DEMO partname=> reporting_mode=>FALSE statid=> statown=> stattab=> stattype=>DATA tabname=>DEMO COMPLETED
We can see here that the publishing of pending stats was actually a restore of stats as of Nov 30th of Year 3000. This is probably because the pending status is hardcoded as a date in the future. Does that mean that all pending stats will become autonomously published at that time? I don’t think we have to worry about Y3K bugs for the moment…
Here is the full receipe I’ve given to an application owner who needs to gather statistics on his tables on a highly critical database. Then he has all the info to limit the risks. My recommendation is to prepare this fallback scenario before doing any change, and test it as I did, on a test environment, in order to be ready to react on any unexpected side effect. Be careful, the pending statsitics do not work correctly with system statistics and can have very nasty side effects (Bug 21326597), but restoring from history is possible.
Cet article Optimizer Statistics Gathering – pending and history est apparu en premier sur Blog dbi services.