Document Type | Technical Information
Category | Administration
Applicable Product Version | 7FS02PS
Document Number | TADTI178
Overview
This document explains the method for collecting Object statistics.
Optimizer statistics provide detailed information about the Database and Objects, and the query Optimizer uses statistics to select the best execution plan for SQL statements.
Method
1. Object Statistics
Object statistics include Table statistics, Index statistics, and Column statistics including histograms.
1.1 Table Statistics
-- Check for existing statistics SQL> SELECT NUM_ROWS,BLOCKS,AVG_ROW_LEN,LAST_ANALYZED FROM ALL_TABLES WHERE OWNER='TEST_USER' AND TABLE_NAME='TEST_TABLE'; NUM_ROWS BLOCKS AVG_ROW_LEN LAST_ANALYZED --------- ---------- ----------- -------------------------------- 0 0 0 -- Generate Table statistics SQL> BEGIN DBMS_STATS.GATHER_TABLE_STATS (OWNNAME='TEST_USER',TABNAME='TEST_TABLE', PARTNAME=NULL,ESTIMATE_PERCENT=10, METHOD_OPT='FOR ALL COLUMNS SIZE 1', DEGREE=4,CASCADE_INDEXES=FALSE, NO_INVALIDATE=FALSE, GRANULARITY='ALL'); END; / -- Check for existing statistics SQL> SELECT NUM_ROWS,BLOCKS,AVG_ROW_LEN,LAST_ANALYZED FROM ALL_TABLES WHERE OWNER='TEST_USER' AND TABLE_NAME='TEST_TABLE'; NUM_ROWS BLOCKS AVG_ROW_LEN LAST_ANALYZED ---------- ---------- ----------- -------------------------------- 996480 2128 7.96044075 2025/12/05
1.2 Index Statistics
- Check for existing statistics
SQL> SELECT BLEVEL,LEAF_BLOCKS,DISTINCT_KEYS,CLUSTERING_FACTOR,NUM_ROWS,LAST_ANALYZED FROM ALL_INDEXES WHERE OWNER='TEST_USER' AND INDEX_NAME='TEST_INDEX'; BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR NUM_ROWS LAST_ANALYZED ---------- ----------- ------------- ----------------- ---------- ---------------------------- 0 0 0 0 0 0 -- Collect only Index statistics SQL> BEGIN DBMS_STATS.GATHER_INDEX_STATS (OWNNAME='TEST_USER',IDXNAME='TEST_INDEX',PARTNAME=NULL, ESTIMATE_PERCENT=1,DEGREE=4,NO_INVALIDATE=TRUE); END; / -- Collect Table and Index statistics simultaneously BEGIN DBMS_STATS.GATHER_TABLE_STATS (OWNNAME='TEST_USER',TABNAME='TEST_TABLE',PARTNAME=NULL, ESTIMATE_PERCENT=10, METHOD_OPT='FOR ALL COLUMNS SIZE 1', DEGREE=4,CASCADE_INDEXES=TRUE, NO_INVALIDATE=FALSE, GRANULARITY='ALL'); END; / -- Check for existing statistics SQL> SELECT BLEVEL,LEAF_BLOCKS,DISTINCT_KEYS,CLUSTERING_FACTOR,NUM_ROWS,LAST_ANALYZED FROM ALL_INDEXES WHERE OWNER='TEST_USER' AND INDEX_NAME='TEST_INDEX'; BLEVEL LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR NUM_ROWS LAST_ANALYZED ---------- ----------- ------------- ----------------- ---------- ---------------------------- 2 2830 944650 .116093795 1013600 2025/12/05
GATHER_INDEX_STATS collects statistics only for the specified Index and does not generate column information.
1.3 Column Statistics
-- Collect all Columns with a single Bucket (no histogram generation) BEGIN DBMS_STATS.GATHER_TABLE_STATS (OWNNAME='TEST_USER',TABNAME='TEST_TABLE', PARTNAME=NULL,ESTIMATE_PERCENT=0, METHOD_OPT='FOR ALL COLUMNS SIZE 1'); END; / -- Collect only Columns with Indexes with a single Bucket (no histogram generation) BEGIN DBMS_STATS.GATHER_TABLE_STATS (OWNNAME='TEST_USER',TABNAME='TEST_TABLE', PARTNAME=NULL,ESTIMATE_PERCENT=0, METHOD_OPT='FOR ALL INDEXED COLUMNS SIZE 1'); END; / -- Collect Columns (specify number of histogram Buckets) BEGIN DBMS_STATS.GATHER_TABLE_STATS (OWNNAME='TEST_USER',TABNAME='TEST_TABLE', PARTNAME=NULL,ESTIMATE_PERCENT=0, METHOD_OPT='FOR ALL COLUMNS SIZE 100'); END; / -- Specify different Bucket counts per Column BEGIN DBMS_STATS.GATHER_TABLE_STATS (OWNNAME='TEST_USER',TABNAME='TEST_TABLE', PARTNAME=NULL,ESTIMATE_PERCENT=0, METHOD_OPT='FOR COLUMNS SIZE 2 A SIZE 100 C SIZE 200'); END; / -- Query Column histograms SELECT COLUMN_NAME, NUM_DISTINCT, LOW_VALUE,HIGH_VALUE,DENSITY,NUM_NULLS,NUM_BUCKETS,LAST_ANALYZED,SAMPLE_SIZE,AVG_COL_LEN FROM ALL_TAB_COL_STATISTICS WHERE TABLE_OWNER='TEST_USER' AND TABLE_NAME='TEST_TABLE'; COLUMN_NAME NUM_DISTINCT LOW_VALUE HIGH_VALUE DENSITY NUM_NULLS NUM_BUCKETS LAST_ANALYZED SAMPLE_SIZE AVG_COL_LEN ----------- ------------ ------------------------------------------ ------------------ ----------- ------------ --------- ----------- ------------- ----------- A 1000501 1.5000000000000000000000000000000000000E+01 9.9999900000000000000000000000000000000E+05 .000014521 0 100 2025/12/05 68763 3.97988744 B 1001370 2.6000000000000000000000000000000000000E+01 1.0000000000000000000000000000000000000E+06 .000000999 0 1 2025/12/05 100137 3.98002736 C 1000501 -9.999990000000000000000000000000000000E+05 -1.500000000000000000000000000000000000E+01 .000014566 0 200 2025/12/05 68763 4.97988744
2. Partition Table Statistics
In an operational system, if there are many log and statistics tables where partitions are added or split daily or monthly, it is difficult to determine the timing for collecting statistics, and in large-scale systems, the time burden for statistics collection is significant.
This section explains the statistics collection levels as a method to reduce the statistics collection load in large-scale systems. It also describes strategies and utilization of statistics information in systems where partitions are added or split periodically, such as daily or monthly.
2.1 Statistics Level by Query Condition
In PARTITION, either Global statistics or Partition statistics are used depending on SQL conditions. Except when the partition key is queried as a constant, Global statistics are used, so collecting Global statistics is better for accurate execution plans.
- Create Table schema
SQL> CREATE TABLE TEST_TABLE (A NUMBER, B NUMBER, C CHAR(2000))
PARTITION BY RANGE (A)
(PARTITION P1 VALUES LESS THAN (10000),
PARTITION P2 VALUES LESS THAN (40000),
PARTITION P3 VALUES LESS THAN (MAXVALUE));
SQL> INSERT INTO TEST_TABLE SELECT LEVEL,MOD(LEVEL,10),LPAD(LEVEL,10,'0')
FROM DUAL CONNECT BY LEVEL<=100000;
SQL> commit;
SQL> CREATE INDEX TEST_INDEX ON TEST_TABLE(A,B) LOCAL;
SQL> BEGIN
DBMS_STATS.GATHER_TABLE_STATS ('TEST_USER','TEST_TABLE',GRANULARITY='ALL');
END;
/- Select Global or Partition statistics differently depending on the query condition.
Statistics Level - GLOBAL Statistics
- When no condition clause is used
SQL> SELECT * FROM TEST_TABLE; Execution Plan -------------------------------------------------------------------------------- 1 PARTITION RANGE (ALL PART) (Cost:14037, %CPU:0, Rows:100090) (PS:1, PE:3) 2 TABLE ACCESS (FULL): TEST_TABLE (Cost:14037, %CPU:0, Rows:100090)
- When partition key condition is not used
SQL> SELECT * FROM TEST_TABLE WHERE B=3; Execution Plan -------------------------------------------------------------------------------- 1 PARTITION RANGE (ALL PART) (Cost:14037, %CPU:0, Rows:100090) (PS:1, PE:3) 2 TABLE ACCESS (FULL): TEST_TABLE (Cost:14037, %CPU:0, Rows:100090
- When two or more partition key conditions are used
SQL> SELECT * FROM TEST_TABLE WHERE A<10 OR A>50000; Execution Plan -------------------------------------------------------------------------------- 1 PARTITION RANGE (ALL PART) (Cost:14037, %CPU:0, Rows:50049) (PS:1, PE:3) 2 TABLE ACCESS (FULL): TEST_TABLE (Cost:14037, %CPU:0, Rows:50049)
- When a single partition condition is used but with a Bind variable
SQL> SELECT * FROM TEST_TABLE WHERE A=:x; Execution Plan -------------------------------------------------------------------------------- 1 PARTITION RANGE (SINGLE PART) (Cost:3, %CPU:0, Rows:1) (PS:KEY, PE:KEY) 2 TABLE ACCESS (ROWID) LOCAL: TEST_TABLE (Cost:3, %CPU:0, Rows:1) 3 INDEX (RANGE SCAN): TEST_INDEX (Cost:2, %CPU:0, Rows:1)
- After collecting Global statistics, check statistics information
SQL> BEGIN
DBMS_STATS.DELETE_TABLE_STATS ('TEST_USER','TEST_TABLE');
DBMS_STATS.GATHER_TABLE_STATS ('TEST_USER','TEST_TABLE',GRANULARITY='GLOBAL');
END;
/
SQL> SELECT NUM_ROWS,BLOCKS,AVG_ROW_LEN,LAST_ANALYZED FROM USER_TABLES WHERE
TABLE_NAME='TEST_TABLE';
NUM_ROWS BLOCKS AVG_ROW_LEN LAST_ANALYZED
--------- --------- ------------ --------------------------------
990030 33920 2005.787=279 2025/12/05
SQL> SELECT NUM_ROWS,BLOCKS,LAST_ANALYZED FROM USER_TAB_PARTITIONS WHERE
TABLE_NAME='TEST_TABLE';
NUM_ROWS BLOCKS LAST_ANALYZED
--------- --------- --------------
0 0
0 0
0 0Statistics Level - PARTITION Statistics
- Single partition key condition
SQL> SELECT * FROM TEST_TABLE WHERE A=100; Execution Plan -------------------------------------------------------------------------------- 1 TABLE ACCESS (ROWID): TEST_TABLE (Cost:3, %CPU:0, Rows:1) 2 PARTITION RANGE (SINGLE PART) (Cost:2, %CPU:0, Rows:1) (PS:1, PE:1) 3 INDEX (RANGE SCAN): TEST_INDEX (Cost:2, %CPU:0, Rows:1)
- Using multiple partition key conditions that express a closed range
SQL> SELECT * FROM TEST_TABLE WHERE A>100 AND A<35000; Execution Plan -------------------------------------------------------------------------------- 1 PARTITION RANGE (SUBSET PART) (Cost:5669, %CPU:0, Rows:34993) (PS:1, PE:2) 2 TABLE ACCESS (FULL): TEST_TABLE (Cost:5669, %CPU:0, Rows:34993
- Except when querying the partition key as a constant value, Global statistics are used, so collecting Global statistics is better for accurate execution plans.
- Example of checking statistics after collecting Partition statistics
SQL> BEGIN
DBMS_STATS.DELETE_TABLE_STATS ('TEST_USER','TEST_TABLE');
DBMS_STATS.GATHER_TABLE_STATS ('TEST_USER','TEST_TABLE',GRANULARITY='PARTITION');
END;
/
SQL> SELECT NUM_ROWS,BLOCKS,AVG_ROW_LEN,LAST_ANALYZED FROM USER_TABLES WHERE
TABLE_NAME='TEST_TABLE';
NUM_ROWS BLOCKS AVG_ROW_LEN LAST_ANALYZED
---------- ---------- ----------- -------------------------------
0 0 0
SQL> SELECT NUM_ROWS,BLOCKS,LAST_ANALYZED FROM USER_TAB_PARTITIONS WHERE
TABLE_NAME='TEST_TABLE';
NUM_ROWS BLOCKS LAST_ANALYZED
--------- --------- -------------
10040 3456 2025/12/05
30320 10240 2025/12/05
59950 20224 2025/12/05
2.2 Whether to Collect Global Statistics
For Tables with large amounts of data, collecting Global statistics can take a lot of time and cause high load, so sometimes only Partition statistics are collected.
Tibero provides a feature to reduce the load of Global statistics collection by deriving Global statistics values from Partition-level statistics. This means that instead of scanning the entire Table for Global statistics, the system updates Global statistics using information collected at the Partition level.
However, since these are estimates, if the Global statistics collection date is old, Global statistics should be recollected.
2.3 Collecting Global and Partition Level Statistics
The following explains the principles of collecting Global and Partition statistics. Choose the appropriate method when collecting statistics.
- Collect all statistics at Global and Partition levels
Use the granularity option 'ALL' to collect statistics at both Partition and Global levels simultaneously.
Because this requires two full scans of all PARTITIONs, it takes a long time, but this method is chosen when initially generating statistics.
- Collect only Partition-level statistics and update Global statistics from Partition-level statistics
When the target Table is very large and collecting Global statistics takes a long time, collect Partition-level statistics. At this time, min, max, distinct count, histogram information of partition key columns are updated, improving the accuracy of collected statistics.
Although execution plans might be slightly less accurate than when collecting Global statistics, this method has the advantage of shorter statistics collection time.
- Collection time for Global and Partition statistics
The following is an example of collection time for Global and Partition statistics.
-- When collecting Global and all Partition-level statistics SQL> /* Collect Partition-level statistics (granularity='PARTITION', no_invalidate= FALSE) */ BEGIN DBMS_STATS.DELETE_TABLE_STATS (OWNNAME=USER,TABNAME='TEST_TABLE' ); DBMS_STATS.GATHER_TABLE_STATS (OWNNAME='TEST_USER',TABNAME='TEST_TABLE', PARTNAME=NULL, NO_INVALIDATE=FALSE, GRANULARITY='PARTITION'); END; / -- Check Global and Partition statistics collection SQL> SELECT NUM_ROWS,BLOCKS,AVG_ROW_LEN,LAST_ANALYZED FROM USER_TABLES WHERE TABLE_NAME='TEST_TABLE'; SQL> SELECT NUM_ROWS,BLOCKS,LAST_ANALYZED FROM USER_TAB_PARTITIONS WHERE TABLE_NAME='TEST_TABLE'; SQL> /* Collect Global and Partition-level statistics (granularity='GLOBAL', estimate_percent = 10, degree = 4,no_invalidate= FALSE) Uses 10% sampling and Parallel=4 options to reduce time */ BEGIN DBMS_STATS.DELETE_TABLE_STATS (OWNNAME=USER,TABNAME='TEST_TABLE' ); DBMS_STATS.GATHER_TABLE_STATS (OWNNAME='TEST_USER',TABNAME='TEST_TABLE', PARTNAME=NULL,ESTIMATE_PERCENT=10, DEGREE=4,NO_INVALIDATE=FALSE, GRANULARITY='GLOBAL'); END; / -- Check Global and Partition statistics collection SQL> SELECT NUM_ROWS,BLOCKS,AVG_ROW_LEN,LAST_ANALYZED FROM USER_TABLES WHERE TABLE_NAME='TEST_TABLE'; SQL> SELECT NUM_ROWS,BLOCKS,LAST_ANALYZED FROM USER_TAB_PARTITIONS WHERE TABLE_NAME='TEST_TABLE';
2.4 Tables with Partitions Added Periodically
When day or month is the Range Partition Key Column value and partitions continuously increase at regular intervals, the range of key column values changes daily, causing statistics information to lag behind the actual data distribution. This negatively affects the accuracy of selectivity and cardinality estimates in query conditions.
The following example explains statistics collection in systems where partitions are added or split periodically, such as daily or monthly, using schema format.
- Create a Partition table.
SQL> @create_table_partition.sql
- Insert test data.
SQL> BEGIN
FOR i IN 1..20 LOOP
INSERT INTO DAILY_LOG
SELECT '202501'||LPAD(i,2,'0'), LEVEL, LEVEL, LEVEL, LEVEL, LEVEL
FROM DUAL CONNECT BY LEVEL <= 10000;
COMMIT;
END LOOP;
END;- Collect statistics information.
SQL> CALL DBMS_STATS.GATHER_TABLE_STATS (USER, 'DAILY_LOG', granularity=>'ALL');
- Add new partition and insert data. (add_partition.sql)
The Table uses a column representing date as a string as the partition key, and since new partitions are added daily with the current date value, the maximum key column value increases daily.
SQL> ALTER TABLE DAILY_LOG ADD PARITTION P20250121 VALUES LESS TAHN ('20250122');
SQL> INSERT INTO DAILY_LOG SELECT '20250121',1,1,1,1, FROM DUAL CONNECT BY LEVEL <=10000;
SQL> COMMIT;In the above example, each partition has 10,000 rows, and after partitions P20250101 through P20250120 were added and statistics collected, partition P20250121 was added and data inserted. Since P20250121 has no statistics collected, the LOG_DAY column statistics maximum value will be '20250120'.
- Check the Plan.
- Query plan for 20250120 with statistics (view_partition_plan.sql)
SQL> @view_partition_plan.sql
SQL> set autotrace traceonly exp planstat
SQL> SELECT * FROM DAILY_LOG WHERE LOG_DAY='20250120';
Execution Plan
--------------------------------------------------------------------------------
1 PARTITION RANGE (SINGLE PART) (Cost:4240, %CPU:0, Rows:9944) (PS:20, PE:20)
2
TABLE ACCESS (FULL): DAILY_LOG (Cost:4240, %CPU:0, Rows:9944)
Predicate Information
---------------------------------------------------------------------------
2 - filter: ("DAILY_LOG"."LOG_DAY" = '20250120') (0.050)
Execution Stat
--------------------------------------------------------------------------------
1 PARTITION RANGE (SINGLE PART) (Time:.02 ms, Rows:10000, CR:0, CU:0, Starts:1)
2
TABLE ACCESS (FULL): DAILY_LOG (Time:108.34 ms, Rows:10000, CR:10145, CU:203,
Starts:1
SQL> set autotrace off
Since the maximum value in the LOG_DAY column histogram is '20230120', cardinality is predicted similarly, and a correct Plan (FULL) is generated.
- Query plan for 20250121 without statistics (view_partition_plan.sql)
SQL> @view_partition_plan.sql
SQL> set autotrace traceonly exp planstat
SQL> SELECT * FROM DAILY_LOG WHERE LOG_DAY='20250121';
Execution Plan
--------------------------------------------------------------------------------
1 TABLE ACCESS (ROWID): DAILY_LOG (Cost:3, %CPU:0, Rows:1)
2
PARTITION RANGE (SINGLE PART) (Cost:2, %CPU:0, Rows:1) (PS:21, PE:21)
3
INDEX (RANGE SCAN): DAILY_LOG_IDX (Cost:2, %CPU:0, Rows:1)
Predicate Information
---------------------------------------------------------------------------
2 - filter: ("DAILY_LOG"."LOG_DAY" = '20250121') (0.000)
Execution Stat
--------------------------------------------------------------------------------
1 TABLE ACCESS (ROWID): DAILY_LOG (Time:2582.04 ms, Rows:10000, CR:10000, CU:0,
Starts:1)
2
PARTITION RANGE (SINGLE PART) (Time:0. ms, Rows:10000, CR:0, CU:0, Starts:1)
3 INDEX (RANGE SCAN): DAILY_LOG_IDX (Time:12.49 ms, Rows:10000, CR:369, CU:0,
Starts:1)
SQL> set autotrace offFor the date '20230121' which exceeds the maximum histogram value, an incorrect Plan (RANGE SCAN) is generated, resulting in slower query performance. In this table with daily partitions, the query speed difference may not be noticeable, but for hourly partitions, statistics information can become outdated even faster.
2.5 Large-Scale Partition Statistics Collection
In large partition environments, the automatic statistics collection feature provided by Tibero by default is not recommended, and strategic statistics collection and utilization tailored to the operational system environment are required.
The following explains the impact of the Granularity value on partition statistics collection performance.
Large-Scale Partitions and Granularity
In large-scale systems, some tables may have thousands of partitions and sizes of several terabytes. Even sampling less than 1% means sampling several gigabytes of data, and collecting statistics for each partition means thousands of samplings, which cannot be ignored in terms of time. If there are several local indexes, even more work is generated.
The following explains the number of samplings and collection processes when creating local indexes on the LOG_DAY partition in the example described in โ2.2.4. Tables with Partitions Added Periodicallyโ and setting Granularity to the default ALL for collection.
- Create Indexes (create_index.sql)
SQL> CREATE INDEX DAILY_LOG_IDX_01 ON DAILY_LOG (LOG_DAY) LOCAL; SQL> CREATE INDEX DAILY_LOG_IDX_02 ON DAILY_LOG (LOG_DAY, C1) LOCAL; SQL> CREATE INDEX DAILY_LOG_IDX_03 ON DAILY_LOG (LOG_DAY, C1, C2) LOCAL; SQL> CREATE INDEX DAILY_LOG_IDX_04 ON DAILY_LOG (LOG_DAY, C3) LOCAL;
Considering leap years 2024 and 2028, the number of partitions is 365 x 10 + 2 = 3562.
- Statistics Collection
The sequence of statistics collection is as follows. If the number of partitions is 3562 as above, the number of samplings can be calculated.
a. Collect for Table: 1 time
b. Collect for Table partitions: 3,562 times
c. Collect for each Index: 4 times
d. Collect for each Index partition: 3,562 * 4 times
In total, 17,815 samplings and collections are performed. If options like 'FOR ALL COLUMNS SIZE 200' are used for precise histogram collection, sampling is done for each column as well.
Even if ESTIMATE_PERCENT is low, performing 17,815 iterations can take several hours or days. For partitions increasing by units, spending this much time on one Table is a very inefficient collection method.
Note
Strategic planning considering operational environment characteristics is necessary when collecting statistics in large-scale systems. For features usable in such environments, see "Advanced Features of Statistics".
3. Incremental Global Statistics
When data is added to new partitions or existing partitions have modified data, the entire Table must be scanned again to recollect Table-level statistics. In other words, collecting Global statistics for partitioned Tables requires scanning each partition to collect Partition-level statistics, then scanning all partitions again to collect Global statistics.
For example, specifying Granularity as ALL to collect both Partition and Table statistics causes all partitions to be scanned twice, impacting time. To avoid this, one might generate statistics only for newly added partitions and then generate Global statistics, but this is inefficient because it still requires scanning all partitions once (one scan for the new partition + one scan for all partitions).
Using Incremental Global ('APPROX_GLOBAL AND PARTITION' granularity) can improve this performance issue. Incremental Global statistics update Global Table statistics by scanning only changed partitions without scanning the entire Table. (One scan for the changed partitions + update Global statistics using existing partition information)
This feature aims to improve Global statistics collection for partitioned Tables. The system maintains both Partition statistics and overall Global statistics for the Table.
The process of applying Incremental Global statistics is as follows.
Create the Table, insert data, then create partitions.
Specify Granularity as ALL to collect initial GLOBAL and PARTITION statistics.
Check the collected statistics status. (Statistics collected for all GLOBAL, PARTITION, and Table columns)
Add new partitions and insert corresponding data.
Specify Granularity as โAPPROX_GLOBAL AND PARTITIONโ and execute gather_table_stats.
Check the collected statistics status.
For large partitioned Tables with Range partitions, data changes in past partitions are rare, so collecting Global statistics every time can be inefficient. To improve this, information needed for collecting statistics of past partitions is stored, and when collecting Global statistics, the stored information is used to shorten statistics collection time.