Document TypeㅣTechnical Information
CategoryㅣAdministration
Applicable Product Versionㅣ7FS02PS
Document NumberㅣTADTI180
Overview
This document explains the method of Incremental Global statistics collection.
Optimizer statistics represent detailed information about the Database and Objects, and the query Optimizer uses statistics to select the best execution plan for SQL statements.
Method
1. Incremental Global Statistics
When data is added to a new partition or existing partitions have modified data, it is necessary to scan the entire Table again to recollect statistics at the Table level. In other words, to collect Global statistics information for a partitioned Table, after collecting Partition-level statistics by scanning each partition, a full scan of all partitions is required to collect Global statistics information.
For example, if you specify the Granularity as ALL to collect statistics for partitions and the Table, the entire partitions are scanned twice, which affects the time. To avoid this, generating statistics only for the newly added partitions and then creating Global statistics is also inefficient because it still requires scanning all partitions to generate Global statistics. (One scan for the new partition + one full scan of all partitions)
At this point, using Incremental Global ('APPROX_GLOBAL AND PARTITION' granularity) can improve this performance issue. Incremental Global statistics update the Global Table statistics by scanning only the changed partitions without scanning the entire Table. (One scan for the changed partition + updating GLOBAL statistics using existing partition information)
This feature aims to enhance Global statistics collection for partitioned Tables. For partitioned Tables, the system maintains statistics for each Partition as well as overall Global statistics for the Table.
The application process of Incremental Global statistics is as follows.
- Create the Table, input data, and create partitions
- Initially collect GLOBAL and PARTITION statistics with Granularity set to ALL
- Check the statistics collection status (statistics are collected for all GLOBAL, PARTITION, and Table columns)
- Add new partitions and input the corresponding data
- Additionally, collect gather_table_stats with Granularity set to “APPROX_GLOBAL AND PARTITION”
- Check the statistics collection status
Among large-scale partitioned Tables, Range partitions rarely have data changes in past partitions, so collecting Global statistics can be inefficient. To improve this, information required for collecting past partition statistics is stored, and when collecting Global statistics, the stored information is used for past partitions to reduce the statistics collection time.