Document Type | Technical Information

Category | Administration

Applicable Product Version | 7FS02PS

Document Number | TADTI186

Overview

This chapter explains the parameters that affect the optimizer when establishing execution plans and the considerations for measuring statistics.

Method

1. Optimizer Statistics Settings

1.1 Data Dictionary

The Data Dictionary views where you can check statistical information are as follows.

 ALL_TABLES
 ALL_INDEXES
 ALL_TAB_STATISTICS
 ALL_TAB_COL_STATISTICS
 ALL_TAB_PARTITIONS
 ALL_TAB_SUBPARTITIONS
 ALL_IND_PARTITIONS
 ALL_IDX_SUBPARTITIONS

Note

For various static views where statistics information can be checked, refer to " Tibero Reference Manual" under "Static View".

1.2 Parameters

Careful consideration is required when setting parameters that affect the optimizer's execution plan. It is important to understand the system and business characteristics of the production environment and determine the most effective settings for the respective tasks.

Settings tested in a development environment may initially be appropriate for the production system, but as usage and duration increase, data characteristics may change. Therefore, parameters should be adjusted and criteria established for optimization of those tasks.

Below is an explanation of parameters that may affect the optimizer.

Parameter	Description
OPTIMIZER_MODE	Determines the cost calculation behavior of the optimizer with five modes available (default: ALL_ROWS) • FIRST_ROWS_1 • FIRST_ROWS_10 • FIRST_ROWS_100 • FIRST_ROWS_1000 • ALL_ROWS FIRST_ROWS_n selects the optimal plan for fetching n rows, while ALL_ROWS selects the optimal plan for fetching all result rows. Even if the query result row count exceeds n, if the client fetches only n rows, it is recommended to switch to FIRST_ROWS_n mode.
CURSOR_SHARING	Tibero supports two modes as follows (default: EXACT) • EXACT: Uses a plan only if the entire SQL string exactly matches a previously parsed plan. • FORCE: Converts constants in the SQL string to bind variables so that the same plan can be reused. For example, for queries SELECT * FROM T WHERE C=1 and SELECT * FROM T WHERE C=2, EXACT generates separate execution plans, but FORCE converts constants to the same bind variable like SELECT * FROM T WHERE C=:SYS_B_0, allowing plan reuse. Using FORCE mode maximizes plan sharing among queries, reducing PP cache memory usage, but requires internal conversion to bind parameters. If the column distribution for the condition is irregular, selectivity calculation may be inaccurate.
_OPT_JOIN_MEMORY_LIMIT	The optimizer tries to generate as many join plan variations as possible. For n join targets, it considers order, join algorithms, and generates all possible plans. However, as the number of join targets increases exponentially, optimizer time increases; inefficient plans are pruned early based on this parameter (default: 5M, configurable: 1M ~ 50M). Lowering this value can reduce parsing time for queries with many joins. However, potentially good plans may be pruned prematurely.
_USE_DYNAMIC_SAMPLING	Parameter to enable or disable dynamic sampling (default: Y) • Y: Use dynamic sampling • N: Do not use dynamic sampling
_DYNAMIC_SAMPLING_CONFIDENCE	Dynamic sampling reads very few sample blocks, which reduces accuracy of statistics. Increasing this parameter causes more sample blocks to be read, reducing statistical error (default: 50, range: 1 ~ 99).
ENABLE_HASH_JOIN ENABLE_MERGE_JOIN ENABLE_IDX_JOIN ENABLE_HASH_JOIN_FULL_OUTER	Join algorithms include Hash join, Index join (nested loop join with index), Sort merge join; these parameters enable or disable use of these algorithms (default: Y). Setting to N excludes the algorithm from optimizer consideration. _ENABLE_HASH_JOIN_FULL_OUTER (default: Y) determines whether Hash join is used for full outer joins.
ENABLE_HASH_GROUPBY ENABLE_SORT_GROUPBY	Algorithms for grouping key columns in GROUP BY: hash and sort methods (default: Y). If data is already sorted, only GROUP BY is needed; otherwise, grouping is done via hash or sort algorithms, controlled by these parameters. If sort group by results in poor performance, set _ENABLE_SORT_GROUPBY=N.
_ENABLE_ISS	Parameter to enable or disable Index Skip Scan (default: Y) • Y: Enabled • N: Disabled; if set to N, index skip scan will not be chosen even if hinted.
_OPT_PGROUPBY_PUSH_RATIO	For parallel GROUP BY, optimizer may generate plans with double GROUP BY for optimization (default: 100). If optimizer predicts row reduction ratio after GROUP BY is less than _OPT_PGROUPBY_PUSH_RATIO percent, it encourages grouping. Parallel queries usually handle large volumes, so this is often efficient. However, in some cases, the double GROUP BY negatively impacts performance. In such cases, lower this parameter (set to 0 to disable) to process with a single GROUP BY.
_OPT_BOUND_SELEC_ADJUST_DEGREE	Adjusts selectivity for values outside the min/max range specified in histograms when used in equality conditions, providing some correction as long as values are not far outside the range (default: 100). If set to 100, values within the range of 'max - min' beyond max or min are assigned selectivity based on 1/NDV of the bucket. For example, if histogram max=100 and min=0, a condition C=101 normally has selectivity=0, but with this parameter, selectivity is calculated as 1/NDV * (some correction factor).
_SAMPLE_SCAN_SKIP_BLK	Parameter to enable skipping blocks during sampling scans if possible. For example, with a 1% sample rate, one row is selected and the next 99 rows skipped, which is effective when sample percent is low or rows per block are few. • N: Disabled (default) • Y: Enabled
_EX_BLOCK_SAMPLING_LVL	Parameter to improve block sampling performance. • N: Disabled (default) • Y: Enabled

2. Limitations of Statistics Measurement

2.1 Inaccurate Statistics

Statistics collected by sampling differ from actual data and cannot be considered perfect. Also, previously collected statistics may not remain accurate. If data updates increase suddenly or new objects are created, statistics collection is necessary.

A common reason for inefficient execution plans even with accurate statistics is inaccurate row count estimation. Even with accurate statistics, if inefficient plans are executed, it is a limitation of the statistics, not an error.

For example, when LIKE conditions use %, accurate selectivity prediction is difficult with statistics. In such cases, dynamic sampling can help predict actual values during sampling.

2.2 Histogram Limitations

Because the number of histogram buckets is limited, height-balanced histograms cannot accurately represent distributions of low-frequency values.

Cardinality for values not present in histogram buckets may be inaccurate. When bind variables are used, even with bind peeking enabled, variable values at plan generation and execution may differ, so 100% accuracy cannot be guaranteed.

2.3 Incorrect Cost Estimation

Based on statistics, selectivity and cardinality for conditions and joins are calculated, which are used to compute costs for data access methods, index and join methods (Index, Nested loop, Sort Merge, Hash).

However, current statistics may differ from actual data distribution, and selectivity and cardinality calculations may not match real data.

2.4 Sampling Rate

Generally, higher sampling rates improve statistics accuracy, but accuracy decreases when many null values exist or data distribution is uneven.

Related to

Search

Welcome to Tibero GTS!

Tibero Large-Scale System Statistics Collection Guide - 11. Optimizer Statistics Settings and Limitations of Statistics Measurement

Overview

Method

1. Optimizer Statistics Settings

1.1 Data Dictionary

1.2 Parameters

2. Limitations of Statistics Measurement

2.1 Inaccurate Statistics

2.2 Histogram Limitations

2.3 Incorrect Cost Estimation

2.4 Sampling Rate

업무 외 시간 안내

Search

Welcome to Tibero GTS!

Overview

Method

1. Optimizer Statistics Settings

1.1 Data Dictionary

1.2 Parameters

2. Limitations of Statistics Measurement

2.1 Inaccurate Statistics

2.2 Histogram Limitations

2.3 Incorrect Cost Estimation

2.4 Sampling Rate