Document Type | Technical Information
Category | Administration
Applicable Product Version | Tibero 7.2.1.
Document Number | TADTI168
Overview
This document explains the methods for creating/deleting CONTEXT indexes and the objects related to the indexes.
Tibero TEXT is a feature that enhances query performance for data composed of text.
Tibero TEXT provides a CONTEXT index that indexes large, consistent documents such as book contents.
Method
Creating an Index
Usage
CREATE INDEX idx_name on [sch_name.]tbl_name INDEXTYPE IS CTXSYS.CONTEXT
parameters('preferences_class_name user_defined_preference_name sync(option)')| Item | Description |
|---|---|
| idx_name | Name of the CONTEXT index to be created |
| sch_name | Owner of the target table where the index will be created |
| tbl_name | Target table on which the index will be created |
| CTXSYS.CONTEXT | Reserved word indicating creation of a CONTEXT index |
| parameters | Reserved word specifying settings to use when creating the CONTEXT index |
| preferences_class_name | Class specifying settings when creating the CONTEXT index |
| user_defined_preference_name | Preference created for the specified CLASS |
| sync(option) |
Used for automatic synchronization of the CONTEXT index
|
Example
SQL> create table t(name varchar(400), value varchar(1000));
Table 'T' created.
SQL> create index t_idx on t(name) indextype is ctxsys.context
2 parameters('sync(on commit)');
Index 'T_IDX' created.
SQL> create index t_idx2 on t(value) indextype is ctxsys.context
2 parameters('sync(every sysdate+1/24/60)');
Index 'T_IDX2' created.
#Note
#When executing queries related to TEXT indexes, an internal package named 'TEXT' is used, so do not name tables 'TEXT'.Note
For detailed information on index settings such as PREFERENCE and CLASS, refer to โTEXT Indexing Elementsโ.
Dropping an Index
The SQL for dropping a CONTEXT index is the same as for other indexes.
Usage
DROP INDEX index_name| Item | Description |
|---|---|
| index_name | Name of the TEXT index to be dropped |
Example
SQL> drop index t_idx;
Index 'T_IDX' dropped.
Objects Related to the Index
The following objects are created when a CONTEXT index is created.
| Item | Object Type | Description |
|---|---|---|
| DR$index_name$I | Table | Table storing tokens of column text |
| DR$index_name$X | Index | Index on the token column and rowid column of the token table |
| DR$index_name$K | Table | Table storing document IDs of the token table |
Index Constraints
CONTEXT indexes have the following constraints.
You cannot create a single CONTEXT index on multiple columns.
CONTEXT indexes do not support Global Partitioning. However, Global Non-Partitioned indexes can be created.
When using
BASIC_LEXERorCHINESE_LEXER, CONTEXT indexes can only be created onVARCHAR2andCLOBtype columns.When using
JAPANESE_LEXER, CONTEXT indexes can only be created onVARCHAR2type columns.
Context Index Preferences
STORAGE
Specifies the storage space of the index and table created when creating a TEXT index.
LEXER
Specifies the settings of the LEXER used when creating a TEXT index.
1) BASIC LEXER
- MIXED_CASE: Determines whether to index case sensitively. If TRUE, indexing is case sensitive.
- BASE_LETTER: Sets whether to distinguish base letters such as umlaut and acute accent when indexing. If TRUE, these characters are not distinguished and indexed as the same.
Umlaut: A phenomenon where pronunciation changes due to the influence of the following syllable
Acute accent: For example, the โ symbol in fiancรฉ
- WHITESPACE: Recognizes specified characters as whitespace and uses them along with punctuation and newline characters as delimiters for sentences and paragraphs. This is used during sentence and paragraph searches.
Default is space or tab characters. If the user specifies whitespace characters, those characters are also recognized as whitespace in addition to the defaults.
Property value: string, Default: space or tab characters
2) JAPANESE_LEXER: A LEXER for generating tokens from Japanese text.
3) CHINESE_LEXER: A LEXER for generating tokens from Chinese text.
Supported only for UTF8 and GBK character sets.
CTXCAT indexes cannot be used; only CONTEXT indexes are supported.
4) WORLD_LEXER: A LEXER for generating tokens from documents containing multiple languages, automatically detecting the language.
Supported only for UTF8 character set.
WORDLIST TEXT
Specifies settings related to tokens generated when creating the index.
STOPLIST TEXT
Specifies words to exclude from the index.
DATASTORE TEXT
Specifies the method of storing documents when creating the index.
FILTER
Specifies the filtering method used to index documents.
SECTION GROUP
Defines criteria for dividing documents into multiple sections.