Document Type | Technical Information
Category | Administration
Applicable Product Version | OpenSQL 3
Document Number | OADT003
This document is an opensql technical document based on postgresql.
Overview
Reindex should be performed in the following situations:
- When the index is corrupted
- When the index becomes bloated and contains a lot of empty space
- When storage parameters for the index have been changed and you want to confirm whether the changes have been applied
- When an index build with the CONCURRENTLY option fails and you want to efficiently rebuild an index in the โINVALIDโ state
Method
Criteria for Determining Reindex Targets
Checking for Index Corruption
- Check for corrupted indexes in the database using the pg_amcheck binary.
There are various options, but to check for index corruption you can use the following options:
$ pg_amcheck -d database_name --heapallindexed --rootdescend
- -d option specifies the database to check.
- --heapallindexed option checks whether all heap tuples exist as index tuples in the index.
- --rootdescend option performs a new search from the root page for each tuple to find tuples again at the leaf level. At this time, amcheckโs bt_index_parent_check function is implicitly used, which will be mentioned again in item 2.
- Check using the provided functions.
You can check using the following two functions.
- bt_index_check(index regclass, heapallindexed boolean) returns void
- bt_index_parent_check(index regclass, heapallindexed boolean, rootdescend boolean) returns void
Both are functions to check for index corruption, but bt_index_parent_check can be considered a superset of bt_index_check.
bt_index_parent_check checks for abnormalities across parent/child relationships.
Checking for Empty Space in Indexes
You can check function provided by the pgstattuple extension.
The pgstattuple extension is also included in contrib, so you can use it immediately by creating the extension.
The following is about the pgstatindex function.
pgstatindex(regclass) returns record
Among the record values returned by pgstatindex, you use avg_leaf_density. avg_leaf_density is the average density of the leaf pages, which is the actual usable space ratio occupied by data in the index. In other words, the remainder after subtracting this ratio from 100 is the ratio of empty space (bloat ratio) in the index.
An example query to check is as follows.
tester=# select 100 - (pgstatindex('index_name')).avg_leaf_density as index_bloat_ratio;Changing Storage Parameters for Indexes
This applies when modifying the fillfactor among index options.
fillfactor is an option indicating the percentage of a page to fill, and can be set as a percentage between 10 and 100 (default = 100). When performing INSERT operations, the table page is filled up to this ratio, and the empty space is used for UPDATE operations.
You can modify fillfactor as shown below, and you must perform a reindex after making the change.
tester=# alter index index_name set (fillfactor = 80);
Note
- fillfactor is also related to the index bloat ratio (empty space ratio) described above.
- avg_leaf_density cannot exceed the value set by fillfactor.
- Also, when determining whether an index is bloated enough to require rebuilding, this should be considered.
Rebuilding After Index Rebuild Failure with the concurrently Option, Invalid State
If an index rebuild using the concurrently option fails, the newly created index during the rebuild process may become invalid.
There may be multiple such indexes that need to be rebuilt. To view their list, check the indislive and indisvalid columns in the pg_index system catalog.
indislive and indisvalid are both boolean values, showing โtโ for normal and โfโ for abnormal.
tester=# select s.indexrelname, i.indisready, i.indislive, i.indisvalid from pg_stat_all_indexes s, pg_index i where i.indisready = 'f' or i.indislive = 'f' or i.indisvalid = 'f';
Reproduction Scenario
- Reproducing index corruption
- Confirming that the index is corrupted
- Normalizing via reindex
Performing Reindex After Checking Index Corruption
- Create a table and index, then forcibly corrupt them.
First, create the table and index.
-- Create a separate tablespace.
-- /var/lib/pgsql/tablespace/test_tbs is an arbitrary path for example purposes.
tester=# create tablespace test_tbs location'/var/lib/pgsql/tablespace/test_tbs';
-- Generate 5 million rows of random data.
tester=# CREATE TABLE concur_test tablespace test_tbs AS
SELECT
gs as idx,
md5(random()::text) AS random_string
FROM
generate_series(1, 5000000) AS gs;
tester=# CREATE INDEX concur_test_idx on concur_test(idx) tablespace test_tbs;
-- Check the oid of the created index.
tester=# select oid, relname from pg_class where relname = 'concur_test_idx';-- The path PG_15_202209061/16477 under the tablespace will be different in each environment, so enter the correct path. $ cd /var/lib/pgsql/tablespace/test_tbs/PG_15_202209061/16477 $ ls -arlt 38261 -- Use vi editor to manipulate the file contents. $ vi 38261
- Check for index corruption.
$ pg_amcheck -d tester --heapallindexed --rootdescend
-- There are various system catalogs you can use; this is an example using pg_stat_all_indexes.
-- When applying in a real work environment, it is recommended to rewrite it according to your environment.
tester=# CREATE OR REPLACE FUNCTION check_corrupted_indexes()
RETURNS TABLE (
index_name TEXT,
table_name TEXT
) AS $$
DECLARE
index_record RECORD;
BEGIN
FOR index_record IN
SELECT indexrelid::regclass AS indexname, relid::regclass::text AS tablename
FROM pg_stat_all_indexes
LOOP
BEGIN
PERFORM bt_index_parent_check(index_record.indexname);
EXCEPTION
WHEN others THEN
-- Return the corrupted index.
index_name := index_record.indexname::text;
table_name := index_record.tablename;
RETURN NEXT;
END;
END LOOP;
END;
$$ LANGUAGE plpgsql;
tester=# select * from check_corrupted_indexes();- After reindex, confirm normalization.
tester=# reindex index concur_test_idx; tester=# select * from check_corrupted_indexes(); $ pg_amcheck -d tester --heapallindexed --rootdescend
Performing Reindex After Confirming Increase in Empty Index Space
- Create an additional table and check the status before the update.
tester=# CREATE TABLE concur_test2 tablespace test_tbs AS
SELECT
gs as idx,
md5(random()::text) AS random_string
FROM
generate_series(1, 5000000) AS gs
;
tester=# create index concur_test2_idx on concur_test2(random_string) tablespace test_tbs;
tester=# select pg_relation_size('concur_test2') as table_size,
pg_relation_size('concur_test2_idx') as index_size,
100 - (pgstatindex('concur_test2_idx')).avg_leaf_density as bloat_ratio;- Perform update/delete on the table.
tester=# update concur_test2 set random_string = 'abcdefghijklmnopqrstuvwxyz1234567890123456789012345678901234567890' where idx between 1 and 2500000; tester=# delete from concur_test2 where idx between 1 and 2500000;
- After the update, refresh statistics with analyze and check index bloat using the pre-update/delete query.
tester=# analyze concur_test2;
tester=# select pg_relation_size('concur_test2') as table_size,
pg_relation_size('concur_test2_idx') as index_size,
100 - (pgstatindex('concur_test2_idx')).avg_leaf_density as bloat_ratio;- After reindex, confirm that it has been properly adjusted.
tester=# reindex index concur_test2_idx;
tester=# select pg_relation_size('concur_test2') as table_size,
pg_relation_size('concur_test2_idx') as index_size,
100 - (pgstatindex('concur_test2_idx')).avg_leaf_density as bloat_ratio;Performing Reindex After Adjusting Index Fillfactor
tester=# alter index concur_test2_idx set (fillfactor=90); tester=# reindex index concur_test2_idx;
Performing Reindex After Checking for Invalid Index
tester=# \d concur_test2; tester=# reindex index concurrently concur_test2_idx; -- Press Ctrl + C to stop during execution
NoteAn index with the suffix ccnew is a temporary index created during the reindex process.
tester=# select s.indexrelname, i.indisready, i.indislive, i.indisvalid from pg_stat_all_indexes s, pg_index i where s.indexrelid = i.indexrelid and (i.indisready = 'f' or i.indislive = 'f' or i.indisvalid = 'f') ;
NoteAccording to the official website, if a ccnew index becomes invalid, you should delete it and reindex again. However, in this document, we show an example of immediately performing reindex on an invalid index and checking that it becomes normal after reindex.
tester=# reindex index concurrently concur_test2_idx;
NoteThere is no function in PostgreSQL to check all four criteria for determining reindex targets at once.
You can create a script or query combining these checks.