Document Type | Technical Information
CATEGORY | Installation
Applicable Product Version | Tibero 7FS04PS
Document Number | PINTI009
Overview
- Part 1: Prosync Installation Preparation and Agent Installation
- Part 2: Instance Installation
- Part 3: CM Failover Setup
- Part 4: CM Failover Testing
Test Environment
| Category | OS | IP | DBMS Version | Prosync Version |
|---|---|---|---|---|
| Source_TAC0 | Rocky Linux release 8.10 | 10.10.10.61 | Tibero 7.2.4 (build 305455) | 4.6.0 (build 308985) |
| Source_TAC1 | Rocky Linux release 8.10 | 10.10.10.62 | Tibero 7.2.4 (build 305455) | 4.6.0 (build 308985) |
| Target_TAC0 | Rocky Linux release 8.10 | 10.10.10.63 | Tibero 7.2.4 (build 305455) | 4.6.0 (build 308985) |
| Target_TAC1 | Rocky Linux release 8.10 | 10.10.10.64 | Tibero 7.2.4 (build 305455) | 4.6.0 (build 308985) |
Note
The Prosync CM Failover feature is supported from Prosync version 4.3 and above.
Method
CM FAILOVER Setup
Stop DB & CM
Stop the tibero and cm processes on all servers.
$ tbdown immediate Tibero instance terminated (IMMEDIATE mode). $ tbcm -d CM DOWN SUCCESS!
Modify $CM_SID.tip File
Add CM_ID at the very top line of $TB_HOME/config/$CM_SID.tip. The CM_ID is based on the AGENT_CM_ID configured in prs_install_agent.cfg during agent installation.
vi $TB_HOME/config/$CM_SID.tip ############ TAC (Tibero Active Cluster) #### must ## node1 CM_ID=0 CM_NAME=cm0 CM_UI_PORT=28629 CM_RESOURCE_FILE=/share/tibero_engine/config/cm0_res_file ### recommend CM_HEARTBEAT_EXPIRE=300 CM_WATCHDOG_EXPIRE=290 CM_ENABLE_FAST_NET_ERROR_DETECTION=Y
Start CM & DB
Start cm and tibero processes on all servers.
$ tbcm -b CM Guard daemon started up. import resources from '/share/tibero_engine/config/cm0_res_file'... TBCM 7.1.1 (Build 305455) TmaxTibero Corporation Copyright (c) 2020-. All rights reserved. Tibero cluster manager started up. Local node name is (cm0:28629). $ tbboot Change core dump dir to /share/tibero_engine/bin/prof. Listener port = 8629 Tibero 7 TmaxTibero Corporation Copyright (c) 2020-. All rights reserved. Tibero instance started up (NORMAL mode).
Check CM
Use cmrctl show all to check the status.
$ cmrctl show all
Resource List of Node cm0
=====================================================================
CLUSTER TYPE NAME STATUS DETAIL
----------- -------- -------------- -------- ------------------------
COMMON network net0 UP (private) 10.10.10.61/29000
COMMON cluster cls0 UP inc: net0, pub: N/A
cls0 file cls0:0 UP /share/tac/CMFILE/CMFILE
cls0 service source_tac UP Database, Active Cluster (auto-restart: OFF)
cls0 db source_tac0 UP(NRML) source_tac, /share/tibero_engine, failed retry cnt: 0
=====================================================================
Resource List of Node cm1
=====================================================================
CLUSTER TYPE NAME STATUS DETAIL
----------- -------- -------------- -------- ------------------------
COMMON network net1 UP (private) 10.10.10.62/29000
COMMON cluster cls0 UP inc: net1, pub: N/A
cls0 file cls0:0 UP /share/tac/CMFILE/CMFILE
cls0 service source_tac UP Database, Active Cluster (auto-restart: OFF)
cls0 db source_tac1 UP(NRML) source_tac, /share/tibero_engine, failed retry cnt: 0
=====================================================================
Register CM GROUP
Add CM GROUP based on the information written in prs_install_agent.cfg during agent installation. Perform this on only one node.
<Syntax> cmrctl add group --name <group_name> --cname <cluster_name> --grptype <type> --failover <true|false> <Example> $ cmrctl add group --name SRC_CM --cname cls0 --grptype prosync Resource add success! (group, SRC_CM) $ cmrctl add group --name TAR_CM --cname cls0 --grptype prosync Resource add success! (group, TAR_CM)
Register CM AGENT
Add CM AGENT based on the information written in prs_install_agent.cfg during agent installation. Perform this on each node separately.
<Syntax> cmrctl add agent --name <agent_name> --grpname <group_name> --script <directory_path> --pubnet <public_network_resource_name> --retry_cnt <retry_cnt> <Example> $ cmrctl add agent --name src_agent1 --grpname SRC_CM --script $PRS_HOME/bin/prs_0.sh Resource add success! (agent, src_agent1) $ cmrctl add agent --name src_agent2 --grpname SRC_CM --script $PRS_HOME/bin/prs_1.sh Resource add success! (agent, src_agent2) $ cmrctl add agent --name tar_agent1 --grpname TAR_CM --script $PRS_HOME/bin/prs_0.sh Resource add success! (agent, tar_agent1) $ cmrctl add agent --name tar_agent2 --grpname TAR_CM --script $PRS_HOME/bin/prs_1.sh Resource add success! (agent, tar_agent2)
Verify CM Registration
Use cmrctl show all to check the registration status of GROUP & AGENT.
$ cmrctl show all
Resource List of Node cm0
=====================================================================
CLUSTER TYPE NAME STATUS DETAIL
----------- -------- -------------- -------- ------------------------
COMMON network net0 UP (private) 10.10.10.61/29000
COMMON cluster cls0 UP inc: net0, pub: N/A
cls0 file cls0:0 UP /share/tac/CMFILE/CMFILE
cls0 service source_tac UP Database, Active Cluster (auto-restart: OFF)
cls0 db source_tac0 UP(NRML) source_tac, /share/tibero_engine, failed retry cnt: 0
cls0 group SRC_CM DOWN type: prosync (failover: ON)
cls0 agent src_agent1 DOWN /share/prosync4/bin/prs_0.sh, start retry cnt: 0
=====================================================================
Resource List of Node cm1
=====================================================================
CLUSTER TYPE NAME STATUS DETAIL
----------- -------- -------------- -------- ------------------------
COMMON network net1 UP (private) 10.10.10.62/29000
COMMON cluster cls0 UP inc: net1, pub: N/A
cls0 file cls0:0 UP /share/tac/CMFILE/CMFILE
cls0 service source_tac UP Database, Active Cluster (auto-restart: OFF)
cls0 db source_tac1 UP(NRML) source_tac, /share/tibero_engine, failed retry cnt: 0
cls0 group SRC_CM DOWN type: prosync (failover: ON)
cls0 agent src_agent2 DOWN /share/prosync4/bin/prs_1.sh, start retry cnt: 0
=====================================================================Note
If the agent is in DEACT status, check the $PRS_HOME/var/cmagent.log to verify the rc result value.
If rc is 127, modify the execution permission of $PRS_HOME/bin/prs_0.sh or change
/bin/shto/bin/bashto resolve the issue.The command to reactivate the deact status is as follows:
$ cmrctl act agent --name <AGENT_ID>
Start CM GROUP
Start prosync using cm and start the cm group to use the failover feature.
$ cmrctl start group --name SRC_CM
=================================== SUCCESS! ===================================
Succeeded to request at each node to boot resources under the group(SRC_CM).
Please use "cmrctl show group --name SRC_CM" to verify the result.
================================================================================
$ cmrctl show all
Resource List of Node cm0
=====================================================================
CLUSTER TYPE NAME STATUS DETAIL
----------- -------- -------------- -------- ------------------------
COMMON network net0 UP (private) 10.10.10.61/29000
COMMON cluster cls0 UP inc: net0, pub: N/A
cls0 file cls0:0 UP /share/tac/CMFILE/CMFILE
cls0 service source_tac UP Database, Active Cluster (auto-restart: OFF)
cls0 db source_tac0 UP(NRML) source_tac, /share/tibero_engine, failed retry cnt: 0
cls0 group SRC_CM UP type: prosync (failover: ON)
cls0 agent src_agent1 UP /share/prosync4/bin/prs_0.sh, start retry cnt: 0
=====================================================================
Resource List of Node cm1
=====================================================================
CLUSTER TYPE NAME STATUS DETAIL
----------- -------- -------------- -------- ------------------------
COMMON network net1 UP (private) 10.10.10.62/29000
COMMON cluster cls0 UP inc: net1, pub: N/A
cls0 file cls0:0 UP /share/tac/CMFILE/CMFILE
cls0 service source_tac UP Database, Active Cluster (auto-restart: OFF)
cls0 db source_tac1 UP(NRML) source_tac, /share/tibero_engine, failed retry cnt: 0
cls0 group SRC_CM UP type: prosync (failover: ON)
cls0 agent src_agent2 UP /share/prosync4/bin/prs_1.sh, start retry cnt: 0
=====================================================================
$ cmrctl start group --name TAR_CM
=================================== SUCCESS! ===================================
Succeeded to request at each node to boot resources under the group(TAR_CM).
Please use "cmrctl show group --name TAR_CM" to verify the result.
================================================================================
$ cmrctl show all
Resource List of Node cm0
=====================================================================
CLUSTER TYPE NAME STATUS DETAIL
----------- -------- -------------- -------- ------------------------
COMMON network net0 UP (private) 10.10.10.63/29000
COMMON cluster cls0 UP inc: net0, pub: N/A
cls0 file cls0:0 UP /share/tac/CMFILE/CMFILE
cls0 service target_tac UP Database, Active Cluster (auto-restart: OFF)
cls0 db target_tac0 UP(NRML) target_tac, /share/tibero_engine, failed retry cnt: 0
cls0 group TAR_CM UP type: prosync (failover: ON)
cls0 agent tar_agent1 UP /share/prosync4/bin/prs_0.sh, start retry cnt: 0
=====================================================================
Resource List of Node cm1
=====================================================================
CLUSTER TYPE NAME STATUS DETAIL
----------- -------- -------------- -------- ------------------------
COMMON network net1 UP (private) 10.10.10.64/29000
COMMON cluster cls0 UP inc: net1, pub: N/A
cls0 file cls0:0 UP /share/tac/CMFILE/CMFILE
cls0 service target_tac UP Database, Active Cluster (auto-restart: OFF)
cls0 db target_tac1 UP(NRML) target_tac, /share/tibero_engine, failed retry cnt: 0
cls0 group TAR_CM UP type: prosync (failover: ON)
cls0 agent tar_agent2 UP /share/prosync4/bin/prs_1.sh, start retry cnt: 0
=====================================================================
Verify prosync Startup
Check the prosync startup status using the prosync admin process.
$ prs_adm ProSync 4 - Admin Utility TmaxData Corporation Copyright (c) 2024-. All rights reserved. Admin> status prs_agent ID: src_agent1, HOST: 10.10.10.61, PORT: 7600, CM_GROUP: SRC_CM, CM_ID: 0 is running prs_agent ID: src_agent2, HOST: 10.10.10.62, PORT: 7700, CM_GROUP: SRC_CM, CM_ID: 1 is running prs_agent ID: tar_agent1, HOST: 10.10.10.63, PORT: 7800, CM_GROUP: TAR_CM, CM_ID: 0 is running prs_agent ID: tar_agent2, HOST: 10.10.10.64, PORT: 7900, CM_GROUP: TAR_CM, CM_ID: 1 is running Instance ID: [PRS_FAILOVER] PRS_FAILOVER_ext1 (1) is running (prs_agent ID : src_agent1, HOST: 10.10.10.61, PORT: 7600) PRS_FAILOVER_ext2 (2) is running (prs_agent ID : src_agent2, HOST: 10.10.10.62, PORT: 7700) PRS_FAILOVER_apply1 (1) is running (prs_agent ID : tar_agent1, HOST: 10.10.10.63, PORT: 7800) PRS_FAILOVER_llob (1) is running (prs_agent ID : src_agent1, HOST: 10.10.10.61, PORT: 7600)