MogDB 如何手工增加从库
本站文章除注明转载外,均为本站原创: 转载自love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客
本文链接地址: MogDB 如何手工增加从库
首先这里不是我们官方的标准建议操作,对于MogDB集群而言,如果读压力过大,需要增加Standby从库进行读分流;那么建议使用gs_expansion来进行操作。我这里主要是为了自己练习、玩耍,这样更加熟悉MogDB的一些机制,因此不走寻常路,大家请勿模仿~~~
目前的架构如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
[omm@mogdb1 dn]$ gs_om -t status --detail [ Cluster State ] cluster_state : Normal redistributing : No current_az : AZ_ALL [ Datanode State ] node node_ip port instance state -------------------------------------------------------------------------------- 1 mogdb1 192.168.33.22 26000 6001 /mogdb/data/dn P Primary Normal 2 mogdb2 192.168.33.23 26000 6002 /mogdb/data/dn S Standby Normal |
这里为需要增加一台作为上述集群的Standby。
1、修改各个节点的/etc/hosts
1 2 3 |
192.168.33.22 mogdb1 192.168.33.23 mogdb2 192.168.33.24 mogdb |
2、修改新加节点机器的相关参数配置
/etc/selinux/config
/etc/sysctl.conf
systemctl disable firewalld.service
systemctl stop firewalld.service
3、创建用户和相关软件目录
–略
4、将mogdb2节点的MogDB软件拷贝到新加节点
–略
5、主库节点修改参数replconn2
1 2 3 4 5 6 7 8 |
replconninfo1 = 'localhost=192.168.33.22 localport=26001 localheartbeatport=26005 localservice=26004 remotehost=192.168.33.23 remoteport=26001 remoteheartbeatport=26005 remoteservice=26004' # replication connection information used to connect primary on standby, or standby on primary, replconninfo2 = 'localhost=192.168.33.22 localport=26001 localheartbeatport=26005 localservice=26004 remotehost=192.168.33.24 remoteport=26001 remoteheartbeatport=26005 remoteservice=26004' # replication connection information used to connect secondary on primary or standby, #replconninfo3 = '' # replication connection information used to connect primary on standby, or standby on primary, #replconninfo4 = '' # replication connection information used to connect primary on standby, or standby on primary, #replconninfo5 = '' # replication connection information used to connect primary on standby, or standby on primary, #replconninfo6 = '' # replication connection information used to connect primary on standby, or standby on primary, #replconninfo7 = '' # replication connection information used to connect primary on standby, or standby on primary, [root@mogdb1 ~]# |
6、修改postgresql.conf/pg_hba.con
注意所有节点的pg_hba.conf需要修改
7、新加节点进行数据库初始化同步
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 |
[omm@mogdb dn]$ gs_ctl build -D /opt/mogdb/data/ -b full -M standby [2022-04-02 23:42:00.339][17100][][gs_ctl]: gs_ctl full build ,datadir is /opt/mogdb/data [2022-04-02 23:42:00.339][17100][][gs_ctl]: stop failed, killing mogdb by force ... [2022-04-02 23:42:00.339][17100][][gs_ctl]: command [ps c -eo pid,euid,cmd | grep mogdb | grep -v grep | awk '{if($2 == curuid && $1!="-n") print "/proc/"$1"/cwd"}' curuid=`id -u`| xargs ls -l | awk '{if ($NF=="/opt/mogdb/data") print $(NF-2)}' | awk -F/ '{print $3 }' | xargs kill -9 >/dev/null 2>&1 ] path: [/opt/mogdb/data] [2022-04-02 23:42:00.441][17100][][gs_ctl]: server stopped [2022-04-02 23:42:00.441][17100][][gs_ctl]: current workdir is (/mogdb/data/dn). [2022-04-02 23:42:00.441][17100][][gs_ctl]: /opt/mogdb/data/postgresql.conf cannot be opened. [omm@mogdb dn]$ mv postgresql.conf /opt/mogdb/data/ [omm@mogdb dn]$ [omm@mogdb dn]$ gs_ctl build -D /opt/mogdb/data/ -b full -M standby 0 LOG: [Alarm Module]can not read GAUSS_WARNING_TYPE env. 0 LOG: [Alarm Module]Host Name: mogdb 0 LOG: [Alarm Module]Host IP: 192.168.33.24 0 LOG: [Alarm Module]Cluster Name: enmocluster [2022-04-02 23:43:51.512][17117][][gs_ctl]: gs_ctl full build ,datadir is /opt/mogdb/data [2022-04-02 23:43:51.513][17117][][gs_ctl]: stop failed, killing mogdb by force ... [2022-04-02 23:43:51.513][17117][][gs_ctl]: command [ps c -eo pid,euid,cmd | grep mogdb | grep -v grep | awk '{if($2 == curuid && $1!="-n") print "/proc/"$1"/cwd"}' curuid=`id -u`| xargs ls -l | awk '{if ($NF=="/opt/mogdb/data") print $(NF-2)}' | awk -F/ '{print $3 }' | xargs kill -9 >/dev/null 2>&1 ] path: [/opt/mogdb/data] [2022-04-02 23:43:51.552][17117][][gs_ctl]: server stopped [2022-04-02 23:43:51.553][17117][][gs_ctl]: current workdir is (/mogdb/data/dn). [2022-04-02 23:43:51.553][17117][][gs_ctl]: fopen build pid file "/opt/mogdb/data/gs_build.pid" success [2022-04-02 23:43:51.554][17117][][gs_ctl]: fprintf build pid file "/opt/mogdb/data/gs_build.pid" success [2022-04-02 23:43:51.554][17117][][gs_ctl]: fsync build pid file "/opt/mogdb/data/gs_build.pid" success [2022-04-02 23:43:51.555][17117][][gs_ctl]: set gaussdb state file when full build:db state(BUILDING_STATE), server mode(STANDBY_MODE), build mode(FULL_BUILD). [2022-04-02 23:43:51.568][17117][dn_6001_6002][gs_ctl]: connect to server success, build started. [2022-04-02 23:43:51.568][17117][dn_6001_6002][gs_ctl]: create build tag file success [2022-04-02 23:43:51.568][17117][dn_6001_6002][gs_ctl]: clear old target dir success [2022-04-02 23:43:51.569][17117][dn_6001_6002][gs_ctl]: create build tag file again success [2022-04-02 23:43:51.569][17117][dn_6001_6002][gs_ctl]: get system identifier success [2022-04-02 23:43:51.569][17117][dn_6001_6002][gs_ctl]: receiving and unpacking files... [2022-04-02 23:43:51.569][17117][dn_6001_6002][gs_ctl]: create backup label success [2022-04-02 23:43:51.864][17117][dn_6001_6002][gs_ctl]: xlog start point: 4/24413930 [2022-04-02 23:43:51.864][17117][dn_6001_6002][gs_ctl]: begin build tablespace list [2022-04-02 23:43:51.864][17117][dn_6001_6002][gs_ctl]: finish build tablespace list [2022-04-02 23:43:51.864][17117][dn_6001_6002][gs_ctl]: begin get xlog by xlogstream [2022-04-02 23:43:51.864][17117][dn_6001_6002][gs_ctl]: starting background WAL receiver [2022-04-02 23:43:51.864][17117][dn_6001_6002][gs_ctl]: starting walreceiver [2022-04-02 23:43:51.865][17117][dn_6001_6002][gs_ctl]: begin receive tar files [2022-04-02 23:43:51.865][17117][dn_6001_6002][gs_ctl]: receiving and unpacking files... [2022-04-02 23:43:51.875][17117][dn_6001_6002][gs_ctl]: check identify system success [2022-04-02 23:43:51.876][17117][dn_6001_6002][gs_ctl]: send START_REPLICATION 4/24000000 success [2022-04-02 23:44:01.278][17117][dn_6001_6002][gs_ctl]: finish receive tar files [2022-04-02 23:44:01.278][17117][dn_6001_6002][gs_ctl]: xlog end point: 4/25000058 [2022-04-02 23:44:01.278][17117][dn_6001_6002][gs_ctl]: fetching MOT checkpoint [2022-04-02 23:44:01.279][17117][dn_6001_6002][gs_ctl]: waiting for background process to finish streaming... [2022-04-02 23:44:06.257][17117][dn_6001_6002][gs_ctl]: starting fsync all files come from source. [2022-04-02 23:44:09.129][17117][dn_6001_6002][gs_ctl]: finish fsync all files. [2022-04-02 23:44:09.130][17117][dn_6001_6002][gs_ctl]: build dummy dw file success [2022-04-02 23:44:09.130][17117][dn_6001_6002][gs_ctl]: rename build status file success [2022-04-02 23:44:09.139][17117][dn_6001_6002][gs_ctl]: build completed(/opt/mogdb/data). [2022-04-02 23:44:09.314][17117][dn_6001_6002][gs_ctl]: waiting for server to start... .0 LOG: [Alarm Module]can not read GAUSS_WARNING_TYPE env. 0 LOG: [Alarm Module]Host Name: mogdb 0 LOG: [Alarm Module]Host IP: 192.168.33.24 0 LOG: [Alarm Module]Cluster Name: enmocluster 2022-04-02 23:44:09.740 [unknown] [unknown] localhost 140629046322752 0[0:0#0] 0 [REDO] LOG: Recovery parallelism, cpu count = 8, max = 4, actual = 4 2022-04-02 23:44:09.740 [unknown] [unknown] localhost 140629046322752 0[0:0#0] 0 [REDO] LOG: ConfigRecoveryParallelism, true_max_recovery_parallelism:4, max_recovery_parallelism:4 2022-04-02 23:44:09.740 [unknown] [unknown] localhost 140629046322752 0[0:0#0] 0 [BACKEND] LOG: [Alarm Module]can not read GAUSS_WARNING_TYPE env. 2022-04-02 23:44:09.740 [unknown] [unknown] localhost 140629046322752 0[0:0#0] 0 [BACKEND] LOG: [Alarm Module]Host Name: mogdb 2022-04-02 23:44:09.819 [unknown] [unknown] localhost 140629046322752 0[0:0#0] 0 [BACKEND] LOG: [Alarm Module]Host IP: 192.168.33.24 2022-04-02 23:44:09.819 [unknown] [unknown] localhost 140629046322752 0[0:0#0] 0 [BACKEND] LOG: [Alarm Module]Cluster Name: enmocluster 2022-04-02 23:44:09.824 [unknown] [unknown] localhost 140629046322752 0[0:0#0] 0 [BACKEND] LOG: loaded library "security_plugin" 2022-04-02 23:44:09.830 [unknown] [unknown] localhost 140629046322752 0[0:0#0] 0 [BACKEND] LOG: InitNuma numaNodeNum: 1 numa_distribute_mode: none inheritThreadPool: 0. 2022-04-02 23:44:09.830 [unknown] [unknown] localhost 140629046322752 0[0:0#0] 0 [BACKEND] LOG: reserved memory for backend threads is: 340 MB 2022-04-02 23:44:09.830 [unknown] [unknown] localhost 140629046322752 0[0:0#0] 0 [BACKEND] LOG: reserved memory for WAL buffers is: 320 MB 2022-04-02 23:44:09.830 [unknown] [unknown] localhost 140629046322752 0[0:0#0] 0 [BACKEND] LOG: Set max backend reserve memory is: 660 MB, max dynamic memory is: 4692 MB 2022-04-02 23:44:09.830 [unknown] [unknown] localhost 140629046322752 0[0:0#0] 0 [BACKEND] LOG: shared memory 20231 Mbytes, memory context 5352 Mbytes, max process memory 25600 Mbytes 2022-04-02 23:44:09.831 [unknown] [unknown] localhost 140629046322752 0[0:0#0] 0 [BACKEND] FATAL: could not create shared memory segment: Cannot allocate memory 2022-04-02 23:44:09.831 [unknown] [unknown] localhost 140629046322752 0[0:0#0] 0 [BACKEND] DETAIL: Failed system call was shmget(key=26000001, size=21214617600, 03600). 2022-04-02 23:44:09.831 [unknown] [unknown] localhost 140629046322752 0[0:0#0] 0 [BACKEND] HINT: This error usually means that PostgreSQL's request for a shared memory segment exceeded available memory or swap space, or exceeded your kernel's SHMALL parameter. You can either reduce the request size or reconfigure the kernel with larger SHMALL. To reduce the request size (currently 21214617600 bytes), reduce PostgreSQL's shared memory usage, perhaps by reducing shared_buffers. The PostgreSQL documentation contains more information about shared memory configuration. 2022-04-02 23:44:09.835 [unknown] [unknown] localhost 140629046322752 0[0:0#0] 0 [BACKEND] LOG: FiniNuma allocIndex: 0. [2022-04-02 23:44:09.838][17117][dn_6001_6002][gs_ctl]: waitpid 17143 failed, exitstatus is 256, ret is 2 [2022-04-02 23:44:09.838][17117][dn_6001_6002][gs_ctl]: stopped waiting [2022-04-02 23:44:09.838][17117][dn_6001_6002][gs_ctl]: could not start server Examine the log output. [2022-04-02 23:44:09.838][17117][dn_6001_6002][gs_ctl]: fopen build pid file "/opt/mogdb/data/gs_build.pid" success [2022-04-02 23:44:09.838][17117][dn_6001_6002][gs_ctl]: fprintf build pid file "/opt/mogdb/data/gs_build.pid" success [2022-04-02 23:44:09.839][17117][dn_6001_6002][gs_ctl]: fsync build pid file "/opt/mogdb/data/gs_build.pid" success |
8、同步完成之后检查集群的状态
在检查过程中发现遇到了如下的一系列问题:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
[omm@mogdb script]$ gs_ctl query -D /mogdb/data/dn/ [2022-04-03 00:30:53.178][2596][][gs_ctl]: gs_ctl query ,datadir is /mogdb/data/dn HA state: local_role : Standby static_connections : 1 db_state : Normal detail_information : FATAL: Forbid remote connection with initial user. FATAL: Forbid remote connection with initial user. sync_mode : Async Senders info: No information Receiver info: No information |
上述问题实际上是属于防火墙问题. 另外一个问题是集群虽然看上去能正常同步,但是无法显示节点信息;这里我通过strace跟踪gs_om -t status –detail 发现会读一个config配置文件。最后发现可以通过如下命令来重建这个集群配置文件。
最开始尝试手工编辑这个二进制文件(以前修复Oracle数据库的时候经常这么干),这里并没有成功。还是老实通过命令来生成吧。
在重建配置文件之前,需要在主库安装软件时的xml文件中增加新加节点的信息:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
[omm@mogdb1 ~]$ cat /opt/soft/script/install_new.xml <?xml version="1.0" encoding="UTF-8"?> <ROOT> <!-- openGauss整体信息 --> <CLUSTER> <!-- 数据库名称 --> <PARAM name="clusterName" value="enmocluster" /> <!-- 数据库节点名称(hostname) --> <PARAM name="nodeNames" value="mogdb1,mogdb2,mogdb" /> <!-- 数据库安装目录--> <PARAM name="gaussdbAppPath" value="/opt/mogdb/app" /> <!-- 日志目录--> <PARAM name="gaussdbLogPath" value="/var/log/mogdb" /> <!-- 临时文件目录--> <PARAM name="tmpMppdbPath" value="/opt/mogdb/tmp"/> <!-- 数据库工具目录--> <PARAM name="gaussdbToolPath" value="/opt/mogdb/tools" /> <!-- 数据库core文件目录--> <PARAM name="corePath" value="/opt/mogdb/corefile"/> <!-- 节点IP,与数据库节点名称列表一一对应 --> <PARAM name="backIp1s" value="192.168.33.23,192.168.33.23,192.168.33.24"/> </CLUSTER> <!-- 每台服务器上的节点部署信息 --> <DEVICELIST> <!-- 节点1上的部署信息 --> <DEVICE sn="10000"> <!-- 节点1的主机名称 --> <PARAM name="name" value="mogdb1"/> <!-- 节点1所在的AZ及AZ优先级 --> <PARAM name="azName" value="AZ1"/> <PARAM name="azPriority" value="1"/> <!-- 节点1的IP,如果服务器只有一个网卡可用,将backIP1和sshIP1配置成同一个IP --> <PARAM name="backIp1" value="192.168.33.22"/> <PARAM name="sshIp1" value="192.168.33.22"/> <!--dn--> <PARAM name="dataNum" value="1"/> <PARAM name="dataPortBase" value="26000"/> <PARAM name="dataNode1" value="/mogdb/data/dn,mogdb2,/mogdb/data/dn,mogdb,/mogdb/data/dn"/> <PARAM name="dataNode1_syncNum" value="0"/> </DEVICE> <!-- 节点2上的节点部署信息,其中“name”的值配置为主机名称 --> <DEVICE sn="10001"> <!-- 节点2的主机名称 --> <PARAM name="name" value="mogdb2"/> <!-- 节点2所在的AZ及AZ优先级 --> <PARAM name="azName" value="AZ1"/> <PARAM name="azPriority" value="1"/> <!-- 节点2的IP,如果服务器只有一个网卡可用,将backIP1和sshIP1配置成同一个IP --> <PARAM name="backIp1" value="192.168.33.23"/> <PARAM name="sshIp1" value="192.168.33.23"/> </DEVICE> <!-- 节点3上的节点部署信息,其中“name”的值配置为主机名称 --> <DEVICE sn="10002"> <!-- 节点2的主机名称 --> <PARAM name="name" value="mogdb"/> <!-- 节点2所在的AZ及AZ优先级 --> <PARAM name="azName" value="AZ1"/> <PARAM name="azPriority" value="1"/> <!-- 节点2的IP,如果服务器只有一个网卡可用,将backIP1和sshIP1配置成同一个IP --> <PARAM name="backIp1" value="192.168.33.24"/> <PARAM name="sshIp1" value="192.168.33.24"/> </DEVICE> </DEVICELIST> </ROOT> |
接下来重建并分发到所有节点:
1 2 3 4 5 6 7 8 9 10 |
[omm@mogdb1 om]$ gs_om -t generateconf -X /opt/soft/script/install_new.xml --distribute Generating static configuration files for all nodes. Creating temp directory to store static configuration files. Successfully created the temp directory. Generating static configuration files. Successfully generated static configuration files. Static configuration files for all nodes are saved in /opt/mogdb/tools/script/static_config_files. Distributing static configuration files to all nodes. omm@mogdb's password: Successfully distributed static configuration files. |
此时查看集群状态发现能看到新加节点的信息了,但是状态不对:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
[ Cluster State ] cluster_state : Degraded redistributing : No current_az : AZ_ALL [ Datanode State ] node node_ip port instance state -------------------------------------------------------------------------------- 1 mogdb1 192.168.33.22 26000 6001 /mogdb/data/dn P Primary Normal 2 mogdb2 192.168.33.23 26000 6002 /mogdb/data/dn S Standby Normal 3 mogdb 192.168.33.24 26000 6003 /mogdb/data/dn S Unknown Unknown |
同时如果此时去新加节点查询集群状态发现也无法识别之前节点的信息:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
[omm@mogdb ~]$ gs_om -t status --detail [ Cluster State ] cluster_state : Unavailable redistributing : No current_az : AZ_ALL [ Datanode State ] node node_ip port instance state -------------------------------------------------------------------------------- 1 mogdb1 192.168.33.22 26000 6001 /mogdb/data/dn P Unknown Unknown 2 mogdb2 192.168.33.23 26000 6002 /mogdb/data/dn S Unknown Unknown 3 mogdb 192.168.33.24 26000 6003 /mogdb/data/dn S Standby Normal |
通过 在primary主库进行停集群发现,同样无法操作新家的standby从库:
1 2 3 4 5 6 7 8 9 |
[omm@mogdb1 om]$ gs_om -t stop Stopping cluster. ========================================= [GAUSS-53606]: Can not stop the database, the cmd is . /home/omm/.bashrc; python3 '/opt/mogdb/tools/script/local/StopInstance.py' -U omm -R /opt/mogdb/app -t 300 -m fast, Error: [GAUSS-51400] : Failed to execute the command: . /home/omm/.bashrc; python3 '/opt/mogdb/tools/script/local/StopInstance.py' -U omm -R /opt/mogdb/app -t 300 -m fast. Error: [SUCCESS] mogdb1: [SUCCESS] mogdb2: [FAILURE] mogdb: .. |
单独将上面的脚本拿到standby执行,发现能够正常执行。这是什么问题呢?联想到在standby上也无法正常操作其他环境,因此为怀疑大概率是omm用户的互信问题,因为我压根没有配置新加standby主机环境omm用户和之前集群节点的互信关系。
实际上通过gs_om来进行集群管理和操作,那么是必须依赖互信的。手工配置互信也ok,这里通过preinstall脚本来完成更简单:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 |
[root@mogdb1 script]# ./gs_preinstall -U omm -G dbgrp -X install_new.xml Parsing the configuration file. Successfully parsed the configuration file. Installing the tools on the local node. Successfully installed the tools on the local node. Are you sure you want to create trust for root (yes/no)? yes Please type 'yes' or 'no': yes Please enter password for root. Password: Password authentication failed, please try again. Password: Creating SSH trust for the root permission user. Checking network information. All nodes in the network are Normal. Successfully checked network information. Creating SSH trust. Creating the local key file. Successfully created the local key files. Appending local ID to authorized_keys. Successfully appended local ID to authorized_keys. Updating the known_hosts file. Successfully updated the known_hosts file. Appending authorized_key on the remote node. Successfully appended authorized_key on all remote node. Checking common authentication file content. Successfully checked common authentication content. Distributing SSH trust file to all node. Successfully distributed SSH trust file to all node. Verifying SSH trust on all hosts. Successfully verified SSH trust on all hosts. Successfully created SSH trust. Successfully created SSH trust for the root permission user. Setting pssh path Successfully set core path. Distributing package. Begin to distribute package to tool path. Successfully distribute package to tool path. Begin to distribute package to package path. Successfully distribute package to package path. Successfully distributed package. Are you sure you want to create the user[omm] and create trust for it (yes/no)? yes Please type 'yes' or 'no': yes Preparing SSH service. Successfully prepared SSH service. Installing the tools in the cluster. Successfully installed the tools in the cluster. Checking hostname mapping. Successfully checked hostname mapping. Creating SSH trust for [omm] user. Please enter password for current user[omm]. Password: [GAUSS-50306] : The password of omm is incorrect. [root@mogdb1 script]# [root@mogdb1 script]# [root@mogdb1 script]# ./gs_preinstall -U omm -G dbgrp -X install_new.xml Parsing the configuration file. Successfully parsed the configuration file. Installing the tools on the local node. Successfully installed the tools on the local node. Are you sure you want to create trust for root (yes/no)? yes Please enter password for root. Password: Creating SSH trust for the root permission user. Checking network information. All nodes in the network are Normal. Successfully checked network information. Creating SSH trust. Creating the local key file. Successfully created the local key files. Appending local ID to authorized_keys. Successfully appended local ID to authorized_keys. Updating the known_hosts file. Successfully updated the known_hosts file. Appending authorized_key on the remote node. Successfully appended authorized_key on all remote node. Checking common authentication file content. Successfully checked common authentication content. Distributing SSH trust file to all node. Successfully distributed SSH trust file to all node. Verifying SSH trust on all hosts. Successfully verified SSH trust on all hosts. Successfully created SSH trust. Successfully created SSH trust for the root permission user. Setting pssh path Successfully set core path. Distributing package. Begin to distribute package to tool path. Successfully distribute package to tool path. Begin to distribute package to package path. Successfully distribute package to package path. Successfully distributed package. Are you sure you want to create the user[omm] and create trust for it (yes/no)? yes Please type 'yes' or 'no': yes Preparing SSH service. Successfully prepared SSH service. Installing the tools in the cluster. Successfully installed the tools in the cluster. Checking hostname mapping. Successfully checked hostname mapping. Creating SSH trust for [omm] user. Please enter password for current user[omm]. Password: Checking network information. All nodes in the network are Normal. Successfully checked network information. Creating SSH trust. Creating the local key file. Successfully created the local key files. Appending local ID to authorized_keys. Successfully appended local ID to authorized_keys. Updating the known_hosts file. Successfully updated the known_hosts file. Appending authorized_key on the remote node. Successfully appended authorized_key on all remote node. Checking common authentication file content. Successfully checked common authentication content. Distributing SSH trust file to all node. Successfully distributed SSH trust file to all node. Verifying SSH trust on all hosts. Successfully verified SSH trust on all hosts. Successfully created SSH trust. Successfully created SSH trust for [omm] user. Checking OS software. Successfully check os software. Checking OS version. Successfully checked OS version. Creating cluster's path. [SUCCESS] mogdb1: [SUCCESS] mogdb2: [FAILURE] mogdb: [GAUSS-50200] : The /opt/mogdb/app already exists. Please remove it. It should be a symbolic link to $GAUSSHOME if it |
上面操作都完毕之后,最后再重启集群发现一切正常:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
[omm@mogdb1 mogdb]$ gs_om -t start Starting cluster. ========================================= [SUCCESS] mogdb 2022-04-03 07:50:30.324 [unknown] [unknown] localhost 139764927858240 0[0:0#0] 0 [BACKEND] WARNING: Failed to initialize the memory protect for g_instance.attr.attr_storage.cstore_buffers (16 Mbytes) or shared memory (12800 Mbytes) is larger. 2022-04-03 07:50:33.331 [unknown] [unknown] localhost 139764927858240 0[0:0#0] 0 [BACKEND] WARNING: Cgroup get_cgroup Gaussdb:omm information: Cgroup does not exist(50002) 2022-04-03 07:50:33.331 [unknown] [unknown] localhost 139764927858240 0[0:0#0] 0 [BACKEND] WARNING: Cgroup get_cgroup Gaussdb:omm/Class/DefaultClass information: Cgroup does not exist(50002) ========================================= Successfully started. [omm@mogdb1 mogdb]$ gs_om -t status --detail [ Cluster State ] cluster_state : Normal redistributing : No current_az : AZ_ALL [ Datanode State ] node node_ip port instance state -------------------------------------------------------------------------------- 1 mogdb1 192.168.33.22 26000 6001 /mogdb/data/dn P Primary Normal 2 mogdb2 192.168.33.23 26000 6002 /mogdb/data/dn S Standby Normal 3 mogdb 192.168.33.24 26000 6003 /mogdb/data/dn S Standby Normal [omm@mogdb1 mogdb]$ |
可以进一步查看standby的同步情况:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
[omm@mogdb1 mogdb]$ gs_om -t status --all ----------------------------------------------------------------------- cluster_state : Normal redistributing : No ----------------------------------------------------------------------- node : 1 node_name : mogdb1 instance_id : 6001 node_ip : 192.168.33.22 data_path : /mogdb/data/dn instance_port : 26000 type : Datanode instance_state : Normal az_name : AZ1 static_connections : 2 HA_state : Normal instance_role : Primary ----------------------------------------------------------------------- node : 2 node_name : mogdb2 instance_id : 6002 node_ip : 192.168.33.23 data_path : /mogdb/data/dn instance_port : 26000 type : Datanode instance_state : Normal az_name : AZ1 instance_role : Standby HA_state : Streaming sender_sent_location : 4/2E1C4930 sender_write_location : 4/2E1C4930 sender_flush_location : 4/2E1C4930 sender_replay_location : 4/2E1C4930 receiver_received_location: 4/2E1C4930 receiver_write_location : 4/2E1C4930 receiver_flush_location : 4/2E1C4930 receiver_replay_location : 4/2E1C4930 sync_percent : 100% sync_state : Quorum ----------------------------------------------------------------------- node : 3 node_name : mogdb instance_id : 6003 node_ip : 192.168.33.24 data_path : /mogdb/data/dn instance_port : 26000 type : Datanode instance_state : Normal az_name : AZ1 instance_role : Standby HA_state : Streaming sender_sent_location : 4/2E1C4930 sender_write_location : 4/2E1C4930 sender_flush_location : 4/2E1C4930 sender_replay_location : 4/2E1C4930 receiver_received_location: 4/2E1C4930 receiver_write_location : 4/2E1C4930 receiver_flush_location : 4/2E1C4930 receiver_replay_location : 4/2E1C4930 sync_percent : 100% sync_state : Quorum ----------------------------------------------------------------------- |
自此整个集群新加standby的纯手工操作算是告一段落了。单纯的自娱自乐!如果是MogDB生产环境,不建议这样操作!
Leave a Reply
You must be logged in to post a comment.