达梦数据库学习笔记 – dataWatch守护集群
本站文章除注明转载外,均为本站原创: 转载自love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客
本文链接地址: 达梦数据库学习笔记 – dataWatch守护集群
对于达梦数据库而言,目前其高可用解决方案主要还是以主备守护集群维护,通过datawatch来实现。
我们知道Oracle 主备也是非常成熟的,几乎所有Oracle用户的容灾都使用dataguard/active dataguard来实现,同时我们也知道Oracle standby的状态可以是mount或open read only 只读模式,这样可以进行读写分离。在Oracle 21c版本中,还支持多租户pdb
级别的read only 同步,这是非常牛。然而在达梦数据库中,standby不存在read only模式的说法,就是mount或者open,只是open时数据库角色是standby,仍然是可以进行只读查询的,不允许进行写操作。但从者来看,跟Oracle类似,只是感觉稍微有点别扭。
对于dm8的主备集群部署,相对来讲比较简单,这里我就简单列一下主备的配置情况:
1、主库dm.ini参数
1 2 3 4 5 6 7 8 9 |
[dmdba@kylin-dm1 enmotech]$ cat dm.ini |grep INI MAL_INI = 1 #dmmal.ini ARCH_INI = 1 #dmarch.ini REP_INI = 0 #dmrep.ini LLOG_INI = 0 #dmllog.ini TIMER_INI = 0 #dmtimer.ini MPP_INI = 0 #dmmpp.ini DFS_INI = 0 #dmdfs.ini DSC_FILE_INIT_ASYNC = 1 #DSC file init async flag |
这里主要修改mal_ini和arch_ini。
2、主库dmarch.ini归档相关参数
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
[dmdba@kylin-dm1 enmotech]$ cat dmarch.ini [ARCHIVE_REALTIME] ARCH_TYPE = REALTIME ARCH_DEST = test_dr [ARCHIVE_LOCAL1] ARCH_TYPE = LOCAL ARCH_DEST = /opt/dm/dmarch ARCH_FILE_SIZE = 256 ARCH_SPACE_LIMIT = 0 |
主库这里arch_dest目录地写standby 实例名称即可,arch_type选择实时同步模式。
3、dmmal.ini
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
[dmdba@kylin-dm1 enmotech]$ cat dmmal.ini MAL_CHECK_INTERVAL = 5 ##MAL链路检测时间间隔 MAL_CONN_FAIL_INTERVAL = 5 ##判定MAL链路断开的时间 [MAL_INST1] MAL_INST_NAME = test ##实例名,和dm.ini中的INSTANCE_NAME一致 MAL_HOST = 172.20.22.189 ##MAL系统监听TCP连接的IP地址 MAL_PORT = 5237 ##MAL系统监听TCP连接的端口 MAL_INST_HOST = 172.20.22.189 ##实例的对外服务IP地址 MAL_INST_PORT = 5236 ##实例的对外服务端口,和dm.ini中的PORT_NUM一致 MAL_DW_PORT = 5238 ##实例本地的守护进程监听TCP连接的端口 MAL_INST_DW_PORT = 5239 ##实例监听守护进程TCP连接的端口 [MAL_INST2] MAL_INST_NAME = test_dr MAL_HOST = 172.20.22.190 MAL_PORT = 5237 MAL_INST_HOST = 172.20.22.190 MAL_INST_PORT = 5236 MAL_DW_PORT = 5238 MAL_INST_DW_PORT = 5239 |
对于dmmal.ini中的参数,主备保持参数一致。
4、dmwatcher.ini
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
[dmdba@kylin-dm1 enmotech]$ cat dmwatcher.ini [TEST] DW_TYPE =GLOBAL ##全局守护类型 DW_MODE = AUTO ##自动切换模式 DW_ERROR_TIME = 10 ##远程守护进程故障认定时间 INST_RECOVER_TIME = 60 ##主库守护进程启动恢复的间隔时间 INST_ERROR_TIME = 10 ##本地实例故障认定时间 INST_OGUID = 666666 ##守护系统唯一OGUID值 INST_INI = /opt/dm/data/enmotech/dm.ini ##dm.ini配置文件路径 INST_AUTO_RESTART = 1 ##打开实例的自动启动功能 INST_STARTUP_CMD = /opt/dm/dmdbms/bin/dmserver ##命令行方式启动 RLOG_SEND_THRESHOLD = 0 ##指定主库发送日志到备库的时间阈值,默认关闭 RLOG_APPLY_THRESHOLD = 0 ##指定备库重演日志的时间阈值,默认关闭 |
对于dmwatcher.ini参数,主备保持一致即可。
5、dmmonitor.ini参数
1 2 3 4 5 6 7 8 9 10 |
[dmdba@kylin-dm1 enmotech]$ cat dmmonitor.ini MON_DW_Confirm = 1 MON_LOG_PATH = /opt/dm/data/log MON_LOG_INTERVAL = 60 MON_LOG_FILE_SIZE = 256 MON_LOG_SPACE_LIMIT = 1024 [test] MON_INST_OGUID = 666666 MON_DW_IP = 172.20.22.189:5238 MON_DW_IP = 172.20.22.190:5238 |
同样,主备保持参数一致。
6、数据库设置oguid值,该参数大小随意
–主库设置OGUID值:
SP_SET_PARA_VALUE(1, ‘ALTER_MODE_STATUS’, 1);
sp_set_oguid(666666);
alter database primary;
SP_SET_PARA_VALUE(1, ‘ALTER_MODE_STATUS’, 0);
—备库
设置OGUID值:
SP_SET_PARA_VALUE(1, ‘ALTER_MODE_STATUS’, 1);
sp_set_oguid(666666);
alter database standby;
SP_SET_PARA_VALUE(1, ‘ALTER_MODE_STATUS’, 0);
对于oguid的设置,均将数据库实例启动到mount状态后,登录disql执行即可。oguid的参数设置要与dmwatcher和dmmonitor中的参数保持一致。
7、备份恢复standby
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
+++主库备份全库 RMAN> backup database '/opt/dm/data/enmotech/dm.ini' full backupset '/opt/dm/dmbak/enmotech_full'; backup database '/opt/dm/data/enmotech/dm.ini' full backupset '/opt/dm/dmbak/enmotech_full'; Database mode = 0, oguid = 0 Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL begin redo pwr log collect, last ckpt lsn: 43428 ... redo pwr log collect finished EP[0]'s cur_lsn[43428], file_lsn[43428] Processing backupset /opt/dm/dmbak/enmotech_full [Percent:100.00%][Speed:0.00M/s][Cost:00:00:02][Remaining:00:00:00] backup successfully! time used: 00:00:03.379 RMAN> +++传输备份到standby [dmdba@kylin-dm1 dmbak]$ scp -r enmotech_full dmdba@172.20.22.190:/opt/dm/dmbak The authenticity of host '172.20.22.190 (172.20.22.190)' can't be established. ECDSA key fingerprint is SHA256:fKglM4A3W56Dd+7XBJBnDjyI0OqNpKYJLVYdJXHSjoI. Are you sure you want to continue connecting (yes/no/[fingerprint])? yes Warning: Permanently added '172.20.22.190' (ECDSA) to the list of known hosts. Authorized users only. All activities may be monitored and reported. dmdba@172.20.22.190's password: enmotech_full.bak 100% 7726KB 190.2MB/s 00:00 enmotech_full.meta 100% 81KB 73.0MB/s 00:00 [dmdba@kylin-dm1 dmbak]$ +++standby进行还原 [dmdba@kylin-dm2 ~]$ dmrman dmrman V8 RMAN> restore database '/opt/dm/data/enmotech/dm.ini' from backupset '/opt/dm/dmbak/enmotech_full'; restore database '/opt/dm/data/enmotech/dm.ini' from backupset '/opt/dm/dmbak/enmotech_full'; file dm.key not found, use default license! Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL [Percent:100.00%][Speed:0.00M/s][Cost:00:00:02][Remaining:00:00:00] restore successfully. time used: 00:00:02.528 RMAN> recover database '/opt/dm/data/enmotech/dm.ini' from backupset '/opt/dm/dmbak/enmotech_full'; recover database '/opt/dm/data/enmotech/dm.ini' from backupset '/opt/dm/dmbak/enmotech_full'; Database mode = 0, oguid = 0 Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL EP[0]'s cur_lsn[43428], file_lsn[43428] 备份集[/opt/dm/dmbak/enmotech_full]备份过程中未产生日志 recover successfully! time used: 371.904(ms) RMAN> RMAN> recover database '/opt/dm/data/enmotech/dm.ini' update DB_MAGIC; recover database '/opt/dm/data/enmotech/dm.ini' update DB_MAGIC; Database mode = 0, oguid = 0 Normal of FAST Normal of DEFAULT Normal of RECYCLE Normal of KEEP Normal of ROLL EP[0]'s cur_lsn[43428], file_lsn[43428] recover successfully! time used: 00:00:01.039 RMAN> update DB_MAGIC; RMAN> |
8、启动主备以及dmwatch和dmmonitor监控。
/opt/dm/dmdbms/bin/dmserver path=/opt/dm/data/enmotech/dm.ini mount &
dmwatcher /opt/dm/data/enmotech/dmwatcher.ini &
dmmonitor /opt/dm/data/enmotech/dmmonitor.ini &
也可以通过注册service的方式来启动数据库。
9、检查监控是否正常
查看/opt/dm/data/enmotech/log 中的dmmonitorlog 即可,如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
[dmdba@kylin-dm1 ~]$ [monitor] 2022-11-09 09:07:17: Dmwatcher process TEST status switching [OPEN-->MON CONFIRM] WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-11-09 09:07:17 MON CONFIRM OK TEST SUSPEND PRIMARY VALID 10 86408129 86408130 [monitor] 2022-11-09 09:07:18: Dmwatcher process TEST status switching [MON CONFIRM-->FAILOVER] WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-11-09 09:07:17 FAILOVER OK TEST SUSPEND PRIMARY VALID 10 86408129 86408130 [monitor] 2022-11-09 09:07:20: Dmwatcher process TEST status switching [FAILOVER-->OPEN] WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-11-09 09:07:20 OPEN OK TEST OPEN PRIMARY VALID 10 86408130 86408130 [monitor] 2022-11-09 09:07:20: Instance TEST_DR[STANDBY, OPEN, ISTAT_SAME:TRUE] error WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-11-09 09:07:19 STARTUP ERROR TEST_DR OPEN STANDBY INVALID 10 86408126 86408126 [monitor] 2022-11-09 09:07:20: Dmwatcher process TEST_DR status switching [OPEN-->STARTUP] WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-11-09 09:07:19 STARTUP ERROR TEST_DR OPEN STANDBY INVALID 10 86408126 86408126 [monitor] 2022-11-09 09:07:33: Instance TEST_DR[STANDBY, OPEN, ISTAT_SAME:TRUE] recover to OK WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-11-09 09:07:32 STARTUP OK TEST_DR OPEN STANDBY INVALID 10 86408126 86408126 [monitor] 2022-11-09 09:07:33: Dmwatcher process TEST_DR status switching [STARTUP-->OPEN] WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-11-09 09:07:32 OPEN OK TEST_DR OPEN STANDBY INVALID 10 86408126 86408126 [monitor] 2022-11-09 09:07:38: Dmwatcher process TEST status switching [OPEN-->RECOVERY] WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-11-09 09:07:38 RECOVERY OK TEST OPEN PRIMARY VALID 10 86408142 86408142 [monitor] 2022-11-09 09:07:42: Dmwatcher process TEST status switching [RECOVERY-->OPEN] WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-11-09 09:07:42 OPEN OK TEST OPEN PRIMARY VALID 10 86408143 86408143 |
从目前监控来看,现在的达梦主备守护集群正常,实际上我进行了多次kill进程测试,数据库很快会被自动拉起。说明主备集群还是比较稳定的。
经过多次benchmark压测,达梦主备集群整体来看还算稳定,性能波动在15-20%之间。
期间监控发现操作系统sys%较高,虽然磁盘IO不是太理想,不过总感觉有一定的优化空间,通过perf top看了下,发现系统中断对性能影响似乎较大。
这里我就是简单测试一下达梦主备的稳定性,没有做过多深入研究测试。另外对于测试的虚拟机环境配置为:4c/16G memory.
参数优化参考了达梦官方的脚本:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 |
declare v_mem_mb int; v_cpus int; mem_per int:=100; MEMORY_POOL int; BUFFER INT; MAX_BUFFER INT; RECYCLE int; CACHE_POOL_SIZE int; BUFFER_POOLS int; RECYCLE_POOLS int; SORT_BUF_SIZE int; SORT_BUF_GLOBAL_SIZE INT; DICT_BUF_SIZE INT; SESS_POOL_SIZE INT; HJ_BUF_SIZE INT; HAGR_BUF_SIZE INT; HJ_BUF_GLOBAL_SIZE INT; HAGR_BUF_GLOBAL_SIZE INT; TASK_THREADS INT; SORT_FLAG INT; SORT_BLK_SIZE INT; RLOG_POOL_SIZE INT; begin SELECT TOP 1 N_CPU,TOTAL_PHY_SIZE/1024/1024 INTO v_cpus,v_mem_mb FROM V$SYSTEMINFO; print v_cpus; print v_mem_mb; v_mem_mb=round(v_mem_mb,-3); print v_mem_mb; IF v_mem_mb <= 2000 THEN return; END IF; IF v_mem_mb > 512000 THEN v_mem_mb :=v_mem_mb*0.8; END IF; TASK_THREADS :=8; IF v_cpus < 8 THEN TASK_THREADS :=4; END IF; IF v_cpus >= 64 THEN v_cpus := 64; TASK_THREADS :=16; END IF; SP_SET_PARA_VALUE(2,'WORKER_THREADS',v_cpus); SP_SET_PARA_VALUE(2,'TASK_THREADS',TASK_THREADS); BUFFER := round(cast(v_mem_mb * 0.5 as int),-3); MAX_BUFFER := BUFFER; RECYCLE :=LEAST(cast(v_mem_mb * 0.125 as int),50000); IF v_mem_mb < 70000 THEN with t as ( select rownum rn from dual connect by level <= 100 ) , t1 as ( select * from t where rn > 1 minus select ta.rn * tb.rn from t ta, t tb where ta.rn <= tb.rn and ta.rn > 1 and tb.rn > 1 ) select top 1 rn into BUFFER_POOLS from t1 where rn > v_mem_mb/800 order by 1; PRINT 'BUFFER_POOLS ' || BUFFER_POOLS; --设置根据内存情况RECYCLE_POOLS参数 with t as ( select rownum rn from dual connect by level <= 100 ) , t1 as ( select * from t where rn > 1 minus select ta.rn * tb.rn from t ta, t tb where ta.rn <= tb.rn and ta.rn > 1 and tb.rn > 1 ) select top 1 rn into RECYCLE_POOLS from t1 where rn > v_mem_mb/800/3 order by 1; PRINT 'RECYCLE_POOLS ' || RECYCLE_POOLS; ELSE BUFFER_POOLS := 101; RECYCLE_POOLS := 41; END IF; --修改内存池 IF v_mem_mb >= 16000 THEN MEMORY_POOL := 2048; CACHE_POOL_SIZE := 1024; SORT_FLAG = 0; SORT_BLK_SIZE=1; SORT_BUF_SIZE := 10; SORT_BUF_GLOBAL_SIZE := 2000; SESS_POOL_SIZE := 16; RLOG_POOL_SIZE := 1024; HJ_BUF_GLOBAL_SIZE := LEAST(cast(v_mem_mb * 0.0625 as int),10000); HAGR_BUF_GLOBAL_SIZE := LEAST(cast(v_mem_mb * 0.0625 as int),10000); HJ_BUF_SIZE :=500; HAGR_BUF_SIZE :=500; IF v_mem_mb >= 64000 THEN CACHE_POOL_SIZE := 2048; RLOG_POOL_SIZE := 2048; SORT_FLAG = 1; SORT_BLK_SIZE=1; SORT_BUF_SIZE=512; SORT_BUF_GLOBAL_SIZE=5120 ; SESS_POOL_SIZE := 32; HJ_BUF_GLOBAL_SIZE := LEAST(cast(v_mem_mb * 0.0625 as int),15000); HAGR_BUF_GLOBAL_SIZE := LEAST(cast(v_mem_mb * 0.0625 as int),15000); HJ_BUF_SIZE :=1000; HAGR_BUF_SIZE :=1000; END IF; DICT_BUF_SIZE := 500; HJ_BUF_GLOBAL_SIZE :=round(HJ_BUF_GLOBAL_SIZE,-3); HAGR_BUF_GLOBAL_SIZE :=round(HAGR_BUF_GLOBAL_SIZE,-3); RECYCLE :=round(RECYCLE,-3); ELSE MEMORY_POOL :=GREAT(cast(v_mem_mb * 0.0625 as int),100); CACHE_POOL_SIZE := 200; RLOG_POOL_SIZE := 256; SORT_BUF_SIZE := 10; SORT_BUF_GLOBAL_SIZE := 500; DICT_BUF_SIZE := 50; SESS_POOL_SIZE =16; SORT_FLAG = 0; SORT_BLK_SIZE=1; HJ_BUF_GLOBAL_SIZE := GREAT(cast(v_mem_mb * 0.0625 as int),500); HAGR_BUF_GLOBAL_SIZE := GREAT(cast(v_mem_mb * 0.0625 as int),500); HJ_BUF_SIZE := GREAT(cast(v_mem_mb * 0.00625 as int),50); HAGR_BUF_SIZE :=GREAT(cast(v_mem_mb * 0.00625 as int),50); END IF; v_mem_mb := v_mem_mb * (mem_per/100.0); --修改内存 SP_SET_PARA_VALUE(2,'MAX_OS_MEMORY', mem_per); SP_SET_PARA_VALUE(2,'MEMORY_POOL', MEMORY_POOL); SP_SET_PARA_VALUE(2,'BUFFER', BUFFER); SP_SET_PARA_VALUE(2,'MAX_BUFFER', MAX_BUFFER); SP_SET_PARA_VALUE(2,'BUFFER_POOLS', BUFFER_POOLS); SP_SET_PARA_VALUE(2,'RECYCLE', RECYCLE); SP_SET_PARA_VALUE(2,'RECYCLE_POOLS', RECYCLE_POOLS); SP_SET_PARA_VALUE(2,'HJ_BUF_GLOBAL_SIZE', HJ_BUF_GLOBAL_SIZE); SP_SET_PARA_VALUE(2,'HJ_BUF_SIZE', HJ_BUF_SIZE ); SP_SET_PARA_VALUE(2,'HAGR_BUF_GLOBAL_SIZE',HAGR_BUF_GLOBAL_SIZE); SP_SET_PARA_VALUE(2,'HAGR_BUF_SIZE', HAGR_BUF_SIZE ); SP_SET_PARA_VALUE(2,'RLOG_POOL_SIZE', RLOG_POOL_SIZE); SP_SET_PARA_VALUE(2,'SESS_POOL_SIZE', SESS_POOL_SIZE); SP_SET_PARA_VALUE(2,'CACHE_POOL_SIZE', CACHE_POOL_SIZE); SP_SET_PARA_VALUE(2,'DICT_BUF_SIZE', DICT_BUF_SIZE); SP_SET_PARA_VALUE(2,'SORT_FLAG',SORT_FLAG); SP_SET_PARA_VALUE(2,'SORT_BLK_SIZE',SORT_BLK_SIZE); SP_SET_PARA_VALUE(2,'SORT_BUF_SIZE', SORT_BUF_SIZE); SP_SET_PARA_VALUE(2,'SORT_BUF_GLOBAL_SIZE', SORT_BUF_GLOBAL_SIZE); SP_SET_PARA_VALUE(2,'USE_PLN_POOL', 1); SP_SET_PARA_VALUE(2,'OLAP_FLAG', 2); SP_SET_PARA_VALUE(2,'VIEW_PULLUP_FLAG', 1); SP_SET_PARA_VALUE(2,'ENABLE_MONITOR', 1); SP_SET_PARA_VALUE(2,'SVR_LOG', 0); SP_SET_PARA_VALUE(2,'TEMP_SIZE', 1024); SP_SET_PARA_VALUE(2,'TEMP_SPACE_LIMIT', 20480); SP_SET_PARA_VALUE(2,'MAX_SESSIONS', 1500); SP_SET_PARA_VALUE(2,'MAX_SESSION_STATEMENT', 20000); SP_SET_PARA_VALUE(1,'PK_WITH_CLUSTER', 0); SP_SET_PARA_VALUE(2,'ENABLE_ENCRYPT',0); SP_SET_PARA_VALUE(2,'OPTIMIZER_MODE',1); end; / |
Leave a Reply
You must be logged in to post a comment.