达梦数据库学习笔记 — 达梦数据库IO写入性能与Partial write
本站文章除注明转载外,均为本站原创: 转载自love wife love life —Roger的Oracle/MySQL/PostgreSQL数据恢复博客
我们知道达梦数据库默认有多个IO线程,负责数据落盘写入,如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
[dmdba@ecs-1461 ~]$ disql sysdba/roger007 Server[LOCALHOST:5236]:mode is normal, state is open login used time : 23.694(ms) disql V8 SQL> SQL> set LINESHOW off SQL> set pagesize 100 SQL> select id,name,THREAD_DESC from v$threads where name like '%io%'; ID NAME THREAD_DESC -------------------- ---------- ------------------------------------ 194377 dm_io_thd IO thread 194378 dm_io_thd IO thread 194379 dm_io_thd IO thread 194380 dm_io_thd IO thread 194391 dm_hio_thd IO thread for HFS to read data pages 194390 dm_hio_thd IO thread for HFS to read data pages 194389 dm_hio_thd IO thread for HFS to read data pages 194388 dm_hio_thd IO thread for HFS to read data pages 8 rows got used time: 8.095(ms). Execute id is 537102. |
接下来我们分别准备几个会话窗口来进行一些IO写入测试和trace跟踪,观察一下IO写入情况:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
+++session 1 SQL> create table test20211020 as select * from dba_objects; executed successfully used time: 202.155(ms). Execute id is 537105. SQL> insert into test20211020 select * from test20211020; affect rows 843 used time: 5.729(ms). Execute id is 537106. SQL> / affect rows 1686 used time: 23.849(ms). Execute id is 537107. SQL> / affect rows 3372 used time: 68.884(ms). Execute id is 537108. SQL> / affect rows 6744 used time: 62.430(ms). Execute id is 537109. SQL> / affect rows 13488 used time: 176.609(ms). Execute id is 537110. SQL> / affect rows 26976 used time: 233.167(ms). Execute id is 537111. SQL> commit; executed successfully used time: 7.788(ms). Execute id is 537112. SQL> insert into test20211020 select * from test20211020; affect rows 53952 used time: 442.755(ms). Execute id is 537113. SQL> / affect rows 107904 used time: 00:00:01.291. Execute id is 537114. SQL> / affect rows 215808 used time: 00:00:12.630. Execute id is 537115. SQL> commit; executed successfully used time: 13.885(ms). Execute id is 537116. SQL> select checkpoint(100); CHECKPOINT(100) --------------- 0 used time: 00:00:07.064. Execute id is 537117. SQL> select SEGMENT_NAME,HEADER_FILE,TABLESPACE_NAME from dba_Segments 2 where segment_name=upper('test20211020'); SEGMENT_NAME HEADER_FILE TABLESPACE_NAME ------------ ----------- --------------- TEST20211020 0 MAIN used time: 697.673(ms). Execute id is 537119. SQL> |
再另一窗口使用strace先跟踪一下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 |
+++ session 2 [root@ecs-1461 ~]# strace -T -tt -f -e trace=read,open,write,pwrite64,pread64 -o /tmp/194377.log -p 194377 strace: Process 194377 attached with 39 threads ^Cstrace: Process 194377 detached strace: Process 194372 detached strace: Process 194376 detached strace: Process 194378 detached strace: Process 194379 detached strace: Process 194380 detached strace: Process 194381 detached strace: Process 194382 detached strace: Process 194383 detached strace: Process 194386 detached strace: Process 194387 detached strace: Process 194388 detached strace: Process 194389 detached strace: Process 194390 detached strace: Process 194391 detached strace: Process 194395 detached strace: Process 194396 detached strace: Process 194397 detached strace: Process 194398 detached strace: Process 194399 detached strace: Process 194400 detached strace: Process 194401 detached strace: Process 194402 detached strace: Process 194403 detached strace: Process 194404 detached strace: Process 194405 detached strace: Process 194406 detached strace: Process 194407 detached strace: Process 194408 detached strace: Process 194409 detached strace: Process 194410 detached strace: Process 194411 detached strace: Process 194412 detached strace: Process 194413 detached strace: Process 194414 detached strace: Process 194415 detached strace: Process 194416 detached strace: Process 194417 detached strace: Process 491351 detached [root@ecs-1461 ~]# cd /proc/194372/fd [root@ecs-1461 fd]# ls -ltr total 0 lrwx------ 1 dmdba dinstall 64 Sep 13 15:20 4 -> 'socket:[1229039]' lrwx------ 1 dmdba dinstall 64 Oct 20 10:35 9 -> /opt/dm/dmdbms/data/enmotech/enmotech03.log lrwx------ 1 dmdba dinstall 64 Oct 20 10:35 8 -> /opt/dm/dmdbms/data/enmotech/enmotech02.log lrwx------ 1 dmdba dinstall 64 Oct 20 10:35 7 -> /opt/dm/dmdbms/data/enmotech/enmotech01.log lrwx------ 1 dmdba dinstall 64 Oct 20 10:35 6 -> /opt/dm/dmdbms/data/enmotech/TEMP.DBF lrwx------ 1 dmdba dinstall 64 Oct 20 10:35 5 -> /opt/dm/dmdbms/data/enmotech/SYSTEM.DBF lrwx------ 1 dmdba dinstall 64 Oct 20 10:35 32 -> '/data/dmarch/ARCHIVE_LOCAL2_0x2997A4FC[0]_2021-10-20_21-01-03.log' lrwx------ 1 dmdba dinstall 64 Oct 20 10:35 31 -> '/opt/dm/dmarch/ARCHIVE_LOCAL1_0x2997A4FC[0]_2021-10-20_21-01-03.log' l-wx------ 1 dmdba dinstall 64 Oct 20 10:35 30 -> 'pipe:[1229052]' lrwx------ 1 dmdba dinstall 64 Oct 20 10:35 3 -> 'socket:[1229038]' lr-x------ 1 dmdba dinstall 64 Oct 20 10:35 29 -> 'pipe:[1229052]' l-wx------ 1 dmdba dinstall 64 Oct 20 10:35 28 -> 'pipe:[1229051]' lr-x------ 1 dmdba dinstall 64 Oct 20 10:35 27 -> 'pipe:[1229051]' l-wx------ 1 dmdba dinstall 64 Oct 20 10:35 26 -> 'pipe:[1229050]' lr-x------ 1 dmdba dinstall 64 Oct 20 10:35 25 -> 'pipe:[1229050]' l-wx------ 1 dmdba dinstall 64 Oct 20 10:35 24 -> 'pipe:[1229049]' lr-x------ 1 dmdba dinstall 64 Oct 20 10:35 23 -> 'pipe:[1229049]' l-wx------ 1 dmdba dinstall 64 Oct 20 10:35 22 -> 'pipe:[1229048]' lr-x------ 1 dmdba dinstall 64 Oct 20 10:35 21 -> 'pipe:[1229048]' l-wx------ 1 dmdba dinstall 64 Oct 20 10:35 20 -> 'pipe:[1229047]' l-wx------ 1 dmdba dinstall 64 Oct 20 10:35 2 -> /opt/dmdbms/log/DmServicedmdb.log lr-x------ 1 dmdba dinstall 64 Oct 20 10:35 19 -> 'pipe:[1229047]' l-wx------ 1 dmdba dinstall 64 Oct 20 10:35 18 -> 'pipe:[1229046]' lr-x------ 1 dmdba dinstall 64 Oct 20 10:35 17 -> 'pipe:[1229046]' l-wx------ 1 dmdba dinstall 64 Oct 20 10:35 16 -> 'pipe:[1229045]' lr-x------ 1 dmdba dinstall 64 Oct 20 10:35 15 -> 'pipe:[1229045]' lrwx------ 1 dmdba dinstall 64 Oct 20 10:35 14 -> /opt/dm/dmdbms/data/enmotech/SYSAWR.DBF lrwx------ 1 dmdba dinstall 64 Oct 20 10:35 13 -> /opt/dm/dmdbms/data/enmotech/MAIN.DBF lrwx------ 1 dmdba dinstall 64 Oct 20 10:35 12 -> /opt/dm/dmdbms/data/enmotech/ROLL.DBF lrwx------ 1 dmdba dinstall 64 Oct 20 10:35 11 -> /opt/dm/dmdbms/data/enmotech/enmotech05.log lrwx------ 1 dmdba dinstall 64 Oct 20 10:35 10 -> /opt/dm/dmdbms/data/enmotech/enmotech04.log l-wx------ 1 dmdba dinstall 64 Oct 20 10:35 1 -> /opt/dmdbms/log/DmServicedmdb.log lr-x------ 1 dmdba dinstall 64 Oct 20 10:35 0 -> /dev/null lrwx------ 1 dmdba dinstall 64 Oct 20 21:39 33 -> 'socket:[2373584]' [root@ecs-1461 ~]# cat /tmp/194377.log |grep 'pwrite64(13'|head -5 194377 21:37:01.951610 pwrite64(13, "\4\0\0\0 \1\0\0\0\0\37\1\0\0\0\0!\1\0\0\24\0\0\0\0\0\0\0\210MS*"..., 8192, 2359296) = 8192 <0.000034> 194380 21:37:01.953046 pwrite64(13, "\4\0\0\0\37\1\0\0\0\0\36\1\0\0\0\0 \1\0\0\24\0\0\0\0\0\0\0\206MS*"..., 8192, 2351104) = 8192 <0.000045> 194378 21:37:01.953172 pwrite64(13, "\4\0\0\0!\1\0\0\0\0 \1\0\0\0\0\"\1\0\0\24\0\0\0\0\0\0\0\212MS*"..., 8192, 2367488) = 8192 <0.000025> 194379 21:37:01.953257 pwrite64(13, "\4\0\0\0\"\1\0\0\0\0!\1\0\0\0\0#\1\0\0\24\0\0\0\0\0\0\0\214MS*"..., 8192, 2375680) = 8192 <0.000020> 194380 21:37:01.953331 pwrite64(13, "\4\0\0\0#\1\0\0\0\0\"\1\0\0\0\0$\1\0\0\24\0\0\0\0\0\0\0\216MS*"..., 8192, 2383872) = 8192 <0.000018> [root@ecs-1461 ~]# [root@ecs-1461 ~]# cat /tmp/194377.log |grep 'pwrite64(13'|awk '{print $1}'|sort|uniq 194377 194378 194379 194380 [root@ecs-1461 ~]# [root@ecs-1461 ~]# cat /tmp/194377.log |grep 194377|grep 'pwrite64(13'|more 194377 21:37:01.951610 pwrite64(13, "\4\0\0\0 \1\0\0\0\0\37\1\0\0\0\0!\1\0\0\24\0\0\0\0\0\0\0\210MS*"..., 8192, 2359296) = 8192 <0.000034> 194377 21:37:01.958897 pwrite64(13, "\4\0\0\0$\1\0\0\0\0#\1\0\0\0\0%\1\0\0\24\0\0\0\0\0\0\0\220MS*"..., 8192, 2392064) = 8192 <0.000039> 194377 21:37:01.959169 pwrite64(13, "\4\0\0\0(\1\0\0\0\0'\1\0\0\0\0)\1\0\0\24\0\0\0\0\0\0\0\236MS*"..., 8192, 2424832) = 8192 <0.000031> 194377 21:37:01.959492 pwrite64(13, "\4\0\0\0\3\1\0\0\0\0\2\1\0\0\0\0\4\1\0\0\24\0\0\0\0\0\0\0@MS*"..., 8192, 2121728) = 8192 <0.000020> 194377 21:37:01.959768 pwrite64(13, "\4\0\0\0\7\1\0\0\0\0\6\1\0\0\0\0\10\1\0\0\24\0\0\0\0\0\0\0HMS*"..., 8192, 2154496) = 8192 <0.000018> 194377 21:37:01.960049 pwrite64(13, "\4\0\0\0\v\1\0\0\0\0\n\1\0\0\0\0\f\1\0\0\24\0\0\0\0\0\0\0PMS*"..., 8192, 2187264) = 8192 <0.000019> 194377 21:37:01.960323 pwrite64(13, "\4\0\0\0\17\1\0\0\0\0\16\1\0\0\0\0\20\1\0\0\24\0\0\0\0\0\0\0_MS*"..., 8192, 2220032) = 8192 <0.000019> 194377 21:37:01.960609 pwrite64(13, "\4\0\0\0\23\1\0\0\0\0\22\1\0\0\0\0\24\1\0\0\24\0\0\0\0\0\0\0gMS*"..., 8192, 2252800) = 8192 <0.000019> 194377 21:37:01.960883 pwrite64(13, "\4\0\0\0\27\1\0\0\0\0\26\1\0\0\0\0\30\1\0\0\24\0\0\0\0\0\0\0oMS*"..., 8192, 2285568) = 8192 <0.000019> 194377 21:37:01.961159 pwrite64(13, "\4\0\0\0\34\1\0\0\0\0\33\1\0\0\0\0\35\1\0\0\24\0\0\0\0\0\0\0\177MS*"..., 8192, 2326528) = 8192 <0.000018> 194377 21:37:02.952345 pwrite64(13, "\4\0\0\0I\1\0\0\0\0H\1\0\0\0\0J\1\0\0\24\0\0\0\0\0\0\0\364MS*"..., 8192, 2695168) = 8192 <0.000022> 194377 21:37:02.952657 pwrite64(13, "\4\0\0\0M\1\0\0\0\0L\1\0\0\0\0N\1\0\0\24\0\0\0\0\0\0\0\374MS*"..., 8192, 2727936) = 8192 <0.000020> 194377 21:37:02.952939 pwrite64(13, "\4\0\0\0Q\1\0\0\0\0P\1\0\0\0\0R\1\0\0\24\0\0\0\0\0\0\0\5NS*"..., 8192, 2760704) = 8192 <0.000019> 194377 21:37:02.953657 pwrite64(13, "\4\0\0\0,\1\0\0\0\0+\1\0\0\0\0-\1\0\0\24\0\0\0\0\0\0\0\246MS*"..., 8192, 2457600 <unfinished ...> 194377 21:37:02.954023 pwrite64(13, "\4\0\0\0000\1\0\0\0\0/\1\0\0\0\0001\1\0\0\24\0\0\0\0\0\0\0\257MS*"..., 8192, 2490368) = 8192 <0.000023> 194377 21:37:02.954301 pwrite64(13, "\4\0\0\0004\1\0\0\0\0003\1\0\0\0\0005\1\0\0\24\0\0\0\0\0\0\0\275MS*"..., 8192, 2523136) = 8192 <0.000018> 194377 21:37:02.954574 pwrite64(13, "\4\0\0\08\1\0\0\0\0007\1\0\0\0\09\1\0\0\24\0\0\0\0\0\0\0\305MS*"..., 8192, 2555904) = 8192 <0.000023> 194377 21:37:02.954880 pwrite64(13, "\4\0\0\0<\1\0\0\0\0;\1\0\0\0\0=\1\0\0\24\0\0\0\0\0\0\0\315MS*"..., 8192, 2588672) = 8192 <0.000020> 194377 21:37:02.955179 pwrite64(13, "\4\0\0\0@\1\0\0\0\0?\1\0\0\0\0A\1\0\0\24\0\0\0\0\0\0\0\334MS*"..., 8192, 2621440) = 8192 <0.000029> 194377 21:37:02.955494 pwrite64(13, "\4\0\0\0E\1\0\0\0\0D\1\0\0\0\0F\1\0\0\24\0\0\0\0\0\0\0\346MS*"..., 8192, 2662400) = 8192 <0.000019> 194377 21:37:03.955655 pwrite64(13, "\4\0\0\0r\1\0\0\0\0q\1\0\0\0\0s\1\0\0\24\0\0\0\0\0\0\0[NS*"..., 8192, 3031040) = 8192 <0.000021> 194377 21:37:03.955947 pwrite64(13, "\4\0\0\0v\1\0\0\0\0u\1\0\0\0\0w\1\0\0\24\0\0\0\0\0\0\0cNS*"..., 8192, 3063808) = 8192 <0.000020> 194377 21:37:03.956232 pwrite64(13, "\4\0\0\0z\1\0\0\0\0y\1\0\0\0\0{\1\0\0\24\0\0\0\0\0\0\0qNS*"..., 8192, 3096576) = 8192 <0.000020> 194377 21:37:03.956509 pwrite64(13, "\4\0\0\0U\1\0\0\0\0T\1\0\0\0\0V\1\0\0\24\0\0\0\0\0\0\0\23NS*"..., 8192, 2793472) = 8192 <0.000019> 194377 21:37:03.956786 pwrite64(13, "\4\0\0\0Y\1\0\0\0\0X\1\0\0\0\0Z\1\0\0\24\0\0\0\0\0\0\0\33NS*"..., 8192, 2826240) = 8192 <0.000019> 194377 21:37:03.957062 pwrite64(13, "\4\0\0\0]\1\0\0\0\0\\\1\0\0\0\0^\1\0\0\24\0\0\0\0\0\0\0#NS*"..., 8192, 2859008) = 8192 <0.000019> ....... |
从上面的trace跟踪结果来看,IO线程的写入均为8192,即达梦数据库默认的page size大小;说明达梦数据库不会对IO进行合并。
我们进一步观察一下checkpoint和redo log刷新线程:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
SQL> select * from v$rlogfile; GROUP_ID FILE_ID PATH CLIENT_PATH CREATE_TIME RLOG_SIZE MIN_EXEC_VER MIN_DCT_VER ----------- ----------- ------------------------------------------- -------------- -------------------------- -------------------- ------------ ----------- 2 0 /opt/dm/dmdbms/data/enmotech/enmotech01.log enmotech01.log 2021-09-03 10:41:47.000000 268435456 V8.1.1.1 4 2 1 /opt/dm/dmdbms/data/enmotech/enmotech02.log enmotech02.log 2021-09-03 10:41:47.000000 268435456 V8.1.1.1 4 2 2 /opt/dm/dmdbms/data/enmotech/enmotech03.log enmotech03.log 2021-09-03 13:20:54.000000 268435456 V8.1.1.1 4 2 3 /opt/dm/dmdbms/data/enmotech/enmotech04.log enmotech04.log 2021-09-03 13:21:07.000000 268435456 V8.1.1.1 4 2 4 /opt/dm/dmdbms/data/enmotech/enmotech05.log enmotech05.log 2021-09-03 13:21:18.000000 268435456 V8.1.1.1 4 used time: 0.511(ms). Execute id is 537124. SQL> SQL> select CKPT_LSN,FILE_LSN,CUR_LSN,NEXT_SEQ,DB_MAGIC,FLUSH_PAGES,CUR_FILE,CUR_OFFSET from v$rlog; CKPT_LSN FILE_LSN CUR_LSN NEXT_SEQ DB_MAGIC FLUSH_PAGES CUR_FILE CUR_OFFSET -------------------- -------------------- -------------------- -------------------- -------------------- ----------- ----------- -------------------- 710137448 710357694 710357694 831394 697804028 0 4 77251584 used time: 1.307(ms). Execute id is 537126. SQL> |
下面是checkpoint 线程的写情况:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
[root@ecs-1461 ~]# cat /tmp/194377.log |grep 194386|grep write|more 194386 21:37:14.995009 write(2, "checkpoint begin, used_space[105"..., 67) = 67 <0.000015> 194386 21:37:14.995247 write(35, "2021-10-20 21:37:14.995 [INFO] d"..., 141) = 141 <0.000013> 194386 21:37:18.342807 pwrite64(7, "DMRLOG7006\360;\216@\0\0\0\0\7\0\0\0\374\244\227)\360SS*\0\0"..., 4096, 0 <unfinished ...> 194386 21:37:18.352600 <... pwrite64 resumed>) = 4096 <0.009739> 194386 21:37:18.481562 write(34, "2021-10-20 21:37:18.387 [INFO] d"..., 126) = 126 <0.005925> 194386 21:37:18.595245 write(34, "2021-10-20 21:37:18.501 [INFO] d"..., 282) = 282 <0.006232> 194386 21:37:19.457746 write(2, "checkpoint end, 162 pages flushe"..., 81) = 81 <0.006132> 194386 21:37:19.563914 write(34, "2021-10-20 21:37:19.467 [INFO] d"..., 155) = 155 <0.005379> 194386 21:38:12.301514 write(2, "checkpoint begin, used_space[350"..., 66) = 66 <0.005652> 194386 21:38:12.400937 write(34, "2021-10-20 21:38:12.311 [INFO] d"..., 140) = 140 <0.005790> 194386 21:38:13.334158 pwrite64(7, "DMRLOG7006\253\5\216\16\0\0\0\0\7\0\0\0\374\244\227)\224lS*\0\0"..., 4096, 0) = 4096 <0.000031> 194386 21:38:13.346433 write(34, "2021-10-20 21:38:13.346 [INFO] d"..., 126) = 126 <0.000031> 194386 21:38:13.346645 write(34, "2021-10-20 21:38:13.346 [INFO] d"..., 283) = 283 <0.000011> |
接下来看看redo log写进程:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
[root@ecs-1461 ~]# cat /tmp/194377.log |grep 194387|grep write|tail -20 194387 21:37:28.993519 pwrite64(11, "\0~\1\0t|\1\0\337\233W\23\374\244\227)\1\0\0\0\0\0\30\257\f\0\0\0\0\0\30\257"..., 97792, 39133696) = 97792 <0.000041> 194387 21:37:29.001324 pwrite64(11, "\0z\1\0\265x\1\0\337\233W\23\374\244\227)\1\0\0\0\0\0\31\257\f\0\0\0\0\0\31\257"..., 96768, 39231488) = 96768 <0.000046> 194387 21:37:29.205380 pwrite64(11, "\0004\1\0\3473\1\0\337\233W\23\374\244\227)\1\0\0\0\0\0\32\257\f\0\0\0\0\0\32\257"..., 78848, 39328256) = 78848 <0.000055> 194387 21:37:32.103943 pwrite64(11, "\0\16\0\0\220\f\0\0\337\233W\23\374\244\227)\1\0\0\0\0\0\33\257\f\0\0\0\0\0\33\257"..., 3584, 39407104) = 3584 <0.000028> 194387 21:37:32.211396 pwrite64(11, "\0\2\0\0\23\1\0\0\337\233W\23\374\244\227)\1\0\0\0\0\0\34\257\f\0\0\0\0\0\34\257"..., 512, 39410688) = 512 <0.000037> 194387 21:38:13.275938 pwrite64(11, "\0\250\0\0`\247\0\0\337\233W\23\374\244\227)\1\0\0\0\0\0\35\257\f\0\0\0\0\0\35\257"..., 43008, 39411200) = 43008 <0.000030> 194387 21:38:13.374092 pwrite64(11, "\0\16\0\0\230\f\0\0\337\233W\23\374\244\227)\1\0\0\0\0\0\36\257\f\0\0\0\0\0\36\257"..., 3584, 39454208) = 3584 <0.000018> 194387 21:38:13.470289 pwrite64(11, "\0\16\0\0\200\f\0\0\337\233W\23\374\244\227)\1\0\0\0\0\0\37\257\f\0\0\0\0\0\37\257"..., 3584, 39457792) = 3584 <0.000018> 194387 21:38:13.567063 pwrite64(11, "\0\16\0\0\200\f\0\0\337\233W\23\374\244\227)\1\0\0\0\0\0 \257\f\0\0\0\0\0 \257"..., 3584, 39461376) = 3584 <0.000014> 194387 21:38:13.732557 pwrite64(11, "\0\16\0\0\200\f\0\0\337\233W\23\374\244\227)\1\0\0\0\0\0!\257\f\0\0\0\0\0!\257"..., 3584, 39464960) = 3584 <0.005734> 194387 21:38:13.952807 pwrite64(11, "\0\16\0\0\230\f\0\0\337\233W\23\374\244\227)\1\0\0\0\0\0\"\257\f\0\0\0\0\0\"\257"..., 3584, 39468544) = 3584 <0.006052> 194387 21:38:14.301298 pwrite64(11, "\0\16\0\0\230\f\0\0\337\233W\23\374\244\227)\1\0\0\0\0\0#\257\f\0\0\0\0\0#\257"..., 3584, 39472128) = 3584 <0.005806> 194387 21:38:14.481043 pwrite64(11, "\0\16\0\0\200\f\0\0\337\233W\23\374\244\227)\1\0\0\0\0\0$\257\f\0\0\0\0\0$\257"..., 3584, 39475712) = 3584 <0.005432> 194387 21:38:14.675334 pwrite64(11, "\0\16\0\0\260\f\0\0\337\233W\23\374\244\227)\1\0\0\0\0\0%\257\f\0\0\0\0\0%\257"..., 3584, 39479296) = 3584 <0.005809> 194387 21:38:15.248544 pwrite64(11, "\0\16\0\0\200\f\0\0\337\233W\23\374\244\227)\1\0\0\0\0\0&\257\f\0\0\0\0\0&\257"..., 3584, 39482880) = 3584 <0.005697> 194387 21:38:18.179307 pwrite64(11, "\0\16\0\0\200\f\0\0\337\233W\23\374\244\227)\1\0\0\0\0\0'\257\f\0\0\0\0\0'\257"..., 3584, 39486464) = 3584 <0.005788> 194387 21:38:18.418174 pwrite64(11, "\0\16\0\0\200\f\0\0\337\233W\23\374\244\227)\1\0\0\0\0\0(\257\f\0\0\0\0\0(\257"..., 3584, 39490048) = 3584 <0.005659> 194387 21:38:18.634811 pwrite64(11, "\0\16\0\0\230\f\0\0\337\233W\23\374\244\227)\1\0\0\0\0\0)\257\f\0\0\0\0\0)\257"..., 3584, 39493632) = 3584 <0.005697> 194387 21:38:18.866807 pwrite64(11, "\0\16\0\0\200\f\0\0\337\233W\23\374\244\227)\1\0\0\0\0\0*\257\f\0\0\0\0\0*\257"..., 3584, 39497216) = 3584 <0.003993> 194387 21:38:19.097366 pwrite64(11, "\0\6\0\0H\5\0\0\337\233W\23\374\244\227)\1\0\0\0\0\0+\257\f\0\0\0\0\0+\257"..., 1536, 39500800) = 1536 <0.000028> |
从上面信息来看,redo log的写入是会进行IO合并的,最小IO粒度是512,也就是redo log block size大小。同时也可以看到,触发检查点之后,检查点线程会更新其他非current redo,这一点跟Oracle类似。
最后我们再通过blktrace来观察一下操作系统底层的IO情况。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 |
SQL> delete from test20211020; affect rows 441615 used time: 00:00:01.933. Execute id is 537130. SQL> commit; executed successfully used time: 5.456(ms). Execute id is 537131. SQL> select checkpoint(100); CHECKPOINT(100) --------------- 0 used time: 00:00:02.669. Execute id is 537132. SQL> [root@ecs-1461 ~]# blktrace -d /dev/vda2 ^C=== vda2 === CPU 0: 373 events, 18 KiB data CPU 1: 0 events, 0 KiB data CPU 2: 10 events, 1 KiB data CPU 3: 146 events, 7 KiB data CPU 4: 137 events, 7 KiB data CPU 5: 0 events, 0 KiB data CPU 6: 0 events, 0 KiB data CPU 7: 0 events, 0 KiB data Total: 666 events (dropped 0), 32 KiB data [root@ecs-1461 ~]# [root@ecs-1461 ~]# blkparse -i vda2.blktrace.0 > vda2.blktrace.prase [root@ecs-1461 ~]# btt -i vda2.blktrace.0 -B vda2.offset ==================== All Devices ==================== ALL MIN AVG MAX N --------------- ------------- ------------- ------------- ----------- Q2Qdm 0.000001270 0.025478517 9.458378019 574 Q2Cdm 0.000617186 0.001875641 0.009620685 568 Q2G 0.000000380 0.000032536 0.006712866 493 G2I 0.000000570 0.000043607 0.001057280 493 Q2M 0.000000440 0.000001368 0.000002190 110 I2D 0.000000700 0.000076288 0.000966670 458 M2D 0.000002320 0.000020718 0.000285483 110 D2C 0.000493425 0.001762272 0.009595925 568 ....... |
由于产生的trace文件内容较多,尝试过滤看看:
1 2 3 4 5 6 7 8 9 |
[root@ecs-1461 ~]# cat vda2.blktrace.prase |grep 194377 | more 253,0 0 2872 15.274699168 194377 A R 56231872 + 8 <- (253,2) 54132672 253,2 0 2873 15.274700258 194377 Q R 56231872 + 8 [dm_io_thd] 253,2 0 2874 15.274703968 194377 G R 56231872 + 8 [dm_io_thd] 253,2 0 2875 15.274705808 194377 I R 56231872 + 8 [dm_io_thd] 253,0 0 3325 15.395695171 194377 A R 58115968 + 8 <- (253,2) 56016768 253,2 0 3326 15.395696181 194377 Q R 58115968 + 8 [dm_io_thd] 253,2 0 3327 15.395699231 194377 G R 58115968 + 8 [dm_io_thd] 253,2 0 3328 15.395701601 194377 I R 58115968 + 8 [dm_io_thd] |
这里我们直接搜索IO线程:
1 2 3 4 5 6 7 8 9 10 11 12 |
搜索关于194377 线程的完整信息: 253,0 0 2872 15.274699168 194377 A R 56231872 + 8 <- (253,2) 54132672 253,2 0 2873 15.274700258 194377 Q R 56231872 + 8 [dm_io_thd] 253,2 0 2874 15.274703968 194377 G R 56231872 + 8 [dm_io_thd] 253,2 0 2875 15.274705808 194377 I R 56231872 + 8 [dm_io_thd]。 +++ 56231872表示偏移量 8 表示操作block个数 253,2 0 2876 15.274715948 451 D R 56231872 + 8 [kworker/0:1H] 253,2 0 2877 15.275371584 492322 C R 56231872 + 8 [0] 253,2 0 2878 15.275700457 194380 Q FWS [dm_io_thd] 253,2 0 2879 15.275703268 194380 G FWS [dm_io_thd] 253,2 0 2880 15.275715288 451 D FN [kworker/0:1H] 253,2 0 2881 15.275797308 451 C FN 0 [0] 253,2 0 2882 15.275798608 451 C WS 0 [0]。 ++ C表示IO完成 |
通过btt分析之后,会产生dat文件,进一步分析dat文件。
1 2 3 4 5 |
[root@ecs-1461 ~]# cat vda2.offset_253,2_c.dat|grep 58115968 7985276.944615264 58115968 58115976 [root@ecs-1461 ~]# [root@ecs-1461 ~]# cat vda2.offset_253,2_c.dat|grep 56231872 7985276.823620591 56231872 56231880 |
这里58115968 58115976 相减为8. 说明这一步IO操作,共进行了8次。其中56231872到56231880 也是8次IO。整个一共进行了16次IO。
1 2 |
[root@ecs-1461 ~]# cat /sys/block/vda/queue/physical_block_size 512 |
最后简单总结一下:
1、 从前面测试来看,达梦数据库IO线程在进行数据落盘写入时,是以page size大小为单位的,而且是单块写入,每次只能写1个。
当然在操作系统层来看,一次数据库最小IO(8k),则需要完成16次IO操作。 这应该也是一个较大的性能瓶颈点;这或许也算是达梦数据库为什么性能不是太强悍的重要原因之一。
2、 从测试来看,达梦数据库的机制也存在大多数数据库都存在的partial write(page断裂)问题,再加上达梦没有类似MySQL或MogDB一样的Double Write机制,因此这算是一个数据安全隐患。
3、Redo log线程在写入日志时会进行IO合并。
Leave a Reply
You must be logged in to post a comment.