接着上一章,我们这里结合osd3_watch.txt日志文件,以及pgmap_active_clean.txt、pgmap_in_down.txt、pgmap_out_down.txt,从中选出4个具有代表性的PG,来分析一下osd0从in+upin+down再到out+down这一整个过程中PG所执行的动作。

选取的4个PG如下:

    in + up                                                        in + down                                                       out + down
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
pg_stat up      up_primary	acting	acting_primary      pg_stat  up     up_primary	acting	acting_primary       pg_stat    up      up_primary	acting	acting_primary

11.4    [0,3]   0           [0,3]	0                   11.4	 [3]    3	        [3]	    3                    11.4	    [3,2]	3	        [3,2]	3

22.2c   [0,3]   0	        [0,3]	0                   22.2c	 [3]    3	        [3]	    3                    22.2c	    [5,7]	5	        [5,7]	5

22.2a   [3,0]   3	        [3,0]	3                   22.2a	 [3]    3	        [3]	    3                    22.2a	    [3,6]	3	        [3,6]	3

22.16   [3,7]   3	        [3,7]	3                   22.16	 [3,7]	3       	[3,7]	3                    22.16	    [3,7]	3	        [3,7]	3

上面4个PG代表4种典型的场景:

  • PG 11.4 : PG的主osd关闭,在osd out之后,PG的其中一个副本进行remap

  • PG 22.2c: PG的主OSD关闭,在osd out之后,PG的两个副本进行remap

  • PG 22.2a: PG的副本OSD关闭,在osd out之后,PG的其中一个副本进行remap

  • PG 22.16: 关闭的OSD并不是PG的任何副本

使用如下命令从osd3_watch.txt中分别导出该PG相关的日志信息:

# grep -rnw "11.4" ./osd3_watch.txt > ./osd3_watch_pg_11.4.txt
# grep -rnw "22.2c" ./osd3_watch.txt > ./osd3_watch_pg_22.2c.txt
# grep -rnw "22.2a" ./osd3_watch.txt > ./osd3_watch_pg_22.2a.txt
# grep -rnw "22.16" ./osd3_watch.txt > ./osd3_watch_pg_22.16.txt
# ls -al
总用量 32900
drwxr-xr-x  2 root root     4096 9月  12 19:59 .
dr-xr-x---. 8 root root     4096 9月  11 11:32 ..
-rw-r--r--  1 root root   170772 9月  12 19:58 osd3_watch_pg_11.4.txt
-rw-r--r--  1 root root    39861 9月  12 19:59 osd3_watch_pg_22.16.txt
-rw-r--r--  1 root root   127958 9月  12 19:59 osd3_watch_pg_22.2a.txt
-rw-r--r--  1 root root    96501 9月  12 19:59 osd3_watch_pg_22.2c.txt
-rw-r--r--  1 root root 33228328 9月  11 14:19 osd3_watch.txt

如下我们接着上一篇文章,分析在将osd0踢出osdmap之后,PG 11.4所执行的相关动作。

1. PG 11.4接收到新的osdmap

这里我们首先给出在这一阶段PG 11.4恢复过程的日志:

40998:2020-09-11 14:10:18.963214 7fba49ec6700 30 osd.3 pg_epoch: 2225 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2223 crt=201'1 lcod 0'0 mlcod 0'0 active+undersized+degraded] lock
41206:2020-09-11 14:10:18.964245 7fba49ec6700 30 osd.3 pg_epoch: 2225 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2223 crt=201'1 lcod 0'0 mlcod 0'0 active+undersized+degraded] lock
41208:2020-09-11 14:10:18.964251 7fba49ec6700 10 osd.3 pg_epoch: 2225 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2223 crt=201'1 lcod 0'0 mlcod 0'0 active+undersized+degraded] null
41632:2020-09-11 14:10:18.965606 7fba49ec6700 20 osd.3 2226 11.4 heartbeat_peers 3
42218:2020-09-11 14:10:18.974032 7fba3d925700 30 osd.3 pg_epoch: 2225 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2223 crt=201'1 lcod 0'0 mlcod 0'0 active+undersized+degraded] lock
42220:2020-09-11 14:10:18.974041 7fba3d925700 10 osd.3 pg_epoch: 2225 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2223 crt=201'1 lcod 0'0 mlcod 0'0 active+undersized+degraded] handle_advance_map [3,2]/[3] -- 3/3
42222:2020-09-11 14:10:18.974048 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2223 crt=201'1 lcod 0'0 mlcod 0'0 active+undersized+degraded] state<Started/Primary/Active>: Active advmap
42223:2020-09-11 14:10:18.974052 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2223 crt=201'1 lcod 0'0 mlcod 0'0 active+undersized+degraded] state<Started>: Started advmap
42224:2020-09-11 14:10:18.974056 7fba3d925700 20 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2223 crt=201'1 lcod 0'0 mlcod 0'0 active+undersized+degraded] new interval newup [3,2] newacting [3]
42225:2020-09-11 14:10:18.974060 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2223 crt=201'1 lcod 0'0 mlcod 0'0 active+undersized+degraded] state<Started>: should_restart_peering, transitioning to Reset
42226:2020-09-11 14:10:18.974064 7fba3d925700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2223 crt=201'1 lcod 0'0 mlcod 0'0 active+undersized+degraded] exit Started/Primary/Active/Clean 299.069829 4 0.000234
42227:2020-09-11 14:10:18.974070 7fba3d925700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2223 crt=201'1 lcod 0'0 mlcod 0'0 active+undersized+degraded] exit Started/Primary/Active 299.106820 0 0.000000
42228:2020-09-11 14:10:18.974074 7fba3d925700 20 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2223 crt=201'1 lcod 0'0 mlcod 0'0 active] agent_stop
42229:2020-09-11 14:10:18.974078 7fba3d925700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2223 crt=201'1 lcod 0'0 mlcod 0'0 active] exit Started/Primary 299.838624 0 0.000000
42230:2020-09-11 14:10:18.974082 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2223 crt=201'1 lcod 0'0 mlcod 0'0 active] clear_primary_state
42231:2020-09-11 14:10:18.974086 7fba3d925700 20 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2223 luod=0'0 crt=201'1 lcod 0'0 mlcod 0'0 active] agent_stop
42232:2020-09-11 14:10:18.974090 7fba3d925700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2223 luod=0'0 crt=201'1 lcod 0'0 mlcod 0'0 active] exit Started 299.838686 0 0.000000
42233:2020-09-11 14:10:18.974095 7fba3d925700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2223 luod=0'0 crt=201'1 lcod 0'0 mlcod 0'0 active] enter Reset
42234:2020-09-11 14:10:18.974099 7fba3d925700 20 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2223 luod=0'0 crt=201'1 lcod 0'0 mlcod 0'0 active] set_last_peering_reset 2226
42235:2020-09-11 14:10:18.974102 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2226 luod=0'0 crt=201'1 lcod 0'0 mlcod 0'0 active] Clearing blocked outgoing recovery messages
42236:2020-09-11 14:10:18.974106 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2226 luod=0'0 crt=201'1 lcod 0'0 mlcod 0'0 active] Not blocking outgoing recovery messages
42237:2020-09-11 14:10:18.974110 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2226 luod=0'0 crt=201'1 lcod 0'0 mlcod 0'0 active] state<Reset>: Reset advmap
42238:2020-09-11 14:10:18.974114 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2226 luod=0'0 crt=201'1 lcod 0'0 mlcod 0'0 active] _calc_past_interval_range start epoch 2224 >= end epoch 2223, nothing to do
42239:2020-09-11 14:10:18.974118 7fba3d925700 20 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2226 luod=0'0 crt=201'1 lcod 0'0 mlcod 0'0 active] new interval newup [3,2] newacting [3]
42240:2020-09-11 14:10:18.974122 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2226 luod=0'0 crt=201'1 lcod 0'0 mlcod 0'0 active] state<Reset>: should restart peering, calling start_peering_interval again
42241:2020-09-11 14:10:18.974125 7fba3d925700 20 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2226 luod=0'0 crt=201'1 lcod 0'0 mlcod 0'0 active] set_last_peering_reset 2226
42242:2020-09-11 14:10:18.974135 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 luod=0'0 crt=201'1 lcod 0'0 mlcod 0'0 active+remapped] start_peering_interval: check_new_interval output: generate_past_intervals interval(2223-2225 up [3](3) acting [3](3)): not rw, up_thru 2223 up_from 2123 last_epoch_clean 2224
42245:2020-09-11 14:10:18.974140 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 luod=0'0 crt=201'1 lcod 0'0 mlcod 0'0 active+remapped]  noting past interval(2223-2225 up [3](3) acting [3](3) maybe_went_rw)
42246:2020-09-11 14:10:18.974148 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 luod=0'0 crt=201'1 lcod 0'0 mlcod 0'0 active+remapped]  up [3] -> [3,2], acting [3] -> [3], acting_primary 3 -> 3, up_primary 3 -> 3, role 0 -> 0, features acting 576460752032874495 upacting 576460752032874495
42247:2020-09-11 14:10:18.974155 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] clear_primary_state
42248:2020-09-11 14:10:18.974161 7fba3d925700 20 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] agent_stop
42249:2020-09-11 14:10:18.974167 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] on_change
42250:2020-09-11 14:10:18.974171 7fba3d925700 15 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped]  requeue_ops 
42251:2020-09-11 14:10:18.974176 7fba3d925700 15 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] publish_stats_to_osd 2226:1334
42252:2020-09-11 14:10:18.974179 7fba3d925700 15 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped]  requeue_ops 
42253:2020-09-11 14:10:18.974183 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] cancel_copy_ops
42254:2020-09-11 14:10:18.974187 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] cancel_flush_ops
42255:2020-09-11 14:10:18.974190 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] cancel_proxy_ops
42256:2020-09-11 14:10:18.974194 7fba3d925700 15 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped]  requeue_ops 
42257:2020-09-11 14:10:18.974197 7fba3d925700 15 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped]  requeue_ops 
42258:2020-09-11 14:10:18.974201 7fba3d925700 15 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped]  requeue_ops 
42259:2020-09-11 14:10:18.974205 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] on_change_cleanup
42260:2020-09-11 14:10:18.974219 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] on_change
42261:2020-09-11 14:10:18.974223 7fba3d925700 20 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] exit NotTrimming
42262:2020-09-11 14:10:18.974227 7fba3d925700 20 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] enter NotTrimming
42263:2020-09-11 14:10:18.974231 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] [3] -> [3], replicas changed
42264:2020-09-11 14:10:18.974235 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] cancel_recovery
42265:2020-09-11 14:10:18.974238 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] clear_recovery_state
42266:2020-09-11 14:10:18.974242 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] check_recovery_sources no source osds () went down
42267:2020-09-11 14:10:18.974247 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] handle_activate_map 
42268:2020-09-11 14:10:18.974251 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] update_heartbeat_peers 3 -> 2,3
42270:2020-09-11 14:10:18.974256 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] take_waiters
42271:2020-09-11 14:10:18.974260 7fba3d925700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] exit Reset 0.000165 1 0.000199
42272:2020-09-11 14:10:18.974264 7fba3d925700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] enter Started
42273:2020-09-11 14:10:18.974268 7fba3d925700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] enter Start
42274:2020-09-11 14:10:18.974272 7fba3d925700  1 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] state<Start>: transitioning to Primary
42275:2020-09-11 14:10:18.974276 7fba3d925700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] exit Start 0.000007 0 0.000000
42276:2020-09-11 14:10:18.974280 7fba3d925700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] enter Started/Primary
42277:2020-09-11 14:10:18.974284 7fba3d925700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] enter Started/Primary/Peering
42278:2020-09-11 14:10:18.974287 7fba3d925700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] enter Started/Primary/Peering/GetInfo
42279:2020-09-11 14:10:18.974292 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] _calc_past_interval_range: already have past intervals back to 2224
42280:2020-09-11 14:10:18.974297 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering]  PriorSet: build_prior interval(2223-2225 up [3](3) acting [3](3) maybe_went_rw)
42281:2020-09-11 14:10:18.974301 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering]  PriorSet: build_prior final: probe 2,3 down  blocked_by {}
42282:2020-09-11 14:10:18.974305 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] up_thru 2223 < same_since 2226, must notify monitor
42283:2020-09-11 14:10:18.974310 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] state<Started/Primary/Peering/GetInfo>:  querying info from osd.2
42284:2020-09-11 14:10:18.974318 7fba3d925700 15 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] publish_stats_to_osd 2226:1335
42285:2020-09-11 14:10:18.974322 7fba3d925700 20 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] handle_activate_map: Not dirtying info: last_persisted is 2224 while current is 2226
42286:2020-09-11 14:10:18.974326 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] handle_peering_event: epoch_sent: 2226 epoch_requested: 2226 NullEvt
44149:2020-09-11 14:10:18.981601 7fba45134700 20 osd.3 2226 _dispatch 0x7fba6de6d0e0 pg_notify(11.4 epoch 2226) v5
44160:2020-09-11 14:10:18.981660 7fba45134700 30 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] lock
44968:2020-09-11 14:10:18.984778 7fba3d124700 30 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] lock
44969:2020-09-11 14:10:18.984783 7fba3d124700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] handle_peering_event: epoch_sent: 2226 epoch_requested: 2226 MNotifyRec from 2 notify: (query_epoch:2226, epoch_sent:2226, info:11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)) features: 0x7ffffffefdfbfff
44970:2020-09-11 14:10:18.984789 7fba3d124700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering]  got osd.2 11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)
44971:2020-09-11 14:10:18.984795 7fba3d124700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] update_heartbeat_peers 2,3 unchanged
44972:2020-09-11 14:10:18.984799 7fba3d124700 20 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] state<Started/Primary/Peering/GetInfo>: Adding osd: 2 peer features: 7ffffffefdfbfff
44973:2020-09-11 14:10:18.984803 7fba3d124700 20 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] state<Started/Primary/Peering/GetInfo>: Common peer features: 7ffffffefdfbfff
44974:2020-09-11 14:10:18.984817 7fba3d124700 20 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] state<Started/Primary/Peering/GetInfo>: Common acting features: 7ffffffefdfbfff
44975:2020-09-11 14:10:18.984820 7fba3d124700 20 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] state<Started/Primary/Peering/GetInfo>: Common upacting features: 7ffffffefdfbfff
44976:2020-09-11 14:10:18.984825 7fba3d124700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] exit Started/Primary/Peering/GetInfo 0.010536 2 0.000071
44977:2020-09-11 14:10:18.984830 7fba3d124700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] enter Started/Primary/Peering/GetLog
44978:2020-09-11 14:10:18.984834 7fba3d124700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] calc_acting osd.2 11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)
44979:2020-09-11 14:10:18.984839 7fba3d124700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] calc_acting osd.3 11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223)
44980:2020-09-11 14:10:18.984847 7fba3d124700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] calc_acting newest update on osd.3 with 11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223)
44982:calc_acting primary is osd.3 with 11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223)
44983: osd.2 (up) accepted 11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)
44985:2020-09-11 14:10:18.984851 7fba3d124700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] choose_acting want [3,2] != acting [3], requesting pg_temp change
44986:2020-09-11 14:10:18.984857 7fba3d124700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] exit Started/Primary/Peering/GetLog 0.000026 0 0.000000
44987:2020-09-11 14:10:18.984861 7fba3d124700 15 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] publish_stats_to_osd 2226:1336
44988:2020-09-11 14:10:18.984865 7fba3d124700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] state<Started/Primary/Peering>: Leaving Peering
44989:2020-09-11 14:10:18.984869 7fba3d124700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] exit Started/Primary/Peering 0.010585 0 0.000000
44990:2020-09-11 14:10:18.984874 7fba3d124700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] exit Started/Primary 0.010593 0 0.000000
44991:2020-09-11 14:10:18.984878 7fba3d124700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] clear_primary_state
44992:2020-09-11 14:10:18.984883 7fba3d124700 20 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] agent_stop
44993:2020-09-11 14:10:18.984887 7fba3d124700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] enter Started/Primary
44994:2020-09-11 14:10:18.984891 7fba3d124700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] enter Started/Primary/Peering/WaitActingChange
45166:2020-09-11 14:10:18.985528 7fba3d124700 10 osd.3 2226 send_pg_temp {11.4=[],11.6=[],19.1=[],22.2c=[],22.44=[],22.a4=[],22.b5=[],22.ca=[],22.d0=[],22.ec=[],23.d=[],23.13=[],23.30=[],23.6d=[]}
46089:2020-09-11 14:10:19.631436 7fba4fd58700 20 osd.3 2226 11.4 heartbeat_peers 2,3
46393:2020-09-11 14:10:20.010087 7fba49ec6700 30 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] lock
46688:2020-09-11 14:10:20.011975 7fba49ec6700 30 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] lock
46691:2020-09-11 14:10:20.011990 7fba49ec6700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] null
47206:2020-09-11 14:10:20.014765 7fba3d925700 30 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] lock
47209:2020-09-11 14:10:20.014782 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] handle_advance_map [3,2]/[3,2] -- 3/3
47212:2020-09-11 14:10:20.014796 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] state<Started/Primary/Peering/WaitActingChange>: verifying no want_acting [] targets didn't go down
47279:2020-09-11 14:10:20.014814 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] state<Started>: Started advmap
47281:2020-09-11 14:10:20.015156 7fba3d925700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] new interval newup [3,2] newacting [3,2]
47283:2020-09-11 14:10:20.015174 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] state<Started>: should_restart_peering, transitioning to Reset
47285:2020-09-11 14:10:20.015184 7fba3d925700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] exit Started/Primary/Peering/WaitActingChange 1.030293 1 0.000106
47288:2020-09-11 14:10:20.015195 7fba3d925700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] exit Started/Primary 1.030307 0 0.000000
47290:2020-09-11 14:10:20.015205 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] clear_primary_state
47292:2020-09-11 14:10:20.015215 7fba3d925700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] agent_stop
47294:2020-09-11 14:10:20.015227 7fba3d925700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] exit Started 1.040962 0 0.000000
47295:2020-09-11 14:10:20.015238 7fba3d925700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] enter Reset
47296:2020-09-11 14:10:20.015245 7fba3d925700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] set_last_peering_reset 2227
47299:2020-09-11 14:10:20.015254 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2227 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] Clearing blocked outgoing recovery messages
47301:2020-09-11 14:10:20.015266 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2227 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] Not blocking outgoing recovery messages
47303:2020-09-11 14:10:20.015280 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2227 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] state<Reset>: Reset advmap
47394:2020-09-11 14:10:20.015296 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2227 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] _calc_past_interval_range: already have past intervals back to 2224
47396:2020-09-11 14:10:20.015871 7fba3d925700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2227 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] new interval newup [3,2] newacting [3,2]
47399:2020-09-11 14:10:20.015889 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2227 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] state<Reset>: should restart peering, calling start_peering_interval again
47400:2020-09-11 14:10:20.015898 7fba3d925700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2227 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] set_last_peering_reset 2227
47403:2020-09-11 14:10:20.015920 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] start_peering_interval: check_new_interval output: generate_past_intervals interval(2226-2226 up [3,2](3) acting [3](3)): not rw, up_thru 2223 up_from 2123 last_epoch_clean 2224
47409:2020-09-11 14:10:20.015946 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive]  noting past interval(2226-2226 up [3,2](3) acting [3](3))
47411:2020-09-11 14:10:20.015960 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive]  up [3,2] -> [3,2], acting [3] -> [3,2], acting_primary 3 -> 3, up_primary 3 -> 3, role 0 -> 0, features acting 576460752032874495 upacting 576460752032874495
47413:2020-09-11 14:10:20.015972 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] clear_primary_state
47414:2020-09-11 14:10:20.015983 7fba3d925700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] agent_stop
47415:2020-09-11 14:10:20.015991 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] on_change
47417:2020-09-11 14:10:20.015998 7fba3d925700 15 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive]  requeue_ops 
47419:2020-09-11 14:10:20.016009 7fba3d925700 15 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] publish_stats_to_osd 2227:1337
47421:2020-09-11 14:10:20.016019 7fba3d925700 15 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive]  requeue_ops 
47422:2020-09-11 14:10:20.016027 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] cancel_copy_ops
47424:2020-09-11 14:10:20.016034 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] cancel_flush_ops
47426:2020-09-11 14:10:20.016042 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] cancel_proxy_ops
47428:2020-09-11 14:10:20.016051 7fba3d925700 15 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive]  requeue_ops 
47430:2020-09-11 14:10:20.016060 7fba3d925700 15 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive]  requeue_ops 
47432:2020-09-11 14:10:20.016070 7fba3d925700 15 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive]  requeue_ops 
47435:2020-09-11 14:10:20.016080 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] on_change_cleanup
47437:2020-09-11 14:10:20.016090 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] on_change
47440:2020-09-11 14:10:20.016101 7fba3d925700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] exit NotTrimming
47442:2020-09-11 14:10:20.016112 7fba3d925700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] enter NotTrimming
47444:2020-09-11 14:10:20.016122 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] [3] -> [3,2], replicas changed
47447:2020-09-11 14:10:20.016131 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] cancel_recovery
47449:2020-09-11 14:10:20.016140 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] clear_recovery_state
47451:2020-09-11 14:10:20.016150 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] check_recovery_sources no source osds () went down
47454:2020-09-11 14:10:20.016168 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] handle_activate_map 
47457:2020-09-11 14:10:20.016178 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] update_heartbeat_peers 2,3 unchanged
47458:2020-09-11 14:10:20.016188 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] take_waiters
47461:2020-09-11 14:10:20.016198 7fba3d925700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] exit Reset 0.000960 1 0.001368
47463:2020-09-11 14:10:20.016211 7fba3d925700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] enter Started
47465:2020-09-11 14:10:20.016221 7fba3d925700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] enter Start
47467:2020-09-11 14:10:20.016231 7fba3d925700  1 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] state<Start>: transitioning to Primary
47469:2020-09-11 14:10:20.016242 7fba3d925700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] exit Start 0.000020 0 0.000000
47472:2020-09-11 14:10:20.016253 7fba3d925700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] enter Started/Primary
47474:2020-09-11 14:10:20.016262 7fba3d925700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] enter Started/Primary/Peering
47475:2020-09-11 14:10:20.016271 7fba3d925700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] enter Started/Primary/Peering/GetInfo
47478:2020-09-11 14:10:20.016281 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] _calc_past_interval_range: already have past intervals back to 2224
47481:2020-09-11 14:10:20.016292 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering]  PriorSet: build_prior interval(2226-2226 up [3,2](3) acting [3](3))
47482:2020-09-11 14:10:20.016301 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering]  PriorSet: build_prior interval(2223-2225 up [3](3) acting [3](3) maybe_went_rw)
47483:2020-09-11 14:10:20.016310 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering]  PriorSet: build_prior final: probe 2,3 down  blocked_by {}
47484:2020-09-11 14:10:20.016318 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] up_thru 2226 < same_since 2227, must notify monitor
47485:2020-09-11 14:10:20.016327 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>:  querying info from osd.2
47487:2020-09-11 14:10:20.016337 7fba3d925700 15 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] publish_stats_to_osd 2227:1338
47489:2020-09-11 14:10:20.016346 7fba3d925700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] handle_activate_map: Not dirtying info: last_persisted is 2226 while current is 2227
47492:2020-09-11 14:10:20.016356 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] handle_peering_event: epoch_sent: 2227 epoch_requested: 2227 NullEvt
50548:2020-09-11 14:10:20.028700 7fba3d124700 20   ? 1956'80 (1956'79) modify   23:1ce16903:::obj-9bjaNA0lG2ZpqCJ:head by client.291879.0:1164 2020-05-30 10:20:25.595078
50723:2020-09-11 14:10:20.029810 7fba3d124700 20 update missing, append 1956'80 (1956'79) modify   23:1ce16903:::obj-9bjaNA0lG2ZpqCJ:head by client.291879.0:1164 2020-05-30 10:20:25.595078
50826:2020-09-11 14:10:20.031026 7fba45134700 20 osd.3 2227 _dispatch 0x7fba6d7401e0 pg_notify(11.4 epoch 2227) v5
50829:2020-09-11 14:10:20.031044 7fba45134700 30 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] lock
50838:2020-09-11 14:10:20.031074 7fba3d124700 30 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] lock
50841:2020-09-11 14:10:20.031095 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] handle_peering_event: epoch_sent: 2227 epoch_requested: 2227 MNotifyRec from 2 notify: (query_epoch:2227, epoch_sent:2227, info:11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)) features: 0x7ffffffefdfbfff
50846:2020-09-11 14:10:20.031104 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering]  got osd.2 11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)
50848:2020-09-11 14:10:20.031115 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] update_heartbeat_peers 2,3 unchanged
50850:2020-09-11 14:10:20.031122 7fba3d124700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>: Adding osd: 2 peer features: 7ffffffefdfbfff
50851:2020-09-11 14:10:20.031129 7fba3d124700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>: Common peer features: 7ffffffefdfbfff
50854:2020-09-11 14:10:20.031135 7fba3d124700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>: Common acting features: 7ffffffefdfbfff
50857:2020-09-11 14:10:20.031140 7fba3d124700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>: Common upacting features: 7ffffffefdfbfff
50859:2020-09-11 14:10:20.031147 7fba3d124700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] exit Started/Primary/Peering/GetInfo 0.014876 2 0.000170
50862:2020-09-11 14:10:20.031154 7fba3d124700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] enter Started/Primary/Peering/GetLog
50863:2020-09-11 14:10:20.031161 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] calc_acting osd.2 11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)
50865:2020-09-11 14:10:20.031168 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] calc_acting osd.3 11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223)
50873:2020-09-11 14:10:20.031192 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] calc_acting newest update on osd.3 with 11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223)
50875:calc_acting primary is osd.3 with 11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223)
50876: osd.2 (up) accepted 11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)
50878:2020-09-11 14:10:20.031210 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] actingbackfill is 2,3
50880:2020-09-11 14:10:20.031215 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] choose_acting want [3,2] (== acting) backfill_targets 
50881:2020-09-11 14:10:20.031220 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetLog>: leaving GetLog
50882:2020-09-11 14:10:20.031225 7fba3d124700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] exit Started/Primary/Peering/GetLog 0.000070 0 0.000000
50883:2020-09-11 14:10:20.031242 7fba3d124700 15 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] publish_stats_to_osd 2227:1339
50885:2020-09-11 14:10:20.031249 7fba3d124700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] enter Started/Primary/Peering/GetMissing
50892:2020-09-11 14:10:20.031254 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetMissing>:  still need up_thru update before going active
50894:2020-09-11 14:10:20.031270 7fba3d124700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] exit Started/Primary/Peering/GetMissing 0.000021 0 0.000000
50895:2020-09-11 14:10:20.031275 7fba3d124700 15 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] publish_stats_to_osd 2227: no change since 2020-09-11 14:10:20.031240
50896:2020-09-11 14:10:20.031285 7fba3d124700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] enter Started/Primary/Peering/WaitUpThru
51762:2020-09-11 14:10:20.420033 7fba496c5700 30 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] lock
51764:2020-09-11 14:10:20.420048 7fba496c5700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] flushed
51770:2020-09-11 14:10:20.420077 7fba3d124700 30 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] lock
51772:2020-09-11 14:10:20.420086 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] handle_peering_event: epoch_sent: 2227 epoch_requested: 2227 FlushedEvt
51773:2020-09-11 14:10:20.420096 7fba3d124700 15 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering]  requeue_ops 
52620:2020-09-11 14:10:20.631849 7fba4fd58700 20 osd.3 2227 11.4 heartbeat_peers 2,3
53389:2020-09-11 14:10:20.745327 7fba4f557700 25 osd.3 2227  sending 11.4 2227:1339
53689:2020-09-11 14:10:21.017381 7fba49ec6700 30 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] lock
53917:2020-09-11 14:10:21.021443 7fba49ec6700 30 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] lock
53919:2020-09-11 14:10:21.021465 7fba49ec6700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] null
54472:2020-09-11 14:10:21.026500 7fba49ec6700 20 osd.3 2228 11.4 heartbeat_peers 2,3
54757:2020-09-11 14:10:21.028048 7fba3d925700 30 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] lock
54759:2020-09-11 14:10:21.028061 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] handle_advance_map [3,2]/[3,2] -- 3/3
54761:2020-09-11 14:10:21.028073 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary/Peering>: Peering advmap
54762:2020-09-11 14:10:21.028081 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] adjust_need_up_thru now 2227, need_up_thru now false
54763:2020-09-11 14:10:21.028087 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started>: Started advmap
54764:2020-09-11 14:10:21.028095 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] check_recovery_sources no source osds () went down
54765:2020-09-11 14:10:21.028104 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] handle_activate_map 
54766:2020-09-11 14:10:21.028112 7fba3d925700  7 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary>: handle ActMap primary
54767:2020-09-11 14:10:21.028120 7fba3d925700 15 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] publish_stats_to_osd 2227: no change since 2020-09-11 14:10:20.031240
54768:2020-09-11 14:10:21.028129 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] take_waiters
54769:2020-09-11 14:10:21.028136 7fba3d925700  5 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] exit Started/Primary/Peering/WaitUpThru 0.996851 3 0.000229
54770:2020-09-11 14:10:21.028145 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary/Peering>: Leaving Peering
54771:2020-09-11 14:10:21.028152 7fba3d925700  5 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] exit Started/Primary/Peering 1.011890 0 0.000000
54772:2020-09-11 14:10:21.028162 7fba3d925700  5 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] enter Started/Primary/Active
54773:2020-09-11 14:10:21.028170 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] state<Started/Primary/Active>: In Active, about to call activate
54774:2020-09-11 14:10:21.028179 7fba3d925700 20 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] activate - purged_snaps [] cached_removed_snaps []
54775:2020-09-11 14:10:21.028186 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] activate - snap_trimq []
54776:2020-09-11 14:10:21.028193 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] activate - no missing, moving last_complete 201'1 -> 201'1
54777:2020-09-11 14:10:21.028200 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] activate peer osd.2 11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)
54778:2020-09-11 14:10:21.028213 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] activate peer osd.2 sending log((0'0,201'1], crt=201'1)
54780:2020-09-11 14:10:21.028240 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] activate peer osd.2 11.4( DNE v 201'1 lc 0'0 (0'0,201'1] local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0) missing missing(1)
54781:2020-09-11 14:10:21.028251 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] needs_recovery osd.2 has 1 missing
54782:2020-09-11 14:10:21.028258 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] add_batch_sources_info: adding sources in batch 1
54783:2020-09-11 14:10:21.028265 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] build_might_have_unfound
54784:2020-09-11 14:10:21.028274 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] _calc_past_interval_range: already have past intervals back to 2224
54785:2020-09-11 14:10:21.028282 7fba3d925700 15 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] build_might_have_unfound: built 2
54786:2020-09-11 14:10:21.028288 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 degraded] activate - starting recovery
54787:2020-09-11 14:10:21.028299 7fba3d925700 10 osd.3 2228 queue_for_recovery queued pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 degraded]
54789:2020-09-11 14:10:21.028307 7fba3d925700 15 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] publish_stats_to_osd 2228:1340
54790:2020-09-11 14:10:21.028315 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] state<Started/Primary/Active>: Activate Finished
54791:2020-09-11 14:10:21.028322 7fba3d925700  5 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] enter Started/Primary/Active/Activating
54792:2020-09-11 14:10:21.028330 7fba3d925700 20 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] handle_activate_map: Not dirtying info: last_persisted is 2227 while current is 2228
54793:2020-09-11 14:10:21.028337 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] handle_peering_event: epoch_sent: 2228 epoch_requested: 2228 NullEvt
54797:2020-09-11 14:10:21.028438 7fba37919700 30 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] lock
55341:2020-09-11 14:10:21.031547 7fba46937700 25 osd.3 2228  still pending 11.4 2228:1340 > acked 1339,2227
57028:2020-09-11 14:10:21.067607 7fba49ec6700 30 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] lock
57029:2020-09-11 14:10:21.067621 7fba49ec6700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] _activate_committed 2228 peer_activated now 3 last_epoch_started 2224 same_interval_since 2227
57030:2020-09-11 14:10:21.067629 7fba49ec6700 30 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] lock
57031:2020-09-11 14:10:21.067635 7fba49ec6700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] flushed
57048:2020-09-11 14:10:21.067861 7fba3d124700 30 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] lock
57049:2020-09-11 14:10:21.067870 7fba3d124700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] handle_peering_event: epoch_sent: 2228 epoch_requested: 2228 FlushedEvt
57050:2020-09-11 14:10:21.067881 7fba3d124700 15 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded]  requeue_ops 
60117:2020-09-11 14:10:21.632324 7fba4fd58700 20 osd.3 2228 11.4 heartbeat_peers 2,3
60468:2020-09-11 14:10:21.964346 7fba45134700 20 osd.3 2228 _dispatch 0x7fba69a9dfe0 pg_info(1 pgs e2228:11.4) v4
60469:2020-09-11 14:10:21.964354 7fba45134700  7 osd.3 2228 handle_pg_info pg_info(1 pgs e2228:11.4) v4 from osd.2
60471:2020-09-11 14:10:21.964414 7fba45134700 30 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] lock
60474:2020-09-11 14:10:21.964472 7fba3d124700 30 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] lock
60475:2020-09-11 14:10:21.964509 7fba3d124700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] handle_peering_event: epoch_sent: 2228 epoch_requested: 2228 MInfoRec from 2 info: 11.4( v 201'1 lc 0'0 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223)
60476:2020-09-11 14:10:21.964530 7fba3d124700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] state<Started/Primary/Active>:  peer osd.2 activated and committed
60477:2020-09-11 14:10:21.964549 7fba3d124700 15 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] publish_stats_to_osd 2228: no change since 2020-09-11 14:10:21.028307
60478:2020-09-11 14:10:21.964575 7fba3d124700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] all_activated_and_committed
60480:2020-09-11 14:10:21.964635 7fba3d124700 30 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] lock
60481:2020-09-11 14:10:21.964670 7fba3d124700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] handle_peering_event: epoch_sent: 2228 epoch_requested: 2228 AllReplicasActivated
60482:2020-09-11 14:10:21.964685 7fba3d124700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] share_pg_info
60484:2020-09-11 14:10:21.964747 7fba3d124700 15 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] publish_stats_to_osd 2228:1341
60485:2020-09-11 14:10:21.964771 7fba3d124700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] check_local
60486:2020-09-11 14:10:21.964782 7fba3d124700 15 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded]  requeue_ops 
60487:2020-09-11 14:10:21.964794 7fba3d124700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] needs_recovery osd.2 has 1 missing
60488:2020-09-11 14:10:21.964805 7fba3d124700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] activate not all replicas are up-to-date, queueing recovery
60489:2020-09-11 14:10:21.964828 7fba3d124700 15 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] publish_stats_to_osd 2228: no change since 2020-09-11 14:10:21.964740
60490:2020-09-11 14:10:21.964867 7fba3d124700 20 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] hit_set_clear
60491:2020-09-11 14:10:21.964881 7fba3d124700 20 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] agent_stop
60493:2020-09-11 14:10:21.965001 7fba3d124700 30 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] lock
60494:2020-09-11 14:10:21.965026 7fba3d124700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] handle_peering_event: epoch_sent: 2228 epoch_requested: 2228 DoRecovery
60495:2020-09-11 14:10:21.965041 7fba3d124700  5 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] exit Started/Primary/Active/Activating 0.936718 5 0.000532
60496:2020-09-11 14:10:21.965057 7fba3d124700  5 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] enter Started/Primary/Active/WaitLocalRecoveryReserved
60497:2020-09-11 14:10:21.965075 7fba3d124700 15 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] publish_stats_to_osd 2228:1342
60621:2020-09-11 14:10:22.049880 7fba49ec6700 30 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
60891:2020-09-11 14:10:22.051626 7fba49ec6700 30 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
60892:2020-09-11 14:10:22.051634 7fba49ec6700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] null
61652:2020-09-11 14:10:22.055162 7fba3d925700 30 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
61659:2020-09-11 14:10:22.055181 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_advance_map [3,2]/[3,2] -- 3/3
61664:2020-09-11 14:10:22.055197 7fba3d925700 10 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started/Primary/Active>: Active advmap
61667:2020-09-11 14:10:22.055208 7fba3d925700 10 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started>: Started advmap
61670:2020-09-11 14:10:22.055219 7fba3d925700 10 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] check_recovery_sources no source osds (3) went down
61675:2020-09-11 14:10:22.055233 7fba3d925700 10 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_activate_map 
61680:2020-09-11 14:10:22.055248 7fba3d925700 10 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started/Primary/Active>: Active: handling ActMap
61686:2020-09-11 14:10:22.055273 7fba3d925700 10 osd.3 2229 queue_for_recovery queued pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded]
61690:2020-09-11 14:10:22.055286 7fba3d925700  7 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started/Primary>: handle ActMap primary
61695:2020-09-11 14:10:22.055305 7fba3d925700 15 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] publish_stats_to_osd 2228: no change since 2020-09-11 14:10:21.965071
61699:2020-09-11 14:10:22.055323 7fba3d925700 10 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] take_waiters
61703:2020-09-11 14:10:22.055333 7fba3d925700 20 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_activate_map: Not dirtying info: last_persisted is 2228 while current is 2229
61706:2020-09-11 14:10:22.055344 7fba3d925700 10 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_peering_event: epoch_sent: 2229 epoch_requested: 2229 NullEvt
61849:2020-09-11 14:10:22.055821 7fba37919700 30 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
61851:2020-09-11 14:10:22.055840 7fba37919700 10 osd.3 2229 do_recovery starting 1 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded]
61855:2020-09-11 14:10:22.055850 7fba37919700 10 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] recovery raced and were queued twice, ignoring!
61859:2020-09-11 14:10:22.055862 7fba37919700 10 osd.3 2229 do_recovery started 0/1 on pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded]
63796:2020-09-11 14:10:22.633040 7fba4fd58700 20 osd.3 2229 11.4 heartbeat_peers 2,3
64613:2020-09-11 14:10:23.233528 7fba49ec6700 30 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
64889:2020-09-11 14:10:23.236332 7fba49ec6700 30 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
64891:2020-09-11 14:10:23.236346 7fba49ec6700 10 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] null
65325:2020-09-11 14:10:23.239004 7fba3d925700 30 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
65329:2020-09-11 14:10:23.239029 7fba3d925700 10 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_advance_map [3,2]/[3,2] -- 3/3
65331:2020-09-11 14:10:23.239052 7fba3d925700 10 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started/Primary/Active>: Active advmap
65334:2020-09-11 14:10:23.239067 7fba3d925700 10 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started>: Started advmap
65337:2020-09-11 14:10:23.239085 7fba3d925700 10 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] check_recovery_sources no source osds (3) went down
65340:2020-09-11 14:10:23.239104 7fba3d925700 10 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_activate_map 
65343:2020-09-11 14:10:23.239121 7fba3d925700 10 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started/Primary/Active>: Active: handling ActMap
65346:2020-09-11 14:10:23.239141 7fba3d925700 10 osd.3 2230 queue_for_recovery queued pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded]
65348:2020-09-11 14:10:23.239155 7fba3d925700  7 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started/Primary>: handle ActMap primary
65353:2020-09-11 14:10:23.239177 7fba3d925700 15 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] publish_stats_to_osd 2228: no change since 2020-09-11 14:10:21.965071
65356:2020-09-11 14:10:23.239201 7fba3d925700 10 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] take_waiters
65360:2020-09-11 14:10:23.239214 7fba3d925700 20 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_activate_map: Not dirtying info: last_persisted is 2228 while current is 2230
65362:2020-09-11 14:10:23.239229 7fba3d925700 10 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_peering_event: epoch_sent: 2230 epoch_requested: 2230 NullEvt
65624:2020-09-11 14:10:23.242068 7fba37919700 30 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
65625:2020-09-11 14:10:23.242099 7fba37919700 10 osd.3 2230 do_recovery starting 1 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded]
65626:2020-09-11 14:10:23.242111 7fba37919700 10 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] recovery raced and were queued twice, ignoring!
65627:2020-09-11 14:10:23.242124 7fba37919700 10 osd.3 2230 do_recovery started 0/1 on pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded]
66155:2020-09-11 14:10:23.245392 7fba49ec6700 20 osd.3 2230 11.4 heartbeat_peers 2,3
68073:2020-09-11 14:10:24.319966 7fba49ec6700 30 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
68718:2020-09-11 14:10:24.323051 7fba49ec6700 30 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
68726:2020-09-11 14:10:24.323069 7fba49ec6700 10 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] null
68741:2020-09-11 14:10:24.323112 7fba3d124700 30 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
68748:2020-09-11 14:10:24.323142 7fba3d124700 10 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_advance_map [3,2]/[3,2] -- 3/3
68752:2020-09-11 14:10:24.323158 7fba3d124700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started/Primary/Active>: Active advmap
68755:2020-09-11 14:10:24.323166 7fba3d124700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started>: Started advmap
68758:2020-09-11 14:10:24.323174 7fba3d124700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] check_recovery_sources no source osds (3) went down
68760:2020-09-11 14:10:24.323183 7fba3d124700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_activate_map 
68764:2020-09-11 14:10:24.323189 7fba3d124700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started/Primary/Active>: Active: handling ActMap
68766:2020-09-11 14:10:24.323199 7fba3d124700 10 osd.3 2231 queue_for_recovery queued pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded]
68769:2020-09-11 14:10:24.323204 7fba3d124700  7 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started/Primary>: handle ActMap primary
68772:2020-09-11 14:10:24.323212 7fba3d124700 15 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] publish_stats_to_osd 2228: no change since 2020-09-11 14:10:21.965071
68775:2020-09-11 14:10:24.323220 7fba3d124700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] take_waiters
68777:2020-09-11 14:10:24.323226 7fba3d124700 20 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_activate_map: Not dirtying info: last_persisted is 2228 while current is 2231
68778:2020-09-11 14:10:24.323232 7fba3d124700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_peering_event: epoch_sent: 2231 epoch_requested: 2231 NullEvt
71105:2020-09-11 14:10:25.746633 7fba4f557700 25 osd.3 2231  sending 11.4 2228:1342
71198:2020-09-11 14:10:25.789587 7fba37919700 30 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
71199:2020-09-11 14:10:25.789593 7fba37919700 10 osd.3 2231 do_recovery starting 1 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded]
71200:2020-09-11 14:10:25.789598 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] recovery raced and were queued twice, ignoring!
71201:2020-09-11 14:10:25.789602 7fba37919700 10 osd.3 2231 do_recovery started 0/1 on pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded]
71309:2020-09-11 14:10:25.797456 7fba35915700 30 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
71315:2020-09-11 14:10:25.797495 7fba3d124700 30 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
71317:2020-09-11 14:10:25.797531 7fba3d124700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_peering_event: epoch_sent: 2228 epoch_requested: 2228 LocalRecoveryReserved
71318:2020-09-11 14:10:25.797542 7fba3d124700  5 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] exit Started/Primary/Active/WaitLocalRecoveryReserved 3.832485 10 0.000382
71320:2020-09-11 14:10:25.797552 7fba3d124700  5 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] enter Started/Primary/Active/WaitRemoteRecoveryReserved
71375:2020-09-11 14:10:26.249869 7fba46937700 25 osd.3 2231  ack on 11.4 2228:1342
71484:2020-09-11 14:10:26.635094 7fba4fd58700 20 osd.3 2231 11.4 heartbeat_peers 2,3
94065:2020-09-11 14:13:14.601725 7fba45134700 20 osd.3 2231 _dispatch 0x7fba6c134b40 MRecoveryReserve GRANT  pgid: 11.4, query_epoch: 2231 v2
94067:2020-09-11 14:13:14.601753 7fba45134700 30 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
94070:2020-09-11 14:13:14.601819 7fba3d124700 30 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
94071:2020-09-11 14:13:14.601855 7fba3d124700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_peering_event: epoch_sent: 2231 epoch_requested: 2231 RemoteRecoveryReserved
94072:2020-09-11 14:13:14.601876 7fba3d124700  5 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] exit Started/Primary/Active/WaitRemoteRecoveryReserved 168.804323 1 0.000031
94073:2020-09-11 14:13:14.601894 7fba3d124700  5 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] enter Started/Primary/Active/Recovering
94074:2020-09-11 14:13:14.601916 7fba3d124700 15 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] publish_stats_to_osd 2231:1343
94075:2020-09-11 14:13:14.601948 7fba3d124700 10 osd.3 2231 queue_for_recovery queued pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded]
94078:2020-09-11 14:13:14.601970 7fba37919700 30 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] lock
94079:2020-09-11 14:13:14.601984 7fba37919700 10 osd.3 2231 do_recovery starting 1 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded]
94080:2020-09-11 14:13:14.601992 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] recover_replicas(1)
94081:2020-09-11 14:13:14.601999 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded]  peer osd.2 missing 1 objects.
94082:2020-09-11 14:13:14.602005 7fba37919700 20 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded]  peer osd.2 missing {11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head=201'1}
94083:2020-09-11 14:13:14.602016 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] recover_replicas: recover_object_replicas(11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head)
94084:2020-09-11 14:13:14.602023 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] prep_object_replica_pushes: on 11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head
94085:2020-09-11 14:13:14.602031 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] get_object_context: obc NOT found in cache: 11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head
94086:2020-09-11 14:13:14.602088 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] populate_obc_watchers 11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head
94087:2020-09-11 14:13:14.602104 7fba37919700 20 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] ReplicatedPG::check_blacklisted_obc_watchers for obc 11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head
94088:2020-09-11 14:13:14.602112 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] get_object_context: creating obc from disk: 0x7fba66a80c00
94089:2020-09-11 14:13:14.602118 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] get_object_context: 0x7fba66a80c00 11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head rwstate(none n=0 w=0) oi: 11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head(201'1 client.34245.0:25 dirty|data_digest|omap_digest s 1172 uv 1 dd 21971aec od ffffffff alloc_hint [0 0]) ssc: 0x7fba6f624460 snapset: 0=[]:[]+head
94090:2020-09-11 14:13:14.602137 7fba37919700 20 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] recovery got recovery read lock on 11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head
94091:2020-09-11 14:13:14.602147 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] start_recovery_op 11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head
94092:2020-09-11 14:13:14.602158 7fba37919700 10 osd.3 2231 start_recovery_op pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 rops=1 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] 11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head (1/3 rops)
94093:2020-09-11 14:13:14.602179 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 rops=1 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] recover_object: 11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head
94094:2020-09-11 14:13:14.602193 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 rops=1 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] prep_push_to_replica: 11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head v201'1 size 1172 to osd.2
94095:2020-09-11 14:13:14.602201 7fba37919700 15 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 rops=1 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] push_to_replica snapset is 0=[]:[]+head
94096:2020-09-11 14:13:14.602208 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 rops=1 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] calc_head_subsets 11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head clone_overlap {}
94097:2020-09-11 14:13:14.602215 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 rops=1 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] calc_head_subsets 11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head  data_subset [0~1172]  clone_subsets {}
94098:2020-09-11 14:13:14.602227 7fba37919700  7 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 rops=1 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] send_push_op 11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head v 201'1 size 1172 recovery_info: ObjectRecoveryInfo(11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head@201'1, size: 1172, copy_subset: [0~1172], clone_subset: {})
94099:2020-09-11 14:13:14.602395 7fba37919700 20 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 rops=1 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] send_pushes: sending push PushOp(11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head, version: 201'1, data_included: [0~1172], data_size: 1172, omap_header_size: 0, omap_entries_size: 0, attrset_size: 2, recovery_info: ObjectRecoveryInfo(11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head@201'1, size: 1172, copy_subset: [0~1172], clone_subset: {}), after_progress: ObjectRecoveryProgress(!first, data_recovered_to:1172, data_complete:true, omap_recovered_to:, omap_complete:true), before_progress: ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:false)) to osd.2
94100:2020-09-11 14:13:14.602442 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 rops=1 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded]  started 1
94101:2020-09-11 14:13:14.602448 7fba37919700 10 osd.3 2231 do_recovery started 1/1 on pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 rops=1 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded]
94130:2020-09-11 14:13:14.663609 7fba45134700 20 osd.3 2231 _dispatch 0x7fba6b4d3400 pg_trim(11.4 to 201'1 e2231) v1
94131:2020-09-11 14:13:14.663622 7fba45134700  7 osd.3 2231 handle_pg_trim pg_trim(11.4 to 201'1 e2231) v1 from osd.2
94133:2020-09-11 14:13:14.663669 7fba45134700 30 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 rops=1 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] lock
94134:2020-09-11 14:13:14.663695 7fba45134700 10 osd.3 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 rops=1 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] replica osd.2 lcod 201'1
94142:2020-09-11 14:13:14.664128 7fba306fb700 10 osd.3 2231 handle_replica_op MOSDPGPushReply(11.4 2231 [PushReplyOp(11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head)]) v2 epoch 2231
94144:2020-09-11 14:13:14.664146 7fba306fb700 15 osd.3 2231 enqueue_op 0x7fba6ba44700 prio 3 cost 8389608 latency 0.000100 MOSDPGPushReply(11.4 2231 [PushReplyOp(11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head)]) v2
94147:2020-09-11 14:13:14.664228 7fba3811a700 30 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 rops=1 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] lock
94150:2020-09-11 14:13:14.664304 7fba3811a700 10 osd.3 2231 dequeue_op 0x7fba6ba44700 prio 3 cost 8389608 latency 0.000257 MOSDPGPushReply(11.4 2231 [PushReplyOp(11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head)]) v2 pg pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 rops=1 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded]
94151:2020-09-11 14:13:14.664341 7fba3811a700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 rops=1 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] handle_message: 0x7fba6ba44700
94153:2020-09-11 14:13:14.664373 7fba3811a700 15 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 rops=1 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] publish_stats_to_osd 2231:1344
94155:2020-09-11 14:13:14.664396 7fba3811a700 15 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 rops=1 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] publish_stats_to_osd 2231:1345
94156:2020-09-11 14:13:14.664408 7fba3811a700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 rops=1 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] pushed 11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head to all replicas
94157:2020-09-11 14:13:14.664423 7fba3811a700 15 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 rops=1 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded]  requeue_ops 
94158:2020-09-11 14:13:14.664435 7fba3811a700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 rops=1 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] finish_recovery_op 11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head
94160:2020-09-11 14:13:14.664451 7fba3811a700 10 osd.3 2231 finish_recovery_op pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] 11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head dequeue=0 (1/3 rops)
94161:2020-09-11 14:13:14.664488 7fba3811a700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] finish_degraded_object 11:3b9ee248:::zone_info.28570f88-31e7-4166-b6d7-d22903cead75:head
94166:2020-09-11 14:13:14.664540 7fba37919700 30 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] lock
94167:2020-09-11 14:13:14.664562 7fba37919700 10 osd.3 2231 do_recovery starting 1 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded]
94168:2020-09-11 14:13:14.664573 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] recover_replicas(1)
94170:2020-09-11 14:13:14.664586 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded]  peer osd.2 missing 0 objects.
94171:2020-09-11 14:13:14.664598 7fba37919700 20 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded]  peer osd.2 missing {}
94173:2020-09-11 14:13:14.664621 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] recover_primary recovering 0 in pg
94174:2020-09-11 14:13:14.664633 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] recover_primary missing(0)
94175:2020-09-11 14:13:14.664664 7fba37919700 25 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] recover_primary {}
94176:2020-09-11 14:13:14.664676 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded]  started 0
94178:2020-09-11 14:13:14.664687 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] start_recovery_ops needs_recovery: {}
94179:2020-09-11 14:13:14.664699 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] start_recovery_ops missing_loc: {}
94180:2020-09-11 14:13:14.664710 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovering+degraded] needs_recovery is recovered
94182:2020-09-11 14:13:14.664721 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] needs_backfill does not need backfill
94183:2020-09-11 14:13:14.664733 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] recovery done, no backfill
94184:2020-09-11 14:13:14.664754 7fba37919700 10 osd.3 2231 do_recovery started 0/1 on pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded]
94186:2020-09-11 14:13:14.664987 7fba3d925700 30 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] lock
94187:2020-09-11 14:13:14.665019 7fba3d925700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] handle_peering_event: epoch_sent: 2231 epoch_requested: 2231 AllReplicasRecovered
94188:2020-09-11 14:13:14.665069 7fba3d925700  5 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] exit Started/Primary/Active/Recovering 0.063174 1 0.000091
94189:2020-09-11 14:13:14.665093 7fba3d925700  5 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] enter Started/Primary/Active/Recovered
94190:2020-09-11 14:13:14.665116 7fba3d925700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] needs_recovery is recovered
94192:2020-09-11 14:13:14.665139 7fba3d925700 15 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active] publish_stats_to_osd 2231:1346
94193:2020-09-11 14:13:14.665158 7fba3d925700  5 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active] exit Started/Primary/Active/Recovered 0.000065 0 0.000000
94194:2020-09-11 14:13:14.665187 7fba3d925700  5 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active] enter Started/Primary/Active/Clean
94195:2020-09-11 14:13:14.665199 7fba3d925700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active] finish_recovery
94197:2020-09-11 14:13:14.665210 7fba3d925700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active] clear_recovery_state
94199:2020-09-11 14:13:14.665229 7fba3d925700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] trim_past_intervals: trimming interval(2223-2225 up [3](3) acting [3](3) maybe_went_rw)
94201:2020-09-11 14:13:14.665246 7fba3d925700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2226-2226/1 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] trim_past_intervals: trimming interval(2226-2226 up [3,2](3) acting [3](3))
94202:2020-09-11 14:13:14.665261 7fba3d925700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] share_pg_info
94206:2020-09-11 14:13:14.665311 7fba3d925700 15 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] publish_stats_to_osd 2231:1347
94229:2020-09-11 14:13:14.666750 7fba49ec6700 30 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] lock
94231:2020-09-11 14:13:14.666768 7fba49ec6700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] _finish_recovery
94232:2020-09-11 14:13:14.666780 7fba49ec6700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] purge_strays 
94234:2020-09-11 14:13:14.666796 7fba49ec6700 15 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] publish_stats_to_osd 2231: no change since 2020-09-11 14:13:14.665310
94471:2020-09-11 14:13:15.100566 7fba49ec6700 30 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] lock
94696:2020-09-11 14:13:15.102433 7fba49ec6700 30 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] lock
94697:2020-09-11 14:13:15.102438 7fba49ec6700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] null
95427:2020-09-11 14:13:15.107961 7fba3d124700 30 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] lock
95430:2020-09-11 14:13:15.107975 7fba3d124700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] handle_advance_map [3,2]/[3,2] -- 3/3
95434:2020-09-11 14:13:15.108002 7fba3d124700 10 osd.3 pg_epoch: 2232 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] state<Started/Primary/Active>: Active advmap
95436:2020-09-11 14:13:15.108010 7fba3d124700 10 osd.3 pg_epoch: 2232 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] state<Started>: Started advmap
95438:2020-09-11 14:13:15.108016 7fba3d124700 10 osd.3 pg_epoch: 2232 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] check_recovery_sources no source osds () went down
95439:2020-09-11 14:13:15.108023 7fba3d124700 10 osd.3 pg_epoch: 2232 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] handle_activate_map 
95440:2020-09-11 14:13:15.108032 7fba3d124700 10 osd.3 pg_epoch: 2232 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] state<Started/Primary/Active>: Active: handling ActMap
95442:2020-09-11 14:13:15.108040 7fba3d124700  7 osd.3 pg_epoch: 2232 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] state<Started/Primary>: handle ActMap primary
95444:2020-09-11 14:13:15.108047 7fba3d124700 15 osd.3 pg_epoch: 2232 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] publish_stats_to_osd 2231: no change since 2020-09-11 14:13:14.665310
95445:2020-09-11 14:13:15.108055 7fba3d124700 10 osd.3 pg_epoch: 2232 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] take_waiters
95446:2020-09-11 14:13:15.108062 7fba3d124700 20 osd.3 pg_epoch: 2232 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] handle_activate_map: Not dirtying info: last_persisted is 2231 while current is 2232
95447:2020-09-11 14:13:15.108069 7fba3d124700 10 osd.3 pg_epoch: 2232 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] handle_peering_event: epoch_sent: 2232 epoch_requested: 2232 NullEvt
97188:2020-09-11 14:13:15.784921 7fba4f557700 25 osd.3 2232  sending 11.4 2231:1347
97408:2020-09-11 14:13:16.204365 7fba46937700 25 osd.3 2232  ack on 11.4 2231:1347
97507:2020-09-11 14:13:16.231365 7fba49ec6700 30 osd.3 pg_epoch: 2232 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] lock
97833:2020-09-11 14:13:16.263230 7fba49ec6700 30 osd.3 pg_epoch: 2232 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] lock
97836:2020-09-11 14:13:16.263246 7fba49ec6700 10 osd.3 pg_epoch: 2232 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] null
98381:2020-09-11 14:13:16.268353 7fba3d124700 30 osd.3 pg_epoch: 2232 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] lock
98383:2020-09-11 14:13:16.268362 7fba3d124700 10 osd.3 pg_epoch: 2232 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] handle_advance_map [3,2]/[3,2] -- 3/3
98385:2020-09-11 14:13:16.268371 7fba3d124700 10 osd.3 pg_epoch: 2233 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] state<Started/Primary/Active>: Active advmap
98386:2020-09-11 14:13:16.268375 7fba3d124700 10 osd.3 pg_epoch: 2233 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] state<Started>: Started advmap
98387:2020-09-11 14:13:16.268380 7fba3d124700 10 osd.3 pg_epoch: 2233 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] check_recovery_sources no source osds () went down
98388:2020-09-11 14:13:16.268385 7fba3d124700 10 osd.3 pg_epoch: 2233 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] handle_activate_map 
98389:2020-09-11 14:13:16.268389 7fba3d124700 10 osd.3 pg_epoch: 2233 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] state<Started/Primary/Active>: Active: handling ActMap
98390:2020-09-11 14:13:16.268393 7fba3d124700  7 osd.3 pg_epoch: 2233 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] state<Started/Primary>: handle ActMap primary
98391:2020-09-11 14:13:16.268398 7fba3d124700 15 osd.3 pg_epoch: 2233 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] publish_stats_to_osd 2231: no change since 2020-09-11 14:13:14.665310
98392:2020-09-11 14:13:16.268403 7fba3d124700 10 osd.3 pg_epoch: 2233 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] take_waiters
98393:2020-09-11 14:13:16.268407 7fba3d124700 20 osd.3 pg_epoch: 2233 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] handle_activate_map: Not dirtying info: last_persisted is 2231 while current is 2233
98394:2020-09-11 14:13:16.268412 7fba3d124700 10 osd.3 pg_epoch: 2233 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2231/0 2226/2227/2223) [3,2] r=0 lpr=2227 crt=201'1 lcod 0'0 mlcod 0'0 active+clean] handle_peering_event: epoch_sent: 2233 epoch_requested: 2233 NullEvt
132571:2020-09-11 14:17:52.776670 7fba4fd58700 20 osd.3 2233 11.4 heartbeat_peers 2,3

1.1 收到新的osdmap

在将osd0踢出osdmap之后,osd3会接收到这一变化,从而触发新的Peering进程。此时osd3接收到e2226版本的osdmap:

2020-09-11 14:10:18.954484 7fba46937700 20 osd.3 2225 _dispatch 0x7fba6e721180 osd_map(2226..2226 src has 1514..2226) v3
2020-09-11 14:10:18.954505 7fba46937700  3 osd.3 2225 handle_osd_map epochs [2226,2226], i have 2226, src has [1514,2226]

然后调用如下函数来进行处理:

void OSD::handle_osd_map(MOSDMap *m)
{
	...

	store->queue_transaction(
		service.meta_osr.get(),
		std::move(t),
		new C_OnMapApply(&service, pinned_maps, last),
		new C_OnMapCommit(this, start, last, m), 0);

	...
}

void OSD::_committed_osd_maps(epoch_t first, epoch_t last, MOSDMap *m)
{
	...
	
	// yay!
	consume_map();

	...
}

void OSD::consume_map()
{
	...

	// scan pg's
	{
		RWLock::RLocker l(pg_map_lock);
		for (ceph::unordered_map<spg_t,PG*>::iterator it = pg_map.begin();it != pg_map.end();++it) {
			PG *pg = it->second;
			pg->lock();
			pg->queue_null(osdmap->get_epoch(), osdmap->get_epoch());
			pg->unlock();
		}
		
		logger->set(l_osd_pg, pg_map.size());
	}
	...
}

程序运行到这里,就构造了一个NullEvt到OSD的消息队列,从而触发相应的Peering流程。

2. 进入Peering流程

进入Peering流程的第一个处理函数为:

void OSD::process_peering_events(
  const list<PG*> &pgs,
  ThreadPool::TPHandle &handle
  )
{
	bool need_up_thru = false;
	epoch_t same_interval_since = 0;
	OSDMapRef curmap = service.get_osdmap();
	PG::RecoveryCtx rctx = create_context();
	rctx.handle = &handle;

	for (list<PG*>::const_iterator i = pgs.begin();i != pgs.end();++i) {
		set<boost::intrusive_ptr<PG> > split_pgs;
		PG *pg = *i;
		pg->lock_suspend_timeout(handle);
		curmap = service.get_osdmap();

		if (pg->deleting) {
			pg->unlock();
			continue;
		}

		if (!advance_pg(curmap->get_epoch(), pg, handle, &rctx, &split_pgs)) {
			// we need to requeue the PG explicitly since we didn't actually
			// handle an event
			peering_wq.queue(pg);
		} else {
			assert(!pg->peering_queue.empty());
			PG::CephPeeringEvtRef evt = pg->peering_queue.front();
			pg->peering_queue.pop_front();
			pg->handle_peering_event(evt, &rctx);
		}

		need_up_thru = pg->need_up_thru || need_up_thru;
		same_interval_since = MAX(pg->info.history.same_interval_since,
		same_interval_since);
		pg->write_if_dirty(*rctx.transaction);

		if (!split_pgs.empty()) {
			rctx.on_applied->add(new C_CompleteSplits(this, split_pgs));
			split_pgs.clear();
		}
		dispatch_context_transaction(rctx, pg, &handle);
		pg->unlock();
		handle.reset_tp_timeout();
	}
	if (need_up_thru)
		queue_want_up_thru(same_interval_since);

	dispatch_context(rctx, 0, curmap, &handle);
	
	service.send_pg_temp();
}

在上面的OSD::process_peering_events()函数中,遍历该OSD上的每一个PG:调用OSD::advance_pg()来进行osdmap追赶。如果OSD::advance_pg()返回值为true,表明当前已经追赶上最新的osdmap,已经可以处理peering event了;否则,表明当前还没有追赶上,因此会将该pg重新放入peering_wq中。

bool OSD::advance_pg(
  epoch_t osd_epoch, PG *pg,
  ThreadPool::TPHandle &handle,
  PG::RecoveryCtx *rctx,
  set<boost::intrusive_ptr<PG> > *new_pgs)
{
	...

	for (;next_epoch <= osd_epoch && next_epoch <= max;++next_epoch) {
		...

		nextmap->pg_to_up_acting_osds(
			pg->info.pgid.pgid,
			&newup, &up_primary,
			&newacting, &acting_primary);

		pg->handle_advance_map(
			nextmap, lastmap, newup, up_primary,
			newacting, acting_primary, rctx);

		...
	}

	service.pg_update_epoch(pg->info.pgid, lastmap->get_epoch());
	pg->handle_activate_map(rctx);

	...
}

在这里针对PG 11.4而言,当前osd3的osdmap版本为e2226,而pg 11.4的osdmap版本为e2225,因此会执行上面的for循环。

1) 函数pg_to_up_acting_osds()

void pg_to_up_acting_osds(pg_t pg, vector<int> *up, int *up_primary,
            vector<int> *acting, int *acting_primary) const {
	_pg_to_up_acting_osds(pg, up, up_primary, acting, acting_primary);
}
void OSDMap::_pg_to_up_acting_osds(const pg_t& pg, vector<int> *up, int *up_primary,
                                   vector<int> *acting, int *acting_primary) const{
}

函数OSDMap::pg_to_up_acting_osds()根据osdmap来计算指定PG所映射到的osd。这里针对PG 11.4而言,计算出的newup为[3,2], newacting为[3].

2) 函数handle_advance_map()

void PG::handle_advance_map(
  OSDMapRef osdmap, OSDMapRef lastmap,
  vector<int>& newup, int up_primary,
  vector<int>& newacting, int acting_primary,
  RecoveryCtx *rctx)
{
	dout(10) << "handle_advance_map "
		<< newup << "/" << newacting
		<< " -- " << up_primary << "/" << acting_primary
		<< dendl;

	update_osdmap_ref(osdmap);

	...
	
	AdvMap evt(
		osdmap, lastmap, newup, up_primary,
		newacting, acting_primary);
	recovery_state.handle_event(evt, rctx);
	...
}

通过上面第一行的打印信息,我们了解到:PG11.4对应的up set为[3,2], acting set也为[3],up_primary为3,acting_primary也为3。

之后会调用PG::update_osdmap_ref()将PG当前的osdmap进行更新;最后产生一个AdvMap事件,交由recovery_state来进行处理,从而触发peering进程。

3) 函数pg_update_epoch()

void pg_update_epoch(spg_t pgid, epoch_t epoch) {
	Mutex::Locker l(pg_epoch_lock);
	map<spg_t,epoch_t>::iterator t = pg_epoch.find(pgid);
	assert(t != pg_epoch.end());
	pg_epochs.erase(pg_epochs.find(t->second));
	t->second = epoch;
	pg_epochs.insert(epoch);
}

函数pg_update_epoch()更新PG的pg_epoch值

4) 函数handle_activate_map()

void PG::handle_activate_map(RecoveryCtx *rctx)
{
	dout(10) << "handle_activate_map " << dendl;
	ActMap evt;
	recovery_state.handle_event(evt, rctx);
	if (osdmap_ref->get_epoch() - last_persisted_osdmap_ref->get_epoch() > cct->_conf->osd_pg_epoch_persisted_max_stale) {
		dout(20) << __func__ << ": Dirtying info: last_persisted is "<< last_persisted_osdmap_ref->get_epoch()
			<< " while current is " << osdmap_ref->get_epoch() << dendl;
		dirty_info = true;
	} else {
		dout(20) << __func__ << ": Not dirtying info: last_persisted is "<< last_persisted_osdmap_ref->get_epoch()
			<< " while current is " << osdmap_ref->get_epoch() << dendl;
	}

	if (osdmap_ref->check_new_blacklist_entries()) check_blacklisted_watchers();
}

函数handle_activate_map()用于激活当前阶段的osdmap,通常是用于触发向Replica发送通知消息,以推动Peering的进程。

注: 通常来说,AdvMap是同步完成的,而ActMap是异步完成的。

2.1 Clean状态对AdvMap事件的处理

由于在当前状态,PG 11.4已经处于clean状态,且osd3对于PG 11.4而言是主OSD,因此接收到AdvMap事件时,处理流程如下:

boost::statechart::result PG::RecoveryState::Active::react(const AdvMap& advmap)
{
	...

	return forward_event();
}

boost::statechart::result PG::RecoveryState::Started::react(const AdvMap& advmap)
{
  dout(10) << "Started advmap" << dendl;
  PG *pg = context< RecoveryMachine >().pg;
  pg->check_full_transition(advmap.lastmap, advmap.osdmap);
  if (pg->should_restart_peering(
	advmap.up_primary,
	advmap.acting_primary,
	advmap.newup,
	advmap.newacting,
	advmap.lastmap,
	advmap.osdmap)) {
    dout(10) << "should_restart_peering, transitioning to Reset" << dendl;
    post_event(advmap);
    return transit< Reset >();
  }
  pg->remove_down_peer_info(advmap.osdmap);
  return discard_event();
}

从以上代码可以看出,其最终会调用pg->should_restart_peering()来检查是否需要触发新的peering操作。通过以前的代码分析,我们知道只要满足如下条件之一即需要重新peering:

  • PG的acting primary发生了改变;

  • PG的acting set发生了改变;

  • PG的up primary发生了改变;

  • PG的up set发生了改变;

  • PG的min_size发生了改变(通常是调整了rule规则);

  • PG的副本size值发生了改变;

  • PG进行了分裂

  • PG的sort bitwise发生了改变

这里针对PG 11.4,其up set发生了变化,因此会触发新的peering状态。此时state_machine进入Reset状态。

2.2 进入Reset状态

2.2.1 Reset构造函数
PG::RecoveryState::Reset::Reset(my_context ctx)
  : my_base(ctx),
    NamedState(context< RecoveryMachine >().pg->cct, "Reset")
{
	context< RecoveryMachine >().log_enter(state_name);
	PG *pg = context< RecoveryMachine >().pg;
	
	pg->flushes_in_progress = 0;
	pg->set_last_peering_reset();
}
void PG::set_last_peering_reset()
{
  dout(20) << "set_last_peering_reset " << get_osdmap()->get_epoch() << dendl;
  if (last_peering_reset != get_osdmap()->get_epoch()) {
    last_peering_reset = get_osdmap()->get_epoch();
    reset_interval_flush();
  }
}

在Reset构造函数中,调用pg->set_last_peering_reset将last_peering_reset设置为e2226.

2.2.2 处理Adv事件

在进入Reset状态之前,我们通过post_event()向Reset投递了AdvMap事件,这里我们来看对该事件的处理:

boost::statechart::result PG::RecoveryState::Reset::react(const AdvMap& advmap)
{
	PG *pg = context< RecoveryMachine >().pg;
	dout(10) << "Reset advmap" << dendl;
	
	// make sure we have past_intervals filled in.  hopefully this will happen
	// _before_ we are active.
	pg->generate_past_intervals();
	
	pg->check_full_transition(advmap.lastmap, advmap.osdmap);
	
	if (pg->should_restart_peering(
	  advmap.up_primary,
	  advmap.acting_primary,
	  advmap.newup,
	  advmap.newacting,
	  advmap.lastmap,
	  advmap.osdmap)) {
		dout(10) << "should restart peering, calling start_peering_interval again"<< dendl;
		pg->start_peering_interval(
		advmap.lastmap,
		advmap.newup, advmap.up_primary,
		advmap.newacting, advmap.acting_primary,
		context< RecoveryMachine >().get_cur_transaction());
	}

	pg->remove_down_peer_info(advmap.osdmap);
	return discard_event();
}

上面调用pg->should_restart_peering()再一次检查是否需要重新peering。然后再调用pg->start_peering_interval()来启动peering流程。

1) 函数generate_past_intervals()

void PG::generate_past_intervals(){
	....

	if (!_calc_past_interval_range(&cur_epoch, &end_epoch,
		osd->get_superblock().oldest_map)) {
		if (info.history.same_interval_since == 0) {
			info.history.same_interval_since = end_epoch;
			dirty_info = true;
		}
		return;
	}

	...
}
bool PG::_calc_past_interval_range(epoch_t *start, epoch_t *end, epoch_t oldest_map)
{
	if (info.history.same_interval_since) {
		*end = info.history.same_interval_since;
	} else {
		// PG must be imported, so let's calculate the whole range.
		*end = osdmap_ref->get_epoch();
	}
	
	// Do we already have the intervals we want?
	map<epoch_t,pg_interval_t>::const_iterator pif = past_intervals.begin();
	if (pif != past_intervals.end()) {
		if (pif->first <= info.history.last_epoch_clean) {
			dout(10) << __func__ << ": already have past intervals back to "<< info.history.last_epoch_clean << dendl;
			return false;
		}
		*end = past_intervals.begin()->first;
	}
	
	*start = MAX(MAX(info.history.epoch_created,
		info.history.last_epoch_clean),
		oldest_map);
	if (*start >= *end) {
		dout(10) << __func__ << " start epoch " << *start << " >= end epoch " << *end<< ", nothing to do" << dendl;
		return false;
	}
	
	return true;
}

函数_calc_past_interval_range()用于计算一个past_interval的范围。所谓一个past_interval,是指一个连续的epoch段[epoch_start, epoch_end],在该epoch段内:

  • PG的acting primary保持不变;

  • PG的acting set保持不变;

  • PG的up primary保持不变;

  • PG的up set保持不变;

  • PG的min_size保持不变;

  • PG的副本size值保持不变;

  • PG没有进行分裂;

  • PG的sort bitwise保持不变;

函数_calc_past_interval_range()如果返回true,表示成功计算到一个past_interval;如果返回false,则表示该interval已经计算过,不必再计算,或者不是一个有效的past_interval。

结合当前的打印日志消息:

42238:2020-09-11 14:10:18.974114 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3] r=0 lpr=2226 luod=0'0 crt=201'1 lcod 0'0 mlcod 0'0 active] _calc_past_interval_range start epoch 2224 >= end epoch 2223, nothing to do

当前info.history.same_interval_since的值为e2223,而info.history.last_epoch_clean的值为e2224,因此函数_calc_past_interval_range()的返回值为false,表明不是一个有效的past_interval。

2) 函数start_peering_interval()

void PG::start_peering_interval(
  const OSDMapRef lastmap,
  const vector<int>& newup, int new_up_primary,
  const vector<int>& newacting, int new_acting_primary,
  ObjectStore::Transaction *t)
{
	...

	set_last_peering_reset();

	...

	init_primary_up_acting(
		newup,
		newacting,
		new_up_primary,
		new_acting_primary);

	....
	if(!lastmap){
		
	}else{
		...
		pg_interval_t::check_new_interval(....);

		dout(10) << __func__ << ": check_new_interval output: "
			<< debug.str() << dendl;
		if (new_interval) {
			dout(10) << " noting past " << past_intervals.rbegin()->second << dendl;
			dirty_info = true;
			dirty_big_info = true;
			info.history.same_interval_since = osdmap->get_epoch();
		}
	}

	...

	dout(10) << " up " << oldup << " -> " << up 
		<< ", acting " << oldacting << " -> " << acting 
		<< ", acting_primary " << old_acting_primary << " -> " << new_acting_primary
		<< ", up_primary " << old_up_primary << " -> " << new_up_primary
		<< ", role " << oldrole << " -> " << role
		<< ", features acting " << acting_features
		<< " upacting " << upacting_features
		<< dendl;
	
	// deactivate.
	state_clear(PG_STATE_ACTIVE);
	state_clear(PG_STATE_PEERED);
	state_clear(PG_STATE_DOWN);
	state_clear(PG_STATE_RECOVERY_WAIT);
	state_clear(PG_STATE_RECOVERING);
	
	peer_purged.clear();
	actingbackfill.clear();
	snap_trim_queued = false;
	scrub_queued = false;
	
	// reset primary state?
	if (was_old_primary || is_primary()) {
		osd->remove_want_pg_temp(info.pgid.pgid);
	}
	clear_primary_state();

	// pg->on_*
	on_change(t);

	assert(!deleting);
	
	// should we tell the primary we are here?
	send_notify = !is_primary();
	
	if (role != oldrole || was_old_primary != is_primary()) {
		// did primary change?
		if (was_old_primary != is_primary()) {
			state_clear(PG_STATE_CLEAN);
			clear_publish_stats();
	
			// take replay queue waiters
			list<OpRequestRef> ls;

			for (map<eversion_t,OpRequestRef>::iterator it = replay_queue.begin();it != replay_queue.end();++it)
				ls.push_back(it->second);

			replay_queue.clear();
			requeue_ops(ls);
		}
	
		on_role_change();
	
		// take active waiters
		requeue_ops(waiting_for_peered);
	
	} else {
		....
	}

	cancel_recovery();
	....
}

在收到新的osdmap,调用advance_map(),然后触发新的Peering之前,会调用PG::start_peering_interval(),完成相关状态的重新设置。

  • set_last_peering_reset()用于清空上一次的peering数据;

  • init_primary_up_acting()用于设置当前新的acting set以及up set

  • 设置info.stats的up set、acting set的值

  • 将当前PG的状态设置为REMAPPED

// This will now be remapped during a backfill in cases
// that it would not have been before.
if (up != acting)
	state_set(PG_STATE_REMAPPED);
else
	state_clear(PG_STATE_REMAPPED);

由于当前PG的up set的值为[3,2],acting set的值为[3],因此这里设置PG的状态为REMAPPED

  • 计算PG 11.4中OSD3的角色
int OSDMap::calc_pg_rank(int osd, const vector<int>& acting, int nrep)
{
	if (!nrep)
		nrep = acting.size();
	for (int i=0; i<nrep; i++) 
		if (acting[i] == osd)
			return i;
	return -1;
}

int OSDMap::calc_pg_role(int osd, const vector<int>& acting, int nrep)
{
	if (!nrep)
		nrep = acting.size();
	return calc_pg_rank(osd, acting, nrep);
}

这里osd3的角色是primary。

  • check_new_interval(): 用于检查是否是一个新的interval。如果是新interval,则计算出该新的past_interval:

42242:2020-09-11 14:10:18.974135 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201’1 (0’0,201’1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2223/2223/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 luod=0’0 crt=201’1 lcod 0’0 mlcod 0’0 active+remapped] start_peering_interval: check_new_interval output: generate_past_intervals interval(2223-2225 up 3 acting 3): not rw, up_thru 2223 up_from 2123 last_epoch_clean 2224

当前的PG osdmap的epoch为e2226,当前的up set为[3,2], acting set为[3],计算出的一个past interval为[e2223,e2225],在此一interval期间,up set为[3], acting set也为[3],up_thru为e2223, up_from为e2123, last_epoch_clean为e2224.

  • 清除PG的相关状态。关于PG的状态是由一个unsigned类型的整数来表示的
/*
 * pg states
 */
#define PG_STATE_CREATING     (1<<0)  				// creating
#define PG_STATE_ACTIVE       (1<<1)  				// i am active.  (primary: replicas too)
#define PG_STATE_CLEAN        (1<<2)  				// peers are complete, clean of stray replicas.
#define PG_STATE_DOWN         (1<<4) 				 // a needed replica is down, PG offline
#define PG_STATE_REPLAY       (1<<5) 				 // crashed, waiting for replay
//#define PG_STATE_STRAY      (1<<6)  				// i must notify the primary i exist.
#define PG_STATE_SPLITTING    (1<<7)  				// i am splitting
#define PG_STATE_SCRUBBING    (1<<8)  				// scrubbing
#define PG_STATE_SCRUBQ       (1<<9)  				// queued for scrub
#define PG_STATE_DEGRADED     (1<<10) 				// pg contains objects with reduced redundancy
#define PG_STATE_INCONSISTENT (1<<11) 				// pg replicas are inconsistent (but shouldn't be)
#define PG_STATE_PEERING      (1<<12)				// pg is (re)peering
#define PG_STATE_REPAIR       (1<<13) 				// pg should repair on next scrub
#define PG_STATE_RECOVERING   (1<<14) 				// pg is recovering/migrating objects
#define PG_STATE_BACKFILL_WAIT     (1<<15) 			// [active] reserving backfill
#define PG_STATE_INCOMPLETE   (1<<16) 				// incomplete content, peering failed.
#define PG_STATE_STALE        (1<<17) 				// our state for this pg is stale, unknown.
#define PG_STATE_REMAPPED     (1<<18) 				// pg is explicitly remapped to different OSDs than CRUSH
#define PG_STATE_DEEP_SCRUB   (1<<19) 				// deep scrub: check CRC32 on files
#define PG_STATE_BACKFILL  (1<<20) 					// [active] backfilling pg content
#define PG_STATE_BACKFILL_TOOFULL (1<<21) 			// backfill can't proceed: too full
#define PG_STATE_RECOVERY_WAIT (1<<22) 				// waiting for recovery reservations
#define PG_STATE_UNDERSIZED    (1<<23) 				// pg acting < pool size
#define PG_STATE_ACTIVATING   (1<<24) 				// pg is peered but not yet active
#define PG_STATE_PEERED        (1<<25) 				// peered, cannot go active, can recover
#define PG_STATE_SNAPTRIM      (1<<26) 				// trimming snaps
#define PG_STATE_SNAPTRIM_WAIT (1<<27) 				// queued to trim snaps

这里将active状态清除了,因此变为inactive状态。

  • 情况primary的peering状态
if (was_old_primary || is_primary()) {
	osd->remove_want_pg_temp(info.pgid.pgid);
}
clear_primary_state();
  • PG 11.4所在OSD3的角色未发生变化,仍然为primary,因此有如下输出
42263:2020-09-11 14:10:18.974231 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] [3] -> [3], replicas changed
  • cancel_recovery(): 取消PG当前正在进行中的recovery动作;

到此为止,由OSD::advance_pg()函数中pg->handle_advance_map()所触发的AdvMap事件就已经处理完成。

AdvMap事件主要作用是: 触发检查是否需要重新Peering,如果需要,并完成Peering的相关初始化操作。

2.2.3 Reset状态下对ActMap事件的处理

这里针对PG 11.4而言,通过AdvMap事件完成了Peering的相关检查操作。然后在OSD::advance_pg()函数中调用PG::handle_activate_map()来激活该OSDMap。这里是对该ActMap事件的处理,参看如下日志片段:

42267:2020-09-11 14:10:18.974247 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] handle_activate_map 
42268:2020-09-11 14:10:18.974251 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] update_heartbeat_peers 3 -> 2,3
42270:2020-09-11 14:10:18.974256 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] take_waiters
42271:2020-09-11 14:10:18.974260 7fba3d925700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] exit Reset 0.000165 1 0.000199

如下是相关的处理函数:

boost::statechart::result PG::RecoveryState::Reset::react(const ActMap&)
{
  PG *pg = context< RecoveryMachine >().pg;
  if (pg->should_send_notify() && pg->get_primary().osd >= 0) {
    context< RecoveryMachine >().send_notify(
      pg->get_primary(),
      pg_notify_t(
	pg->get_primary().shard, pg->pg_whoami.shard,
	pg->get_osdmap()->get_epoch(),
	pg->get_osdmap()->get_epoch(),
	pg->info),
      pg->past_intervals);
  }

  pg->update_heartbeat_peers();
  pg->take_waiters();

  return transit< Started >();
}

通常在osdmap发送变动,从而引发PG的acting set、up set发生变动,对于一个PG的副本OSD通常会发送一个通知消息到PG的Primary OSD。这里由于osd3已经是PG11.4的primary OSD,因此这里不需要发送通知消息。

注: 关于send_notify变量的设置,是在Initial::react(const Load&)函数中。另外send_notify()最终的发送函数为OSD::do_notifies()

2.3 进入Started状态

PG::RecoveryState::Started::Started(my_context ctx)
  : my_base(ctx),
    NamedState(context< RecoveryMachine >().pg->cct, "Started")
{
	context< RecoveryMachine >().log_enter(state_name);
}
2.3.1 进入Started/Start状态

Started的默认初始子状态为Start状态:

PG::RecoveryState::Start::Start(my_context ctx)
  : my_base(ctx),
    NamedState(context< RecoveryMachine >().pg->cct, "Start")
{
	context< RecoveryMachine >().log_enter(state_name);
	
	PG *pg = context< RecoveryMachine >().pg;
	if (pg->is_primary()) {
		dout(1) << "transitioning to Primary" << dendl;
		post_event(MakePrimary());
	} else { //is_stray
		dout(1) << "transitioning to Stray" << dendl; 
		post_event(MakeStray());
	}
}

这里osd3对于PG 11.4而言为主OSD,因此产生MakePrimary()事件,进入Started/Primary状态。

2.3.2 进入Started/Primary状态
PG::RecoveryState::Primary::Primary(my_context ctx)
  : my_base(ctx),
    NamedState(context< RecoveryMachine >().pg->cct, "Started/Primary")
{
	context< RecoveryMachine >().log_enter(state_name);
	PG *pg = context< RecoveryMachine >().pg;
	assert(pg->want_acting.empty());
	
	// set CREATING bit until we have peered for the first time.
	if (pg->info.history.last_epoch_started == 0) {
		pg->state_set(PG_STATE_CREATING);
		// use the history timestamp, which ultimately comes from the
		// monitor in the create case.
		utime_t t = pg->info.history.last_scrub_stamp;
		pg->info.stats.last_fresh = t;
		pg->info.stats.last_active = t;
		pg->info.stats.last_change = t;
		pg->info.stats.last_peered = t;
		pg->info.stats.last_clean = t;
		pg->info.stats.last_unstale = t;
		pg->info.stats.last_undegraded = t;
		pg->info.stats.last_fullsized = t;
		pg->info.stats.last_scrub_stamp = t;
		pg->info.stats.last_deep_scrub_stamp = t;
		pg->info.stats.last_clean_scrub_stamp = t;
	}
}

2.4 进入Peering状态

Started/Primary的默认初始子状态为Started/Primary/Peering,我们来看如下Peering的构造函数:

PG::RecoveryState::Peering::Peering(my_context ctx)
  : my_base(ctx),
    NamedState(context< RecoveryMachine >().pg->cct, "Started/Primary/Peering"),
    history_les_bound(false)
{
	context< RecoveryMachine >().log_enter(state_name);
	
	PG *pg = context< RecoveryMachine >().pg;
	assert(!pg->is_peered());
	assert(!pg->is_peering());
	assert(pg->is_primary());
	pg->state_set(PG_STATE_PEERING);
}

进入Peering状态后,会将PG的状态设置为Peering。然后直接进入Peering的默认初始子状态GetInfo.

2.4.1 进入Peering/GetInfo状态
/*--------GetInfo---------*/
PG::RecoveryState::GetInfo::GetInfo(my_context ctx)
  : my_base(ctx),
    NamedState(context< RecoveryMachine >().pg->cct, "Started/Primary/Peering/GetInfo")
{
	context< RecoveryMachine >().log_enter(state_name);
	
	PG *pg = context< RecoveryMachine >().pg;
	pg->generate_past_intervals();
	unique_ptr<PriorSet> &prior_set = context< Peering >().prior_set;
	
	assert(pg->blocked_by.empty());
	
	if (!prior_set.get())
		pg->build_prior(prior_set);
	
	pg->reset_min_peer_features();
	get_infos();

	if (peer_info_requested.empty() && !prior_set->pg_down) {
		post_event(GotInfo());
	}
}

进入GetInfo状态后,首先调用generate_past_intervals()计算past_intervals。这里我们在前面Reset阶段就已经计算出了新的past_interval为[2223,2225]。从如下日志片段中也能看出:

42279:2020-09-11 14:10:18.974292 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] _calc_past_interval_range: already have past intervals back to 2224

1) 函数PG::build_prior()

void PG::build_prior(std::unique_ptr<PriorSet> &prior_set)
{
	if (1) {
		// sanity check
		for (map<pg_shard_t,pg_info_t>::iterator it = peer_info.begin();it != peer_info.end();++it) {
			assert(info.history.last_epoch_started >= it->second.history.last_epoch_started);
		}
	}

	prior_set.reset(
		new PriorSet(
			pool.info.ec_pool(),
			get_pgbackend()->get_is_recoverable_predicate(),
			*get_osdmap(),
			past_intervals,
			up,
			acting,
			info,
			this));
	PriorSet &prior(*prior_set.get());
	
	if (prior.pg_down) {
		state_set(PG_STATE_DOWN);
	}
	
	if (get_osdmap()->get_up_thru(osd->whoami) < info.history.same_interval_since) {
		dout(10) << "up_thru " << get_osdmap()->get_up_thru(osd->whoami)
			<< " < same_since " << info.history.same_interval_since
			<< ", must notify monitor" << dendl;
			need_up_thru = true;
	} else {
		dout(10) << "up_thru " << get_osdmap()->get_up_thru(osd->whoami)
			<< " >= same_since " << info.history.same_interval_since
			<< ", all is well" << dendl;
		need_up_thru = false;
	}
	set_probe_targets(prior_set->probe);
}

关于PriorSet,我们这里不做介绍,后面我们会再开辟专门的章节来进行说明(参见up_thru)。这里我们给出针对PG 11.4所构建的PriorSet的结果:

42280:2020-09-11 14:10:18.974297 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering]  PriorSet: build_prior interval(2223-2225 up [3](3) acting [3](3) maybe_went_rw)
42281:2020-09-11 14:10:18.974301 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering]  PriorSet: build_prior final: probe 2,3 down  blocked_by {}
42282:2020-09-11 14:10:18.974305 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] up_thru 2223 < same_since 2226, must notify monitor

当前osdmap的epoch为e2226,对于osd3来说其上一次申请的up_thru为e2223,而当前的same_interval_since为e2226,因此这里need_up_thru需要设置为true。

2) 函数PG::get_infos()

void PG::RecoveryState::GetInfo::get_infos()
{
	PG *pg = context< RecoveryMachine >().pg;
	unique_ptr<PriorSet> &prior_set = context< Peering >().prior_set;
	
	pg->blocked_by.clear();
	for (set<pg_shard_t>::const_iterator it = prior_set->probe.begin();it != prior_set->probe.end();++it) {
		pg_shard_t peer = *it;
		if (peer == pg->pg_whoami) {
			continue;
		}
		if (pg->peer_info.count(peer)) {
			dout(10) << " have osd." << peer << " info " << pg->peer_info[peer] << dendl;
			continue;
		}

		if (peer_info_requested.count(peer)) {
			dout(10) << " already requested info from osd." << peer << dendl;
			pg->blocked_by.insert(peer.osd);
		} else if (!pg->get_osdmap()->is_up(peer.osd)) {

			dout(10) << " not querying info from down osd." << peer << dendl;

		} else {
			dout(10) << " querying info from osd." << peer << dendl;
			context< RecoveryMachine >().send_query(
				peer, pg_query_t(pg_query_t::INFO,
				it->shard, pg->pg_whoami.shard,
				pg->info.history,
				pg->get_osdmap()->get_epoch()));

			peer_info_requested.insert(peer);
			pg->blocked_by.insert(peer.osd);
		}
	}
	
	pg->publish_stats_to_osd();
}


//位于src/osd/pg.h的RecoveryMachine中
void send_query(pg_shard_t to, const pg_query_t &query) {
	assert(state->rctx);
	assert(state->rctx->query_map);
	(*state->rctx->query_map)[to.osd][spg_t(pg->info.pgid.pgid, to.shard)] = query;
}

这里由上面构造的PriorSet为:

42280:2020-09-11 14:10:18.974297 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering]  PriorSet: build_prior interval(2223-2225 up [3](3) acting [3](3) maybe_went_rw)
42281:2020-09-11 14:10:18.974301 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering]  PriorSet: build_prior final: probe 2,3 down  blocked_by {}

GetInfo()是由PG的主OSD向从OSD发送pg_query_t::INFO消息以获取从OSD的pg_into_t信息。这里对于PG11.4而言,因为目前其prior_set->probe为[2,3]。因此这里会向osd2发出pg_query_t::INFO请求。这里我们还是讲述一下GetInfo究竟是要获取哪些info信息,下面我们先来看一下pg_query_t结构体的定义:

/** 
 * pg_query_t - used to ask a peer for information about a pg.
 *
 * note: if version=0, type=LOG, then we just provide our full log.
 */
struct pg_query_t {
	enum {
		INFO = 0,
		LOG = 1,
		MISSING = 4,
		FULLLOG = 5,
	};
	
	__s32 type;
	eversion_t since;
	pg_history_t history;
	epoch_t epoch_sent;
	shard_id_t to;
	shard_id_t from;
};

如下则是pg_info_t结构体的定义(src/osd/osd_types.h):

/**
 * pg_info_t - summary of PG statistics.
 *
 * some notes: 
 *  - last_complete implies we have all objects that existed as of that
 *    stamp, OR a newer object, OR have already applied a later delete.
 *  - if last_complete >= log.bottom, then we know pg contents thru log.head.
 *    otherwise, we have no idea what the pg is supposed to contain.
 */
struct pg_info_t {
	spg_t pgid;
	eversion_t last_update;                     ///< last object version applied to store.
	eversion_t last_complete;                  ///< last version pg was complete through.
	epoch_t last_epoch_started;                ///< last epoch at which this pg started on this osd
	
	version_t last_user_version;               ///< last user object version applied to store
	
	eversion_t log_tail;                       ///< oldest log entry.
	
	hobject_t last_backfill;                   ///< objects >= this and < last_complete may be missing
	bool last_backfill_bitwise;                ///< true if last_backfill reflects a bitwise (vs nibblewise) sort
	
	interval_set<snapid_t> purged_snaps;
	
	pg_stat_t stats;
	
	pg_history_t history;
	pg_hit_set_history_t hit_set;
};

上面send_query()函数将要查询的pg_query_t::INFO放入RecoveryMachine的state中,然后会由OSD::process_peering_events()中的dispatch_context()来将消息发送出去:

void OSD::process_peering_events(
  const list<PG*> &pgs,
  ThreadPool::TPHandle &handle
  )
{
	...
	
	dispatch_context(rctx, 0, curmap, &handle);

	...
}

void OSD::dispatch_context(PG::RecoveryCtx &ctx, PG *pg, OSDMapRef curmap,
                           ThreadPool::TPHandle *handle)
{
	if (service.get_osdmap()->is_up(whoami) && is_active()) {

		do_notifies(*ctx.notify_list, curmap);
		do_queries(*ctx.query_map, curmap);
		do_infos(*ctx.info_map, curmap);

	}
}

/** do_queries
 * send out pending queries for info | summaries
 */
void OSD::do_queries(map<int, map<spg_t,pg_query_t> >& query_map,
		     OSDMapRef curmap)
{
	for (map<int, map<spg_t,pg_query_t> >::iterator pit = query_map.begin();pit != query_map.end();++pit) {
		if (!curmap->is_up(pit->first)) {
			dout(20) << __func__ << " skipping down osd." << pit->first << dendl;
			continue;
		}
		int who = pit->first;
		ConnectionRef con = service.get_con_osd_cluster(who, curmap->get_epoch());

		if (!con) {
			dout(20) << __func__ << " skipping osd." << who
				<< " (NULL con)" << dendl;
			continue;
		}
		service.share_map_peer(who, con.get(), curmap);
		dout(7) << __func__ << " querying osd." << who
			<< " on " << pit->second.size() << " PGs" << dendl;

		MOSDPGQuery *m = new MOSDPGQuery(curmap->get_epoch(), pit->second);
		con->send_message(m);
	}
}

接着在从OSD接收到MSG_OSD_PG_QUERY消息后:

void OSD::dispatch_op(OpRequestRef op)
{
	switch (op->get_req()->get_type()) {
	
		case MSG_OSD_PG_CREATE:
			handle_pg_create(op);
			break;
		case MSG_OSD_PG_NOTIFY:
			handle_pg_notify(op);
			break;
		case MSG_OSD_PG_QUERY:
			handle_pg_query(op);
			break;
		
		...
	}
}

/** PGQuery
 * from primary to replica | stray
 * NOTE: called with opqueue active.
 */
void OSD::handle_pg_query(OpRequestRef op)
{
	....

	map< int, vector<pair<pg_notify_t, pg_interval_map_t> > > notify_list;

	for (map<spg_t,pg_query_t>::iterator it = m->pg_list.begin();it != m->pg_list.end();++it) {
	
		...

		
		//处理在pg_map中找到了的PG的query请求		
		{
			RWLock::RLocker l(pg_map_lock);
			if (pg_map.count(pgid)) {
				PG *pg = 0;
				pg = _lookup_lock_pg_with_map_lock_held(pgid);
				pg->queue_query(
					it->second.epoch_sent, it->second.epoch_sent,
					pg_shard_t(from, it->second.from), it->second);
				pg->unlock();
				continue;
			}
		}

		//处理在PG_map中未找到的PG的query请求
	}

	do_notifies(notify_list, osdmap);
}

void PG::queue_query(epoch_t msg_epoch,
		     epoch_t query_epoch,
		     pg_shard_t from, const pg_query_t& q)
{
	dout(10) << "handle_query " << q << " from replica " << from << dendl;
	queue_peering_event(
      CephPeeringEvtRef(std::make_shared<CephPeeringEvt>(msg_epoch, query_epoch,
					 MQuery(from, q, query_epoch))));
}

void OSD::process_peering_events(
  const list<PG*> &pgs,
  ThreadPool::TPHandle &handle
  )	
{
	...

	PG::CephPeeringEvtRef evt = pg->peering_queue.front();
	pg->peering_queue.pop_front();
	pg->handle_peering_event(evt, &rctx);

	...
}

void PG::handle_peering_event(CephPeeringEvtRef evt, RecoveryCtx *rctx)
{
	dout(10) << "handle_peering_event: " << evt->get_desc() << dendl;
	if (!have_same_or_newer_map(evt->get_epoch_sent())) {
		dout(10) << "deferring event " << evt->get_desc() << dendl;
		peering_waiters.push_back(evt);
		return;
	}

	if (old_peering_evt(evt))
		return;
	recovery_state.handle_event(evt, rctx);
}


boost::statechart::result PG::RecoveryState::Stray::react(const MQuery& query)
{
	PG *pg = context< RecoveryMachine >().pg;
	if (query.query.type == pg_query_t::INFO) {
		pair<pg_shard_t, pg_info_t> notify_info;
		pg->update_history_from_master(query.query.history);
		pg->fulfill_info(query.from, query.query, notify_info);

		context< RecoveryMachine >().send_notify(
			notify_info.first,
			pg_notify_t(
			  notify_info.first.shard, pg->pg_whoami.shard,
			  query.query_epoch,
			  pg->get_osdmap()->get_epoch(),
			  notify_info.second),
			  pg->past_intervals);
	}

	...
}

/** do_notifies
 * Send an MOSDPGNotify to a primary, with a list of PGs that I have
 * content for, and they are primary for.
 */

void OSD::do_notifies(
  map<int,vector<pair<pg_notify_t,pg_interval_map_t> > >& notify_list,
  OSDMapRef curmap)
{
	for (map<int,vector<pair<pg_notify_t,pg_interval_map_t> > >::iterator it =notify_list.begin();it != notify_list.end();++it) {
		if (!curmap->is_up(it->first)) {
			dout(20) << __func__ << " skipping down osd." << it->first << dendl;
			continue;
		}

		ConnectionRef con = service.get_con_osd_cluster(it->first, curmap->get_epoch());
		if (!con) {
			dout(20) << __func__ << " skipping osd." << it->first<< " (NULL con)" << dendl;
			continue;
		}
		service.share_map_peer(it->first, con.get(), curmap);
		dout(7) << __func__ << " osd " << it->first << " on " << it->second.size() << " PGs" << dendl;
		MOSDPGNotify *m = new MOSDPGNotify(curmap->get_epoch(),it->second);
		con->send_message(m);
	}
}

从上面我们可以看到查询pg_query_t::INFO时返回的是一个pg_notify_t类型的包装,其封装了pg_info_t数据结构。

注: 从上面OSD::handle_pg_query()的注释可看出,handle_pg_query()用于Primary OSD向replica/stray发送查询信息


之后,PG主OSD接收到pg_query_t::INFO的响应信息(MSG_OSD_PG_NOTIFY):

void OSD::dispatch_op(OpRequestRef op)
{
	switch (op->get_req()->get_type()) {
	
		case MSG_OSD_PG_CREATE:
			handle_pg_create(op);
			break;
		case MSG_OSD_PG_NOTIFY:
			handle_pg_notify(op);
			break;
		case MSG_OSD_PG_QUERY:
			handle_pg_query(op);
			break;
		
		...
	}
}

/** PGNotify
 * from non-primary to primary
 * includes pg_info_t.
 * NOTE: called with opqueue active.
 */
void OSD::handle_pg_notify(OpRequestRef op)
{
	for (vector<pair<pg_notify_t, pg_interval_map_t> >::iterator it = m->get_pg_list().begin();it != m->get_pg_list().end();++it) {
		
		if (it->first.info.pgid.preferred() >= 0) {
			dout(20) << "ignoring localized pg " << it->first.info.pgid << dendl;
			continue;
		}
		
		handle_pg_peering_evt(
			spg_t(it->first.info.pgid.pgid, it->first.to),
			it->first.info.history, it->second,
			it->first.query_epoch,
			PG::CephPeeringEvtRef(
				new PG::CephPeeringEvt(
					it->first.epoch_sent, it->first.query_epoch,
					PG::MNotifyRec(pg_shard_t(from, it->first.from), it->first,
					op->get_req()->get_connection()->get_features())))
		);
	}
}

/*
 * look up a pg.  if we have it, great.  if not, consider creating it IF the pg mapping
 * hasn't changed since the given epoch and we are the primary.
 */
void OSD::handle_pg_peering_evt(
  spg_t pgid,
  const pg_history_t& orig_history,
  pg_interval_map_t& pi,
  epoch_t epoch,
  PG::CephPeeringEvtRef evt)
{
	...
	if (!_have_pg(pgid)) {

		...

	} else {
		// already had it.  did the mapping change?
		PG *pg = _lookup_lock_pg(pgid);
		if (epoch < pg->info.history.same_interval_since) {
			dout(10) << *pg << " get_or_create_pg acting changed in "<< pg->info.history.same_interval_since<< " (msg from " << epoch << ")" << dendl;
			pg->unlock();
			return;
		}

		pg->queue_peering_event(evt);
		pg->unlock();
		return;
	}
}

boost::statechart::result PG::RecoveryState::GetInfo::react(const MNotifyRec& infoevt) 
{
	PG *pg = context< RecoveryMachine >().pg;
	
	set<pg_shard_t>::iterator p = peer_info_requested.find(infoevt.from);
	if (p != peer_info_requested.end()) {
		peer_info_requested.erase(p);
		pg->blocked_by.erase(infoevt.from.osd);
	}
	
	epoch_t old_start = pg->info.history.last_epoch_started;
	if (pg->proc_replica_info(infoevt.from, infoevt.notify.info, infoevt.notify.epoch_sent)) {
		// we got something new ...
		unique_ptr<PriorSet> &prior_set = context< Peering >().prior_set;
		if (old_start < pg->info.history.last_epoch_started) {
			dout(10) << " last_epoch_started moved forward, rebuilding prior" << dendl;
			pg->build_prior(prior_set);
	
			// filter out any osds that got dropped from the probe set from
			// peer_info_requested.  this is less expensive than restarting
			// peering (which would re-probe everyone).
			set<pg_shard_t>::iterator p = peer_info_requested.begin();

			while (p != peer_info_requested.end()) {
				if (prior_set->probe.count(*p) == 0) {
					dout(20) << " dropping osd." << *p << " from info_requested, no longer in probe set" << dendl;
					peer_info_requested.erase(p++);
				} else {
					++p;
				}
			}
			get_infos();
		}

		dout(20) << "Adding osd: " << infoevt.from.osd << " peer features: "<< hex << infoevt.features << dec << dendl;

		pg->apply_peer_features(infoevt.features);

		// are we done getting everything?
		if (peer_info_requested.empty() && !prior_set->pg_down) {
			/*
			* make sure we have at least one !incomplete() osd from the
			* last rw interval.  the incomplete (backfilling) replicas
			* get a copy of the log, but they don't get all the object
			* updates, so they are insufficient to recover changes during
			* that interval.
			*/
			if (pg->info.history.last_epoch_started) {
				for (map<epoch_t,pg_interval_t>::reverse_iterator p = pg->past_intervals.rbegin();p != pg->past_intervals.rend();++p) {
					if (p->first < pg->info.history.last_epoch_started)
						break;
					if (!p->second.maybe_went_rw)
						continue;

					pg_interval_t& interval = p->second;
					dout(10) << " last maybe_went_rw interval was " << interval << dendl;
					OSDMapRef osdmap = pg->get_osdmap();
	
					/*
					* this mirrors the PriorSet calculation: we wait if we
					* don't have an up (AND !incomplete) node AND there are
					* nodes down that might be usable.
					*/
					bool any_up_complete_now = false;
					bool any_down_now = false;
					for (unsigned i=0; i<interval.acting.size(); i++) {
						int o = interval.acting[i];
						if (o == CRUSH_ITEM_NONE)
							continue;

						pg_shard_t so(o, pg->pool.info.ec_pool() ? shard_id_t(i) : shard_id_t::NO_SHARD);
						if (!osdmap->exists(o) || osdmap->get_info(o).lost_at > interval.first)
							continue;  // dne or lost
						if (osdmap->is_up(o)) {
							pg_info_t *pinfo;
							if (so == pg->pg_whoami) {
								pinfo = &pg->info;
							} else {
								assert(pg->peer_info.count(so));
								pinfo = &pg->peer_info[so];
							}

							if (!pinfo->is_incomplete())
								any_up_complete_now = true;
						} else {
								any_down_now = true;
						}
					}

					if (!any_up_complete_now && any_down_now) {
						dout(10) << " no osds up+complete from interval " << interval << dendl;
						pg->state_set(PG_STATE_DOWN);
						pg->publish_stats_to_osd();
						return discard_event();
					}
					break;
				}
			}

			dout(20) << "Common peer features: " << hex << pg->get_min_peer_features() << dec << dendl;
			dout(20) << "Common acting features: " << hex << pg->get_min_acting_features() << dec << dendl;
			dout(20) << "Common upacting features: " << hex << pg->get_min_upacting_features() << dec << dendl;
			post_event(GotInfo());
		}
	}
	return discard_event();
}

上面获取到PG info之后交由GetInfo::react(const MNotifyRec&)进行处理。到此为止,GetInfo()流程执行完毕。

参看如下日志片段:

42283:2020-09-11 14:10:18.974310 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] state<Started/Primary/Peering/GetInfo>:  querying info from osd.2
42284:2020-09-11 14:10:18.974318 7fba3d925700 15 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] publish_stats_to_osd 2226:1335
42285:2020-09-11 14:10:18.974322 7fba3d925700 20 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] handle_activate_map: Not dirtying info: last_persisted is 2224 while current is 2226
42286:2020-09-11 14:10:18.974326 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] handle_peering_event: epoch_sent: 2226 epoch_requested: 2226 NullEvt
44149:2020-09-11 14:10:18.981601 7fba45134700 20 osd.3 2226 _dispatch 0x7fba6de6d0e0 pg_notify(11.4 epoch 2226) v5
44160:2020-09-11 14:10:18.981660 7fba45134700 30 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] lock
44968:2020-09-11 14:10:18.984778 7fba3d124700 30 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] lock
44969:2020-09-11 14:10:18.984783 7fba3d124700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] handle_peering_event: epoch_sent: 2226 epoch_requested: 2226 MNotifyRec from 2 notify: (query_epoch:2226, epoch_sent:2226, info:11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)) features: 0x7ffffffefdfbfff
44970:2020-09-11 14:10:18.984789 7fba3d124700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering]  got osd.2 11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)
44971:2020-09-11 14:10:18.984795 7fba3d124700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] update_heartbeat_peers 2,3 unchanged
44972:2020-09-11 14:10:18.984799 7fba3d124700 20 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] state<Started/Primary/Peering/GetInfo>: Adding osd: 2 peer features: 7ffffffefdfbfff
44973:2020-09-11 14:10:18.984803 7fba3d124700 20 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] state<Started/Primary/Peering/GetInfo>: Common peer features: 7ffffffefdfbfff
44974:2020-09-11 14:10:18.984817 7fba3d124700 20 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] state<Started/Primary/Peering/GetInfo>: Common acting features: 7ffffffefdfbfff
44975:2020-09-11 14:10:18.984820 7fba3d124700 20 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] state<Started/Primary/Peering/GetInfo>: Common upacting features: 7ffffffefdfbfff
44976:2020-09-11 14:10:18.984825 7fba3d124700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] exit Started/Primary/Peering/GetInfo 0.010536 2 0.000071
2.4.2 进入Peering/GetLog状态

我们先给出此一阶段的一个日志片段:

44977:2020-09-11 14:10:18.984830 7fba3d124700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] enter Started/Primary/Peering/GetLog
44978:2020-09-11 14:10:18.984834 7fba3d124700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] calc_acting osd.2 11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)
44979:2020-09-11 14:10:18.984839 7fba3d124700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] calc_acting osd.3 11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223)
44980:2020-09-11 14:10:18.984847 7fba3d124700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] calc_acting newest update on osd.3 with 11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223)
44982:calc_acting primary is osd.3 with 11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223)
44983: osd.2 (up) accepted 11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)
44985:2020-09-11 14:10:18.984851 7fba3d124700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] choose_acting want [3,2] != acting [3], requesting pg_temp change
44986:2020-09-11 14:10:18.984857 7fba3d124700  5 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped+peering] exit Started/Primary/Peering/GetLog 0.000026 0 0.000000

GetLog构造函数如下:

PG::RecoveryState::GetLog::GetLog(my_context ctx)
  : my_base(ctx),
    NamedState(
      context< RecoveryMachine >().pg->cct, "Started/Primary/Peering/GetLog"),
    msg(0)
{
	context< RecoveryMachine >().log_enter(state_name);
	
	PG *pg = context< RecoveryMachine >().pg;
	
	// adjust acting?
	if (!pg->choose_acting(auth_log_shard, false,&context< Peering >().history_les_bound)) {
		if (!pg->want_acting.empty()) {
			post_event(NeedActingChange());
		} else {
			post_event(IsIncomplete());
		}
		return;
	}
	
	// am i the best?
	if (auth_log_shard == pg->pg_whoami) {
		post_event(GotLog());
		return;
	}
	
	const pg_info_t& best = pg->peer_info[auth_log_shard];
	
	// am i broken?
	if (pg->info.last_update < best.log_tail) {
		dout(10) << " not contiguous with osd." << auth_log_shard << ", down" << dendl;
		post_event(IsIncomplete());
		return;
	}
	
	// how much log to request?
	eversion_t request_log_from = pg->info.last_update;
	assert(!pg->actingbackfill.empty());
	for (set<pg_shard_t>::iterator p = pg->actingbackfill.begin();p != pg->actingbackfill.end();++p) {
		if (*p == pg->pg_whoami) continue;

		pg_info_t& ri = pg->peer_info[*p];
		if (ri.last_update >= best.log_tail && ri.last_update < request_log_from)
			request_log_from = ri.last_update;
	}
	
	// how much?
	dout(10) << " requesting log from osd." << auth_log_shard << dendl;
	context<RecoveryMachine>().send_query(
	  auth_log_shard,
	  pg_query_t(
		pg_query_t::LOG,
		auth_log_shard.shard, pg->pg_whoami.shard,
		request_log_from, pg->info.history,
		pg->get_osdmap()->get_epoch()));
	
	assert(pg->blocked_by.empty());
	pg->blocked_by.insert(auth_log_shard.osd);
	pg->publish_stats_to_osd();
}

1) 调用函数pg->choose_acting()来选择出拥有权威日志的OSD,并计算出acting_backfillbackfill_targets两个OSD列表。选择出来的权威OSD通过auth_log_shard参数返回;

2) 如果选择失败,并且want_acting不为空,就抛出NeedActingChange事件,状态机转移到Primary/WaitActingChange状态,等待申请临时PG返回结果;如果want_acting为空,就抛出IsIncomplete事件,PG的状态机转移到Primary/Peering/Incomplete状态,表明失败,PG就处于InComplete状态。

3) 如果auth_log_shard等于pg->pg_whoami,也就是选出的拥有权威日志的OSD为当前主OSD,直接抛出GotLog()完成GetLog过程;

4) 如果pg->info.last_update小于best.log_tail,也就是本OSD的日志和权威日志不重叠,那么本OSD无法恢复,抛出IsInComplete事件。经过函数choose_acting()的选择后,主OSD必须是可恢复的。如果主OSD不可恢复,必须申请临时PG,选择拥有权威日志的OSD为临时主OSD;

5)如果自己不是权威日志的OSD,则需要去拥有权威日志的OSD上去拉取权威日志,并与本地合并。发送pg_query_t::LOG请求的过程与pg_query_t::INFO的过程是一样的。

2.4.2.1 PG::choose_acting()函数

函数choose_acting()用来计算PG的acting_backfill和backfill_targets两个OSD列表。acting_backfill保存了当前PG的acting列表,包括需要进行Backfill操作的OSD列表;backfill_targets列表保存了需要进行Backfill的OSD列表。

bool PG::choose_acting(pg_shard_t &auth_log_shard_id,
		       bool restrict_to_up_acting,
		       bool *history_les_bound)
{
	...
	map<pg_shard_t, pg_info_t>::const_iterator auth_log_shard =
	   find_best_info(all_info, restrict_to_up_acting, history_les_bound);
	
	if (auth_log_shard == all_info.end()) {
		if (up != acting) {
			dout(10) << "choose_acting no suitable info found (incomplete backfills?)," << " reverting to up" << dendl;
			want_acting = up;
			vector<int> empty;
			osd->queue_want_pg_temp(info.pgid.pgid, empty);
		} else {
			dout(10) << "choose_acting failed" << dendl;
			assert(want_acting.empty());
		}

		return false;
	}

	....

	if (want != acting) {
		dout(10) << "choose_acting want " << want << " != acting " << acting<< ", requesting pg_temp change" << dendl;
		want_acting = want;
	
		if (want_acting == up) {
			// There can't be any pending backfill if
			// want is the same as crush map up OSDs.

			assert(compat_mode || want_backfill.empty());
			vector<int> empty;
			osd->queue_want_pg_temp(info.pgid.pgid, empty);
		} else
			osd->queue_want_pg_temp(info.pgid.pgid, want);
			return false;
		}

}

1) 首先调用函数PG::find_best_info()来选举出一个拥有权威日志的OSD,保存在变量auth_log_shard里;

2) 如果没有选举出拥有权威日志的OSD,则进入如下流程:

a) 如果up不等于acting,申请临时PG,返回false值;

b) 否则确保want_acting列表为空,返回false值;

注: 在osdmap发生变化时,OSD::advance_pg()中会计算acting以及up,最终调用到OSDMap::_pg_to_up_acting_osds()来进行计算;

3) 计算是否是compat_mode模式,检查是,如果所有的OSD都支持纠删码,就设置compat_mode值为true;

4) 根据PG的不同类型,调用不同的函数。对应ReplicatedPG调用函数calc_replicated_acting()来计算PG需要的列表

在这里针对PG 11.4而言,我们选择出来的权威日志OSD为osd3,PG的up set为[3,2], acting set为[3]。调用ReplicatedPG::calc_replicated_acting()计算出的want为[3,2],因此这里want不等于acting,需要申请产生pg_temp。

OSDService::send_pg_temp()请求会在OSD::process_peering_events()的最后发出。通过下面的日志,我们可以看出发出去的pg_temp请求中PG 11.4的pg_temp为[]:

45166:2020-09-11 14:10:18.985528 7fba3d124700 10 osd.3 2226 send_pg_temp {11.4=[],11.6=[],19.1=[],22.2c=[],22.44=[],22.a4=[],22.b5=[],22.ca=[],22.d0=[],22.ec=[],23.d=[],23.13=[],23.30=[],23.6d=[]}

上面我们可以看到,但want_acting等于up时,调用PG::queue_want_pg_temp()时传递了一个empty,这个是什么意思呢? 其实是用于向OsdMonitor发送一个请求,要求删除PG 11.4对应的PG temp,参看如下日志(mon-node7-1.txt):

2020-09-11 14:10:18.985380 7f8f92d0c700 10 mon.node7-1@0(leader).osd e2226 preprocess_query osd_pgtemp(e2226 {11.4=[],11.6=[],19.1=[],22.2c=[],22.44=[],22.a4=[],22.b5=[],22.ca=[],22.d0=[],22.ec=[],23.d=[],23.13=[],23.30=[],23.6d=[]} v2226) v1 from osd.3 10.17.155.114:6800/894
2020-09-11 14:10:18.985393 7f8f92d0c700 10 mon.node7-1@0(leader).osd e2226 preprocess_pgtemp osd_pgtemp(e2226 {11.4=[],11.6=[],19.1=[],22.2c=[],22.44=[],22.a4=[],22.b5=[],22.ca=[],22.d0=[],22.ec=[],23.d=[],23.13=[],23.30=[],23.6d=[]} v2226) v1
2020-09-11 14:10:18.985403 7f8f92d0c700 20 is_capable service=osd command= exec on cap allow rwx
2020-09-11 14:10:18.985406 7f8f92d0c700 20  allow so far , doing grant allow rwx
2020-09-11 14:10:18.985408 7f8f92d0c700 20  match
2020-09-11 14:10:18.985410 7f8f92d0c700 20 mon.node7-1@0(leader).osd e2226  11.4[0,3] -> []
2020-09-11 14:10:18.985452 7f8f92d0c700  7 mon.node7-1@0(leader).osd e2226 prepare_update osd_pgtemp(e2226 {11.4=[],11.6=[],19.1=[],22.2c=[],22.44=[],22.a4=[],22.b5=[],22.ca=[],22.d0=[],22.ec=[],23.d=[],23.13=[],23.30=[],23.6d=[]} v2226) v1 from osd.3 10.17.155.114:6800/894
2020-09-11 14:10:18.985486 7f8f92d0c700  7 mon.node7-1@0(leader).osd e2226 prepare_pgtemp e2226 from osd.3 10.17.155.114:6800/894

代码调用流程如下:

bool PaxosService::dispatch(MonOpRequestRef op){
	...

	// preprocess
	if (preprocess_query(op)) 
		return true;  // easy!

	...

	if (prepare_update(op)) {
		...
	}

	...
}

bool OSDMonitor::preprocess_query(MonOpRequestRef op){

	...

	switch (m->get_type()) {
		...
		case MSG_OSD_PGTEMP:
    		return preprocess_pgtemp(op);

		...
	}

	...
}

bool OSDMonitor::prepare_update(MonOpRequestRef op){
	...

	switch (m->get_type()) {
		...
		case MSG_OSD_PGTEMP:
    		return prepare_pgtemp(op);

		...
	}

	...
}
2.4.3 进入Peering/WaitActingChange状态

在上面Peering/GetLog构造函数中,由于PG::choose_acting()函数返回false,并且pg->want_acting不为空,因此产生NeedActingChange()事件,从而进入WaitActingChange状态。

PG::RecoveryState::WaitActingChange::WaitActingChange(my_context ctx)
  : my_base(ctx),
    NamedState(context< RecoveryMachine >().pg->cct, "Started/Primary/Peering/WaitActingChange")
{
	context< RecoveryMachine >().log_enter(state_name);
}

1) 收到AdvMap事件

在进入WaitActingChange状态之前,由于osd3向OSDMonitor申请pg temp,导致osdmap发生变更,从而触发如下调用(此时osdmap版本为e2227):

void PG::handle_advance_map(
  OSDMapRef osdmap, OSDMapRef lastmap,
  vector<int>& newup, int up_primary,
  vector<int>& newacting, int acting_primary,
  RecoveryCtx *rctx)
{
	...

	AdvMap evt(
		osdmap, lastmap, newup, up_primary,
		newacting, acting_primary);
	recovery_state.handle_event(evt, rctx);
}

boost::statechart::result PG::RecoveryState::WaitActingChange::react(const AdvMap& advmap)
{
	PG *pg = context< RecoveryMachine >().pg;
	OSDMapRef osdmap = advmap.osdmap;
	
	dout(10) << "verifying no want_acting " << pg->want_acting << " targets didn't go down" << dendl;
	for (vector<int>::iterator p = pg->want_acting.begin(); p != pg->want_acting.end(); ++p) {
		if (!osdmap->is_up(*p)) {
			dout(10) << " want_acting target osd." << *p << " went down, resetting" << dendl;
			post_event(advmap);
			return transit< Reset >();
		}
	}
	return forward_event();
}
boost::statechart::result PG::RecoveryState::Started::react(const AdvMap& advmap)
{
	dout(10) << "Started advmap" << dendl;
	PG *pg = context< RecoveryMachine >().pg;
	pg->check_full_transition(advmap.lastmap, advmap.osdmap);
	if (pg->should_restart_peering(
	  advmap.up_primary,
	  advmap.acting_primary,
	  advmap.newup,
	  advmap.newacting,
	  advmap.lastmap,
	  advmap.osdmap)) {
		dout(10) << "should_restart_peering, transitioning to Reset" << dendl;
		post_event(advmap);
		return transit< Reset >();
	}
	pg->remove_down_peer_info(advmap.osdmap);
	return discard_event();
}

由于此时up set为[3,2],acting set也由[3]变为了[3,2],因此会再一次触发Peering操作。这里重新进入Reset状态。

如下是本阶段的一个日志片段:

47209:2020-09-11 14:10:20.014782 7fba3d925700 10 osd.3 pg_epoch: 2226 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] handle_advance_map [3,2]/[3,2] -- 3/3
47212:2020-09-11 14:10:20.014796 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] state<Started/Primary/Peering/WaitActingChange>: verifying no want_acting [] targets didn't go down
47279:2020-09-11 14:10:20.014814 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] state<Started>: Started advmap
47281:2020-09-11 14:10:20.015156 7fba3d925700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] new interval newup [3,2] newacting [3,2]
47283:2020-09-11 14:10:20.015174 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] state<Started>: should_restart_peering, transitioning to Reset
47285:2020-09-11 14:10:20.015184 7fba3d925700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] exit Started/Primary/Peering/WaitActingChange 1.030293 1 0.000106
47288:2020-09-11 14:10:20.015195 7fba3d925700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2226 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] exit Started/Primary 1.030307 0 0.000000

3. 重新进入Peering流程

在上面最后收到的osdmap版本为e2227,然后重新进入Reset阶段,触发新的Peering过程。

3.1 Reset状态

PG::RecoveryState::Reset::Reset(my_context ctx)
  : my_base(ctx),
    NamedState(context< RecoveryMachine >().pg->cct, "Reset")
{
	context< RecoveryMachine >().log_enter(state_name);
	PG *pg = context< RecoveryMachine >().pg;
	
	pg->flushes_in_progress = 0;
	pg->set_last_peering_reset();
}

调用pg::set_last_peering_reset()设置osdmap的epoch为e2227.

3.1 Reset状态下对AdvMap事件的处理

在进入Reset状态之前,我们向其投递了AdvMap事件,这里我们来看一下对该事件的处理:

boost::statechart::result PG::RecoveryState::Reset::react(const AdvMap& advmap)
{
}

如下是此一过程的日志片段:

47303:2020-09-11 14:10:20.015280 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2227 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] state<Reset>: Reset advmap
47394:2020-09-11 14:10:20.015296 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2227 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] _calc_past_interval_range: already have past intervals back to 2224
47396:2020-09-11 14:10:20.015871 7fba3d925700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2227 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] new interval newup [3,2] newacting [3,2]
47399:2020-09-11 14:10:20.015889 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2227 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] state<Reset>: should restart peering, calling start_peering_interval again
47400:2020-09-11 14:10:20.015898 7fba3d925700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2]/[3] r=0 lpr=2227 pi=2223-2225/1 crt=201'1 lcod 0'0 mlcod 0'0 remapped] set_last_peering_reset 2227
47403:2020-09-11 14:10:20.015920 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] start_peering_interval: check_new_interval output: generate_past_intervals interval(2226-2226 up [3,2](3) acting [3](3)): not rw, up_thru 2223 up_from 2123 last_epoch_clean 2224
47409:2020-09-11 14:10:20.015946 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2226/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive]  noting past interval(2226-2226 up [3,2](3) acting [3](3))
47411:2020-09-11 14:10:20.015960 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive]  up [3,2] -> [3,2], acting [3] -> [3,2], acting_primary 3 -> 3, up_primary 3 -> 3, role 0 -> 0, features acting 576460752032874495 upacting 576460752032874495
47413:2020-09-11 14:10:20.015972 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] clear_primary_state
47414:2020-09-11 14:10:20.015983 7fba3d925700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] agent_stop
47415:2020-09-11 14:10:20.015991 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] on_change
47417:2020-09-11 14:10:20.015998 7fba3d925700 15 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive]  requeue_ops 
47419:2020-09-11 14:10:20.016009 7fba3d925700 15 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] publish_stats_to_osd 2227:1337
47421:2020-09-11 14:10:20.016019 7fba3d925700 15 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive]  requeue_ops 
47422:2020-09-11 14:10:20.016027 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] cancel_copy_ops
47424:2020-09-11 14:10:20.016034 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] cancel_flush_ops
47426:2020-09-11 14:10:20.016042 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] cancel_proxy_ops
47428:2020-09-11 14:10:20.016051 7fba3d925700 15 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive]  requeue_ops 
47430:2020-09-11 14:10:20.016060 7fba3d925700 15 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive]  requeue_ops 
47432:2020-09-11 14:10:20.016070 7fba3d925700 15 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive]  requeue_ops 
47435:2020-09-11 14:10:20.016080 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] on_change_cleanup
47437:2020-09-11 14:10:20.016090 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] on_change
47440:2020-09-11 14:10:20.016101 7fba3d925700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] exit NotTrimming
47442:2020-09-11 14:10:20.016112 7fba3d925700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] enter NotTrimming
47444:2020-09-11 14:10:20.016122 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] [3] -> [3,2], replicas changed
47447:2020-09-11 14:10:20.016131 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] cancel_recovery
47449:2020-09-11 14:10:20.016140 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] clear_recovery_state
47451:2020-09-11 14:10:20.016150 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] check_recovery_sources no source osds () went down

具体的代码分析,我们在前面已经讲解过,这里不再赘述。

3.2 ActMap事件的处理

在接收到新的OSDMap,处理完成AdvMap事件之后,接着就会调用PG::handle_activate_map()来激活osdmap:

void PG::handle_activate_map(RecoveryCtx *rctx)
{
}

如下是相关日志片段,具体代码分析这里将不再赘述:

47454:2020-09-11 14:10:20.016168 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] handle_activate_map 
47457:2020-09-11 14:10:20.016178 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] update_heartbeat_peers 2,3 unchanged
47458:2020-09-11 14:10:20.016188 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] take_waiters
47461:2020-09-11 14:10:20.016198 7fba3d925700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] exit Reset 0.000960 1 0.001368
47463:2020-09-11 14:10:20.016211 7fba3d925700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] enter Started
47465:2020-09-11 14:10:20.016221 7fba3d925700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] enter Start

3.3 进入Started状态

Started构造函数如下:

/*------Started-------*/
PG::RecoveryState::Started::Started(my_context ctx)
  : my_base(ctx),
    NamedState(context< RecoveryMachine >().pg->cct, "Started")
{
	context< RecoveryMachine >().log_enter(state_name);
}
3.3.1 进入Started/Start状态

进入Started状态后,会进去其默认子状态Start:

/*-------Start---------*/
PG::RecoveryState::Start::Start(my_context ctx)
  : my_base(ctx),
    NamedState(context< RecoveryMachine >().pg->cct, "Start")
{
	context< RecoveryMachine >().log_enter(state_name);
	
	PG *pg = context< RecoveryMachine >().pg;
	if (pg->is_primary()) {
		dout(1) << "transitioning to Primary" << dendl;
		post_event(MakePrimary());
	} else { //is_stray
		dout(1) << "transitioning to Stray" << dendl; 
		post_event(MakeStray());
	}
}

Start状态的构造函数中,根据PG对应OSD的角色不同,选择进入Primary状态或者Stray状态。

这里osd3对于PG 11.4而言为主OSD,因此产生MakePrimary()事件,进入Started/Primary状态。

3.3.2 进入Started/Primary状态

Primary构造函数如下:

/*---------Primary--------*/
PG::RecoveryState::Primary::Primary(my_context ctx)
  : my_base(ctx),
    NamedState(context< RecoveryMachine >().pg->cct, "Started/Primary")
{
	context< RecoveryMachine >().log_enter(state_name);
	PG *pg = context< RecoveryMachine >().pg;
	assert(pg->want_acting.empty());
	
	// set CREATING bit until we have peered for the first time.
	if (pg->info.history.last_epoch_started == 0) {
		pg->state_set(PG_STATE_CREATING);
		// use the history timestamp, which ultimately comes from the
		// monitor in the create case.
		utime_t t = pg->info.history.last_scrub_stamp;
		pg->info.stats.last_fresh = t;
		pg->info.stats.last_active = t;
		pg->info.stats.last_change = t;
		pg->info.stats.last_peered = t;
		pg->info.stats.last_clean = t;
		pg->info.stats.last_unstale = t;
		pg->info.stats.last_undegraded = t;
		pg->info.stats.last_fullsized = t;
		pg->info.stats.last_scrub_stamp = t;
		pg->info.stats.last_deep_scrub_stamp = t;
		pg->info.stats.last_clean_scrub_stamp = t;
	}
}

注意: 在上一次退出Primary状态时,我们就已经清空了pg->want_acting。

3.4 进入Peering状态

Started/Primary的默认初始子状态为Started/Primary/Peering,现在我们来看一下Peering的构造函数:

/*---------Peering--------*/
PG::RecoveryState::Peering::Peering(my_context ctx)
  : my_base(ctx),
    NamedState(context< RecoveryMachine >().pg->cct, "Started/Primary/Peering"),
    history_les_bound(false)
{
	context< RecoveryMachine >().log_enter(state_name);
	
	PG *pg = context< RecoveryMachine >().pg;
	assert(!pg->is_peered());
	assert(!pg->is_peering());
	assert(pg->is_primary());
	pg->state_set(PG_STATE_PEERING);
}
3.4.1 进入Started/Primary/Peering/GetInfo状态

进入Peering状态后,默认会跳转进入Peering的子状态GetInfo,如下是此过程的一个日志片段:

47475:2020-09-11 14:10:20.016271 7fba3d925700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] enter Started/Primary/Peering/GetInfo
47478:2020-09-11 14:10:20.016281 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] _calc_past_interval_range: already have past intervals back to 2224
47481:2020-09-11 14:10:20.016292 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering]  PriorSet: build_prior interval(2226-2226 up [3,2](3) acting [3](3))
47482:2020-09-11 14:10:20.016301 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering]  PriorSet: build_prior interval(2223-2225 up [3](3) acting [3](3) maybe_went_rw)
47483:2020-09-11 14:10:20.016310 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering]  PriorSet: build_prior final: probe 2,3 down  blocked_by {}
47484:2020-09-11 14:10:20.016318 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] up_thru 2226 < same_since 2227, must notify monitor
47485:2020-09-11 14:10:20.016327 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>:  querying info from osd.2
47487:2020-09-11 14:10:20.016337 7fba3d925700 15 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] publish_stats_to_osd 2227:1338
47489:2020-09-11 14:10:20.016346 7fba3d925700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] handle_activate_map: Not dirtying info: last_persisted is 2226 while current is 2227
47492:2020-09-11 14:10:20.016356 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] handle_peering_event: epoch_sent: 2227 epoch_requested: 2227 NullEvt
50548:2020-09-11 14:10:20.028700 7fba3d124700 20   ? 1956'80 (1956'79) modify   23:1ce16903:::obj-9bjaNA0lG2ZpqCJ:head by client.291879.0:1164 2020-05-30 10:20:25.595078
50723:2020-09-11 14:10:20.029810 7fba3d124700 20 update missing, append 1956'80 (1956'79) modify   23:1ce16903:::obj-9bjaNA0lG2ZpqCJ:head by client.291879.0:1164 2020-05-30 10:20:25.595078
50826:2020-09-11 14:10:20.031026 7fba45134700 20 osd.3 2227 _dispatch 0x7fba6d7401e0 pg_notify(11.4 epoch 2227) v5
50829:2020-09-11 14:10:20.031044 7fba45134700 30 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] lock
50838:2020-09-11 14:10:20.031074 7fba3d124700 30 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] lock
50841:2020-09-11 14:10:20.031095 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] handle_peering_event: epoch_sent: 2227 epoch_requested: 2227 MNotifyRec from 2 notify: (query_epoch:2227, epoch_sent:2227, info:11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)) features: 0x7ffffffefdfbfff
50846:2020-09-11 14:10:20.031104 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering]  got osd.2 11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)
50848:2020-09-11 14:10:20.031115 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] update_heartbeat_peers 2,3 unchanged
50850:2020-09-11 14:10:20.031122 7fba3d124700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>: Adding osd: 2 peer features: 7ffffffefdfbfff
50851:2020-09-11 14:10:20.031129 7fba3d124700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>: Common peer features: 7ffffffefdfbfff
50854:2020-09-11 14:10:20.031135 7fba3d124700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>: Common acting features: 7ffffffefdfbfff
50857:2020-09-11 14:10:20.031140 7fba3d124700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>: Common upacting features: 7ffffffefdfbfff
50859:2020-09-11 14:10:20.031147 7fba3d124700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] exit Started/Primary/Peering/GetInfo 0.014876 2 0.000170

GetInfo状态的构造函数如下:

/*--------GetInfo---------*/
PG::RecoveryState::GetInfo::GetInfo(my_context ctx)
  : my_base(ctx),
    NamedState(context< RecoveryMachine >().pg->cct, "Started/Primary/Peering/GetInfo")
{
	context< RecoveryMachine >().log_enter(state_name);
	
	PG *pg = context< RecoveryMachine >().pg;
	pg->generate_past_intervals();
	unique_ptr<PriorSet> &prior_set = context< Peering >().prior_set;
	
	assert(pg->blocked_by.empty());
	
	if (!prior_set.get())
		pg->build_prior(prior_set);
	
	pg->reset_min_peer_features();
	get_infos();
	if (peer_info_requested.empty() && !prior_set->pg_down) {
		post_event(GotInfo());
	}
}

1) 函数PG::generate_past_intervals()

void PG::generate_past_intervals()
{
	epoch_t cur_epoch, end_epoch;
	if (!_calc_past_interval_range(&cur_epoch, &end_epoch,osd->get_superblock().oldest_map)) {
		if (info.history.same_interval_since == 0) {
			info.history.same_interval_since = end_epoch;
			dirty_info = true;
		}
		return;
	}

	...
	
}
bool PG::_calc_past_interval_range(epoch_t *start, epoch_t *end, epoch_t oldest_map)
{
	if (info.history.same_interval_since) {
		*end = info.history.same_interval_since;
	} else {
		// PG must be imported, so let's calculate the whole range.
		*end = osdmap_ref->get_epoch();
	}
	
	// Do we already have the intervals we want?
	map<epoch_t,pg_interval_t>::const_iterator pif = past_intervals.begin();
	if (pif != past_intervals.end()) {
		if (pif->first <= info.history.last_epoch_clean) {
			dout(10) << __func__ << ": already have past intervals back to "<< info.history.last_epoch_clean << dendl;
			return false;
		}
		*end = past_intervals.begin()->first;
	}
	
	*start = MAX(MAX(info.history.epoch_created,
	info.history.last_epoch_clean),
	oldest_map);
	if (*start >= *end) {
		dout(10) << __func__ << " start epoch " << *start << " >= end epoch " << *end<< ", nothing to do" << dendl;
		return false;
	}
	
	return true;
}

当前的osdmap版本为e2227,在e2226时PG 11.4还未进入Active状态即又开始触发Peering过程,因此在上面PG::_calc_past_interval_range()函数中info.history.last_epoch_clean的值仍为e2224。参看如下日志片段:

47478:2020-09-11 14:10:20.016281 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] _calc_past_interval_range: already have past intervals back to 2224

2) 函数PG::build_prior()

void PG::build_prior(std::unique_ptr<PriorSet> &prior_set)
{
	if (1) {
		// sanity check
		for (map<pg_shard_t,pg_info_t>::iterator it = peer_info.begin();it != peer_info.end();++it) {
			assert(info.history.last_epoch_started >= it->second.history.last_epoch_started);
		}
	}

	prior_set.reset(
	  new PriorSet(
		pool.info.ec_pool(),
		get_pgbackend()->get_is_recoverable_predicate(),
		*get_osdmap(),
		past_intervals,
		up,
		acting,
		info,
		this));

	PriorSet &prior(*prior_set.get());
	
	if (prior.pg_down) {
		state_set(PG_STATE_DOWN);
	}
	
	if (get_osdmap()->get_up_thru(osd->whoami) < info.history.same_interval_since) {
		dout(10) << "up_thru " << get_osdmap()->get_up_thru(osd->whoami)<< " < same_since " << info.history.same_interval_since<< ", must notify monitor" << dendl;
		need_up_thru = true;
	} else {
		dout(10) << "up_thru " << get_osdmap()->get_up_thru(osd->whoami)<< " >= same_since " << info.history.same_interval_since<< ", all is well" << dendl;
		need_up_thru = false;
	}
	set_probe_targets(prior_set->probe);
}

PriorSet用于记录PG前一阶段的状态信息,主要用于辅助当前阶段的恢复工作。在当前阶段osdmap版本为e2227,up set为[3,2], acting set为[3,2],info.history.last_epoch_clean为e2224。在构建PriorSet时,参看如下日志片段:

47481:2020-09-11 14:10:20.016292 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering]  PriorSet: build_prior interval(2226-2226 up [3,2](3) acting [3](3))
47482:2020-09-11 14:10:20.016301 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering]  PriorSet: build_prior interval(2223-2225 up [3](3) acting [3](3) maybe_went_rw)
47483:2020-09-11 14:10:20.016310 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering]  PriorSet: build_prior final: probe 2,3 down  blocked_by {}
47484:2020-09-11 14:10:20.016318 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] up_thru 2226 < same_since 2227, must notify monitor

从上面可以看到,在[e2226,e2226]阶段对应的up set为[3,2], acting set为[3]; 在[e2223, e2225]阶段对应的up set为[3], acting set为[3],并且可能执行了写操作。在当前e2227阶段,probe set为[2,3], down set为[],并且获取到OSD3的up_thru为e2226,需要通知monitor。

通知monitor会在OSD::process_peering_events()函数的最后来进行:

void OSD::process_peering_events(
  const list<PG*> &pgs,
  ThreadPool::TPHandle &handle
  )
{
	...
	if (need_up_thru)
		queue_want_up_thru(same_interval_since);

	...
}

关于PriorSet更为详细的介绍,我们会在后续up_thru相关文章中进一步说明。

3) PG::get_infos()

void PG::RecoveryState::GetInfo::get_infos()
{
	PG *pg = context< RecoveryMachine >().pg;
	unique_ptr<PriorSet> &prior_set = context< Peering >().prior_set;
	
	pg->blocked_by.clear();
	for (set<pg_shard_t>::const_iterator it = prior_set->probe.begin();it != prior_set->probe.end();++it) {
		pg_shard_t peer = *it;
		if (peer == pg->pg_whoami) {
			continue;
		}

		if (pg->peer_info.count(peer)) {
			dout(10) << " have osd." << peer << " info " << pg->peer_info[peer] << dendl;
			continue;
		}

		if (peer_info_requested.count(peer)) {
			dout(10) << " already requested info from osd." << peer << dendl;
			pg->blocked_by.insert(peer.osd);
		} else if (!pg->get_osdmap()->is_up(peer.osd)) {
			dout(10) << " not querying info from down osd." << peer << dendl;
		} else {
			dout(10) << " querying info from osd." << peer << dendl;
			context< RecoveryMachine >().send_query(
			  peer, pg_query_t(pg_query_t::INFO,
				it->shard, pg->pg_whoami.shard,
				pg->info.history,
				pg->get_osdmap()->get_epoch()));
			peer_info_requested.insert(peer);
			pg->blocked_by.insert(peer.osd);
		}
	}
	
	pg->publish_stats_to_osd();
}

在PG::get_infos()函数中会遍历prior_set,然后向对应的从OSD发送pg_query_t::INFO消息,以获取从OSD的pg_info_t信息。这里对于PG 11.4而言,因为在前面PG::build_prior()所构建的probe为[2,3],因此这里会向osd2发送pg_query_t::INFO信息。

47483:2020-09-11 14:10:20.016310 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering]  PriorSet: build_prior final: probe 2,3 down  blocked_by {}

3) 收到MNotifyRec信息

在上面步骤2) 中向OSD2发送了pg_query_t::INFO消息,这里收到了对该请求的响应,然后调用GetInfo::react(const MNotifyRec &)来进行处理:

boost::statechart::result PG::RecoveryState::GetInfo::react(const MNotifyRec& infoevt) 
{
}

下面给出此一过程相应的日志片段:

50841:2020-09-11 14:10:20.031095 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] handle_peering_event: epoch_sent: 2227 epoch_requested: 2227 MNotifyRec from 2 notify: (query_epoch:2227, epoch_sent:2227, info:11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)) features: 0x7ffffffefdfbfff
50846:2020-09-11 14:10:20.031104 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering]  got osd.2 11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)
50848:2020-09-11 14:10:20.031115 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] update_heartbeat_peers 2,3 unchanged
50850:2020-09-11 14:10:20.031122 7fba3d124700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>: Adding osd: 2 peer features: 7ffffffefdfbfff
50851:2020-09-11 14:10:20.031129 7fba3d124700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>: Common peer features: 7ffffffefdfbfff
50854:2020-09-11 14:10:20.031135 7fba3d124700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>: Common acting features: 7ffffffefdfbfff
50857:2020-09-11 14:10:20.031140 7fba3d124700 20 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetInfo>: Common upacting features: 7ffffffefdfbfff
50859:2020-09-11 14:10:20.031147 7fba3d124700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] exit Started/Primary/Peering/GetInfo 0.014876 2 0.000170

在上面react(const MNotifyRec&)函数中,首先调用PG::proc_replica_info()来处理从OSD返回过来的pg_info信息。如果PG::proc_replica_info()返回true,表明成功返回的pg_info信息有效,然后执行如下流程:

3.1) 检查pg->info.history.last_epoch_started,如果主OSD的old_start小于返回的history.last_epoch_started,表明当前主OSD过于老旧,需要重新build_prior(),然后重新get_infos();

注: 当前pg->info.history.last_epoch_started的值为e2224??

3.2) 判断我们已经收到了pg->peer_info_requested中所有请求的响应信息。在这些响应中,我们要求在最后一个rw interval中,至少要有一个OSD处于complete状态,这样才能准确获取到该PG在最后所做的修改,后续才能进行恢复

boost::statechart::result PG::RecoveryState::GetInfo::react(const MNotifyRec& infoevt) 
{
	...

	// are we done getting everything?
	if (peer_info_requested.empty() && !prior_set->pg_down){
		/*
		* make sure we have at least one !incomplete() osd from the
		* last rw interval.  the incomplete (backfilling) replicas
		* get a copy of the log, but they don't get all the object
		* updates, so they are insufficient to recover changes during
		* that interval.
		*/
		if (pg->info.history.last_epoch_started) {
			for (map<epoch_t,pg_interval_t>::reverse_iterator p = pg->past_intervals.rbegin();p != pg->past_intervals.rend();++p) {
				if (p->first < pg->info.history.last_epoch_started)
					break;
				if (!p->second.maybe_went_rw)
					continue;

				pg_interval_t& interval = p->second;
				dout(10) << " last maybe_went_rw interval was " << interval << dendl;
				OSDMapRef osdmap = pg->get_osdmap();
	
				/*
				* this mirrors the PriorSet calculation: we wait if we
				* don't have an up (AND !incomplete) node AND there are
				* nodes down that might be usable.
				*/
				bool any_up_complete_now = false;
				bool any_down_now = false;
				for (unsigned i=0; i<interval.acting.size(); i++) {
					int o = interval.acting[i];

					if (o == CRUSH_ITEM_NONE)
						continue;
					pg_shard_t so(o, pg->pool.info.ec_pool() ? shard_id_t(i) : shard_id_t::NO_SHARD);
					if (!osdmap->exists(o) || osdmap->get_info(o).lost_at > interval.first)
						continue;  // dne or lost
					if (osdmap->is_up(o)) {
						pg_info_t *pinfo;
						if (so == pg->pg_whoami) {
							pinfo = &pg->info;
						} else {
							assert(pg->peer_info.count(so));
							pinfo = &pg->peer_info[so];
						}
						if (!pinfo->is_incomplete())
							any_up_complete_now = true;

					} else {
						any_down_now = true;
					}

				}  //end for

				if (!any_up_complete_now && any_down_now) {
					dout(10) << " no osds up+complete from interval " << interval << dendl;

					pg->state_set(PG_STATE_DOWN);
					pg->publish_stats_to_osd();
					return discard_event();

				}  //end if

				break;

			} //end for

		}  //end if
		dout(20) << "Common peer features: " << hex << pg->get_min_peer_features() << dec << dendl;
		dout(20) << "Common acting features: " << hex << pg->get_min_acting_features() << dec << dendl;
		dout(20) << "Common upacting features: " << hex << pg->get_min_upacting_features() << dec << dendl;
		post_event(GotInfo());

	}  //end if

	...
}

这里针对PG 11.4而言,最后一个past_interval为[e2226,e2226],倒数第二个past_interval为[e2223, e2225]。

4) 函数PG::proc_replica_info()

bool PG::proc_replica_info(
  pg_shard_t from, const pg_info_t &oinfo, epoch_t send_epoch)
{
	map<pg_shard_t, pg_info_t>::iterator p = peer_info.find(from);
	if (p != peer_info.end() && p->second.last_update == oinfo.last_update) {
		dout(10) << " got dup osd." << from << " info " << oinfo << ", identical to ours" << dendl;
		return false;
	}
	
	if (!get_osdmap()->has_been_up_since(from.osd, send_epoch)) {
		dout(10) << " got info " << oinfo << " from down osd." << from<< " discarding" << dendl;
		return false;
	}
	
	dout(10) << " got osd." << from << " " << oinfo << dendl;
	assert(is_primary());
	peer_info[from] = oinfo;
	might_have_unfound.insert(from);
	
	unreg_next_scrub();
	if (info.history.merge(oinfo.history))
		dirty_info = true;

	reg_next_scrub();
	
	// stray?
	if (!is_up(from) && !is_acting(from)) {
		dout(10) << " osd." << from << " has stray content: " << oinfo << dendl;
		stray_set.insert(from);
		if (is_clean()) {
			purge_strays();
		}
	}
	
	// was this a new info?  if so, update peers!
	if (p == peer_info.end())
		update_heartbeat_peers();
	
	return true;
}

PG::proc_replica_info()首先判断收到的pg_info信息是否有效,如下两种情况都被判定为无效,直接返回false

  • 收到了重复的pg_info

  • pg_info源osd当前是否处于down状态

接着调用pg_history_t::merge()来合并history,这是一个十分重要的函数:

bool merge(const pg_history_t &other) {
	// Here, we only update the fields which cannot be calculated from the OSDmap.
	bool modified = false;
	if (epoch_created < other.epoch_created) {
		epoch_created = other.epoch_created;
		modified = true;
	}

	if (last_epoch_started < other.last_epoch_started) {
		last_epoch_started = other.last_epoch_started;
		modified = true;
	}

	if (last_epoch_clean < other.last_epoch_clean) {
		last_epoch_clean = other.last_epoch_clean;
		modified = true;
	}
	if (last_epoch_split < other.last_epoch_split) {
		last_epoch_split = other.last_epoch_split; 
		modified = true;
	}
	if (last_epoch_marked_full < other.last_epoch_marked_full) {
		last_epoch_marked_full = other.last_epoch_marked_full;
		modified = true;
	}
	if (other.last_scrub > last_scrub) {
		last_scrub = other.last_scrub;
		modified = true;
	}
	if (other.last_scrub_stamp > last_scrub_stamp) {
		last_scrub_stamp = other.last_scrub_stamp;
		modified = true;
	}
	if (other.last_deep_scrub > last_deep_scrub) {
		last_deep_scrub = other.last_deep_scrub;
		modified = true;
	}
	if (other.last_deep_scrub_stamp > last_deep_scrub_stamp) {
		last_deep_scrub_stamp = other.last_deep_scrub_stamp;
		modified = true;
	}
	if (other.last_clean_scrub_stamp > last_clean_scrub_stamp) {
		last_clean_scrub_stamp = other.last_clean_scrub_stamp;
		modified = true;
	}
	return modified;
}

从上面可以看到,如果相应的history信息发生了改变,则merge()成功,返回true,否则返回false。pg_history_t::merge()函数主要用于合并如下字段:

  • history.epoch_created

  • history.last_epoch_started

  • history.last_epoch_clean

  • history.last_epoch_split

  • history.last_epoch_marked_full

  • history.last_scrub

  • history.last_scrub_stamp

  • history.last_deep_scrub

  • history.last_deep_scrub_stamp

  • history.last_clean_scrub_stamp

结合如下打印日志:

50841:2020-09-11 14:10:20.031095 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] handle_peering_event: epoch_sent: 2227 epoch_requested: 2227 MNotifyRec from 2 notify: (query_epoch:2227, epoch_sent:2227, info:11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)) features: 0x7ffffffefdfbfff
50846:2020-09-11 14:10:20.031104 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering]  got osd.2 11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)

如下是打印pg_info的函数实现(src/osd/osd_types.h):

inline ostream& operator<<(ostream& out, const pg_info_t& pgi) 
{
	out << pgi.pgid << "(";
	if (pgi.dne())
		out << " DNE";
	if (pgi.is_empty())
		out << " empty";
	else {
		out << " v " << pgi.last_update;
		if (pgi.last_complete != pgi.last_update)
			out << " lc " << pgi.last_complete;
		out << " (" << pgi.log_tail << "," << pgi.last_update << "]";
	}
	if (pgi.is_incomplete())
		out << " lb " << pgi.last_backfill << (pgi.last_backfill_bitwise ? " (bitwise)" : " (NIBBLEWISE)");

	//out << " c " << pgi.epoch_created;
	out << " local-les=" << pgi.last_epoch_started;
	out << " n=" << pgi.stats.stats.sum.num_objects;
	out << " " << pgi.history<< ")";

	return out;
}

inline ostream& operator<<(ostream& out, const pg_history_t& h) {
	return out << "ec=" << h.epoch_created
		<< " les/c/f " << h.last_epoch_started << "/" << h.last_epoch_clean
		<< "/" << h.last_epoch_marked_full
		<< " " << h.same_up_since << "/" << h.same_interval_since << "/" << h.same_primary_since;
}

从这里我们可以知道,对于PG 11.4,从osd2获取到的pg_info信息为:

  • info.last_epoch_started为e0

  • info.stats.stats.sum.num_objects为0

  • info.history.epoch_created为0

  • info.history.last_epoch_started为e0

  • info.history.last_epoch_clean为e0

  • info.history.last_epoch_marked_full为e0

  • info.history.same_up_since为e0

  • info.history.same_interval_since为e0

  • info.history.same_primary_since为e0

3.4.2 进入Started/Primary/Peering/GetLog状态

在上面成功获取到Info信息之后,接着就会进入GetLog状态:

PG::RecoveryState::GetLog::GetLog(my_context ctx)
  : my_base(ctx),
    NamedState(
      context< RecoveryMachine >().pg->cct, "Started/Primary/Peering/GetLog"),
    msg(0)
{
	context< RecoveryMachine >().log_enter(state_name);
	
	PG *pg = context< RecoveryMachine >().pg;
	
	// adjust acting?
	if (!pg->choose_acting(auth_log_shard, false,&context< Peering >().history_les_bound)) {
		if (!pg->want_acting.empty()) {
			post_event(NeedActingChange());
		} else {
			post_event(IsIncomplete());
		}
		return;
	}
	
	// am i the best?
	if (auth_log_shard == pg->pg_whoami) {
		post_event(GotLog());
		return;
	}
	
	const pg_info_t& best = pg->peer_info[auth_log_shard];
	
	// am i broken?
	if (pg->info.last_update < best.log_tail) {
		dout(10) << " not contiguous with osd." << auth_log_shard << ", down" << dendl;
		post_event(IsIncomplete());
		return;
	}
	
	// how much log to request?
	eversion_t request_log_from = pg->info.last_update;
	assert(!pg->actingbackfill.empty());
	for (set<pg_shard_t>::iterator p = pg->actingbackfill.begin();p != pg->actingbackfill.end();++p) {
		if (*p == pg->pg_whoami) continue;

		pg_info_t& ri = pg->peer_info[*p];
		if (ri.last_update >= best.log_tail && ri.last_update < request_log_from)
			request_log_from = ri.last_update;
	}
	
	// how much?
	dout(10) << " requesting log from osd." << auth_log_shard << dendl;
	context<RecoveryMachine>().send_query(
	  auth_log_shard,
	  pg_query_t(
		pg_query_t::LOG,
		auth_log_shard.shard, pg->pg_whoami.shard,
		request_log_from, pg->info.history,
		pg->get_osdmap()->get_epoch()));
	
	assert(pg->blocked_by.empty());
	pg->blocked_by.insert(auth_log_shard.osd);
	pg->publish_stats_to_osd();

执行步骤如下:

1) 调用函数pg->choose_acting()来选择出拥有权威日志的OSD,并计算出acting_backfillbackfill_targets两个OSD列表。选择出来的权威OSD通过auth_log_shard参数返回;

2) 如果选择失败,并且want_acting不为空,就抛出NeedActingChange事件,状态机转移到Primary/WaitActingChange状态,等待申请临时PG返回结果;如果want_acting为空,就抛出IsIncomplete事件,PG的状态机转移到Primary/Peering/Incomplete状态,表明失败,PG就处于InComplete状态。

3) 如果auth_log_shard等于pg->pg_whoami,也就是选出的拥有权威日志的OSD为当前主OSD,直接抛出GotLog()完成GetLog过程;

4) 如果pg->info.last_update小于best.log_tail,也就是本OSD的日志和权威日志不重叠,那么本OSD无法恢复,抛出IsInComplete事件。经过函数choose_acting()的选择后,主OSD必须是可恢复的。如果主OSD不可恢复,必须申请临时PG,选择拥有权威日志的OSD为临时主OSD;

5)如果自己不是权威日志的OSD,则需要去拥有权威日志的OSD上去拉取权威日志,并与本地合并。发送pg_query_t::LOG请求的过程与pg_query_t::INFO的过程是一样的。

如下是此一阶段的日志片段:

50862:2020-09-11 14:10:20.031154 7fba3d124700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] enter Started/Primary/Peering/GetLog
50863:2020-09-11 14:10:20.031161 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] calc_acting osd.2 11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)
50865:2020-09-11 14:10:20.031168 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] calc_acting osd.3 11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223)
50873:2020-09-11 14:10:20.031192 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] calc_acting newest update on osd.3 with 11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223)
50875:calc_acting primary is osd.3 with 11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223)
50876: osd.2 (up) accepted 11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)
50878:2020-09-11 14:10:20.031210 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] actingbackfill is 2,3
50880:2020-09-11 14:10:20.031215 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] choose_acting want [3,2] (== acting) backfill_targets 
50881:2020-09-11 14:10:20.031220 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetLog>: leaving GetLog
50882:2020-09-11 14:10:20.031225 7fba3d124700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] exit Started/Primary/Peering/GetLog 0.000070 0 0.000000
50883:2020-09-11 14:10:20.031242 7fba3d124700 15 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] publish_stats_to_osd 2227:1339

3.4.1.1 函数PG::choose_acting()

bool PG::choose_acting(pg_shard_t &auth_log_shard_id,
		       bool restrict_to_up_acting,
		       bool *history_les_bound);

函数choose_acting()用来计算PG的acting_backfill和backfill_targets两个OSD列表。acting_backfill保存了当前PG的acting列表,包括需要进行Backfill操作的OSD列表;backfill_targets列表保存了需要进行Backfill的OSD列表。

当前acting set为[3,2],获取到的权威日志信息如下:

50863:2020-09-11 14:10:20.031161 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] calc_acting osd.2 11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)
50865:2020-09-11 14:10:20.031168 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] calc_acting osd.3 11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223)

因此这里权威肯定是存在于osd3上。

接着计算出来的actingbackfill为[2,3],want为[3,2],acting为[3,2]。


在GetLog()构造函数中选择出来的拥有权威日志的OSD为osd3,就是当前的主OSD,因此直接抛出GotLog()事件,从而进入GetMissing状态。

3.4.3 进入Started/Primary/Peering/GetMissing状态
/*------GetMissing--------*/
PG::RecoveryState::GetMissing::GetMissing(my_context ctx)
  : my_base(ctx),
    NamedState(context< RecoveryMachine >().pg->cct, "Started/Primary/Peering/GetMissing")
{
	context< RecoveryMachine >().log_enter(state_name);
	
	PG *pg = context< RecoveryMachine >().pg;
	assert(!pg->actingbackfill.empty());
	for (set<pg_shard_t>::iterator i = pg->actingbackfill.begin();i != pg->actingbackfill.end();++i) {
		if (*i == pg->get_primary()) continue;
		const pg_info_t& pi = pg->peer_info[*i];
	
		if (pi.is_empty())
			continue;                                // no pg data, nothing divergent
	
		if (pi.last_update < pg->pg_log.get_tail()) {
			dout(10) << " osd." << *i << " is not contiguous, will restart backfill" << dendl;
			pg->peer_missing[*i];
			continue;
		}

		if (pi.last_backfill == hobject_t()) {
			dout(10) << " osd." << *i << " will fully backfill; can infer empty missing set" << dendl;
			pg->peer_missing[*i];
			continue;
		}

		if (pi.last_update == pi.last_complete &&  // peer has no missing
		  pi.last_update == pg->info.last_update) {  // peer is up to date
			// replica has no missing and identical log as us.  no need to
			// pull anything.
			// FIXME: we can do better here.  if last_update==last_complete we
			//        can infer the rest!
			dout(10) << " osd." << *i << " has no missing, identical log" << dendl;
			pg->peer_missing[*i];
			continue;
		}

		// We pull the log from the peer's last_epoch_started to ensure we
		// get enough log to detect divergent updates.
		eversion_t since(pi.last_epoch_started, 0);
		assert(pi.last_update >= pg->info.log_tail);  // or else choose_acting() did a bad thing
		if (pi.log_tail <= since) {
			dout(10) << " requesting log+missing since " << since << " from osd." << *i << dendl;
			context< RecoveryMachine >().send_query(
			  *i,
			  pg_query_t(
				pg_query_t::LOG,
				i->shard, pg->pg_whoami.shard,
				since, pg->info.history,
				pg->get_osdmap()->get_epoch()));
		} else {
			dout(10) << " requesting fulllog+missing from osd." << *i<< " (want since " << since << " < log.tail " << pi.log_tail << ")"<< dendl;
			context< RecoveryMachine >().send_query(
			  *i, pg_query_t(
				pg_query_t::FULLLOG,
				i->shard, pg->pg_whoami.shard,
				pg->info.history, pg->get_osdmap()->get_epoch()));
		}
		peer_missing_requested.insert(*i);
		pg->blocked_by.insert(i->osd);
	}
	
	if (peer_missing_requested.empty()) {
		if (pg->need_up_thru) {
			dout(10) << " still need up_thru update before going active" << dendl;
			post_event(NeedUpThru());
			return;
		}
	
		// all good!
		post_event(Activate(pg->get_osdmap()->get_epoch()));
	} else {
		pg->publish_stats_to_osd();
	}
}

进入GetMissing状态后,会遍历actingbackfill列表,当前针对PG 11.4而言其actingbackfill为[2,3]。由于PG 11.4中osd2返回来的pg_info.last_update为0,因此这里直接退出循环遍历。

之后由于在GetInfo()构造函数中,调用build_prior()计算得到PG 11.4需要进行up_thru操作,因此这里产生NeedUpThru事件,从而进入WaitUpThru状态。

如下是此一阶段的日志片段:

50885:2020-09-11 14:10:20.031249 7fba3d124700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] enter Started/Primary/Peering/GetMissing
50892:2020-09-11 14:10:20.031254 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary/Peering/GetMissing>:  still need up_thru update before going active
50894:2020-09-11 14:10:20.031270 7fba3d124700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] exit Started/Primary/Peering/GetMissing 0.000021 0 0.000000
50895:2020-09-11 14:10:20.031275 7fba3d124700 15 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] publish_stats_to_osd 2227: no change since 2020-09-11 14:10:20.031240
3.4.4 进入Started/Primary/Peering/WaitUpThru状态
/*------WaitUpThru--------*/
PG::RecoveryState::WaitUpThru::WaitUpThru(my_context ctx)
  : my_base(ctx),
    NamedState(context< RecoveryMachine >().pg->cct, "Started/Primary/Peering/WaitUpThru")
{
	context< RecoveryMachine >().log_enter(state_name);
}

如下是此一阶段的日志片段:

50896:2020-09-11 14:10:20.031285 7fba3d124700  5 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] enter Started/Primary/Peering/WaitUpThru
51762:2020-09-11 14:10:20.420033 7fba496c5700 30 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] lock
51764:2020-09-11 14:10:20.420048 7fba496c5700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] flushed
51770:2020-09-11 14:10:20.420077 7fba3d124700 30 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] lock
51772:2020-09-11 14:10:20.420086 7fba3d124700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] handle_peering_event: epoch_sent: 2227 epoch_requested: 2227 FlushedEvt
51773:2020-09-11 14:10:20.420096 7fba3d124700 15 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering]  requeue_ops 
52620:2020-09-11 14:10:20.631849 7fba4fd58700 20 osd.3 2227 11.4 heartbeat_peers 2,3
53389:2020-09-11 14:10:20.745327 7fba4f557700 25 osd.3 2227  sending 11.4 2227:1339
53689:2020-09-11 14:10:21.017381 7fba49ec6700 30 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] lock
53917:2020-09-11 14:10:21.021443 7fba49ec6700 30 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] lock
53919:2020-09-11 14:10:21.021465 7fba49ec6700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] null
54472:2020-09-11 14:10:21.026500 7fba49ec6700 20 osd.3 2228 11.4 heartbeat_peers 2,3
54757:2020-09-11 14:10:21.028048 7fba3d925700 30 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] lock
54759:2020-09-11 14:10:21.028061 7fba3d925700 10 osd.3 pg_epoch: 2227 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] handle_advance_map [3,2]/[3,2] -- 3/3
54761:2020-09-11 14:10:21.028073 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary/Peering>: Peering advmap
54762:2020-09-11 14:10:21.028081 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] adjust_need_up_thru now 2227, need_up_thru now false
54763:2020-09-11 14:10:21.028087 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started>: Started advmap
54764:2020-09-11 14:10:21.028095 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] check_recovery_sources no source osds () went down
54765:2020-09-11 14:10:21.028104 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] handle_activate_map 
54766:2020-09-11 14:10:21.028112 7fba3d925700  7 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] state<Started/Primary>: handle ActMap primary
54767:2020-09-11 14:10:21.028120 7fba3d925700 15 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] publish_stats_to_osd 2227: no change since 2020-09-11 14:10:20.031240
54768:2020-09-11 14:10:21.028129 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] take_waiters
54769:2020-09-11 14:10:21.028136 7fba3d925700  5 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 peering] exit Started/Primary/Peering/WaitUpThru 0.996851 3 0.000229

1) AdvMap事件的处理

由于上面need_up_thru,因此会向OSDMonitor发起up_thru请求,导致osdmap发生改变,因此调用OSD::process_peering_events()、OSD::advance_pg()、 PG::handle_advance_map()来处理这一变化,然后产生AdvMap事件。到此,我们获取到的最新的osdmap版本为e2228.

由于当前所处的状态为WaitUpThru,其本身并不能直接处理AdvMap事件,因此会调用其父状态的相关函数来进行处理,我们来看相应的代码:

boost::statechart::result PG::RecoveryState::Peering::react(const AdvMap& advmap) 
{
	PG *pg = context< RecoveryMachine >().pg;
	dout(10) << "Peering advmap" << dendl;
	if (prior_set.get()->affected_by_map(advmap.osdmap, pg)) {
		dout(1) << "Peering, affected_by_map, going to Reset" << dendl;
		post_event(advmap);
		return transit< Reset >();
	}
	
	pg->adjust_need_up_thru(advmap.osdmap);
	
	return forward_event();
}

boost::statechart::result PG::RecoveryState::Started::react(const AdvMap& advmap)
{
	dout(10) << "Started advmap" << dendl;
	PG *pg = context< RecoveryMachine >().pg;
	pg->check_full_transition(advmap.lastmap, advmap.osdmap);
	if (pg->should_restart_peering(
	  advmap.up_primary,
	  advmap.acting_primary,
	  advmap.newup,
	  advmap.newacting,
	  advmap.lastmap,
	  advmap.osdmap)) {
		dout(10) << "should_restart_peering, transitioning to Reset" << dendl;
		post_event(advmap);
		return transit< Reset >();
	}
	pg->remove_down_peer_info(advmap.osdmap);
	return discard_event();
}

// true if the given map affects the prior set
bool PG::PriorSet::affected_by_map(const OSDMapRef osdmap, const PG *debug_pg) const
{
}

在当前阶段,prior_set->probe为[2,3], prior_set->down为[],prior_set->blocked_by为{}。

PG::PriorSet::affected_by_map()函数的作用是判断指定的osdmap会不会影响当前的prior_set,如果返回true,表明会影响,此时应该进入Reset状态,重新peering;否则返回false。

在这里针对PG 11.4,osdmap e2228并不会影响到当前的prior_set,因此调用PG::adjust_need_up_thru()来调整need_up_thru:

bool PG::adjust_need_up_thru(const OSDMapRef osdmap)
{
	epoch_t up_thru = get_osdmap()->get_up_thru(osd->whoami);
	if (need_up_thru &&
	  up_thru >= info.history.same_interval_since) {
		dout(10) << "adjust_need_up_thru now " << up_thru << ", need_up_thru now false" << dendl;
		need_up_thru = false;
		return true;
	}
	return false;
}

由于当前up_thru为e2227,而info.history.same_interval_since为e2223,因此这里need_up_thru的值设置为false。

之后AdvMap事件会继续向上抛出,调用Started::react(const AdvMap&)来进行处理。这里并不需要重新触发Peering。

2) 处理ActMap事件

收到了osdmap变动的消息,处理完成AdvMap事件后,接着就会产生一个ActMap事件,如下是对该事件的处理:

boost::statechart::result PG::RecoveryState::WaitUpThru::react(const ActMap& am)
{
	PG *pg = context< RecoveryMachine >().pg;
	if (!pg->need_up_thru) {
		post_event(Activate(pg->get_osdmap()->get_epoch()));
	}
	return forward_event();
}

boost::statechart::result PG::RecoveryState::Primary::react(const ActMap&)
{
	dout(7) << "handle ActMap primary" << dendl;
	PG *pg = context< RecoveryMachine >().pg;
	pg->publish_stats_to_osd();
	pg->take_waiters();
	return discard_event();
}

上面我们可以看到,如果pg->need_up_thru为false,那么产生Activate()事件,从而进入Active状态。这里针对PG 11.4而言,其并不需要再一次进行up_thru了,因此这里进入Active状态。

3.5 进入Active状态

/*---------Active---------*/
PG::RecoveryState::Active::Active(my_context ctx)
  : my_base(ctx),
    NamedState(context< RecoveryMachine >().pg->cct, "Started/Primary/Active"),
    remote_shards_to_reserve_recovery(
      unique_osd_shard_set(
	context< RecoveryMachine >().pg->pg_whoami,
	context< RecoveryMachine >().pg->actingbackfill)),
    remote_shards_to_reserve_backfill(
      unique_osd_shard_set(
	context< RecoveryMachine >().pg->pg_whoami,
	context< RecoveryMachine >().pg->backfill_targets)),
    all_replicas_activated(false)
{
	context< RecoveryMachine >().log_enter(state_name);
	
	PG *pg = context< RecoveryMachine >().pg;
	
	assert(!pg->backfill_reserving);
	assert(!pg->backfill_reserved);
	assert(pg->is_primary());
	dout(10) << "In Active, about to call activate" << dendl;
	pg->start_flush(
		context< RecoveryMachine >().get_cur_transaction(),
		context< RecoveryMachine >().get_on_applied_context_list(),
		context< RecoveryMachine >().get_on_safe_context_list());

	pg->activate(*context< RecoveryMachine >().get_cur_transaction(),
		pg->get_osdmap()->get_epoch(),
		*context< RecoveryMachine >().get_on_safe_context_list(),
		*context< RecoveryMachine >().get_query_map(),
		context< RecoveryMachine >().get_info_map(),
		context< RecoveryMachine >().get_recovery_ctx());
	
	// everyone has to commit/ack before we are truly active
	pg->blocked_by.clear();
	for (set<pg_shard_t>::iterator p = pg->actingbackfill.begin();p != pg->actingbackfill.end();	++p) {
		if (p->shard != pg->pg_whoami.shard) {
			pg->blocked_by.insert(p->shard);
		}
	}
	pg->publish_stats_to_osd();
	dout(10) << "Activate Finished" << dendl;
}

在上面Active状态的构造函数中,调用pg->activate()来激活PG,如下是此一阶段的一个日志片段:

54772:2020-09-11 14:10:21.028162 7fba3d925700  5 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] enter Started/Primary/Active
54773:2020-09-11 14:10:21.028170 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2224 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] state<Started/Primary/Active>: In Active, about to call activate
54774:2020-09-11 14:10:21.028179 7fba3d925700 20 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] activate - purged_snaps [] cached_removed_snaps []
54775:2020-09-11 14:10:21.028186 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] activate - snap_trimq []
54776:2020-09-11 14:10:21.028193 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] activate - no missing, moving last_complete 201'1 -> 201'1
54777:2020-09-11 14:10:21.028200 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] activate peer osd.2 11.4( DNE empty local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0)
54778:2020-09-11 14:10:21.028213 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] activate peer osd.2 sending log((0'0,201'1], crt=201'1)
54780:2020-09-11 14:10:21.028240 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] activate peer osd.2 11.4( DNE v 201'1 lc 0'0 (0'0,201'1] local-les=0 n=0 ec=0 les/c/f 0/0/0 0/0/0) missing missing(1)
54781:2020-09-11 14:10:21.028251 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] needs_recovery osd.2 has 1 missing
54782:2020-09-11 14:10:21.028258 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] add_batch_sources_info: adding sources in batch 1
54783:2020-09-11 14:10:21.028265 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] build_might_have_unfound
54784:2020-09-11 14:10:21.028274 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] _calc_past_interval_range: already have past intervals back to 2224
54785:2020-09-11 14:10:21.028282 7fba3d925700 15 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 inactive] build_might_have_unfound: built 2
54786:2020-09-11 14:10:21.028288 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 degraded] activate - starting recovery
54787:2020-09-11 14:10:21.028299 7fba3d925700 10 osd.3 2228 queue_for_recovery queued pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 degraded]
54789:2020-09-11 14:10:21.028307 7fba3d925700 15 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] publish_stats_to_osd 2228:1340
54790:2020-09-11 14:10:21.028315 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] state<Started/Primary/Active>: Activate Finished

状态Active的构造函数里处理过程如下:

1) 在构造函数里初始化了remote_shards_to_reserve_recovery和remote_shards_to_reserve_backfill,需要Recovery操作和Backfill操作的OSD;

2) 调用函数pg->start_flush()来完成相关数据的flush工作;

3) 调用函数pg->activate()完成最后的激活工作。

此时,由于PG 11.4当前的映射为[3,2],但是osd2缺失了一个object,因此这里显示的状态为inactive.

注: 进入Active状态后,默认进入其初始子状态Started/Primary/Active/Activating

3.5.1 MissingLoc

在讲解pg->activate()之前,我们先介绍一下其中使用到的MissingLoc。类MIssingLoc用来记录处于missing状态对象的位置,也就是缺失对象的正确版本分别在哪些OSD上。恢复时就去这些OSD上去拉取正确对象的对象数据:

 class MissingLoc {
	//缺失的对象 ---> item(现在版本,缺失的版本)
	map<hobject_t, pg_missing_t::item, hobject_t::BitwiseComparator> needs_recovery_map;

	//缺失的对象 ---> 所在的OSD集合
	map<hobject_t, set<pg_shard_t>, hobject_t::BitwiseComparator > missing_loc;

	//所有缺失对象所在的OSD集合
	set<pg_shard_t> missing_loc_sources;
	PG *pg;
	set<pg_shard_t> empty_set;
};

下面介绍一些MissingLoc处理函数,作用是添加相应的missing对象列表。其对应两个函数:add_active_missing()函数和add_source_info()函数。

  • add_active_missing()函数: 用于把一个副本中的所有缺失对象添加到MissingLoc的needs_recovery_map结构里
void add_active_missing(const pg_missing_t &missing) {
	...
}
  • add_source_info()函数: 用于计算每个缺失对象是否在本OSD上
/// Adds info about a possible recovery source
bool add_source_info(
	pg_shard_t source,               ///< [in] source
	const pg_info_t &oinfo,         ///< [in] info
	const pg_missing_t &omissing,   ///< [in] (optional) missing
	bool sort_bitwise,             ///< [in] local sort bitwise (vs nibblewise)
	ThreadPool::TPHandle* handle   ///< [in] ThreadPool handle
	); 

具体实现如下:

遍历needs_recovery_map里的所有对象,对每个对象做如下处理:

1) 如果oinfo.last_update < need(所缺失的对象版本),就跳过;

2) 如果该PG正常的last_backfill指针小于MAX值,说明还处于Backfill阶段,但是sort_bitwise不正确,跳过;

3) 如果该对象大于last_backfill,显然该对象不存在,跳过;

4) 如果该对象大于last_complete,说明该对象或者是上次Peering之后缺失的对象,还没有来得及修复;或者是新创建的对象。检查如果missing记录已存在,就是上次缺失的对象,直接跳过;否则就是新创建的对象,存在该OSD中;

5)经过上述检查后,确认该对象在本OSD上,在missing_loc添加该对象的location为本OSD。

3.5.2 activate操作

PG::activate()函数是主OSD进入Active状态后执行的第一步操作:

void PG::activate(ObjectStore::Transaction& t,
		  epoch_t activation_epoch,
		  list<Context*>& tfin,
		  map<int, map<spg_t,pg_query_t> >& query_map,
		  map<int,
		      vector<
			pair<pg_notify_t,
			     pg_interval_map_t> > > *activator_map,
                  RecoveryCtx *ctx)
{
	
}

该函数完成以下功能:

  • 更新一些pg_info的参数信息

  • 给replica发消息,激活副本PG;

  • 计算MissingLoc,也就是缺失对象分布在哪些OSD上,用于后续的恢复;

具体的处理过程如下:

1) 如果需要客户回答,就把PG添加到replay_queue队列里;

2) 更新info.last_epoch_started变量,info.last_epoch_started指的是本OSD在完成目前Peering进程后的更新,而info.history.last_epoch_started是PG的所有OSD都确认完成Peering的更新。

3) 更新一些相关的字段;

4) 注册C_PG_ActivateCommitted()回调函数,该函数最终完成activate的工作;

5) 初始化snap_trimq快照相关的变量;

6) 设置info.last_complete指针:

  • 如果missing.num_missing()等于0,表明处于clean状态。直接更新info.last_complete等于info.last_update,并调用pg_log.reset_recovery_pointers()调整log的complete_to指针;

  • 否则,如果有需要恢复的对象,就调用函数pg_log.activate_not_complete(info),设置info.last_complete为缺失的第一个对象的前一个版本。

7) 以下都是主OSD的操作,给每个从OSD发送MOSDPGLog类型的消息,激活该PG的从OSD上的副本。分别对应三种不同处理:

void PG::activate(ObjectStore::Transaction& t,
		  epoch_t activation_epoch,
		  list<Context*>& tfin,
		  map<int, map<spg_t,pg_query_t> >& query_map,
		  map<int,
		      vector<
			pair<pg_notify_t,
			     pg_interval_map_t> > > *activator_map,
                  RecoveryCtx *ctx)
{
	...
	for (set<pg_shard_t>::iterator i = actingbackfill.begin();i != actingbackfill.end();++i) {

		...

		if (pi.last_update == info.last_update && !force_restart_backfill){

			...

		}else if (pg_log.get_tail() > pi.last_update || pi.last_backfill == hobject_t() ||
		force_restart_backfill || (backfill_targets.count(*i) && pi.last_backfill.is_max())) {	

			...
		}else{

		}

		// share past_intervals if we are creating the pg on the replica
		// based on whether our info for that peer was dne() *before*
		// updating pi.history in the backfill block above.
		if (needs_past_intervals)
			m->past_intervals = past_intervals;
		
		// update local version of peer's missing list!
		if (m && pi.last_backfill != hobject_t()) {
			for (list<pg_log_entry_t>::iterator p = m->log.log.begin();p != m->log.log.end();++p)
				if (cmp(p->soid, pi.last_backfill, get_sort_bitwise()) <= 0)
					pm.add_next_event(*p);
		}
		
		if (m) {
			dout(10) << "activate peer osd." << peer << " sending " << m->log << dendl;
			//m->log.print(cout);
			osd->send_message_osd_cluster(peer.osd, m, get_osdmap()->get_epoch());
		}
		
		// peer now has 
		pi.last_update = info.last_update;
		
		// update our missing
		if (pm.num_missing() == 0) {
			pi.last_complete = pi.last_update;
			dout(10) << "activate peer osd." << peer << " " << pi << " uptodate" << dendl;
		} else {
			dout(10) << "activate peer osd." << peer << " " << pi << " missing " << pm << dendl;
		}

	}
}
  • 如果pi.last_update等于info.last_update,这种情况下,该OSD本身就是clean的,不需要给该OSD发送其他信息。添加到activator_map只发送pg_info来激活从OSD。其最终的执行在PeeringWQ的线程执行完状态机的事件处理后,在函数OSD::dispatch_context()里调用OSD::do_info()函数实现;

  • 需要Backfill操作的OSD,发送pg_info,以及osd_min_pg_log_entries数量的PG日志;

  • 需要Recovery操作的OSD,发送pg_info,以及从缺失的日志;

8) 设置MissingLoc,也就是统计缺失的对象,以及缺失的对象所在的OSD,核心就是调用MissingLoc的add_source_info()函数,见MissingLoc的相关分析;

void PG::activate(ObjectStore::Transaction& t,
		  epoch_t activation_epoch,
		  list<Context*>& tfin,
		  map<int, map<spg_t,pg_query_t> >& query_map,
		  map<int,
		      vector<
			pair<pg_notify_t,
			     pg_interval_map_t> > > *activator_map,
                  RecoveryCtx *ctx)
{
	...

	// if primary..
	if (is_primary()) {

		...

		// Set up missing_loc
		set<pg_shard_t> complete_shards;
		for (set<pg_shard_t>::iterator i = actingbackfill.begin();i != actingbackfill.end();++i) {
			if (*i == get_primary()) {
				missing_loc.add_active_missing(missing);
				if (!missing.have_missing())
					complete_shards.insert(*i);
			} else {
				assert(peer_missing.count(*i));
				missing_loc.add_active_missing(peer_missing[*i]);
				if (!peer_missing[*i].have_missing() && peer_info[*i].last_backfill.is_max())
					complete_shards.insert(*i);
			}
		}

		...
	}

	...

}

在这里针对PG 11.4而言,osd2是需要进行恢复的,因此会加入missing_loc列表中。

9) 如果需要恢复,把该PG加入到osd->queue_for_recovery(this)的恢复队列中;

void PG::activate(ObjectStore::Transaction& t,
		  epoch_t activation_epoch,
		  list<Context*>& tfin,
		  map<int, map<spg_t,pg_query_t> >& query_map,
		  map<int,
		      vector<
			pair<pg_notify_t,
			     pg_interval_map_t> > > *activator_map,
                  RecoveryCtx *ctx)
{
	...

	// if primary..
	if (is_primary()) {

		...

		if (needs_recovery()) {
			// If only one shard has missing, we do a trick to add all others as recovery
			// source, this is considered safe since the PGLogs have been merged locally,
			// and covers vast majority of the use cases, like one OSD/host is down for
			// a while for hardware repairing
			if (complete_shards.size() + 1 == actingbackfill.size()) {
				missing_loc.add_batch_sources_info(complete_shards, ctx->handle);
			} else {
				missing_loc.add_source_info(pg_whoami, info, pg_log.get_missing(),
					get_sort_bitwise(), ctx->handle);

				for (set<pg_shard_t>::iterator i = actingbackfill.begin();i != actingbackfill.end();++i) {
					if (*i == pg_whoami) continue;
					dout(10) << __func__ << ": adding " << *i << " as a source" << dendl;
					assert(peer_missing.count(*i));
					assert(peer_info.count(*i));

					missing_loc.add_source_info(
						*i,
						peer_info[*i],
						peer_missing[*i],
						get_sort_bitwise(),
						ctx->handle);
				}
			}

			for (map<pg_shard_t, pg_missing_t>::iterator i = peer_missing.begin();i != peer_missing.end();++i) {
				if (is_actingbackfill(i->first))
					continue;
				assert(peer_info.count(i->first));
				search_for_missing(
					peer_info[i->first],
					i->second,
					i->first,
					ctx);
			}
		
			build_might_have_unfound();
		
			state_set(PG_STATE_DEGRADED);
			dout(10) << "activate - starting recovery" << dendl;
			osd->queue_for_recovery(this);
			if (have_unfound())
				discover_all_missing(query_map);
		}
	}
}

10) 如果PG当前acting set的size小于该PG所在pool设置的副本size,也就是当前的OSD不够,就标记PG的状态为PG_STATE_DEGRADEDPG_STATE_UNDERSIZED ,最后标记PG为PG_STATE_ACTIVATING状态;

这里针对PG 11.4而言,osd2的pg_info.last_update为e0,这里会进入最后的else分支,进行Recovery操作: 发送pg_info以及从缺失的日志信息。并把osd2加入Recovery列表,以进行数据恢复。

3.5.3 Primary激活成功的回调

在上面PG::activate()函数中为对应的RecoveryCtx.transaction注册了一个提交的回调函数:

void PG::activate(ObjectStore::Transaction& t,
		  epoch_t activation_epoch,
		  list<Context*>& tfin,
		  map<int, map<spg_t,pg_query_t> >& query_map,
		  map<int,
		      vector<
			pair<pg_notify_t,
			     pg_interval_map_t> > > *activator_map,
                  RecoveryCtx *ctx)
{

	...
	// write pg info, log
	dirty_info = true;
	dirty_big_info = true; // maybe


	t.register_on_complete(
		new C_PG_ActivateCommitted(
		this,
		get_osdmap()->get_epoch(),
		activation_epoch));

	...
}

对于RecoveryCtx.transaction事务的提交会在如下函数中进行:

void OSD::process_peering_events(
  const list<PG*> &pgs,
  ThreadPool::TPHandle &handle
  )
{
	...

	PG::RecoveryCtx rctx = create_context();
	rctx.handle = &handle;
	for (list<PG*>::const_iterator i = pgs.begin();i != pgs.end();++i) {

		...
		dispatch_context_transaction(rctx, pg, &handle);
	}

	...
	dispatch_context(rctx, 0, curmap, &handle);
}

void OSD::dispatch_context_transaction(PG::RecoveryCtx &ctx, PG *pg,
                                       ThreadPool::TPHandle *handle)
{
	if (!ctx.transaction->empty()) {
		if (!ctx.created_pgs.empty()) {
			ctx.on_applied->add(new C_OpenPGs(ctx.created_pgs, store));
		}

		int tr = store->queue_transaction(
			pg->osr.get(),
			std::move(*ctx.transaction), ctx.on_applied, ctx.on_safe, NULL,
			TrackedOpRef(), handle);

		delete (ctx.transaction);
		assert(tr == 0);
		ctx.transaction = new ObjectStore::Transaction;
		ctx.on_applied = new C_Contexts(cct);
		ctx.on_safe = new C_Contexts(cct);
	}
}

如下是此一过程的一个日志片段:

57029:2020-09-11 14:10:21.067621 7fba49ec6700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] _activate_committed 2228 peer_activated now 3 last_epoch_started 2224 same_interval_since 2227
3.5.4 收到从OSD的MOSDPGLog的应答

当收到从OSD发送的对MOSDPGLog的ACK消息后,触发MInfoRec事件,下面这个函数处理该事件:

boost::statechart::result PG::RecoveryState::Active::react(const MInfoRec& infoevt)
{
	PG *pg = context< RecoveryMachine >().pg;
	assert(pg->is_primary());
	
	assert(!pg->actingbackfill.empty());
	// don't update history (yet) if we are active and primary; the replica
	// may be telling us they have activated (and committed) but we can't
	// share that until _everyone_ does the same.
	if (pg->is_actingbackfill(infoevt.from)) {
		dout(10) << " peer osd." << infoevt.from << " activated and committed" << dendl;
		pg->peer_activated.insert(infoevt.from);
		pg->blocked_by.erase(infoevt.from.shard);
		pg->publish_stats_to_osd();

		if (pg->peer_activated.size() == pg->actingbackfill.size()) {
			pg->all_activated_and_committed();
		}
	}

	return discard_event();
}

处理过程比较简单:检查该请求的源OSD在本PG的actingbackfill列表中,以及在等待列表中删除该OSD。最后检查,当收集到所有从OSD发送的ACK,就调用函数all_activated_and_committed():

/*
 * update info.history.last_epoch_started ONLY after we and all
 * replicas have activated AND committed the activate transaction
 * (i.e. the peering results are stable on disk).
 */
void PG::all_activated_and_committed()
{
  dout(10) << "all_activated_and_committed" << dendl;
  assert(is_primary());
  assert(peer_activated.size() == actingbackfill.size());
  assert(!actingbackfill.empty());
  assert(blocked_by.empty());

  queue_peering_event(
    CephPeeringEvtRef(
      std::make_shared<CephPeeringEvt>(
        get_osdmap()->get_epoch(),
        get_osdmap()->get_epoch(),
        AllReplicasActivated())));
}

该函数产生一个AllReplicasActivated()事件。

如下是此一过程的一个日志片段:

60468:2020-09-11 14:10:21.964346 7fba45134700 20 osd.3 2228 _dispatch 0x7fba69a9dfe0 pg_info(1 pgs e2228:11.4) v4
60469:2020-09-11 14:10:21.964354 7fba45134700  7 osd.3 2228 handle_pg_info pg_info(1 pgs e2228:11.4) v4 from osd.2
60471:2020-09-11 14:10:21.964414 7fba45134700 30 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] lock
60474:2020-09-11 14:10:21.964472 7fba3d124700 30 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] lock
60475:2020-09-11 14:10:21.964509 7fba3d124700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] handle_peering_event: epoch_sent: 2228 epoch_requested: 2228 MInfoRec from 2 info: 11.4( v 201'1 lc 0'0 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223)
60476:2020-09-11 14:10:21.964530 7fba3d124700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] state<Started/Primary/Active>:  peer osd.2 activated and committed
60477:2020-09-11 14:10:21.964549 7fba3d124700 15 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] publish_stats_to_osd 2228: no change since 2020-09-11 14:10:21.028307
60478:2020-09-11 14:10:21.964575 7fba3d124700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] all_activated_and_committed

注: Replica收到主OSD发送的MOSDPGLog消息,会进入ReplicaAcitve状态,然后也会调用PG::activate()函数,从而也会调用到PG::_activate_committed()函数,在该函数里向主OSD发出ACK响应。

3.5.5 AllReplicasActivated

如下函数用于处理AllReplicasActivated事件:

boost::statechart::result PG::RecoveryState::Active::react(const AllReplicasActivated &evt)
{
	PG *pg = context< RecoveryMachine >().pg;
	all_replicas_activated = true;
	
	pg->state_clear(PG_STATE_ACTIVATING);
	pg->state_clear(PG_STATE_CREATING);
	if (pg->acting.size() >= pg->pool.info.min_size) {
		pg->state_set(PG_STATE_ACTIVE);
	} else {
		pg->state_set(PG_STATE_PEERED);
	}
	
	// info.last_epoch_started is set during activate()
	pg->info.history.last_epoch_started = pg->info.last_epoch_started;
	pg->dirty_info = true;
	
	pg->share_pg_info();
	pg->publish_stats_to_osd();
	
	pg->check_local();
	
	// waiters
	if (pg->flushes_in_progress == 0) {
		pg->requeue_ops(pg->waiting_for_peered);
	}
	
	pg->on_activate();
	
	return discard_event();
}

当所有的replica处于activated状态时,进行如下处理:

1) 取消PG_STATE_ACTIVATINGPG_STATE_CREATING状态,如果该PG上acting状态的OSD数量大于等于pool的min_size,那么设置该PG为PG_STATE_ACTIVE状态,否则设置为PG_STATE_PEERED状态;

2) 设置pg->info.history.last_epoch_started为pg.info.last_epoch_started;

3) 调用pg->share_pg_info()函数向actingbackfill列表中的Replicas发送最新的pg_info_t信息;

4) 调用ReplicatedPG::check_local()检查本地的stray objects是否被删除;

5) 如果有读写请求在等待peering操作完成,则把该请求添加到处理队列pg->requeue_ops(pg->waiting_for_peered);

注: 在ReplicatedPG::do_request()函数中,如果发现当前PG没有peering成功,那么将会将相应的请求保存到waiting_for_peered队列中。详细请参看OSD的读写流程。

6) 调用函数ReplicatedPG::on_activate(),如果需要Recovery操作,触发DoRecovery事件;如果需要Backfill操作,触发RequestBackfill事件;否则,触发AllReplicasRecovered事件。

void ReplicatedPG::on_activate()
{
	// all clean?
	if (needs_recovery()) {
		dout(10) << "activate not all replicas are up-to-date, queueing recovery" << dendl;
		queue_peering_event(
		  CephPeeringEvtRef(
			std::make_shared<CephPeeringEvt>(
			get_osdmap()->get_epoch(),
			get_osdmap()->get_epoch(),
			DoRecovery())));
	} else if (needs_backfill()) {
		dout(10) << "activate queueing backfill" << dendl;
		queue_peering_event(
		  CephPeeringEvtRef(
			std::make_shared<CephPeeringEvt>(
			get_osdmap()->get_epoch(),
			get_osdmap()->get_epoch(),
			RequestBackfill())));
	} else {
		dout(10) << "activate all replicas clean, no recovery" << dendl;
		queue_peering_event(
	  	  CephPeeringEvtRef(
			std::make_shared<CephPeeringEvt>(
			get_osdmap()->get_epoch(),
			get_osdmap()->get_epoch(),
			AllReplicasRecovered())));
	}
	
	publish_stats_to_osd();
	
	if (!backfill_targets.empty()) {
		last_backfill_started = earliest_backfill();
		new_backfill = true;
		assert(!last_backfill_started.is_max());

		dout(5) << "on activate: bft=" << backfill_targets<< " from " << last_backfill_started << dendl;
		for (set<pg_shard_t>::iterator i = backfill_targets.begin();i != backfill_targets.end();++i) {
			dout(5) << "target shard " << *i<< " from " << peer_info[*i].last_backfill<< dendl;
		}
	}
	
	hit_set_setup();
	agent_setup();
}

在on_activate()函数中,如果需要Recovery操作,触发DoRecovery事件;如果需要Backfill操作,触发RequestBackfill事件;否则,触发AllReplicasRecovered事件。之后再调用hit_set_setup()来初始化Cache Tier需要的hit_set对象,调用agent_setup()来初始化Cache Tier需要的agent对象。

如下是此阶段的一个日志片段:

60481:2020-09-11 14:10:21.964670 7fba3d124700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2224/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 activating+degraded] handle_peering_event: epoch_sent: 2228 epoch_requested: 2228 AllReplicasActivated
60482:2020-09-11 14:10:21.964685 7fba3d124700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] share_pg_info
60484:2020-09-11 14:10:21.964747 7fba3d124700 15 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] publish_stats_to_osd 2228:1341
60485:2020-09-11 14:10:21.964771 7fba3d124700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] check_local
60486:2020-09-11 14:10:21.964782 7fba3d124700 15 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded]  requeue_ops 
60487:2020-09-11 14:10:21.964794 7fba3d124700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] needs_recovery osd.2 has 1 missing
60488:2020-09-11 14:10:21.964805 7fba3d124700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] activate not all replicas are up-to-date, queueing recovery
60489:2020-09-11 14:10:21.964828 7fba3d124700 15 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] publish_stats_to_osd 2228: no change since 2020-09-11 14:10:21.964740
60490:2020-09-11 14:10:21.964867 7fba3d124700 20 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] hit_set_clear
60491:2020-09-11 14:10:21.964881 7fba3d124700 20 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] agent_stop
3.5.6 进入Active/Activating状态

其实在Active构造函数执行完成之后,默认就进入了Activating状态,并不需要等到上面副本OSD响应MOSDPGLog请求。如下我们简单给出Activating的构造函数:

/*------Activating--------*/
PG::RecoveryState::Activating::Activating(my_context ctx)
  : my_base(ctx),
    NamedState(context< RecoveryMachine >().pg->cct, "Started/Primary/Active/Activating")
{
	context< RecoveryMachine >().log_enter(state_name);
}
3.5.7 进入Active/WaitLocalRecoveryReserved状态

在上面ReplicatedPG::on_activate()函数中,由于PG 11.4的osd2需要恢复,因此会产生DoRecovery()事件,从而进入WaitLocalRecoveryReserved状态,如下是此一阶段的日志片段:

60496:2020-09-11 14:10:21.965057 7fba3d124700  5 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+degraded] enter Started/Primary/Active/WaitLocalRecoveryReserved
60497:2020-09-11 14:10:21.965075 7fba3d124700 15 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] publish_stats_to_osd 2228:1342
60621:2020-09-11 14:10:22.049880 7fba49ec6700 30 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
60891:2020-09-11 14:10:22.051626 7fba49ec6700 30 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
60892:2020-09-11 14:10:22.051634 7fba49ec6700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] null
61652:2020-09-11 14:10:22.055162 7fba3d925700 30 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
61659:2020-09-11 14:10:22.055181 7fba3d925700 10 osd.3 pg_epoch: 2228 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_advance_map [3,2]/[3,2] -- 3/3
61664:2020-09-11 14:10:22.055197 7fba3d925700 10 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started/Primary/Active>: Active advmap
61667:2020-09-11 14:10:22.055208 7fba3d925700 10 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started>: Started advmap
61670:2020-09-11 14:10:22.055219 7fba3d925700 10 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] check_recovery_sources no source osds (3) went down
61675:2020-09-11 14:10:22.055233 7fba3d925700 10 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_activate_map 
61680:2020-09-11 14:10:22.055248 7fba3d925700 10 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started/Primary/Active>: Active: handling ActMap
61686:2020-09-11 14:10:22.055273 7fba3d925700 10 osd.3 2229 queue_for_recovery queued pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded]
61690:2020-09-11 14:10:22.055286 7fba3d925700  7 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started/Primary>: handle ActMap primary
61695:2020-09-11 14:10:22.055305 7fba3d925700 15 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] publish_stats_to_osd 2228: no change since 2020-09-11 14:10:21.965071
61699:2020-09-11 14:10:22.055323 7fba3d925700 10 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] take_waiters
61703:2020-09-11 14:10:22.055333 7fba3d925700 20 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_activate_map: Not dirtying info: last_persisted is 2228 while current is 2229
61706:2020-09-11 14:10:22.055344 7fba3d925700 10 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_peering_event: epoch_sent: 2229 epoch_requested: 2229 NullEvt
61849:2020-09-11 14:10:22.055821 7fba37919700 30 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
61851:2020-09-11 14:10:22.055840 7fba37919700 10 osd.3 2229 do_recovery starting 1 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded]
61855:2020-09-11 14:10:22.055850 7fba37919700 10 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] recovery raced and were queued twice, ignoring!
61859:2020-09-11 14:10:22.055862 7fba37919700 10 osd.3 2229 do_recovery started 0/1 on pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded]
63796:2020-09-11 14:10:22.633040 7fba4fd58700 20 osd.3 2229 11.4 heartbeat_peers 2,3
64613:2020-09-11 14:10:23.233528 7fba49ec6700 30 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
64889:2020-09-11 14:10:23.236332 7fba49ec6700 30 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
64891:2020-09-11 14:10:23.236346 7fba49ec6700 10 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] null
65325:2020-09-11 14:10:23.239004 7fba3d925700 30 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
65329:2020-09-11 14:10:23.239029 7fba3d925700 10 osd.3 pg_epoch: 2229 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_advance_map [3,2]/[3,2] -- 3/3
65331:2020-09-11 14:10:23.239052 7fba3d925700 10 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started/Primary/Active>: Active advmap
65334:2020-09-11 14:10:23.239067 7fba3d925700 10 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started>: Started advmap
65337:2020-09-11 14:10:23.239085 7fba3d925700 10 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] check_recovery_sources no source osds (3) went down
65340:2020-09-11 14:10:23.239104 7fba3d925700 10 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_activate_map 
65343:2020-09-11 14:10:23.239121 7fba3d925700 10 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started/Primary/Active>: Active: handling ActMap
65346:2020-09-11 14:10:23.239141 7fba3d925700 10 osd.3 2230 queue_for_recovery queued pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded]
65348:2020-09-11 14:10:23.239155 7fba3d925700  7 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started/Primary>: handle ActMap primary
65353:2020-09-11 14:10:23.239177 7fba3d925700 15 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] publish_stats_to_osd 2228: no change since 2020-09-11 14:10:21.965071
65356:2020-09-11 14:10:23.239201 7fba3d925700 10 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] take_waiters
65360:2020-09-11 14:10:23.239214 7fba3d925700 20 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_activate_map: Not dirtying info: last_persisted is 2228 while current is 2230
65362:2020-09-11 14:10:23.239229 7fba3d925700 10 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_peering_event: epoch_sent: 2230 epoch_requested: 2230 NullEvt
65624:2020-09-11 14:10:23.242068 7fba37919700 30 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
65625:2020-09-11 14:10:23.242099 7fba37919700 10 osd.3 2230 do_recovery starting 1 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded]
65626:2020-09-11 14:10:23.242111 7fba37919700 10 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] recovery raced and were queued twice, ignoring!
65627:2020-09-11 14:10:23.242124 7fba37919700 10 osd.3 2230 do_recovery started 0/1 on pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded]
66155:2020-09-11 14:10:23.245392 7fba49ec6700 20 osd.3 2230 11.4 heartbeat_peers 2,3
68073:2020-09-11 14:10:24.319966 7fba49ec6700 30 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
68718:2020-09-11 14:10:24.323051 7fba49ec6700 30 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
68726:2020-09-11 14:10:24.323069 7fba49ec6700 10 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] null
68741:2020-09-11 14:10:24.323112 7fba3d124700 30 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
68748:2020-09-11 14:10:24.323142 7fba3d124700 10 osd.3 pg_epoch: 2230 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_advance_map [3,2]/[3,2] -- 3/3
68752:2020-09-11 14:10:24.323158 7fba3d124700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started/Primary/Active>: Active advmap
68755:2020-09-11 14:10:24.323166 7fba3d124700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started>: Started advmap
68758:2020-09-11 14:10:24.323174 7fba3d124700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] check_recovery_sources no source osds (3) went down
68760:2020-09-11 14:10:24.323183 7fba3d124700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_activate_map 
68764:2020-09-11 14:10:24.323189 7fba3d124700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started/Primary/Active>: Active: handling ActMap
68766:2020-09-11 14:10:24.323199 7fba3d124700 10 osd.3 2231 queue_for_recovery queued pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded]
68769:2020-09-11 14:10:24.323204 7fba3d124700  7 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] state<Started/Primary>: handle ActMap primary
68772:2020-09-11 14:10:24.323212 7fba3d124700 15 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] publish_stats_to_osd 2228: no change since 2020-09-11 14:10:21.965071
68775:2020-09-11 14:10:24.323220 7fba3d124700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] take_waiters
68777:2020-09-11 14:10:24.323226 7fba3d124700 20 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_activate_map: Not dirtying info: last_persisted is 2228 while current is 2231
68778:2020-09-11 14:10:24.323232 7fba3d124700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_peering_event: epoch_sent: 2231 epoch_requested: 2231 NullEvt
71105:2020-09-11 14:10:25.746633 7fba4f557700 25 osd.3 2231  sending 11.4 2228:1342
71198:2020-09-11 14:10:25.789587 7fba37919700 30 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
71199:2020-09-11 14:10:25.789593 7fba37919700 10 osd.3 2231 do_recovery starting 1 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded]
71200:2020-09-11 14:10:25.789598 7fba37919700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] recovery raced and were queued twice, ignoring!
71201:2020-09-11 14:10:25.789602 7fba37919700 10 osd.3 2231 do_recovery started 0/1 on pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded]
71309:2020-09-11 14:10:25.797456 7fba35915700 30 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
71315:2020-09-11 14:10:25.797495 7fba3d124700 30 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
71317:2020-09-11 14:10:25.797531 7fba3d124700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_peering_event: epoch_sent: 2228 epoch_requested: 2228 LocalRecoveryReserved
71318:2020-09-11 14:10:25.797542 7fba3d124700  5 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] exit Started/Primary/Active/WaitLocalRecoveryReserved 3.832485 10 0.000382

WaitLocalRecoveryReserved构造函数如下:

PG::RecoveryState::WaitLocalRecoveryReserved::WaitLocalRecoveryReserved(my_context ctx)
  : my_base(ctx),
    NamedState(context< RecoveryMachine >().pg->cct, "Started/Primary/Active/WaitLocalRecoveryReserved")
	{
	context< RecoveryMachine >().log_enter(state_name);
	PG *pg = context< RecoveryMachine >().pg;
	pg->state_set(PG_STATE_RECOVERY_WAIT);
	pg->osd->local_reserver.request_reservation(
		pg->info.pgid,
		new QueuePeeringEvt<LocalRecoveryReserved>(
			pg, pg->get_osdmap()->get_epoch(),
			LocalRecoveryReserved()),
		pg->get_recovery_priority());

	pg->publish_stats_to_osd();
}

在WaitLocalRecoveryReserved构造函数中,首先设置PG的状态为Recovery wait,之后调用AsyncReserver::request_reservation()来请求分配Recovery的资源。

注1: 每一次PG状态发生改变,都会调用pg::publish_stats_to_osd()来告知osd。

注2: 上面的日志片段中又出现了多次的handle_advance_map(),可能的原因是其他PG申请pg_temp引起的,这里对于PG 11.4仅仅只会在OSD::advance_pg()中影响pg_epoch,我们可以暂时忽略

bool OSD::advance_pg(
  epoch_t osd_epoch, PG *pg,
  ThreadPool::TPHandle &handle,
  PG::RecoveryCtx *rctx,
  set<boost::intrusive_ptr<PG> > *new_pgs)
{
	service.pg_update_epoch(pg->info.pgid, lastmap->get_epoch());
}

1) 函数AsyncReserver::request_reservation()

/**
* Requests a reservation
*
* Note, on_reserved may be called following cancel_reservation.  Thus,
* the callback must be safe in that case.  Callback will be called
* with no locks held.  cancel_reservation must be called to release the
* reservation slot.
*/
void request_reservation(
  T item,                   ///< [in] reservation key
  Context *on_reserved,     ///< [in] callback to be called on reservation
  unsigned prio
) {
	Mutex::Locker l(lock);
	assert(!queue_pointers.count(item) &&
		!in_progress.count(item));

	queues[prio].push_back(make_pair(item, on_reserved));
	queue_pointers.insert(make_pair(item, make_pair(prio,--(queues[prio]).end())));
	do_queues();
}

可以看到request_reservation()仅仅是把要获取资源的请求放入队列。

2) LocalRecoveryReserved事件

struct WaitLocalRecoveryReserved : boost::statechart::state< WaitLocalRecoveryReserved, Active >, NamedState {
	typedef boost::mpl::list <
		boost::statechart::transition< LocalRecoveryReserved, WaitRemoteRecoveryReserved >
	> reactions;
	> 
	explicit WaitLocalRecoveryReserved(my_context ctx);
	void exit();
};

当Recovery资源预约好之后,就会产生LocalRecoveryReserved事件,然后直接进入WaitRemoteRecoveryReserved状态。

3.5.8 进入Active/WaitRemoteRecoveryReserved状态

在预约好本地的Recovery资源后,还需要预约远程OSD上的Recovery资源。如下是此一阶段的一个日志片段:

71320:2020-09-11 14:10:25.797552 7fba3d124700  5 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] enter Started/Primary/Active/WaitRemoteRecoveryReserved
71375:2020-09-11 14:10:26.249869 7fba46937700 25 osd.3 2231  ack on 11.4 2228:1342
71484:2020-09-11 14:10:26.635094 7fba4fd58700 20 osd.3 2231 11.4 heartbeat_peers 2,3
94065:2020-09-11 14:13:14.601725 7fba45134700 20 osd.3 2231 _dispatch 0x7fba6c134b40 MRecoveryReserve GRANT  pgid: 11.4, query_epoch: 2231 v2
94067:2020-09-11 14:13:14.601753 7fba45134700 30 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
94070:2020-09-11 14:13:14.601819 7fba3d124700 30 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] lock
94071:2020-09-11 14:13:14.601855 7fba3d124700 10 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] handle_peering_event: epoch_sent: 2231 epoch_requested: 2231 RemoteRecoveryReserved
94072:2020-09-11 14:13:14.601876 7fba3d124700  5 osd.3 pg_epoch: 2231 pg[11.4( v 201'1 (0'0,201'1] local-les=2228 n=1 ec=132 les/c/f 2228/2224/0 2226/2227/2223) [3,2] r=0 lpr=2227 pi=2223-2226/2 crt=201'1 lcod 0'0 mlcod 0'0 active+recovery_wait+degraded] exit Started/Primary/Active/WaitRemoteRecoveryReserved 168.804323 1 0.000031

如下是WaitRemoteRecoveryReserved构造函数:

PG::RecoveryState::WaitRemoteRecoveryReserved::WaitRemoteRecoveryReserved(my_context ctx)
  : my_base(ctx),
    NamedState(context< RecoveryMachine >().pg->cct, "Started/Primary/Active/WaitRemoteRecoveryReserved"),
    remote_recovery_reservation_it(context< Active >().remote_shards_to_reserve_recovery.begin())
{
	context< RecoveryMachine >().log_enter(state_name);
	post_event(RemoteRecoveryReserved());
}

构造函数中产生一个RemoteRecoveryReserved()事件。

1) RemoteRecoveryReserved事件处理

boost::statechart::result
PG::RecoveryState::WaitRemoteRecoveryReserved::react(const RemoteRecoveryReserved &evt) {
	PG *pg = context< RecoveryMachine >().pg;
	
	if (remote_recovery_reservation_it != context< Active >().remote_shards_to_reserve_recovery.end()) {
		assert(*remote_recovery_reservation_it != pg->pg_whoami);
		ConnectionRef con = pg->osd->get_con_osd_cluster(
		remote_recovery_reservation_it->osd, pg->get_osdmap()->get_epoch());
		if (con) {
			pg->osd->send_message_osd_cluster(
			new MRecoveryReserve(
			  MRecoveryReserve::REQUEST,
			  spg_t(pg->info.pgid.pgid, remote_recovery_reservation_it->shard),
			  pg->get_osdmap()->get_epoch()),
			con.get());
		}
		++remote_recovery_reservation_it;
	} else {
		post_event(AllRemotesReserved());
	}
	return discard_event();
}

PG::RecoveryState::Active::Active(my_context ctx)
  : my_base(ctx),
    NamedState(context< RecoveryMachine >().pg->cct, "Started/Primary/Active"),
    remote_shards_to_reserve_recovery(
      unique_osd_shard_set(
	context< RecoveryMachine >().pg->pg_whoami,
	context< RecoveryMachine >().pg->actingbackfill)),
    remote_shards_to_reserve_backfill(
      unique_osd_shard_set(
	context< RecoveryMachine >().pg->pg_whoami,
	context< RecoveryMachine >().pg->backfill_targets)),
    all_replicas_activated(false)
{
	...
}

在我们进入Active状态的时候(参看Active构造函数),就把需要进行Recovery操作的添加到了Active::remote_shards_to_reserve_recovery中了。这里对于PG 11.4而言,需要向osd.2发送MRecoveryReserve::REQUEST,以获取Recovery所需要的资源。

2) 对MRecoveryReserve::REQUEST请求的处理

对于PG 11.4而言,osd2接收到MRecoveryReserve::REQUEST后,处理流程如下:

void OSD::dispatch_op(OpRequestRef op){
	switch (op->get_req()->get_type()) {

		...

		case MSG_OSD_RECOVERY_RESERVE:
   		  handle_pg_recovery_reserve(op);
   		  break;
	}
}

void OSD::handle_pg_recovery_reserve(OpRequestRef op)
{
	MRecoveryReserve *m = static_cast<MRecoveryReserve*>(op->get_req());
	assert(m->get_type() == MSG_OSD_RECOVERY_RESERVE);
	
	if (!require_osd_peer(op->get_req()))
		return;
	if (!require_same_or_newer_map(op, m->query_epoch, false))
		return;
	
	PG::CephPeeringEvtRef evt;
	if (m->type == MRecoveryReserve::REQUEST) {
		evt = PG::CephPeeringEvtRef(
			new PG::CephPeeringEvt(
			  m->query_epoch,
			  m->query_epoch,
			  PG::RequestRecovery()));
	} else if (m->type == MRecoveryReserve::GRANT) {
		evt = PG::CephPeeringEvtRef(
			new PG::CephPeeringEvt(
			  m->query_epoch,
			  m->query_epoch,
			  PG::RemoteRecoveryReserved()));
	} else if (m->type == MRecoveryReserve::RELEASE) {
		evt = PG::CephPeeringEvtRef(
			new PG::CephPeeringEvt(
			  m->query_epoch,
			  m->query_epoch,
			  PG::RecoveryDone()));
	} else {
		assert(0);
	}
	
	if (service.splitting(m->pgid)) {
		peering_wait_for_split[m->pgid].push_back(evt);
		return;
	}
	
	PG *pg = _lookup_lock_pg(m->pgid);
	if (!pg) {
		dout(10) << " don't have pg " << m->pgid << dendl;
		return;
	}
	
	pg->queue_peering_event(evt);
	pg->unlock();
}

handle_pg_recovery_reserve()函数产生RequestRecovery()事件。

对于PG 11.4的osd2而言,在进入ReplicaActive状态后默认会进入其子状态RepNotRecovering,因此这里是RepNotRecovering处理RequestRecovery事件:

struct RepNotRecovering : boost::statechart::state< RepNotRecovering, ReplicaActive>, NamedState {
	typedef boost::mpl::list<
		boost::statechart::custom_reaction< RequestBackfillPrio >,
		boost::statechart::transition< RequestRecovery, RepWaitRecoveryReserved >,
		boost::statechart::transition< RecoveryDone, RepNotRecovering >  // for compat with pre-reservation peers
	> reactions;
	> 
	explicit RepNotRecovering(my_context ctx);
	boost::statechart::result react(const RequestBackfillPrio &evt);
	void exit();
};

RepNotRecovering接收到RequestRecovery事件直接进入RepWaitRecoveryReserved状态:

/*---RepWaitRecoveryReserved--*/
PG::RecoveryState::RepWaitRecoveryReserved::RepWaitRecoveryReserved(my_context ctx)
  : my_base(ctx),
    NamedState(context< RecoveryMachine >().pg->cct, "Started/ReplicaActive/RepWaitRecoveryReserved")
{
	context< RecoveryMachine >().log_enter(state_name);
	PG *pg = context< RecoveryMachine >().pg;
	
	pg->osd->remote_reserver.request_reservation(
		pg->info.pgid,
		new QueuePeeringEvt<RemoteRecoveryReserved>(
		  pg, pg->get_osdmap()->get_epoch(),
		  RemoteRecoveryReserved()),
		pg->get_recovery_priority());
}
void request_reservation(
  T item,                   ///< [in] reservation key
  Context *on_reserved,     ///< [in] callback to be called on reservation
  unsigned prio
) {
	Mutex::Locker l(lock);
	assert(!queue_pointers.count(item) &&
		!in_progress.count(item));

	queues[prio].push_back(make_pair(item, on_reserved));
	queue_pointers.insert(make_pair(item, make_pair(prio,--(queues[prio]).end())));
	do_queues();
}

在RepWaitRecoveryReserved构造函数中调用request_reservation()预约Recovery资源,预约成功会回调QueuePeeringEvt,产生RemoteRecoveryReserved事件。

如下是对RemoteRecoveryReserved事件的处理:

boost::statechart::result
PG::RecoveryState::RepWaitRecoveryReserved::react(const RemoteRecoveryReserved &evt)
{
	PG *pg = context< RecoveryMachine >().pg;
	pg->osd->send_message_osd_cluster(
		pg->primary.osd,
		new MRecoveryReserve(
		  MRecoveryReserve::GRANT,
		  spg_t(pg->info.pgid.pgid, pg->primary.shard),
		  pg->get_osdmap()->get_epoch()),
		pg->get_osdmap()->get_epoch());

	return transit<RepRecovering>();
}

上面函数中会向主OSD(对于PG 11.4而言其主OSD为osd3)发送MRecoveryReserve::GRANT,然后自己直接进入RepRecovering状态。

3) 主OSD对MRecoveryReserve::GRANT的处理

void OSD::dispatch_op(OpRequestRef op){
	switch (op->get_req()->get_type()) {

		...

		case MSG_OSD_RECOVERY_RESERVE:
   		  handle_pg_recovery_reserve(op);
   		  break;
	}
}

void OSD::handle_pg_recovery_reserve(OpRequestRef op)
{
	MRecoveryReserve *m = static_cast<MRecoveryReserve*>(op->get_req());
	assert(m->get_type() == MSG_OSD_RECOVERY_RESERVE);
	
	...
	else if (m->type == MRecoveryReserve::GRANT) {
		evt = PG::CephPeeringEvtRef(
			new PG::CephPeeringEvt(
			  m->query_epoch,
			  m->query_epoch,
			  PG::RemoteRecoveryReserved()));
	}
	...
}

主OSD收到响应后,构造一个RemoteRecoveryReserved事件。如下是对该事件的处理:

boost::statechart::result
PG::RecoveryState::WaitRemoteRecoveryReserved::react(const RemoteRecoveryReserved &evt) {
	PG *pg = context< RecoveryMachine >().pg;
	
	if (remote_recovery_reservation_it != context< Active >().remote_shards_to_reserve_recovery.end()) {
		assert(*remote_recovery_reservation_it != pg->pg_whoami);
		ConnectionRef con = pg->osd->get_con_osd_cluster(
		remote_recovery_reservation_it->osd, pg->get_osdmap()->get_epoch());
		if (con) {
			pg->osd->send_message_osd_cluster(
			new MRecoveryReserve(
			  MRecoveryReserve::REQUEST,
			  spg_t(pg->info.pgid.pgid, remote_recovery_reservation_it->shard),
			  pg->get_osdmap()->get_epoch()),
			con.get());
		}
		++remote_recovery_reservation_it;
	} else {
		post_event(AllRemotesReserved());
	}
	return discard_event();
}

可以看到,这里如果所有Remote资源都预约成功,则会产生一个AllRemotesReserved事件,从而进入Recovering状态。




  1. ceph存储 PG的状态机和peering过程

  2. Ceph OSDMap 机制浅析