(02.120130) 46208: open file flags:1 (02.120136) 46208: inet: Restore: family AF_INET type SOCK_STREAM proto IPPROTO_TCP port 41810 state TCP_CLOSE_WAIT src_addr 127.0.0.1 dst_addr 127.0.0.1 (02.120163) 46208: sockets: Binding socket to lo dev (02.120168) 46208: tcp: Restoring TCP connection (02.120174) 46208: tcp: Restoring TCP connection id 2a2 ino 1c13a (02.120190) 46208: Debug: Setting 1 queue seq to 238346158 (02.120197) 46208: Debug: Setting 2 queue seq to 3717721167 (02.120210) 46208: Debug: Restoring TCP options (02.120215) 46208: Debug: Will turn SAK on (02.120221) 46208: Debug: Will set snd_wscale to 7 (02.120227) 46208: Debug: Will set rcv_wscale to 7 (02.120233) 46208: Debug: Will turn timestamps on (02.120238) 46208: Debug: Will set mss clamp to 65495 (02.120245) 46208: Debug: Restoring TCP 1 queue data 1078 bytes (02.122296) 46208: Error (soccr/soccr.c:690): Unable to send a fin packet: libnet_write_raw_ipv4(): -1 bytes written (Operation not permitted) (02.122329) 46208: Error (criu/files.c:1372): Unable to open fd=469 id=0x2a2 (02.555720) Error (criu/cr-restore.c:1634): 46208 killed by signal 9: Killed
Debug
Step 1: 问题发生时的场景
criu checkpoint日志显示:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
(00.424489) inet: Collected: ino 0x1c13a family AF_INET type SOCK_STREAM port 41810 state TCP_ESTABLISHED src_addr 127.0.0.1
(09.312339) 46208 fdinfo 469: pos: 0 flags: 4002/0x1 (09.312345) fdinfoEntry fd: 469 (09.312347) sockets: Searching for socket 0x1c13a family 2 (09.312358) sockets: Dumping lo bound dev for sk (09.312360) sockets: No filter for socket (09.312361) inet: Dumping inet socket at 469 (09.312363) inet: Dumping: ino 0x1c13a family AF_INET type SOCK_STREAM port 41810 state TCP_ESTABLISHED src_addr 127.0.0.1 (09.312366) inet: Dumped: family AF_INET type SOCK_STREAM proto IPPROTO_TCP port 41810 state 0 src_addr 127.0.0.1 dst_addr 127.0.0.1 (09.312369) tcp: Dumping TCP connection (09.312370) tcp: Turning repair on for socket 1c13a (09.312372) Locked 127.0.0.1:41810 - 127.0.0.1:30018 connection (09.312410) tcp: Done (09.312419) write fdinfoEntry fd=469 id=674
-A OUTPUT -j KUBE-FIREWALL -A KUBE-FIREWALL -m comment --comment "kubernetes firewall for dropping marked packets" -m mark --mark 0x8000/0x8000 -j DROP
符合OUTPUT hook + filter chain、comment match + mark match、drop target,怀疑上述的规则是导致sendto的罪魁祸首,遂删除该规则重新执行criu,问题不再发生。