Kubernetes 集群文件描述符測(cè)漏了...

問題現(xiàn)象
某個(gè)客戶的 Kubernetes 環(huán)境由于 too many open files(文件描述符泄漏)導(dǎo)致集群不正常。
已查明由該客戶 Kubernetes 環(huán)境外置存儲(chǔ) NFS 服務(wù)器宕機(jī)導(dǎo)致。以下分析基于通過手動(dòng)關(guān)閉 NFS 服務(wù)復(fù)現(xiàn)的測(cè)試環(huán)境。
初步分析
我們首先確定 Kubernetes 節(jié)點(diǎn)上的 Linux 系統(tǒng)最大可打開文件數(shù)量:
$?cat?/proc/sys/fs/file-max
100000
再看到當(dāng)前已打開的文件數(shù)量:
$?cat?/proc/sys/fs/file-nr
83552?0?100000
$?cat?/proc/sys/fs/file-nr
84640?0?100000
不斷上漲中,已逼近系統(tǒng)的最大可打開文件數(shù)量極限。
在客戶的環(huán)境上排查時(shí)我們片面地只分析了 socket 統(tǒng)計(jì)信息,錯(cuò)誤地認(rèn)為由某個(gè)進(jìn)程瘋狂打開 /var/run/docker.sock 導(dǎo)致,并未查出真正的罪魁禍?zhǔn)祝?/p>
$?ss?-a?|?grep?"docker.sock"?|?wc?-l
1176
$?ss?-a?|?grep?"docker.sock"?|?wc?-l
1220
這里說一下 Docker 默認(rèn)使用 unix domain socket (IPC socket) 進(jìn)行本地通訊,而 kubelet 通過 dockershim 將 CRI 請(qǐng)求轉(zhuǎn)換成相應(yīng)的 Docker API 請(qǐng)求發(fā)給 dockerd (Docker Daemon) 進(jìn)程,所以 /var/run/docker.sock 是由 kubelet 使用。
要查看某個(gè)進(jìn)程到底打開了多少文件,正常情況下 lsof 當(dāng)然可以做到,但是在極端情況下,lsof 都是無法正常使用的,我們這時(shí)就要通過 /proc 虛擬文件系統(tǒng)來查看進(jìn)程的數(shù)據(jù),/proc/${pid}/fd 這個(gè)路徑下的文件與進(jìn)程所打開的文件是一對(duì)一的關(guān)系,所以我們統(tǒng)計(jì) ?/proc/${pid}/fd 路徑下文件的數(shù)量就能夠得到進(jìn)程打開文件的數(shù)量:
$?ps?-ef?|?grep?kubelet
root?????28051?????1??7?15:37??????????00:29:57?/usr/bin/kubelet?--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf?--kubeconfig=/etc/kubernetes/kubelet.conf?--config=/var/lib/kubelet/config.yaml?--cgroup-driver=systemd?--network-plugin=cni?--pod-infra-container-image=10.10.10.254:5000/pause:3.2
$?ls?/proc/28051/fd?|?wc?-l
1253
$?ls?/proc/28051/fd?|?wc?-l
1256
我使用腳本持續(xù)打印出 kubelet 與全局已打開文件數(shù)量,發(fā)現(xiàn) kubelet 打開文件數(shù)的增量比 file-nr 要小很多:
$?cat?<?kubelet.sh
while?true
do
????sock=$(ss?-a?|?grep?"docker.sock"?|?wc?-l)
????fd=$(ls?/proc/28051/fd?|?wc?-l)
????file_nr=$(cat?/proc/sys/fs/file-nr)
????echo?"docker.sock:?$sock;?file?opened:?$fd;?file-nr:?$file_nr"
????sleep?10
done
EOF
$?sh?kubelet.sh
docker.sock:?1280;?file?opened:?1316;?file-nr:?92224?0?100000
docker.sock:?1280;?file?opened:?1315;?file-nr:?92192?0?100000
docker.sock:?1280;?file?opened:?1315;?file-nr:?92224?0?100000
docker.sock:?1281;?file?opened:?1316;?file-nr:?92256?0?100000
docker.sock:?1283;?file?opened:?1318;?file-nr:?92416?0?100000
docker.sock:?1283;?file?opened:?1318;?file-nr:?92448?0?100000
docker.sock:?1284;?file?opened:?1319;?file-nr:?92480?0?100000
docker.sock:?1285;?file?opened:?1321;?file-nr:?92544?0?100000
docker.sock:?1286;?file?opened:?1321;?file-nr:?92640?0?100000
docker.sock:?1286;?file?opened:?1321;?file-nr:?92640?0?100000
docker.sock:?1286;?file?opened:?1321;?file-nr:?92608?0?100000
docker.sock:?1286;?file?opened:?1321;?file-nr:?92608?0?100000
docker.sock:?1286;?file?opened:?1321;?file-nr:?92608?0?100000
docker.sock:?1286;?file?opened:?1321;?file-nr:?92640?0?100000
docker.sock:?1286;?file?opened:?1321;?file-nr:?92576?0?100000
docker.sock:?1287;?file?opened:?1322;?file-nr:?92672?0?100000
docker.sock:?1289;?file?opened:?1324;?file-nr:?92768?0?100000
這就說明還有其他進(jìn)程在持續(xù)打開文件。
深入分析
所以我們要使用腳本來觀察所有進(jìn)程使用文件的增量情況:
$?cat?<?all.sh
while?true
do
????for?proc?in?$(find?/proc/?-maxdepth?1?-type?d?-name?"[0-9]*")
????do
????????fd=$(ls?$proc/fd?|?wc?-l)
????????if?[[?$fd?-gt?1000?]];?then
????????????pid=$(echo?$proc?|?awk?-F/?'{print?$3}')
????????????echo?"process?$pid?opened?$fd?files\n"
????????fi
????done
????echo?==========================
????sleep?10
done
EOF
$?sh?all.sh
process?21932?opened?4441?files
ls:?cannot?access?/proc/25665/fd:?No?such?file?or?directory
process?26140?opened?7035?files
process?26659?opened?4441?files
process?27160?opened?4441?files
process?28051?opened?1366?files
==========================
ls:?cannot?access?/proc/21786/fd:?No?such?file?or?directory
process?21932?opened?4441?files
process?26140?opened?7040?files
process?26659?opened?4441?files
process?27160?opened?4451?files
process?28051?opened?1368?files
==========================
ls:?cannot?access?/proc/3912/fd:?No?such?file?or?directory
ls:?cannot?access?/proc/3917/fd:?No?such?file?or?directory
ls:?cannot?access?/proc/3935/fd:?No?such?file?or?directory
ls:?cannot?access?/proc/3938/fd:?No?such?file?or?directory
process?21932?opened?4451?files
process?26140?opened?7060?files
process?26659?opened?4461?files
process?27160?opened?4461?files
process?28051?opened?1371?files
ls:?cannot?access?/proc/21122/fd:?No?such?file?or?directory
process?21932?opened?4481?files
process?26140?opened?7095?files
process?26659?opened?4481?files
process?27160?opened?4481?files
process?28051?opened?1379?files
直接根據(jù) PID 來看一下進(jìn)程樹:
$?pstree?21932?-s
systemd───containerd───containerd-shim─┬─patroni─┬─patroni───463*[{patroni}]
???????????????????????????????????????│?????????├─postgres───14*[postgres]
???????????????????????????????????????│?????????└─sshd
???????????????????????????????????????├─221*[pgha-liveness.s───curl]
???????????????????????????????????????├─237*[pgha-readiness.───pgha-readiness.───curl]
???????????????????????????????????????└─10*[{containerd-shim}]
$?pstree?26659?-s
systemd───containerd───containerd-shim─┬─patroni─┬─patroni───463*[{patroni}]
???????????????????????????????????????│?????????├─postgres───7*[postgres]
???????????????????????????????????????│?????????└─sshd
???????????????????????????????????????├─221*[pgha-liveness.s───curl]
???????????????????????????????????????├─237*[pgha-readiness.───pgha-readiness.───curl]
???????????????????????????????????????└─10*[{containerd-shim}]
$?pstree?27160?-s
systemd───containerd───containerd-shim─┬─patroni─┬─patroni───464*[{patroni}]
???????????????????????????????????????│?????????├─postgres───7*[postgres]
???????????????????????????????????????│?????????└─sshd
???????????????????????????????????????├─221*[pgha-liveness.s───curl]
???????????????????????????????????????├─238*[pgha-readiness.───pgha-readiness.───curl]
???????????????????????????????????????└─10*[{containerd-shim}]
$?pstree?26140?-s
systemd───dockerd───40*[{dockerd}]
$?pstree?28051?-s
systemd───kubelet───31*[{kubelet}]
到處都是 patroni 和 curl 進(jìn)程,不出問題才有鬼呢。
lsof 已經(jīng)廢了,我們手動(dòng)去看看 21932 號(hào)進(jìn)程打開了哪些文件:
$?ll?-tr?/proc/21932/fd?|?tail?-n?50
lr-x------?1?root?root?64?Feb?26?23:56?4563?->?pipe:[4883358]
lr-x------?1?root?root?64?Feb?26?23:56?4561?->?pipe:[4883357]
lr-x------?1?root?root?64?Feb?26?23:56?4560?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/2b6b459b99e213722d3b36b4ac447fb63d1e77908e8583a443443350eb8b6dbe-stderr
l-wx------?1?root?root?64?Feb?26?23:56?4558?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/2b6b459b99e213722d3b36b4ac447fb63d1e77908e8583a443443350eb8b6dbe-stderr
lr-x------?1?root?root?64?Feb?26?23:56?4556?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/2b6b459b99e213722d3b36b4ac447fb63d1e77908e8583a443443350eb8b6dbe-stdout
l-wx------?1?root?root?64?Feb?26?23:56?4554?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/2b6b459b99e213722d3b36b4ac447fb63d1e77908e8583a443443350eb8b6dbe-stdout
lr-x------?1?root?root?64?Feb?26?23:56?4553?->?pipe:[4874646]
lr-x------?1?root?root?64?Feb?26?23:56?4551?->?pipe:[4874645]
lr-x------?1?root?root?64?Feb?26?23:56?4550?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/7ae25e44f7286fe796a51546f0311bbedad89ff3c12cde6089665b814cc7abb8-stderr
l-wx------?1?root?root?64?Feb?26?23:56?4548?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/7ae25e44f7286fe796a51546f0311bbedad89ff3c12cde6089665b814cc7abb8-stderr
lr-x------?1?root?root?64?Feb?26?23:56?4546?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/7ae25e44f7286fe796a51546f0311bbedad89ff3c12cde6089665b814cc7abb8-stdout
l-wx------?1?root?root?64?Feb?26?23:56?4544?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/7ae25e44f7286fe796a51546f0311bbedad89ff3c12cde6089665b814cc7abb8-stdout
lr-x------?1?root?root?64?Feb?26?23:56?4543?->?pipe:[4869051]
lr-x------?1?root?root?64?Feb?26?23:56?4541?->?pipe:[4869050]
lr-x------?1?root?root?64?Feb?26?23:56?4540?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/a491b0b22bdaa8950fa83d44a350683fa2b2fe8b45b46281ff09025cb1640285-stderr
l-wx------?1?root?root?64?Feb?26?23:56?4538?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/a491b0b22bdaa8950fa83d44a350683fa2b2fe8b45b46281ff09025cb1640285-stderr
lr-x------?1?root?root?64?Feb?26?23:56?4536?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/a491b0b22bdaa8950fa83d44a350683fa2b2fe8b45b46281ff09025cb1640285-stdout
l-wx------?1?root?root?64?Feb?26?23:56?4534?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/a491b0b22bdaa8950fa83d44a350683fa2b2fe8b45b46281ff09025cb1640285-stdout
lr-x------?1?root?root?64?Feb?26?23:56?4533?->?pipe:[4859412]
lr-x------?1?root?root?64?Feb?26?23:56?4531?->?pipe:[4859411]
lr-x------?1?root?root?64?Feb?26?23:56?4530?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/4b19d3a6fc67b5a8aeccfff3edb671315743ebb3fef07f55d76bac37a36e62c3-stderr
l-wx------?1?root?root?64?Feb?26?23:56?4528?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/4b19d3a6fc67b5a8aeccfff3edb671315743ebb3fef07f55d76bac37a36e62c3-stderr
lr-x------?1?root?root?64?Feb?26?23:56?4526?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/4b19d3a6fc67b5a8aeccfff3edb671315743ebb3fef07f55d76bac37a36e62c3-stdout
l-wx------?1?root?root?64?Feb?26?23:56?4524?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/4b19d3a6fc67b5a8aeccfff3edb671315743ebb3fef07f55d76bac37a36e62c3-stdout
lr-x------?1?root?root?64?Feb?26?23:56?4523?->?pipe:[4857797]
lr-x------?1?root?root?64?Feb?26?23:56?4521?->?pipe:[4857796]
lr-x------?1?root?root?64?Feb?26?23:56?4520?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/e9315eacd861101be78d7b1b7eabf139f8bd27e902908e915798830341db8ab4-stderr
l-wx------?1?root?root?64?Feb?26?23:56?4518?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/e9315eacd861101be78d7b1b7eabf139f8bd27e902908e915798830341db8ab4-stderr
lr-x------?1?root?root?64?Feb?26?23:56?4516?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/e9315eacd861101be78d7b1b7eabf139f8bd27e902908e915798830341db8ab4-stdout
l-wx------?1?root?root?64?Feb?26?23:56?4514?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/e9315eacd861101be78d7b1b7eabf139f8bd27e902908e915798830341db8ab4-stdout
lr-x------?1?root?root?64?Feb?26?23:56?4513?->?pipe:[4847408]
lr-x------?1?root?root?64?Feb?26?23:56?4511?->?pipe:[4847407]
lr-x------?1?root?root?64?Feb?26?23:56?4510?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/a8ed17bac296cb67aa9e09b26b6556d98ca35f89a8d4872aedf93165ce6e771c-stderr
l-wx------?1?root?root?64?Feb?26?23:56?4508?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/a8ed17bac296cb67aa9e09b26b6556d98ca35f89a8d4872aedf93165ce6e771c-stderr
lr-x------?1?root?root?64?Feb?26?23:56?4506?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/a8ed17bac296cb67aa9e09b26b6556d98ca35f89a8d4872aedf93165ce6e771c-stdout
l-wx------?1?root?root?64?Feb?26?23:56?4504?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/a8ed17bac296cb67aa9e09b26b6556d98ca35f89a8d4872aedf93165ce6e771c-stdout
lr-x------?1?root?root?64?Feb?26?23:56?4503?->?pipe:[4842434]
lr-x------?1?root?root?64?Feb?26?23:56?4501?->?pipe:[4842433]
lr-x------?1?root?root?64?Feb?26?23:56?4500?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/59e1ada1cda0fed1656465b371b410973db83c420a285a3b270e148896218117-stderr
l-wx------?1?root?root?64?Feb?26?23:56?4498?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/59e1ada1cda0fed1656465b371b410973db83c420a285a3b270e148896218117-stderr
lr-x------?1?root?root?64?Feb?26?23:56?4496?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/59e1ada1cda0fed1656465b371b410973db83c420a285a3b270e148896218117-stdout
l-wx------?1?root?root?64?Feb?26?23:56?4494?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/59e1ada1cda0fed1656465b371b410973db83c420a285a3b270e148896218117-stdout
lr-x------?1?root?root?64?Feb?26?23:56?4493?->?pipe:[4800931]
lr-x------?1?root?root?64?Feb?26?23:56?4491?->?pipe:[4800930]
lr-x------?1?root?root?64?Feb?26?23:56?4490?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/6dc135d11312e1033360fb714bc3f2aac04dad9fb494cb70c924c54091c54e68-stderr
l-wx------?1?root?root?64?Feb?26?23:56?4488?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/6dc135d11312e1033360fb714bc3f2aac04dad9fb494cb70c924c54091c54e68-stderr
lr-x------?1?root?root?64?Feb?26?23:56?4486?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/6dc135d11312e1033360fb714bc3f2aac04dad9fb494cb70c924c54091c54e68-stdout
l-wx------?1?root?root?64?Feb?26?23:56?4484?->?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/6dc135d11312e1033360fb714bc3f2aac04dad9fb494cb70c924c54091c54e68-stdout
lr-x------?1?root?root?64?Feb?26?23:56?4483?->?pipe:[4800791]
lr-x------?1?root?root?64?Feb?26?23:56?4481?->?pipe:[4800790]
$?ll?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/6dc135d11312e1033360fb714bc3f2aac04dad9fb494cb70c924c54091c54e68-stderr
prwx------?1?root?root?0?Feb?26?23:31?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/6dc135d11312e1033360fb714bc3f2aac04dad9fb494cb70c924c54091c54e68-stderr
這些都是命名管道文件,而我們常用的 | 豎線符號(hào)是匿名管道,管道經(jīng)常被用于父子進(jìn)程間通訊。
這些管道文件的命名都是以 /run/docker/containerd/46f400679dc5 為前綴,盲猜一把 46f400679dc5 是某個(gè)容器的 ID:
$?docker?ps?|?grep?46f400679dc5
46f400679dc5????????10.10.10.254:5000/caas4/crunchy-postgres-ha???????????"/opt/cpm/bin/bootst…"???9?hours?ago?????????Up?9?hours??????????????????????????????k8s_database_host-7f49c5c448-5vkdh_pgo-namespace_ac0a3524-3b65-4876-ab87-9610a70afda2_0
正是 21932 號(hào) containerd-shim 進(jìn)程創(chuàng)建的容器。
$?pstree?21932?-s
systemd───containerd───containerd-shim─┬─patroni─┬─patroni───475*[{patroni}]
???????????????????????????????????????│?????????├─postgres───14*[postgres]
???????????????????????????????????????│?????????└─sshd
???????????????????????????????????????├─226*[pgha-liveness.s───curl]
???????????????????????????????????????├─244*[pgha-readiness.───pgha-readiness.───curl]
???????????????????????????????????????├─runc───5*[{runc}]
???????????????????????????????????????└─10*[{containerd-shim}]
上面就是 ID 為 46f400679dc5 容器的進(jìn)程樹。
/proc/21932/fd 路徑下的那些軟鏈接都指向了 /run/docker/containerd/46f400679dc5 路徑:
$?ll?-t?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8/?|?head?-20
total?0
prwx------?1?root?root?0?Feb?27?01:17?init-stderr
prwx------?1?root?root?0?Feb?27?00:09?fdbf8eb06956db4b897ed0fd7bf9ef51b74bab74499d609e14c42f99991b4c56-stderr
prwx------?1?root?root?0?Feb?27?00:09?fdbf8eb06956db4b897ed0fd7bf9ef51b74bab74499d609e14c42f99991b4c56-stdout
prwx------?1?root?root?0?Feb?27?00:07?763baa451bc729810eeb9ba0bbb9bf52161a9e2e352c09c5fe2c60ef697444f2-stderr
prwx------?1?root?root?0?Feb?27?00:07?763baa451bc729810eeb9ba0bbb9bf52161a9e2e352c09c5fe2c60ef697444f2-stdout
prwx------?1?root?root?0?Feb?27?00:07?9429e6fdb63269d484c26f6391177d269400e2e4fec8f8a594f50fbd0905f8a9-stderr
prwx------?1?root?root?0?Feb?27?00:07?9429e6fdb63269d484c26f6391177d269400e2e4fec8f8a594f50fbd0905f8a9-stdout
prwx------?1?root?root?0?Feb?27?00:06?8b4a7652cddebb52294a7a66fd620f6ac4258077cfde21531cce6ee86ac1ccda-stderr
prwx------?1?root?root?0?Feb?27?00:06?8b4a7652cddebb52294a7a66fd620f6ac4258077cfde21531cce6ee86ac1ccda-stdout
prwx------?1?root?root?0?Feb?26?23:51?25cb04f4897dc47670ef5fe5dcdb7725cba9dca76bf0991c199acfd77b87faaa-stderr
prwx------?1?root?root?0?Feb?26?23:51?25cb04f4897dc47670ef5fe5dcdb7725cba9dca76bf0991c199acfd77b87faaa-stdout
prwx------?1?root?root?0?Feb?26?23:50?fd13511325a0bd065bd0ab99c83dca555580dba0f8d51c95fb8d388bee659eaf-stderr
prwx------?1?root?root?0?Feb?26?23:50?fd13511325a0bd065bd0ab99c83dca555580dba0f8d51c95fb8d388bee659eaf-stdout
prwx------?1?root?root?0?Feb?26?23:49?c8dce3743c86adebee9043a2adf80eab08b4c565b90d62d85f1e542a4c84b19a-stderr
prwx------?1?root?root?0?Feb?26?23:49?c8dce3743c86adebee9043a2adf80eab08b4c565b90d62d85f1e542a4c84b19a-stdout
prwx------?1?root?root?0?Feb?26?23:48?fc4a2b410671cbe1e3a3706e3e281bde89be0de537af60aa676365aeba911386-stderr
prwx------?1?root?root?0?Feb?26?23:48?fc4a2b410671cbe1e3a3706e3e281bde89be0de537af60aa676365aeba911386-stdout
prwx------?1?root?root?0?Feb?26?23:47?c5a0233a9651a3bf629004cfd94b79f2f96688fce59764d1adda5bc4d8f7622c-stderr
prwx------?1?root?root?0?Feb?26?23:47?c5a0233a9651a3bf629004cfd94b79f2f96688fce59764d1adda5bc4d8f7622c-stdout
$?ls?/run/docker/containerd/46f400679dc5867a1bd4de1cc97deadeb049bec0ad4e3c74b3cc2cd77aa9afc8?|?wc?-l
944
這些文件都是用來做啥的?
先看看 Docker 源碼 https://github.com/moby/moby,創(chuàng)建命名管道也就是 Linux FIFO 的源碼在 https://github.com/moby/moby/blob/v19.03.12/libcontainerd/remote/client_linux.go#L97-L122:
func?newFIFOSet(bundleDir,?processID?string,?withStdin,?withTerminal?bool)?*cio.FIFOSet?{
????config?:=?cio.Config{
????????Terminal:?withTerminal,
????????Stdout:???filepath.Join(bundleDir,?processID+"-stdout"),
????}
????paths?:=?[]string{config.Stdout}
????if?withStdin?{
????????config.Stdin?=?filepath.Join(bundleDir,?processID+"-stdin")
????????paths?=?append(paths,?config.Stdin)
????}
????if?!withTerminal?{
????????config.Stderr?=?filepath.Join(bundleDir,?processID+"-stderr")
????????paths?=?append(paths,?config.Stderr)
????}
????closer?:=?func()?error?{
????????for?_,?path?:=?range?paths?{
????????????if?err?:=?os.RemoveAll(path);?err?!=?nil?{
????????????????logrus.Warnf("libcontainerd:?failed?to?remove?fifo?%v:?%v",?path,?err)
????????????}
????????}
????????return?nil
????}
????return?cio.NewFIFOSet(config,?closer)
}
再找到 newFIFOSet 調(diào)用處 https://github.com/moby/moby/blob/v19.03.12/libcontainerd/remote/client.go#L194
//?Exec?creates?exec?process.
//
//?The?containerd?client?calls?Exec?to?register?the?exec?config?in?the?shim?side.
//?When?the?client?calls?Start,?the?shim?will?create?stdin?fifo?if?needs.?But
//?for?the?container?main?process,?the?stdin?fifo?will?be?created?in?Create?not
//?the?Start?call.?stdinCloseSync?channel?should?be?closed?after?Start?exec
//?process.
func?(c?*client)?Exec(ctx?context.Context,?containerID,?processID?string,?spec?*specs.Process,?withStdin?bool,?attachStdio?libcontainerdtypes.StdioCallback)?(int,?error)?{
????//?a?lot?of?code?here
????fifos?:=?newFIFOSet(labels[DockerContainerBundlePath],?processID,?withStdin,?spec.Terminal)
????p,?err?=?t.Exec(ctx,?processID,?spec,?func(id?string)?(cio.IO,?error)?{
????????rio,?err?=?c.createIO(fifos,?containerID,?processID,?stdinCloseSync,?attachStdio)
????????return?rio,?err
????})
????//?a?lot?of?code?here
}
看上去每當(dāng) exec 新進(jìn)程,都會(huì)創(chuàng)建 Linux FIFO 也就是命名管道。我們找個(gè) Kubernetes 環(huán)境來驗(yàn)證一下:
$?docker?ps?|?grep?grafana-646d9cfddf-w8xt4
10669996b0c8????????10.10.10.158:5000/caas4/grafana?????????????????????????????????"/run.sh"????????????????16?hours?ago????????Up?16?hours?????????????????????????????k8s_grafana_grafana-646d9cfddf-w8xt4_gap_f385ebe8-58e8-45bc-919a-4353db754b79_0
$?docker?inspect?-f?{{.State.Pid}}?10669996b0c8
7612
$?pstree?7612?-p?-s
systemd(1)───containerd(22719)───containerd-shim(7585)───grafana-server(7612)─┬─{grafana-server}(7666)
??????????????????????????????????????????????????????????????????????????????├─{grafana-server}(7667)
??????????????????????????????????????????????????????????????????????????????├─{grafana-server}(7668)
??????????????????????????????????????????????????????????????????????????????├─{grafana-server}(7669)
??????????????????????????????????????????????????????????????????????????????├─{grafana-server}(7670)
??????????????????????????????????????????????????????????????????????????????├─{grafana-server}(7672)
??????????????????????????????????????????????????????????????????????????????├─{grafana-server}(7673)
??????????????????????????????????????????????????????????????????????????????├─{grafana-server}(7703)
??????????????????????????????????????????????????????????????????????????????├─{grafana-server}(8197)
??????????????????????????????????????????????????????????????????????????????├─{grafana-server}(8198)
??????????????????????????????????????????????????????????????????????????????├─{grafana-server}(8199)
??????????????????????????????????????????????????????????????????????????????├─{grafana-server}(8200)
??????????????????????????????????????????????????????????????????????????????├─{grafana-server}(12076)
??????????????????????????????????????????????????????????????????????????????├─{grafana-server}(17397)
??????????????????????????????????????????????????????????????????????????????└─{grafana-server}(29134)
我們找到負(fù)責(zé)操作 grafana 容器的 containerd-shim 進(jìn)程打開的文件:
$?ll?/proc/7585/fd
total?0
lr-x------?1?root?root?64?Feb?27?23:15?0?->?/dev/null
l-wx------?1?root?root?64?Feb?27?23:15?1?->?/dev/null
lrwx------?1?root?root?64?Feb?27?23:15?10?->?socket:[230322]
lr-x------?1?root?root?64?Feb?27?23:15?11?->?pipe:[230333]
l---------?1?root?root?64?Feb?27?23:15?12?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/init-stdout
lr-x------?1?root?root?64?Feb?27?23:15?13?->?pipe:[230334]
l-wx------?1?root?root?64?Feb?27?23:15?14?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/init-stdout
l---------?1?root?root?64?Feb?27?23:15?15?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/init-stdout
lr-x------?1?root?root?64?Feb?27?23:15?16?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/init-stdout
l---------?1?root?root?64?Feb?27?23:15?17?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/init-stderr
l-wx------?1?root?root?64?Feb?27?23:15?18?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/init-stderr
l---------?1?root?root?64?Feb?27?23:15?19?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/init-stderr
l-wx------?1?root?root?64?Feb?27?23:15?2?->?/dev/null
lr-x------?1?root?root?64?Feb?27?23:15?20?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/init-stderr
lrwx------?1?root?root?64?Feb?27?23:15?3?->?socket:[230331]
l---------?1?root?root?64?Feb?27?23:15?4?->?/var/lib/containerd/io.containerd.runtime.v1.linux/moby/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/shim.stdout.log
lrwx------?1?root?root?64?Feb?27?23:15?5?->?anon_inode:[eventpoll]
lrwx------?1?root?root?64?Feb?27?23:15?6?->?/var/lib/containerd/io.containerd.runtime.v1.linux/moby/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/shim.stdout.log
l---------?1?root?root?64?Feb?27?23:15?7?->?/var/lib/containerd/io.containerd.runtime.v1.linux/moby/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/shim.stderr.log
lrwx------?1?root?root?64?Feb?27?23:15?8?->?/var/lib/containerd/io.containerd.runtime.v1.linux/moby/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/shim.stderr.log
lrwx------?1?root?root?64?Feb?27?23:15?9?->?anon_inode:[eventpoll]
$?ls?/proc/7585/fd?|?wc?-l
21
接著手動(dòng)進(jìn)入 grafana 容器運(yùn)行 bash 進(jìn)程:
$?kubectl?exec?-it?grafana-646d9cfddf-w8xt4?-n?gap?bash
在主機(jī)上查看 containerd-shim 進(jìn)程打開的文件:
$?ll?/proc/7585/fd
total?0
lr-x------?1?root?root?64?Feb?27?23:15?0?->?/dev/null
l-wx------?1?root?root?64?Feb?27?23:15?1?->?/dev/null
lrwx------?1?root?root?64?Feb?27?23:15?10?->?socket:[230322]
lr-x------?1?root?root?64?Feb?27?23:15?11?->?pipe:[230333]
l---------?1?root?root?64?Feb?27?23:15?12?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/init-stdout
lr-x------?1?root?root?64?Feb?27?23:15?13?->?pipe:[230334]
l-wx------?1?root?root?64?Feb?27?23:15?14?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/init-stdout
l---------?1?root?root?64?Feb?27?23:15?15?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/init-stdout
lr-x------?1?root?root?64?Feb?27?23:15?16?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/init-stdout
l---------?1?root?root?64?Feb?27?23:15?17?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/init-stderr
l-wx------?1?root?root?64?Feb?27?23:15?18?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/init-stderr
l---------?1?root?root?64?Feb?27?23:15?19?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/init-stderr
l-wx------?1?root?root?64?Feb?27?23:15?2?->?/dev/null
lr-x------?1?root?root?64?Feb?27?23:15?20?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/init-stderr
l---------?1?root?root?64?Feb?27?23:18?22?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/9cf438e02f7ff6e96466a2a9e5316ff4fea819a69f6f197534409be6be561464-stdin
l---------?1?root?root?64?Feb?27?23:18?23?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/9cf438e02f7ff6e96466a2a9e5316ff4fea819a69f6f197534409be6be561464-stdin
lrwx------?1?root?root?64?Feb?27?23:18?24?->?/dev/pts/ptmx
l-wx------?1?root?root?64?Feb?27?23:18?25?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/9cf438e02f7ff6e96466a2a9e5316ff4fea819a69f6f197534409be6be561464-stdin
lr-x------?1?root?root?64?Feb?27?23:18?26?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/9cf438e02f7ff6e96466a2a9e5316ff4fea819a69f6f197534409be6be561464-stdin
l---------?1?root?root?64?Feb?27?23:18?27?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/9cf438e02f7ff6e96466a2a9e5316ff4fea819a69f6f197534409be6be561464-stdout
l-wx------?1?root?root?64?Feb?27?23:18?28?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/9cf438e02f7ff6e96466a2a9e5316ff4fea819a69f6f197534409be6be561464-stdout
l---------?1?root?root?64?Feb?27?23:18?29?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/9cf438e02f7ff6e96466a2a9e5316ff4fea819a69f6f197534409be6be561464-stdout
lrwx------?1?root?root?64?Feb?27?23:15?3?->?socket:[230331]
lr-x------?1?root?root?64?Feb?27?23:18?30?->?/run/docker/containerd/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/9cf438e02f7ff6e96466a2a9e5316ff4fea819a69f6f197534409be6be561464-stdout
l---------?1?root?root?64?Feb?27?23:15?4?->?/var/lib/containerd/io.containerd.runtime.v1.linux/moby/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/shim.stdout.log
lrwx------?1?root?root?64?Feb?27?23:15?5?->?anon_inode:[eventpoll]
lrwx------?1?root?root?64?Feb?27?23:15?6?->?/var/lib/containerd/io.containerd.runtime.v1.linux/moby/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/shim.stdout.log
l---------?1?root?root?64?Feb?27?23:15?7?->?/var/lib/containerd/io.containerd.runtime.v1.linux/moby/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/shim.stderr.log
lrwx------?1?root?root?64?Feb?27?23:15?8?->?/var/lib/containerd/io.containerd.runtime.v1.linux/moby/10669996b0c8ef8a6e84c33bd54fb78c1776ef3a93abccca96f92337aaf8cfcc/shim.stderr.log
lrwx------?1?root?root?64?Feb?27?23:15?9?->?anon_inode:[eventpoll]
$?ls?/proc/7585/fd?|?wc?-l
30
果然增加了,當(dāng) ctrl + C 終止 bash 進(jìn)程后,containerd-shim 進(jìn)程打開的文件數(shù)量又恢復(fù)至 21:
$?ls?/proc/7585/fd?|?wc?-l
21
問題總結(jié)
至此一切都能串起來了,NFS 服務(wù)器宕機(jī)后,我們的 Kubernetes 環(huán)境中 Postgres 集群的某些組件(說的就是你 patroni 同學(xué))瘋狂創(chuàng)建子進(jìn)程導(dǎo)致操作容器的 containerd-shim 進(jìn)程打開的命名管道數(shù)量急劇上升,最終超過了 Linux 系統(tǒng)的限制,也就是 too many open files。
原文鏈接:https://blog.crazytaxii.com/posts/k8s_file_descriptor_leaks/


你可能還喜歡
點(diǎn)擊下方圖片即可閱讀

云原生是一種信仰???
關(guān)注公眾號(hào)
后臺(tái)回復(fù)?k8s?獲取史上最方便快捷的 Kubernetes 高可用部署工具,只需一條命令,連 ssh 都不需要!


點(diǎn)擊?"閱讀原文"?獲取更好的閱讀體驗(yàn)!
發(fā)現(xiàn)朋友圈變“安靜”了嗎?


