[PVE]一则pve日志误报的临时处理

2022-05-20 14:25:34 浏览数 (1)

今天翻了一个PVE集群的日志,发现一个持续报错,单一错误居然把/var/log/syslog撑到了600M,主要就一个错误

代码语言:javascript复制
<14>Nov 15 09:20:10 xnode010 lxcfs[2201]: proc_fuse.c: 1018: proc_stat_read: cpu0 from /lxc/113/ns has unexpected cpu time: 20509567 in /proc/stat, 25054664 in cpuacct.usage_all; unable to determine idle time
<14>Nov 15 09:20:08 xnode010 lxcfs[2201]: proc_fuse.c: 1018: proc_stat_read: cpu0 from /lxc/113/ns has unexpected cpu time: 20509565 in /proc/stat, 25054662 in cpuacct.usage_all; unable to determine idle time
<14>Nov 15 09:20:03 xnode010 lxcfs[2201]: proc_fuse.c: 1018: proc_stat_read: cpu0 from /lxc/113/ns has unexpected cpu time: 20509559 in /proc/stat, 25054653 in cpuacct.usage_all; unable to determine idle time

咋一看是CPU使用率的问题,可是仔细一看cpu使用率又很低

代码语言:javascript复制
# top -bn1 | head
top - 09:45:45 up 159 days, 20:00,  3 users,  load average: 0.20, 0.16, 0.17
Tasks: 733 total,   1 running, 732 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.4 sy,  0.0 ni, 99.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem : 257551.1 total, 140682.6 free,   8178.7 used, 108689.9 buff/cache
MiB Swap:   8192.0 total,   8034.0 free,    158.0 used. 242308.6 avail Mem 


PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME  COMMAND
2302 root      10 -10 3039908 361588  12140 S   5.9   0.1   2633:36 ovs-vswitchd
363712 root      rt   0  580032 184640  51464 S   5.9   0.1   4208:06 corosync
2198179 root      20   0       0      0      0 I   5.9   0.0   0:45.37 kworker/u80:2-ixgbe

容器分配CPU也没有让某个CPU很负载过大

代码语言:javascript复制
# pct cpusets
----------------------------------------------------------------------------------------------------------------
100:      2
106:        3                                                       24
110:  0                                  15
113:              6                                                       26                            36    38
149:                    9 10    12             17
190:                  8            13
----------------------------------------------------------------------------------------------------------------

日志来自lxcfs.service

官方论坛说是执行的时候使用'-l'会导致,但实际上没有'-l'也有

代码语言:javascript复制
https://forum.proxmox.com/threads/syslog-is-spammed-with-unable-to-determine-idle-time.55032/
# systemctl status lxcfs.service

● lxcfs.service - FUSE filesystem for LXC
   Loaded: loaded (/lib/systemd/system/lxcfs.service; enabled; vendor preset: enabled)
   Active: active (running) since Tue 2021-06-08 13:44:57 CST; 5 months 7 days ago
     Docs: man:lxcfs(1)
 Main PID: 2201 (lxcfs)
    Tasks: 11 (limit: 308995)
   Memory: 28.5M
   CGroup: /system.slice/lxcfs.service
           └─2201 /usr/bin/lxcfs /var/lib/lxcfs

Nov 15 09:35:27 xnode010 lxcfs[2201]: proc_fuse.c: 1018: proc_stat_read: cpu0 from /lxc/113/ns has unexpected cpu time: 20511032 in /proc/stat, 25056479 in cp
Nov 15 09:35:29 xnode010 lxcfs[2201]: proc_fuse.c: 1018: proc_stat_read: cpu0 from /lxc/113/ns has unexpected cpu time: 20511032 in /proc/stat, 25056480 in cp
Nov 15 09:35:34 xnode010 lxcfs[2201]: proc_fuse.c: 1018: proc_stat_read: cpu0 from /lxc/113/ns has unexpected cpu time: 20511038 in /proc/stat, 25056488 in cp
Nov 15 09:35:39 xnode010 lxcfs[2201]: proc_fuse.c: 1018: proc_stat_read: cpu0 from /lxc/113/ns has unexpected cpu time: 20511043 in /proc/stat, 25056497 in cp
Nov 15 09:35:44 xnode010 lxcfs[2201]: proc_fuse.c: 1018: proc_stat_read: cpu0 from /lxc/113/ns has unexpected cpu time: 20511049 in /proc/stat, 25056505 in cp

既然是误报,最简单的方法就是不让它出现

新建一个配置文件

代码语言:javascript复制
# cat /etc/rsyslog.d/pve-local.conf
#  filter out
:msg, contains, "unable to determine idle time" stop

重启一下日志服务,使之生效

代码语言:javascript复制
# systemctl restart  rsyslog

0 人点赞