我们的服务器中使用了很多启动脚本为shell脚本,为了方便管理改为systemctl方式管理。 早上重启后正常,但是晚上流量高峰期间,大量用户无法链接服务器。 查看服务器进程日志出现大量报警日志。 后经过大佬排查。发现是因为systemctl启动的进程没有遵循limits资源限制,导致到达systemctl默认限定值后无法加载更多文件!
代码语言:javascript复制[root@kilig ~]# ulimit -a
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 15082
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 102400
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 102400
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
对比进程启动limits
代码语言:javascript复制[root@kilig ~]# cat /proc/1024/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size unlimited unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 1024 1024 processes
Max open files 1024 1024 files
Max locked memory 8048 8048 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 1508 1508 signals
Max msgqueue size 8192 8192 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
发现明显差异 解决办法加上LimitNOFILE LimitNOFILE LimitNOFILE指定参数即可
代码语言:javascript复制[Unit]
Description=kilig.systemctl
[Service]
LimitNOFILE=infinity
LimitNOFILE=102400
LimitNOFILE=102400
WorkingDirectory=xxxxxxx
ExecStart=xxxxxxxxxxxxxx
ExecStopPost=xxxxxxxxxxxxxxxxxxxxxx
Restart=always